Page 425 - 《软件学报》2025年第4期

P. 425

邹慧琪等: 基于图神经网络的复杂时空数据挖掘方法综述 1831

(1) 大模型作为增强器
文本属性图 (TAG) 作为普通图的一种, 图上每个节点包括丰富的文本特征, 其中一个常见的例子是论文引用图.
He 等人 [63] 提出的 TAPE 设计了一种新颖的架构, 如图 10 所示, 它将大语言模型作为增强器, 先将原始的文章
相关信息通过设计 prompt 和问题输入到 LLM 后生成预测和相应的解释, 再将生成的解释文本序列和原始文本序
列输入到语言模型中进行微调, 从而生成更丰富的节点特征表示. 最后, 将微调语言模型生成的解释表示和原始文
本和 LLM 生成的 top-k 预测的表示作为输入分别训练图神经网络并对输出取平均可用于下游预测任务中. 其中,
TAPE 中的大模型使用了 GPT3.5 [64] , 通过语言建模即服务 (LMaaS) 进行访问 [65] , 而无需进行改动和微调, 用于微
调的语言模型是 DeBERTa [66] . TAPE 较好地解决了 LLM 训练开销大的问题, 并结合了语言模型规模较小, 适合进
行微调和图神经网络对图拓扑结构信息提取能力强的特点, 是文本属性图大模型相关工作的一个重要探索.

Step l : Node feature extraction Step 2: Downstream tasks
on TAGs
Prediction: cs.CV, cs.IR, cs.CL, cs.LG, cs.AI.
Explanation: The paper is about a new dataset for scene text
Abstract: Text in curve orientation, despite being one of
(2) 大模型作为预测器
the common text orientations in real world environment… detection and recognition, which is a topic related to computer vision
Title: Total Text A Comprehensive Dataset For Scene Text (cs.CV). The paper also mentions the use of deep learning techniques
Detection And Recognition. such as DeconvNet, which falls under the sub-category of artificial
Question: Which arXiv CS sub-category does this paper intelligence (cs.AI). The dataset is annotated and involves text
belong to? Give 5 likely arXiv CS sub-categories as a recognition, which could also fall under the sub-categories of
comma-separated list ordered from most to least likely, in information retrieval (cs.IR) and natural language processing (cs.CL).
the form "cs.XX", and provide your reasoning. Finally, the paper discusses the effectiveness of different solutions,
Answer: which could be evaluated using machine learning techniques, falling Graph structure
under the sub-category of machine learning (cs.LG).
A
h pred
Prompt
Prediction:
Query Response Fine-tune h expl
Title: LLM LM GNN
Abstract:
(GPT3.5 (175B)) Explanation: Deberta RevGAT Y
(129M) h orig (1.8M)
Frozen
Fine-tune
Trainable Trainable
Without fine-tuning Shallow embedding techniques h ogb
e.g., skip-gram/bags of words
Shallow embedding pipeline (e.g., OGB) LM-based pipeline (e.g., GLANT) LLM-based pipeline (Ours)
图 10 TAPE 的架构 [63]

除了普通的文本属性图的相关应用, 还有一类常见的基于普通图的大模型应用于推荐系统, 比如 Ren 等人 [67]
提出的 RLMRec 相对于传统的仅基于 ID (即用户/物品标识) 的信息推荐, 该方法还额外提取了用户和物品相关的
文本信息, 它将 GNN 增强协同过滤 [68] 得到的用户-物品协作关系与通过 LLM 得到的增强语义表示通过对比建模
进行对齐, 最后通过最大化互信息以最优化推荐系统的优化目标. 其中, RLMRec 中的 LLM 将设计的系统 prompt
和用户/物品概述的 prompt 作为输入, 从而生成更准确的用户/物品概述, 所以该模型中 LLM 充当增强器的作用.
用户/物品概述的 prompt 中包括标题、原始描述、数据集特定属性和用户评论等信息.

Tang 等人 [69] 提出的 GraphGPT 中的 LLM 则是作为预测器, 它将文本信息和图数据进行对齐, 并使用双阶段
的图指令微调方法, 分别是自监督的图匹配任务和特定任务的图数据指令微调来提高对图数据结构的理解能力和
对不同的下游任务的泛化能力, 并通过融合“思维链”的方式提升模型的推理能力. 具体来说, 在对齐阶段, 为了将
文本信息融合到图数据结构编码过程, GraphGPT 引入了对比学习的方法, 通过使用真值转换函数和交叉熵损失函
数进行对比对齐, 其中图编码器用的是 graph Transformer [70] , 文本编码器用的是经典 Transformer [26] . 在双阶段的图
指令微调部分, 第 1 阶段需要将图结构的相关信息融入到大语言模型中, 在指令构建阶段, GraphGPT 将图中的节

420 421 422 423 424 425 426 427 428 429 430