基于增强全局特征提取的分类大模型框架
DOI:
作者:
作者单位:

1. 福建理工大学计算机科学与数学学院,福建 福州 350118 ;2. 闽江学院福建省信息处理与智能控制重点实验室,福建 福州 350108

作者简介:

陈可纬(2000—),男,硕士研究生,研究方向为自然语言处理、大语言模型。E-mail:1019375578@qq.com。

通讯作者:

中图分类号:

TP399

基金项目:

国家自然科学基金项目(62172095);福建省自然科学基金项目(2023J01349);闽江学院福建省信息处理与智能控制重点实验室开放课题(MJUKF-IPIC2024402)


A Classification Framework Based on Enhanced Global Feature Extraction for Large Models
Author:
Affiliation:

1. College of Computer Science and Mathematics, Fujian University of Technology, Fuzhou 350118 , China ;2. Fujian Provincial Key Laboratory of Big Data Mining and Applications, Fuzhou 350108 , China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    大语言模型(LLMs)通常采用指令微调方法适应下游任务,以增强其泛化能力,然而该方法针对LLMs的分类任务存在一定的性能局限性,有时无法满足任务需要。针对上述问题,提出一种全局特征提取分类大模型框架。该框架使用本文提出的全局特征提取增强方法,在注意力层释放全局特征,再对特征进行增强,并在微调的过程中运用低秩微调优化损失,最后构建一个全局特征提取的分类大模型。与基线模型RoBERTa相比,在通用情感分析数据集SST-2和AGNews上,准确率分别提升1.44个百分点和0.95个百分点。与基线模型PIQN模型相比,在通用命名实体识别(NER)数据集OntoNotes和CoNLL2003中,F1分数分别提升0.79%和1.99%。实验结果表明,在不需要复杂的提示工程或外部知识的条件下,使用该框架的大模型性能显著优于其数倍规模的LLMs。

    Abstract:

    Large language models (LLMs) are often trained with instruction fine-tuning to adapt to downstream tasks to enhance their generalization ability, but this method has certain performance limitations for LLMs' classification tasks, and sometimes cannot meet the task requirements. To address this issue, a global feature extraction classification large model framework is proposed. This framework uses the global feature extraction enhancement method proposed in this paper to release global features in the attention layer, then enhance the features, and apply the depth low-rank fine-tuning optimization loss proposed in this paper during fine-tuning. Finally, a global feature extraction classification large model is constructed. Compared with the baseline model RoBERTa, the accuracy on the general sentiment analysis dataset SST-2 and AGNews was improved by 1.44 and 0.95 percentage points, respectively. Compared with the baseline model PIQN, the F1 score on the general named entity recognition (NER) dataset OntoNotes and CoNLL2003 was improved by 0.79% and 1.99%, respectively. The experimental results show that, under the condition of not requiring complex prompt engineering or external knowledge, the performance of the large model using this framework is significantly better than that of its several times larger LLMs.

    参考文献
    相似文献
    引证文献
引用本文

陈可纬,刘建华,陈治铭,等. 基于增强全局特征提取的分类大模型框架[J]. 华东交通大学学报,2026,43(2): 115-126.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-12-24
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2026-05-20
  • 出版日期:
关闭