基于图对比学习的长文本分类模型A Long Text Classification Model Based on Graph Contrast Learning
刘宇昊,高榕,严灵毓,叶志伟
摘要(Abstract):
当前基于字符级考虑的文本分类方法在长文本分类上,存在输入维度过大致使计算困难以及内容过长难以捕捉长距离关系,从而导致准确度不足的问题。由此,提出基于自适应视图生成器和负采样优化的图对比学习长文本分类模型。首先将长文本分为若干段落,用BERT衍生模型对段落进行嵌入表示,然后基于文本的高级结构将段落的嵌入表示视为节点构建图模型,接着使用自适应视图生成器对图进行增广,并通过图对比学习得到文本的嵌入表示,同时在图对比学习的负采样阶段,引入PU Learning知识修正负采样偏差的问题,最后将得到的文本嵌入表示使用两层线性层进行分类。通过在两个中文数据集上的实验显示,方法优于主流先进模型。
关键词(KeyWords): 文本表示;长文本分类;图对比学习;负采样
基金项目(Foundation):
作者(Author): 刘宇昊,高榕,严灵毓,叶志伟
参考文献(References):
- [1]KOWSARI K,JAFARI MEIMANDI K,HEIDAR-YSAFA M,et al.Text classification algorithms:A survey[J].Information,2019,10(04):150.
- [2]MEDHAT W,HASSAN A,KORASHY H.Sen-timent analysis algorithms and app-lications:A survey[J].Ain Shams engineering journal,2014,5(04):1093-1113.
- [3]YATES A,NOGUEIRA R,LIN J.Pretrained transformers for text ranking:BERT and beyond[C]∥Proceedings of the 14th ACM International Conference on Web Search and Data Mining.2021:1154-1156.
- [4]MA X,ZHU Q,ZHOU Y,et al.Improving question generation with sentence-level semantic matching and answer position inferring[C]∥Proceedings of the AAAI Conference on Artificial Intelligence.2020,34(05):8464-8471.
- [5]WANG Z,LIU X,YANG P,et al.Cross-lingual text classification with heterogeneous graph neural network[C]∥Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing,(ACL/IJCNLP)2021,(Volume 2:Short Papers),2021:612-620
- [6]CHAFFAR S,INKPEN D.Using a heterogeneous dataset for emotion analysis in text[C]∥Canadian conference on artificial intelligence.Springer,Berlin,Heidelberg,2011:62-67.
- [7]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space[C]∥1st International Conference on Learning Representations,(ICLR 2013),Scottsdale,Arizona,USA,2013.http:∥arxiv.org/abs/1301.3781.
- [8]LILLEBERG J,ZHU Y,ZHANG Y.Support vector machines and word2vec for text classification with semantic features[C]∥2015IEEE 14th International Conference on Cognitive Informatics&Cognitive Computing(ICCI*CC).IEEE,2015:136-140.
- [9]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]∥Advances in Neural Information Processing Systems 30:Annual Conference on Neural Information Processing Systems 2017.2017:5998-6008.
- [10]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[C]∥Proceedings of NAACL-HLT.2019:4171-4186.
- [11]VAN DEN OORD A,LI Y,Vinyals O.Representation learning with contrastive predictive coding[C].CoRR,2018,abs/1807.03748.http:∥arxiv.org/abs/1807.03748.
- [12]SUN C,QIU X,XU Y,et al.How to fine-tune bert for text classification?[J]∥China national conference on Chinese computational linguistics.Springer,Cham,2019:194-206.
- [13]MOHANTY I,GOYAL A,DOTTERWEICH A.Emotions are subtle:learning sentiment based text representations using contrastive learning[J/OL].[2021-04-15].CoRR,2021,abs/2112.01054.https:∥arxiv.org/abs/2112.01054.
- [14]XU P,CHEN X,MA X,et al.Contrastive Document Representation Learning with Graph Attention Networks[C]∥Findings of the Association for Computational Linguistics:EMNLP 2021:3874-3884.
- [15]DU PLESSIS M,NIU G,SUGIYAMA M.Convex formulation for learning from positive and unlabeled data[C]∥International conference on machine learning.PMLR,2015:1386-1394.
- [16]■CUCURULL G,CASANOVA A,et al.Graph attention networks[J].stat,2017,1050:20.
- [17]JANG E,GU S,POOLE B.Categorical reparameterization with gumbel-softmax[C]∥5TH International conference on learning representations,(ICLR 2017),Toulon,France,2018.https:∥openreview.net/forum?id=rke3y85ee.
- [18]CHU G,WANG X,SHI C,et al.CuCo:Graph representation with curriculum contrastive learning[C]∥Proc.IJCAI.2021:2300-2306.
- [19]KIRYO R,NIU G,DU PLESSIS M C,et al.Positive-unlabeled learning with non-negative risk estimator[C]∥Advances in Neural Information Processing Systems30:Annual Conference on Neural Information Processing Systems 2017.2017:1675-1685.
- [20]HINTON G,VINYALS O,DEAN J.Distilling the knowledge in a neural network[J/OL].[2022-04-15].CoRR,2015,abs/1503.02531.http:∥arxiv.org/abs/1503.02531.
- [21]WANG F,LIU H.Understanding the behaviour of contrastive loss[C]∥Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.2021:2495-2504.
- [22]李景阳,孙茂松.Non-independent term selection for Chinese text categorization[J].Tsinghua Science and Technology,2009(01):115-122.
- [23]WANG C,ZHANG M,MA S,et al.Automatic online news issue construction in web environment[C]∥Proceedings of the 17th International Conference on World Wide Web,WWW’08,pages 457-466,New York,NY,USA,2008.ACM.
- [24]LAI S,XU L,LIU K,et al.Recurrent convolutional neural networks for text classification[C]∥Twenty-ninth AAAI conference on artificial intelligence.2015.
- [25]ZHOU P,SHI W,TIAN J,et al.Attention-based bidirectional long short-term memory networks for relation classification[C]∥Proceedings of the 54th annual meeting of the association for computational linguistics(volume 2:Short papers).2016:207-212.
- [26]KIM J,JANG S,PARK E,et al.Text classification using capsules[J].Neurocomputing,2020,376:214-221.
- [27]BELTAGY I,PETERS M E,COHAN A.Longformer:The long-document transformer[J/OL].[2022-04-15].CoRR,2020,abs/2004.05150.https:∥arxiv.org/abs/2004.05150.
- [28]HUAN H,YAN J,XIE Y,et al.Feature-enhanced nonequilibrium bidirectional long short-term memory model for Chinese text classification[J].IEEE Access,2020,8:199629-199637.
- [29]YANG Z,YANG D,DYER C,et al.Hierarchical attention networks for document classification[C]∥Proceedings of the 2016conference of the North American chapter of the association for computational linguistics:human language technologies.2016:1480-1489.
- [30]YOU Y,CHEN T,SUI Y,et al.Graph contrastive learning with augmentations[J].Advances in Neural Information Processing Systems,2020,33:5812-5823.