Supervised Contrastive Learning with Term Weighting for Improving Chinese Text Classification  

在线阅读下载全文

作  者:Jiabao Guo Bo Zhao Hui Liu Yifan Liu Qian Zhong 

机构地区:[1]School of Cyber Science and Engineering,Wuhan University,Wuhan 430000,China

出  处:《Tsinghua Science and Technology》2023年第1期59-68,共10页清华大学学报(自然科学版(英文版)

基  金:supported by the National Natural Science Foundation of China (No.U1936122);Primary Research&Developement Plan of Hubei Province (Nos.2020BAB101 and 2020BAA003).

摘  要:With the rapid growth of information retrieval technology,Chinese text classification,which is the basis of information content security,has become a widely discussed topic.In view of the huge difference compared with English,Chinese text task is more complex in semantic information representations.However,most existing Chinese text classification approaches typically regard feature representation and feature selection as the key points,but fail to take into account the learning strategy that adapts to the task.Besides,these approaches compress the Chinese word into a representation vector,without considering the distribution of the term among the categories of interest.In order to improve the effect of Chinese text classification,a unified method,called Supervised Contrastive Learning with Term Weighting(SCL-TW),is proposed in this paper.Supervised contrastive learning makes full use of a large amount of unlabeled data to improve model stability.In SCL-TW,we calculate the score of term weighting to optimize the process of data augmentation of Chinese text.Subsequently,the transformed features are fed into a temporal convolution network to conduct feature representation.Experimental verifications are conducted on two Chinese benchmark datasets.The results demonstrate that SCL-TW outperforms other advanced Chinese text classification approaches by an amazing margin.

关 键 词:Chinese text classification Supervised Contrastive Learning(SCL) Term Weighting(TW) Temporal Convolution Network(TCN) 

分 类 号:H31[语言文字—英语]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象