检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:胥桂仙[1] 李晓荣 XU Guixian;LI Xiaorong(School of Information Engineering,Minzu University of China,Beijing 100081,China)
出 处:《中央民族大学学报(自然科学版)》2024年第3期62-72,共11页Journal of Minzu University of China(Natural Sciences Edition)
基 金:北京市社科基金项目(20YYB011)。
摘 要:无监督聚类的目的是根据表示空间中的距离将数据划分为有意义或有用的簇,但往往不同类别在表示空间中是相互重叠的,为了实现不同类别的良好分离,使用实例对比学习模型,修改模型的激活函数为Tanh,并将单层感知机修改为多层感知机,提出了深度对比学习聚类模型。模型首先将原始中文长文本数据集输入神经网络特征提取层BERT中,然后将提取到的全部特征输入实例对比学习层中,对特征进行优化,最终使用K⁃means进行聚类。深度对比学习聚类模型在中文长文本聚类方面的性能相比于无监督聚类,在THUCNews数据集上的准确度提高了10%~25%。能够更好地促进不同类别相互重叠的数据的有效分离,实验效果显著优于现有的其他相关模型。The purpose of unsupervised clustering is to divide the data into meaningful or useful clusters according to the distance in the representation space,The different categories are overlap⁃ping each other in the representation space,In order to achieve a good separation of different catego⁃ries,it can use an example contrast learning model(SCCL),on the basis of the SCCL model,the activation function of the model is modified to Tanh,The Single⁃Layer Perceptron(SLP)was modi⁃fied to a multilayer perceptron,and a Clustering with Deep Contrastive Learning Model(CDCL)was proposed.The model first inputs the original Chinese long text dataset into the neural network fea⁃ture extraction layer Bert,and then inputs all the extracted features into the Instance⁃wise Contras⁃tive Learning(Instance⁃CL)layer to optimize the features,and finally use K⁃means for clustering.The performance of the deep contrast learning clustering model CDCL in Chinese long text clustering is evaluated,and it is shown that the deep contrast learning clustering model CDCL improves the ac⁃curacy of unsupervised clustering by 10%-25%compared with unsupervised clustering on the THUCNews dataset.The results show that the model can better promote the effective separation of different categories of overlapping data,and the experimental effect is significantly better than other existing related models.
关 键 词:实例对比学习模型 深度对比学习聚类模型 长文本聚类 K⁃means 实例对比学习层
分 类 号:TP301[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49