检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:崔丁洁 徐冰[1] CUI Dingjie;XU Bing(Harbin Institute of Technology,Harbin 150001,China)
出 处:《智能计算机与应用》2023年第5期197-202,F0003,共7页Intelligent Computer and Applications
基 金:国家重点研发计划(2020YFB1406902)。
摘 要:针对面向主流价值观的文本质量评价这一全新且较为复杂的任务,本文依据主流价值观对文本质量进行定义,构建了一个面向主流价值观的文本质量评价数据集。为了缓解人工标注数据的压力以及解决域内数据获取困难的问题,提出了一个基于无监督数据增强框架的文本质量评价方法。实验证明,在数据量较小时,能显著提升模型性能。为了获取更多数据,自主构建了一个大规模中文微博检索库,通过检索对数据集进行扩充。最终模型的F1值达到86.2%,相比BERT提升1.22%。More and more user generated content on the network provides a new window and channel for the publicity of mainstream values.Aiming at the new and complex task of text quality evaluation oriented to mainstream values,this paper defines text quality according to mainstream values,and constructs a text quality evaluation data set oriented to mainstream values.In order to alleviate the pressure of manually labeling data and solve the problem of difficult data acquisition in the domain,this paper proposes a text quality evaluation method based on unsupervised data enhancement framework.Experiments show that the performance of the model can be significantly improved when the amount of data is small.In order to obtain more data,we independently built a large-scale Chinese microblog retrieval database to expand the data set through retrieval.The F1 value of the final model reached 86.2%,which is 1.22%higher than BERT.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.28