汉英新闻领域词典构建及文本分类  

Chinese-English News Domain Dictionary Construction and Text Classification

在线阅读下载全文

作  者:张彦彦 ZHANG Yanyan(Henan University of Economics and Law,Zhengzhou 450046,China;Information Engineering University,Zhengzhou 450001,China)

机构地区:[1]河南财经政法大学,河南郑州450046 [2]信息工程大学,河南郑州450001

出  处:《信息工程大学学报》2023年第6期669-674,共6页Journal of Information Engineering University

摘  要:针对新闻文本内容领域交叉、语义特征稀疏等问题,提出了结合概念层次网络词语知识库的领域词典附加特征向量的细粒度新闻文本分类方法,满足新闻文本多层级领域文本分类的需求。实验结果表明,附加领域词典特征向量的多层文本分类器在父领域及子领域的文本分类实现上均具有较好的性能。从总体分类实现的结果来看,第1层文本分类的效果要好于第2层文本分类的效果,第2层分类效果受到上层分类的影响,领域分类效果较好的父领域在进行子领域分类过程中表现出更好的分类实现性能。For that texts in news reports have problems of domain crossover and semantic sparsity,a fine-grained news text classification method combining hierarchical network of concepts word knowl-edge base with domain dictionary additional feature vectors was proposed to meet the needs of multi-level text classification.Experimental results show that the multi-layer text classifier with domain dic-tionary feature vectors has better performance in both father and sub-domain text classification.From the results of the overall classification implementation,the effect of the first-layer text classification is better than that of the second-layer text classification.The second-layer text classification effect is affected by the upper-layer classification.The father domain with better domain classification effect performs better in the process of sub-domain classification.

关 键 词:新闻文本 概念层次网络 文本分类 领域词典 

分 类 号:TP393[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象