一种模仿人类的自动文本分类算法  被引量:5

An Automatic Algorithm of Text Categorization Imitating Human's

在线阅读下载全文

作  者:王树梅[1] 戴保存[1] 黄河燕[2] 陈肇雄[2] 

机构地区:[1]南京理工大学计算机系,南京210014 [2]中国科学院计算机语言信息工程研究中心,北京100083

出  处:《计算机科学》2003年第3期44-45,53,共3页Computer Science

摘  要:An algorithm of text classification is given that imitates human's in this paper. On one hand, the algorithmenhances weight of theme when feature vector is processed, because of the assumption that the title of a document canproject its content. On the other hand,a weight parameter o vector is designed to simulate human's skimming andskipping behavior for calculating method of a document cluster center, and a weight of the feature that there are morepositive examples than negative ones is enhanced . The experiment shows that the algorithm greatly improves the per-formance of a text classification system.An algorithm of text classification is given that imitates human's in this paper. On one hand, the algorithm enhances weight of theme when feature vector is processed, because of the assumption that the title of a document can project its content. On the other hand, a weight parameter to vector is designed to simulate human's skimming and skipping behavior for calculating method of a document cluster center, and a weight of the feature that there are more positive examples than negative ones is enhanced . The experiment shows that the algorithm greatly improves the performance of a text classification system.

关 键 词:自动文本分类算法 文本信息处理 文档分类 自然语言处理 INTERNET 

分 类 号:TP391[自动化与计算机技术—计算机应用技术] TP393.4[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象