An improved TF-IDF approach for text classification  被引量:5

An improved TF-IDF approach for text classification

在线阅读下载全文

作  者:张云涛 龚玲 王永成 

机构地区:[1]Network & Information Center School of Electronic & Information Technology, Shanghai Jiaotong University, Shanghai 200030, China [2]School of Electronic & Information Technology, Shanghai Jiaotong University, Shanghai 200030, China

出  处:《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》2005年第1期49-55,共7页浙江大学学报(英文版)A辑(应用物理与工程)

基  金:Project (No. 60082003) supported by the National Natural Science Foundation of China

摘  要:This paper presents a new improved term frequency/inverse document frequency (TF-IDF) approach which uses confidence, support and characteristic words to enhance the recall and precision of text classification. Synonyms defined by a lexicon are processed in the improved TF-IDF approach. We detailedly discuss and analyze the relationship among confidence, recall and precision. The experiments based on science and technology gave promising results that the new TF-IDF approach improves the precision and recall of text classification compared with the conventional TF-IDF approach.This paper presents a new improved term frequency/inverse document frequency (TF-IDF) approach which uses confidence, support and characteristic words to enhance the recall and precision of text classification. Synonyms defined by a lexicon are processed in the improved TF-IDF approach. We detailedly discuss and analyze the relationship among confidence, recall and precision. The experiments based on science and technology gave promising results that the new TF-IDF approach improves the precision and recall of text classification compared with the conventional TF-IDF approach.

关 键 词:Term frequency/inverse document frequency (TF-IDF) Text classification CONFIDENCE SUPPORT Characteristic                   words 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象