检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]上海出版印刷高等专科学校,上海200093 [2]上海海事大学信息工程学院,上海201306
出 处:《计算机与现代化》2012年第7期120-123,共4页Computer and Modernization
基 金:上海海事大学科研基金资助项目(20100091)
摘 要:随着邮件分类技术的不断发展,为了对邮件进行更加有效的组织和管理,需要对不断变化的邮件进行动态特征提取,根据其动态特征对邮件进行分类。从邮件的动态特征方面入手,通过编写邮件客户端程序,利用中科院的ICTCLAS分词工具实现中文邮件的准确分词,利用改进的TF-IDF算法对邮件的特征权重进行计算,并利用WEKA挖掘工具进行结果的仿真实验。实验结果表明,利用邮件的动态特征来对邮件进行分类是切实可行的,且在一定程度上能够对邮件进行合理有效的分类。With the development of E-mail classification technology, it needs to extract from the constantly E-mail features, so as to improve the organization and management of the message category more effective, according to changing characteristics. This article resolves the problem from the aspects of the message' s dynamic characteristics, by using the mail client software, using the ICTCLAS tool to realize Chinese word segmentation, and using the improved TF-IDF algorithm to calculate the mail feature weighting, and also using the WEKA mining tool to examine the result with the simulation experiment. The experimental results show that, by using the dynamic characteristics in a mail message, the realization of changing characteristics in mail classification is feasible, and to a certain extent, this method is more reasonable and effective.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.249