检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]山东师范大学信息科学与工程学院,济南250014
出 处:《计算机科学》2009年第12期227-230,256,共5页Computer Science
基 金:国家自然基金(60873247);山东省自然基金(Y2006G20);山东省高新自主创新专项工程(2008ZZ28)资助
摘 要:引入了一种以逻辑概念为中心的段落化匹配方式。该方法建立在概念词典之上,通过分析待分类文本中所包含的逻辑概念,将待分类文本中表达相同意义的段落进行聚类分析以得到一个逻辑层次,并建立以此逻辑层次划分方法为基础的逻辑段落概念,然后以该逻辑段落作为依据来衡量不同的段落对于文本主题表示的贡献程度。同时,针对匹配过程中存在的多义词和同义词现象,引入了同义词概念扩充和关联词语扩充。实验证明,该方法能够获得更高的内容过滤准确率,有效提高分类效果。A new matching method based on logic-centered paragraphs was introduced. The method built on the basis of the concept dictionary carried out the cluster analysis of the paragraphs which have the same meaning in the text by analyzing the logical concept of the text to be classified so as to get a logical level, and established the logical paragraph concept on the basis of the division method of the logical level, then measured the contribution of different paragraphs to the text theme according to the logical paragraph. At the same time, in order to solve problem of synonyms and polysemy in the matching process, the expansion of the synonyms concept and related words were introduced. Experimental results show that this method can obtain a higher accuracy rate in content flitting, improving the effectiveness of classi- fication effectively.
分 类 号:TP301[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.205