检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]淮阴工学院计算机工程学院,江苏淮安233003
出 处:《计算机工程》2010年第22期81-82,85,共3页Computer Engineering
摘 要:根据短信文本的特性,给出一种基于密度的中文短信聚类的方法,该方法将文本数据中具有高密度的区域划分为簇,构造一个可达相似度的升序排列的种子队列存储待扩张的短信文本,选择大阈值相似度可达的对象,即快速定位稠密空间的文本对象使较高密度的簇优先完成。实验结果表明,该聚类方法比K-means提高10倍左右的效率。According to the characteristics of short message text, a clustering method of the Chinese message based on density is given. High-density region of the text data is divided into clusters and a seed queue is constructed, which is arranged in ascending order of the reachable similarity, to store the text of short message text to be expanded. The text message is disposed in a specific order. In order to make higher-density clusters to complete first, the object is selected according to a greater threshold similarity, namely that the dense space text object which can be rapidly located makes the high-density cluster complete first. Experimental result shows that this clustering method's efficiency is increased 10 times of K-means method.
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.147.44.106