检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:赵一[1] 李昭[2] 陈鹏[2] 何泾沙 何克清[1] ZHAO Yi;LI Zhao;CHEN Peng;HE Jing-sha;HE Ke-qing(College of Computer and Information,China Three Gorges University,Yichang 443002,China;School of Computer Science,Wuhan University,Wuhan 430072,China)
机构地区:[1]武汉大学计算机学院,武汉430072 [2]三峡大学计算机与信息学院,湖北宜昌443002
出 处:《小型微型计算机系统》2019年第1期81-88,共8页Journal of Chinese Computer Systems
基 金:国家重点研发计划项目(2016YFC0802500;2016YFB0800403)资助;国家自然科学基金项目(61562073)资助;三峡大学人才专项经费项目(8000303)资助
摘 要:目前,互联网中发布的Web服务大都通过自然语言进行描述,这种非结构化的描述方式为机器进行自动分析与处理带来了极大的困难.如何提高服务发现的效率和精确率,已成为服务计算领域的研究热点之一.服务聚类是服务发现的重要支撑技术,通过将语义相似的服务加以聚类和组织,有助于改进服务发现的效果.当前的服务聚类技术主要采用LDA(潜式狄里克雷分布)和K-means等模型在同一领域下进行工作,利用这些方法进行服务聚类时还存在一定的局限性,例如,未充分利用词汇间的语义关系进行降维,从而导致服务发现的效果不够理想.针对该问题,本文使用神经网络模型(word2vec模型)获得服务描述中的同义词表并生成领域特征词集,来最大限度的降低服务特征向量维度;在此基础上,提出S-LDA(Semantic Latent Dirichlet Allocation)模型对同一领域的服务进行聚类,由此构建了一个面向领域的Web服务聚类框架(Domain Semantic aided Web Service Clustering,DSWSC).在ProgrammableWeb网站上发布的服务数据集开展的实验表明,与LDA和K-means等方法相比,本文方法在熵、聚类纯度和F指标上均取得了明显效果,有助于提高服务搜索的准确率.Currently,most of the Web services published in the Internet are described by natural language,this kind of unstructured descriptions brings difficulties in automatic analysis and processing. Howto improve the efficiency and accuracy of service discovery has become a hot topic in the field of service computing. Service clustering is an important fundamental technology for service discovery.It is helpful to improve the effectiveness of service discovery by clustering and organizing semantic similar services. The current service clustering technology mainly adopts LDA( Latent Dirichlet Allocation) and K-means models. There is still some limitations when using these methods for service clustering,e. g.,they are unable to reduce dimension by using lexical semantic relations. To solve this problem,this paper firstly creates synonyms for service descriptions by the neural network model( word2 vec model),and then uses the decision tree classifier to classify service domains. Afterwards,an improved S-LDA( Semantic Latent Dirichlet Allocation) model is proposed to cluster semantic similar services. In this way,a domain-oriented service semantic clustering method( DSWSC) is proposed. Experiments conducted on the service data set published on the Programming Web showthat our approach outperforms LDA and K-means methods in entropy,clustering purity and F-measure,which can be helpful to improve the accuracy in service discovery.
关 键 词:语义潜式狄里克雷分布 Word2vec web服务聚类
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.4