检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]山东理工大学科技信息研究所,淄博255049
出 处:《图书情报工作》2015年第14期126-134,共9页Library and Information Service
基 金:教育部人文社会科学研究青年基金"长句检索中信息查询扩展研究"(编号:12YJC870001);文化部科技创新项目"大规模学术文献并行处理与自动分类研究"研究成果之一
摘 要:[目的 /意义]由于传统科技创新主题概率识别方法忽略文本内容语义理解,为了更加准确地识别出主题,科技创新主题语义识别势在必行。[方法 /过程]提出一种基于LDA的科技创新主题语义识别方法,利用语义角色标注技术对科技文献中的科技创新内容进行语义标引,构建LDA主题语义识别模型,根据表征科技创新内容的关键词语义角色对应的上位词的概率识别出科技创新主题。[结果 /结论]通过以3D打印领域数据为对象进行实验,证明该方法能够更加准确地识别出科技创新主题,形成科技创新主题-主题词-科技文献的混合分布聚类集群,减少研究背景等无关数据干扰,避免语义含义相同的科技创新主题词重复统计问题。[ Purpose/significance] Traditional probabilistic model of technology innovation theme identification method ignores the semantic understanding of the text. In order to identify the theme more accurately, the semantic recognition of technological innovation theme is imperative. [ Method/process ] This article proposes a semantic recognition method of science and technology innovation theme based LDA, uses the semantic role labeling technique to semantic index the technological innovation content of scientific literature, builds the LDA topic semantic recognition model, and identifies the science and technology innovation theme according to the probability of the hypernyms which correspond with semantic roles of keywords from technological innovation content. [ Result/conclusion ] The 3 D printing field data experimental results show that, this method can identify the innovation theme more accurately, and form a mixed distribution cluster of scientific and technological innovation theme-scientific and technological innovation MeSH-scientific literature. It can reduce the interference of the background and other irrelevant data and avoid of the same semantic meaning' s double counting problem of scientific and technological innovation MESH.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.3