检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]华东理工大学,上海200237 [2]石河子大学,新疆石河子832003
出 处:《情报理论与实践》2018年第9期123-129,160,共8页Information Studies:Theory & Application
基 金:国家自然科学基金项目"面向事件分析的信息意图检测;建模与群体意图推理技术研究"(项目编号:61462073);上海市科学技术委员会项目"基于知识库的数据搜素引擎技术"(项目编号:17DZ1101003)的研究成果
摘 要:[目的/意义]专利关键词是对专利核心内容的概括,高效准确地抽取专利关键词不仅可以辅助人们对专利的快速查找,同时对专利分类、聚类、翻译等具有重要意义。[方法/过程]提出了"关键词在关键句中"的关键词抽取新思路。首先构建了一个联合句网络语义图特征和启发式规则特征的专利摘要句排序模型,然后仅选择Top-KS%的句子参与关键词计算,同时将句子语义权重参数引入到关键词权重计算过程中,从而使得句子的重要性传递到句中的词上。[结果/结论]在真实中文专利数据集中实验表明,从中文专利中选择适当比例关键句参与关键词抽取计算,相较于传统关键词抽取算法F值提升了6%~13%左右,有效地降低原始文档的噪声数据,提升了关键词抽取的效果。[ Purpose/significance ] Keywords of Chinese patents, which provide a high-level topic description of a patent doc- ument, hold an important position in classic NLP tasks, such as patent classification, patent clustering, patent retrieval and pa- tent translation. [ Method/process] This paper proposes an innovative idea that "the keywords are in the key sentences" to extract keywords. The sentence-ranking modelis constructed to select the top-Ks percent of the sentences for calculation based on the char- acteristics of sentence-embedding graph and heuristic rules. Meanwhile, the semantic weights of sentences are also introduced to calculate keywords weights, so the importance of sentences can be transferred to the keywords in the sentences. [ Result/conclu- sion] The experimental results of Chinese patents datasets show that compared with traditional keywords extraction algorithm, se- lecting appropriate percent of key sentences for keywords extraction calculation improves the performance by 6% to 13% in F-score, which can effectively filter out noisy sentences in original documents and improve the performance of keywords extraction.
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.249