检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:胡雅敏 吴晓燕[1] 廖兴滨 钱杨舸 陈方[1,2] Hu Yamin;Wu Xiaoyan;Liao Xingbin;Qian Yangge;Chen Fang(Chengdu Library and Information Center,Chinese Academy of Sciences,Chengdu 610299;Department of Library,Information and Archives Management,School of Economics and Management,University of Chinese Academy of Sciences,Beijing 100190;Chengdu Information Technology of Chinese Academy of Sciences CO.,LTD,Chengdu,610299)
机构地区:[1]中国科学院成都文献情报中心,成都610299 [2]中国科学院大学经济与管理学院图书情报与档案管理系,北京100190 [3]中国科学院成都计算机应用研究所,成都610299
出 处:《图书情报工作》2022年第24期92-103,共12页Library and Information Service
基 金:中国科学院成都文献情报中心2021年创新基金青年项目“基于知识基因的领域创新路径分析框架研究”(项目编号:E1Z0000202)研究成果之一。
摘 要:[目的/意义]面向专利文本进行更细粒度的技术实体识别和技术预测,利于更详细地把握专利技术布局与趋势。[方法/过程]首先利用深度学习方法自动识别专利技术术语类实体,通过实验对比多组深度学习算法的优劣。其次,提出新的半监督标注和自定义标注方案,提高人工标注效率。最后,执行训练得到的最优模型,结合链路预测方法,对合成生物技术进行细粒度的技术预测。[结果/结论]实证结果表明RoBERTa-BiLSTM-CRF模型更适用于语义复杂的专利技术实体识别,F1值可达到86.8%,技术识别结果比传统IPC分析方法更精细。同时,细粒度的技术预测结果表明,合成生物学的合成方法在不断改进创新,合成物研究向合成燃料发展。[Purpose/Significance]It is beneficial to grasp the layout and trend of patent technology by identif-ying technical entities and predicting technology with finer granularity for patent texts.[Method/Process]The deep learning method was used to automatically identify patent technology terms entities,and the advantages and disadvan-tages of several groups of deep learning algorithms were compared by empirical analysis.At the same time,new semi-supervised labeling and self-defined labeling schemes were proposed to improve the efficiency of manual labeling.Fi-nally,the optimal model obtained by training was implemented,and the fine-grained technical prediction of synthetic biotechnology was made by combining the link prediction method.[Result/Conclusion]The empirical results show that RoBERTa-BiLSTM-CRF model is more suitable for the recognition of patent technical terms with complex seman-tics,and the F1 value reaches 86.8%.The technical recognition result is more detailed than the traditional IPC a-nalysis method.The fine-grained technical prediction shows that the synthetic methods of synthetic biology are con-stantly improving and innovating,and the synthetic research is developing towards synthetic fuels.
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15