知识数据库中非结构化文本关键信息抽取模型  被引量:5

Key Information Extraction Model of Unstructured Text in Knowledge Database

在线阅读下载全文

作  者:郭炜杰[1] 包晓安[1] GUO WEI-jie;BAO Xiao-an(Zhejiang Sci-Tech University,Hangzhou Zhejiang 310018,China)

机构地区:[1]浙江理工大学,浙江杭州310018

出  处:《计算机仿真》2021年第9期357-360,394,共5页Computer Simulation

基  金:浙江省重点研发计划项目(2020C03094)。

摘  要:针对传统文本关键信息抽取模型中存在的信息抽取效果不佳、抽取耗时较长等问题,提出设计知识数据库中非结构化文本关键信息抽取模型。利用六元组优化隐马尔可夫模型,取得模型发生概率,平滑处理不完整的训练样本;对不同时刻释放观察值序列展开初始化、终结操作,获取最优状态序列,经过解码观察序列后,对比得到正序解码序列与逆序解码序列,滤除无解码歧义的状态,完成歧义消除;根据解得的最大概率状态序列,明确所要抽取的文本关键信息,完成知识数据库中非结构化文本关键信息抽取模型设计。实验结果表明:采用所提模型抽取非结构化文本关键信息的效果较好,且耗时较短。The traditional text key information extraction model has a poor extraction effect and long extraction time.In this regard, the key information extraction model of unstructured text in a knowledge database is designed.Firstly, for obtaining the occurrence probability of the model and smooth the incomplete training samples, six tuples were applied to optimize the hidden Markov model.Secondly, the observation sequences released at different times were initialized and stopped to obtain the optimal state sequence.Then, the observation sequence was decoded to get the positive sequence and the negative sequence, thus eliminating the state of no decoding ambiguity(ambiguity elimination).Finally, the key information to be extracted was determined to complete the extraction model design of unstructured text key information in the knowledge database by solving the state sequence of maximum probability.The experimental results show that the model has an excellent extraction effect and short time-consuming.

关 键 词:知识数据库 非结构化 文本关键信息 信息抽取 隐马尔可夫模型 最大概率状态序列 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象