检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]武警工程大学信息工程系,陕西西安710086
出 处:《计算机应用与软件》2015年第11期281-284,292,共5页Computer Applications and Software
摘 要:传统的文本信息抽取算法通常基于词典、规则或其他模型实现,但由于词典建立困难、规则设定模糊或模型结构单一等原因,信息抽取的准确性通常较低。针对传统的文本信息抽取算法存在的多种不足,提出一种基于混合模型的文本信息抽取算法。该算法融合了多种信息抽取方法,引入支持向量机对信息进行分类,利用S型函数拟合调整模型参数,并采用数据平滑技术优化模型概率空间。实验结果表明,与传统的文本信息抽取算法相比,该算法信息抽取的精确度和召回率明显提高,具有较好的可行性。Traditional text information extraction algorithm is usually implemented based on dictionary, rules or other models. However due to the difficulty in dictionary constructing, unclarity in rules setting and single model structure, etc., the precision of information extraction is usually low. In light of the deficiencies existed in traditional text information extraction algorithm, we proposed a hybrid model- based text information extraction algorithm. The algorithm incorporates a variety of information extraction methods, and introduces SVM to classify the information. At the same time, it uses S function to fit adjustment model parameters, and optimises probability space of model by using data smoothing technique. Experimental result indicated that compared with traditional text information extraction algorithm, this algorithm improved obviously the precision and recall rate of the information extraction and had good feasibility.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.28