病理镜检文本数据的结构化处理方法被引量：2

Structured Approach for Pathological Microscopy Text

作　　者：陈德华[1] 刘茜茜[1] 乐嘉锦[1] 潘乔[1] 朱立峰[2]

机构地区：[1]东华大学计算机科学与技术学院,上海201620 [2]上海交通大学医学院附属瑞金医院计算机中心,上海201620

出　　处：《计算机与现代化》2016年第4期1-6,共6页Computer and Modernization

基　　金：上海市科委科技创新行动计划资助项目(15511106900)

摘　　要：目前医疗文本数据的结构化处理大多依赖通用分词工具或医学知识库,而通用分词工具对专业术语的识别效果并不理想,且国内的中文医学术语标准化进程不足。针对此问题,提出一种基于统计信息对镜检文本数据进行结构化处理的方法。该方法以聚类文本为基础,基于断点词与重合串分词,利用分词词串的统计信息获取关键词以及词语类别信息,并进行词语扩充,从而得到最终词库作为字典。利用基于字典的双向最大匹配分词算法,对文本数据进行分词,并通过添加否定检出的规则,获取结构化数据。实验结果表明,该方法获取的医学词库的准确率达到了80%,实现了不依赖分词工具获得结构化数据的功能。The current structured approaches for the medical text data are mostly dependent on universal word segmentation software or professional terminology libraries,but the recognition effect of professional vocabularies by universal word segmentation tools is not satisfactory,and a mature system of Chinese standard terminology library is not established. Aimed at these problems,this paper puts forward a kind of structured processing method for medical text data based on statistical information. On the basis of clustering text and according to the breakpoint words and coincident string word segmentation,the key words and the type information of words are obtained by the statistical information of participle word string,enlarged the words and got the final lexicon as the word dictionary. It carried out word segmentation by the two-way dictionary word maximum matching algorithm and then obtained structured data by adding the rules of negative detection. Experiments show that the accuracy of the professional vocabulary libraries obtained by this method reached 80%,and this method achieves the capability to get structured data without the help of segmentation tools.

关键词：医疗文本数据文本数据结构化统计分词双向最大匹配

分类号：TP391.1[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

病理镜检文本数据的结构化处理方法被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

病理镜检文本数据的结构化处理方法 被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

病理镜检文本数据的结构化处理方法被引量：2