BioTSA: Annotating Token Semantic Association to Support Biomedical Text Mining  被引量:2

BioTSA: Annotating Token Semantic Association to Support Biomedical Text Mining

在线阅读下载全文

作  者:WEI Xiaomei HUANG Sixing CHEN Bo JI Donghong 

机构地区:[1]School of Computer, Wuhan University [2]College of Informatics, Huazhong Agriculture University

出  处:《Wuhan University Journal of Natural Sciences》2015年第2期134-140,共7页武汉大学学报(自然科学英文版)

基  金:Supported by the National Natural Science Foundation of China(61202304,61173095,61173062,61202193)

摘  要:Corpus is a kind of important resource for knowledge acquisition in the natural language processing (NLP). However, up to now, in the biomedical domain comparatively fewer corpus focus on semantic association among all tokens in a sentence. We proposed an annotation scheme based on feature structure theory for enriching biomedical domain corpora with token semantic association (TSA). There are 227 documents of the BioNLP GE ST training data annotated to form TSA corpus in which each annotated item shows a token semantic association that appears as a triple. The annotation of token semantic association has the potential to significantly advance biomedical text mining by providing rich token semantic information for NLP systems especially for the sophisticated IE systems, such as bio-event extraction.Corpus is a kind of important resource for knowledge acquisition in the natural language processing (NLP). However, up to now, in the biomedical domain comparatively fewer corpus focus on semantic association among all tokens in a sentence. We proposed an annotation scheme based on feature structure theory for enriching biomedical domain corpora with token semantic association (TSA). There are 227 documents of the BioNLP GE ST training data annotated to form TSA corpus in which each annotated item shows a token semantic association that appears as a triple. The annotation of token semantic association has the potential to significantly advance biomedical text mining by providing rich token semantic information for NLP systems especially for the sophisticated IE systems, such as bio-event extraction.

关 键 词:ANNOTATION token semantic association feature structure TRIPLE 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象