近30年来中文语言知识资源发展及应用  被引量:6

An Overview of the Advances and Applications of Online Chinese Language Resources over Three Decades

在线阅读下载全文

作  者:詹卫东[1] Zhan Weidong

机构地区:[1]北京大学中国语言文学系/中国语言学研究中心/计算语言学教育部重点实验室,北京100871

出  处:《语言战略研究》2018年第4期58-69,共12页Chinese Journal of Language Policy and Planning

基  金:国家重点基础研究发展计划(2014CB340504);教育部人文社科重点研究基地重大项目(13JJD740001;15JJD740002)经费支持

摘  要:本文利用互联网搜索引擎,调研了中国大陆和港台地区,以及北美、欧洲等多地的中文语言知识资源,包括语料库、知识库及相应的检索系统的现状。得益于经验主义研究范式在自然语言信息处理以及其他语言应用研究领域近30年来的快速发展,中文世界的可用语言知识资源已经积累到了相当可观的规模。本文从4个方面讨论了中文语言知识资源在汉语研究及教学中的应用价值,并简要分析了资源建设面临的挑战及对汉语语言学未来发展可能造成的影响,指出汉语语言学研究的理想进路应是将基于理性内省的语言学研究范式与基于真实海量语言数据的实证分析相结合,而不是将二者对立起来。In the past three decades, empiricism paradigm in research prevails in natural language processing and other lan-guage application fields, which leads to the boom of online language data resources, including corpora, knowledge bases, and the related search engines.. With regard to Chinese language online resources, numerous Chinese corpora, lexicon and dic-tionaries, large or small, have been established and open for search and research purposes, which has given great impetus for Chinese language studies. This paper examines the development and application of the online Chinese language resources, and discusses their possible impact on linguistics and the challenges for their further development. First, it gives a brief introduction of the background of corpus development. Second, it presents an overview of the Chinese language resources constructed since the 1990s to date. Third, it uses some concrete examples to demonstrate the application of online resources in linguistic research and language teaching. Fourth, it discusses the challenges for the construction of Chinese language online resources and the difficulties in their applications. In conclusion, it suggests a closer integration of introspection-based theoretical analysis and data-driven statistical analysis to benefit language studies.

关 键 词:语言知识资源 语料库 知识库 检索系统 

分 类 号:H002[语言文字—语言学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象