共词分析过程中的若干问题研究  被引量:109

Co-word Analysis:Limitations and Solutions

在线阅读下载全文

作  者:李纲[1] 巴志超 LI Gang BA Zhichao

机构地区:[1]武汉大学信息资源研究中心 [2]武汉大学信息管理学院,博士研究生湖北武汉430072

出  处:《中国图书馆学报》2017年第4期93-113,共21页Journal of Library Science in China

基  金:国家自然科学基金项目"科研团队动态演化规律研究"(编号:71273196)的研究成果之一~~

摘  要:为完善和优化共词分析方法,本文从共词分析过程中概念术语的词源选择、高频词的选定、术语相关性计算以及多元统计分析四个方面系统地总结共词分析存在的局限性。在词源选择方面,论述不同类型的文献分析单元、术语的规范化以及术语表征差异性对共词分析的影响;在高频词选定方面,分析国内外相关研究在设定高频词阈值、考虑术语语义类型特征以及低频关键词处理等问题时存在的不足,并提出相应的解决方法;在术语相关性计算方面,认为术语之间不仅存在着直接的频次共现,还存在间接的语义相关,总结现有的术语语义相关性度量方法,并对其相关特征进行分析;在多元统计分析方面,对共词分析中常采用的统计分析方法和应用策略进行探讨。本文基于严谨、客观的态度对共词分析的优缺点做出评价,有利于该方法的不断完善和发展,同时也为继续从事共词分析研究的人员提供理论借鉴和实践参考。Co-word analysis is a content analysis technique based on the assumption that the subject of a paper can be summarized in a limited number of key terms. If two terms co-occur within one paper, the two research topics they represent are related, and the higher frequency of the co-word means stronger correlation in terms pairs. However, the basic work of co-word analysis is still words and extremely sensitive to the selection of terms, and the quality of co-word analysis depends on a variety of factors, such as the quality of terms and indexes, the high-frequency terms extraction, and the adequacy of statistical methods. Therefore it is necessary to delve into the limitations of co-word analysis at different stages to improve and optimize it. The co-word analysis conducted in the present study involved six sequential steps: determination of problem analysis, term source selection, high-frequency terms extraction, relevance calculation of terms, multivariate statistical analysis, and visual presentation of results. This paper focuses on those six key issues to analyze and demonstrate the main problems based on the induction and summarization of the existing relevant research. Results indicate the following conclusions. 1) In the term source selection, solely making use of keywords and index words, which is called "indexer effect" by researchers, is the biggest problem of early co-word analysis. Keywords are uncontrolled words, and problems of homonyms and synonyms will be brought out. Meanwhile, terms expression differences exist among different parts of analysis units, and some errors of co-word analysis will be induced if those differences are ignored. In order to solve the above problems, the textual semantic structure and the phenomenon of different quality with different quantity of terms can be considered. 2) Researchers engaged in co-word analysis have never been out of the pattern that adopts high-frequency term to develop the multivariate statistical analysis. The extraction of high- frequency te

关 键 词:共词分析 词源选择 术语规范化 高频词选定 语义关联 多元统计分析 

分 类 号:G250.7[文化科学—图书馆学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象