检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]东华大学计算机科学与技术学院,上海201620
出 处:《计算机工程》2014年第12期126-131,共6页Computer Engineering
基 金:国家自然科学基金资助项目(60973121)
摘 要:网络中的很多程序资源在知识概念上有内在的联系,却没有超链接将它们连接在一起。将网络程序资源中的算法知识名称获取出来,组织成一个算法知识专家库文件,用于识别程序设计资源所含的知识点,即可将程序设计资源按知识点相互联系。为了自动获取程序资源中的算法知识名称,提出一种基于自然语言处理的算法知识名称发现方法。通过发现含有算法知识名称语句的字符串模式,从程序资源中提取可能含算法知识名称的字符串,从中找出最有可能出现在算法知识名称中的分词,并根据这些分词获取算法知识名称。实验结果表明,与原有人工整理出的算法知识名称集合相比,该方法新增了11.2%的算法知识点和13.6%的算法知识名称。There are many programming resources on the Internet. Although these programming resources have internal relations,there are often no hyperlinks connecting them. Getting the terms of algorithmic knowledge,organizing the terms to an expert file,which is used for recognizing the knowledge in the programming resources,the programming resources can be connected by the knowledge. To get the terms of algorithmic knowledge,this paper proposes a method to discover terms of algorithmic knowledge based on natural language processing. This method consists of discovering the patterns of strings which contain terms of algorithmic knowledge,extracting from programming resources that probably contain terms of algorithmic knowledge according to the discovered patterns,finding the word segmentation most likely appearing in the terms of algorithmic knowledge,and fetching the terms of algorithmic knowledge according to the word segmentation. This method increases 11 . 2% algorithmic knowledge and 13 . 6% terms of algorithmic knowledge in comparison with the manual collection of terms of algorithmic knowledge which is obtained by previous work.
关 键 词:知识发现 模式发现 自然语言处理 算法知识名称 中文分词 词性标注
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.43