检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:陈宇航 王世宙 汤正婷 陈良育[1] 姜宁康[1] CHEN Yuhang;WANG Shizhou;TANG Zhengting;CHEN Liangyu;JIANG Ningkang(Software Engineering Institute,East China Normal University,Shanghai 200062,China)
出 处:《华东师范大学学报(自然科学版)》2025年第1期46-58,共13页Journal of East China Normal University(Natural Science)
基 金:国家自然科学基金(62272416)。
摘 要:第三方软件系统在现代软件开发过程中有着重要的作用.软件开发人员根据需求,在第三方软件库中检索合适的依赖库来构建软件,可避免许多重复工作,加快开发过程.然而,检索第三方依赖库的过程可能会很困难.通常第三方软件库提供预设的标签(类别)给软件开发人员进行查找,但是如果一个软件的预设标签被错误地标注,软件开发人员就无法查找到其需要的库,这势必会影响开发过程.提出了一种软件分类模型来解决上述挑战,模型结合方法向量、方法重要性和文本向量,将未知类别的软件分类到已知类别.鉴于此问题尚未有公开的数据集,为此建立了一个数据集并公开,此数据集包含来自Maven存储库的30种类别的120个软件系统.在此自建数据集上对提出的分类模型进行了测试,预测类别的准确度对于1个候选者的情况(top-1)为70%,对于3个候选者的情况(top-3)则达到了90%.实验结果表明,所提模型可以有效用于对开源存储库中的软件系统分类,辅助软件开发人员快速查找第三方库.Third-party software systems play a significant role in modern software development.Software developers build software based on requirements by retrieving appropriate dependency libraries from thirdparty software repositories,effectively avoiding repetitive wheel-building operations and thus speeding up the development process.However,retrieving third-party dependency libraries can be challenging.Typically,third-party software repositories provide preset tags(categories)for software developers to search.However,when a software’s preset tags are incorrectly labeled,software developers are unable to find the libraries required,and this inevitably affects the development process.This study proposes a software clustering model to address the aforementioned challenges.The model combines method vectors,method importance,and text vectors to categorize unknown categories of software into known categories.In addition,because no publicly available dataset exists for this problem,we built a dataset and made it publicly available.This clustering model was tested on a self-built dataset comprising 30 categories and software systems from the Maven repository.The accuracy of the prediction category was 70%for one candidate(top-1)and 90%for three candidates(top-3).The experimental results show that our model can help software developers find suitable software,can be useful for classifying software systems in open-source repositories,and can assist software developers in quickly locating third-party libraries.
关 键 词:软件分类 第三方软件系统 方法重要性分数 code2vec
分 类 号:TP311.5[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.117