检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:于平 Yu Ping(Guangzhou Huanan Business College,Guangzhou 510000,Guangdong)
出 处:《武汉工程职业技术学院学报》2024年第1期36-41,共6页Journal of Wuhan Engineering Institute
摘 要:由于软件开发过程中涉及多个团队和人员的协作,文档之间往往存在不一致性、错误或遗漏等问题,这些问题如果不及时发现和处理,将严重影响软件开发的效率和质量。对此,为精准获取所需数据,提升软件开发者工作效率和软件开发速度,提出面向软件开发信息库的多源异构数据深层次挖掘方法。基于时间序列完成不同来源获取软件信息库多源异构数据缺失值以及噪声数据的处理;提取处理后多源异构数据特征,以此为输入SOM神经网络进行多源异构数据聚类;利用ATPRK方法预测出软件信息库的多源异构数据需求,以此为依据,再次聚类SOM网络输出聚类结果,实现多源异构数据的深层次挖掘。实验结果表示:该方法可挖掘出99%的软件开发信息库的多源异构数据;有效去除软件开发信息库中不被需要的多源异构数据;多源异构数据聚类数量为16时的聚类正确率最好,且多源异构数据最小聚类熵值仅为0.31,数据深层次挖掘效果较好。Due to the fact that software development process involves the collaboration of multiple teams and individuals in the software development process,there are some problems such as inconsistencies,errors or omissions among documents.If these issues are not discovered and addressed promptly,they can significantly impact the efficiency and quality of software development.To address this,a deep-level mining method for multi-source heterogeneous data in software development information repository is proposed to accurately obtain the necessary data,improve software developers'work efficiency and software development speed.The method involves handling missing values and noise data in the multi-source heterogeneous data from different sources based on time series analysis.The processed multi-source heterogeneous data features are extracted and used as inputs for clustering using a self-organizing feature mapping neural network(SOM neural network).Additionally,the ATPRK method is utilized to predict the requirements for the multi-source heterogeneous data in the software information repository.Based on this prediction,the SOM network clusters are recalculated to obtain the clustering results,achieving deep-level mining of multi-source heterogeneous data.Experimental results indicate that this method can mine 99%of the multi-source heterogeneous data in the software development information repository,effectively remove unnecessary data,achieve the best clustering accuracy when the number of clusters is 16,with a minimum clustering entropy value of only 0.31,demonstrating good performance in deep-level mining of data.
关 键 词:软件开发 多源异构 数据挖掘 数据预处理 特征提取 数据聚类 SOM神经网络
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.133.117.5