检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:刘海洋[1] 张杨[1] 田泉泉 王晓红[1] LIU Haiyang;ZHANG Yang;TIAN Quanquan;WANG Xiaohong(School of Information Science and Engineering,Hebei University of Science and Technology,Shijiazhuang,Hebei 050018,China)
机构地区:[1]河北科技大学信息科学与工程学院,河北石家庄050018
出 处:《河北工业科技》2024年第5期330-335,共6页Hebei Journal of Industrial Science and Technology
基 金:国家自然科学基金(61440012);河北省自然科学基金(F2023208001);河北省引进留学人员资助项目(C20230358)。
摘 要:为了提高多标签代码坏味检测的准确率,提出一种基于预训练模型与BiLSTM-CNN的多标签代码坏味检测方法DMSmell(deep multi-smell)。首先,利用静态分析工具获取源代码中的文本信息和结构度量信息,并采用2种检测规则对代码坏味实例进行标记;其次,利用CodeBERT预训练模型生成文本信息对应的词向量,并分别采用BiLSTM和CNN对词向量和结构度量信息进行深度特征提取;最后,结合注意力机制和多层感知机,完成多标签代码坏味的检测,并对DMSmell方法进行了性能评估。结果表明:DMSmell方法在一定程度上提高了多标签代码坏味检测的准确率,与基于分类器链的方法相比,精确匹配率提高了1.36个百分点,微查全率提高了2.45个百分点,微F1提高了1.1个百分点。这表明,将文本信息与结构度量信息相结合,并利用深度学习技术进行特征提取和分类,可以有效提高代码坏味检测的准确性,为多标签代码坏味检测的研究和应用提供重要的参考。To improve the accuracy of multi-label code smell detection,a multi-label code smell detection method DMSmell(Deep Multi-Smell)based on pre-trained model and BiLSTM-CNN was proposed.Firstly,the static analysis tool was used to obtain the text information and structural metric information in the source code,and two detection rules were adopted to label the code smell instances;Secondly,the pre-training model of CodeBERT was used to generate the word vectors corresponding to the textual information,and the deep feature extraction of the word vectors and the structural metric features were performed by using BiLSTM and CNN,respectively;Finally,the detection of multi-label code smell was accomplished by combining the attention mechanism and multi-layer perceptron,and the performance of the DMSmell method was evaluated.The results show that the DMSmell method improves the accuracy of multi-label code smell detection to a certain extent.Compared with the classifier chain-based method,the accurate match ratio has improved by 1.36 percentage points,the micro-recall rate has improved by 2.45 percentage points,and the micro-F 1 has improved by 1.1 percentage points.The results show that the combination of textual information with structural metric information and the use of deep learning techniques for feature extraction and classification can effectively improve the accuracy of code smell detection,which provides an important reference for the research and application of multi-label code smell detection.
关 键 词:软件工程 代码坏味 预训练模型 多标签分类 深度学习
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.17.191.196