检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:孙义康 高建华[1] SUN Yikang;GAO Jianhua(Department of Computer Science and Technology,Shanghai Normal University,Shanghai 200234,China)
机构地区:[1]上海师范大学计算机科学与技术系,上海200234
出 处:《计算机工程》2025年第2期223-237,共15页Computer Engineering
基 金:国家自然科学基金(61672355)。
摘 要:死代码是一种不良代码异味,会导致软件质量逐渐衰退。传统的死代码检测方法主要依赖于静态分析技术、代码结构的度量以及启发式规则,这些方法在开发者之间存在高度差异,且对源代码文本信息关注较少,忽略代码在实际执行过程中的情况,存在较大的局限性。针对以上问题,设计一种新型死代码检测方法,并采用基于卷积神经网络和长短期记忆相结合的技术,其主要思路是将代码文本信息和代码度量信息相结合,提高死代码检测的准确性。首先使用DUM-Tool等工具并结合人工以确定应用程序中的死代码实例进行死代码标记,以深度优先遍历抽象语法树获取源代码的文本信息,将标签值与文本信息相匹配,再使用CK代码度量提取工具获取源代码的代码度量信息。然后通过Word2Vec将文本信息转化为词向量,使用卷积神经网络提取代码度量信息的特征,将两者拼接得到死代码检测的数据集。最后使用长短期记忆网络对数据集进行训练,再通过Sigmoid函数进行分类。实验结果表明,将代码文本信息和度量信息相结合可以有效实现死代码的检测,与传统的检测方法相比,平均F1值最高提升12.58百分点。Dead code is a code smell that leads to the gradual deterioration of software quality.Traditional dead code detection methods primarily rely on static analysis techniques,code structure metrics,and heuristic rules.These methods vary considerably among developers.Moreover,these methods pay limited attention to the textual information and overlook the execution context of the source code,leading to significant limitations.To address these challenges,an innovative approach for detecting dead code is designed by integrating a Convolutional Neural Network(CNN)and Long Short-Term Memory(LSTM).Textual and code metric information is integrated in this method to enhance the accuracy of dead code detection.First,dead code instances in an application are identified using tools such as the DUM-Tool and manually verified and labeled.The source code′s textual information is then obtained by traversing the abstract syntax tree in a depth-first manner,matching label values with textual information,and extracting code metric information using CK code metric extraction tools.The textual information is transformed into word vectors using Word2Vec,and a CNN is utilized to extract features from the code metric information.Finally,the combination of these features forms a dataset for dead code detection;this dataset is subsequently trained using LSTM and classified using a Sigmoid function.The experimental results reveal that the integration of textual and metric information facilitates effective dead code detection,achieving a maximum F1 value improvement of 12.58 percentage point compared with traditional detection methods.
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.112