检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:周文杰 谢琪 崔梦天 ZHOU Wenjie;XIE Qi;CUI Mengtian(The Key Laboratory for Computer Systems of State Ethnic Affairs Commission,Southwest Minzu University,Chengdu 610041,P.R.China)
机构地区:[1]西南民族大学计算机系统国家民委重点实验室,成都610041
出 处:《重庆大学学报》2023年第7期53-62,共10页Journal of Chongqing University
基 金:国家自然科学基金资助项目(61502401,12050410248);四川省科技计划项目(2021YFH0120);西南民族大学中央高校基本科研业务费专项资金(2020YYXS59)。
摘 要:针对重复缺陷报告检测研究中存在语义长距离依赖以及缺陷报告特征的单一性问题,提出一种强化文本关联语义和多特征提取的重复缺陷报告检测模型。引入自注意力机制捕获缺陷报告文本序列内部的语义关联性,从而动态计算上下文语义向量进行语义分析,解决长距离依赖问题;利用隐含狄利克雷分布算法捕获缺陷报告文本的主题特征,同时针对缺陷报告的类别信息,构建一种特征提取网络计算类别差异特征;最后基于3类特征向量进行综合检测。实验结果表明,该模型实现了更优的检测性能。A duplicate bug report detection model with enhanced text relevance semantics and multi-feature extraction was proposed to address the issues of semantic long-distance dependence and the singleness of bug report features in the current research on duplicate bug report detection.The model introduced the self-attention mechanism to capture the semantic relevance within the bug report text sequence.This mechanism calculates the contextual semantic vector dynamically for semantic analysis and resolves the problem of long-distance dependence.Additionally,the model employed the latent Dirichlet allocation algorithm to capture the topic characteristics of the bug report text.Furthermore,a feature extraction network was constructed to calculate category difference features,providing category information for the bug report simultaneously.Finally,comprehensive detection was performed based on three types of feature vectors.The experimental results demonstrate that the model achieves improved detection performance.
关 键 词:重复缺陷报告检测 长距离依赖 自注意力机制 语义分析 多特征提取
分 类 号:TP311.5[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222