检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李华[1] 陈俞源 高红 何思敏 乔峥元 LI Hua;CHEN Yu-yuan;GAO Hong;HE Si-min;QIAO Zheng-yuan(School of Resources Engineering,Xi’an University of Architecture and Technology,Xi'an 710055,China;The Northwest Company of China Construction Third Engineering Bureau,Xi'an710000,China)
机构地区:[1]西安建筑科技大学资源工程学院,西安710055 [2]中建三局集团有限公司西北分公司,西安710000
出 处:《安全与环境学报》2022年第3期1421-1429,共9页Journal of Safety and Environment
基 金:西安建筑科技大学校基金自然科学专项(X20180011)。
摘 要:在智慧工地项目安全管理过程中,为实现事故隐患排查信息的自动分类识别,提出了建筑事故隐患分类的Bert改进模型。该模型首先将术语多类别加权与单词嵌入方式相结合,其次对focal loss函数采用遗传算法优化类别权重_(αt代替)交叉熵损失函数,再者以Bert模型为基础构建了3种改进型分类算法,实现了隐患语料集的有效分类,最后采用3组算法对语料集进行对比验证。结果表明:ga_Bert+tfidf+focal模型在各隐患类别上的总体F_(1)分别高出其他3类模型5.9%、1.6%和0.66%,达到92.86%,对建筑事故隐患文本分类适用性较好。改进后的Bert模型解决了术语在不同类别标签的文档中具有不同重要性的问题,减缓了在多分类任务中各类别数据分布不均衡对模型分类性能的影响,为建筑企业项目安全管理智能化提供了理论支持。To achieve the automatic classified identification of hidden danger information in intelligent construction sites,this paper proposes an improved Bert model to further improve the practicability and applicability of safety inspection notification.Firstly,the word embedding scheme is applied,allocating multiple category weighting of hidden danger to different terms.Secondly,the original cross-entropy loss function is replaced by the focal loss function that optimized category weight α_(t) by a genetic algorithm,aiming at adding the optimal weight to each hidden danger category.Furthermore,three improved classification algorithms are constructed based on the Bert model to achieve the effective classification of hidden danger corpus.Finally,612 safety inspection reports of a construction company over the past eight years are processed by data cleaning,denoising,and other manual preprocessing operations.As a result,the corpus noise such as special characters,useless information,and the mixture of SBC case and DBC case is removed.Then,the hidden dangers of accident categories are divided based on the standard specifications,and the two-way exchange data annotation is carried out.As a result,16033 text data set of building hidden danger containing 12 labels of hidden danger categories are created to compare and verify three groups of algorithms.The results show that:the F_(1) score of ga_Bert+TFIDF+focal model in each hidden danger category is higher than the 5.9%of Bert+enc,1.6%of Bert+foca,and 0.66%of ga_Bert+focal respectively,reaching 92.86%,which is better applicable for text classification of the hidden danger.The improved Bert model solves the problem that terminology attaches different importances to documents with different category labels,and reduces the impact of unbalanced data distribution on the classification performance of the model in the multi-classification task,which provides theoretical support for the intelligent project of safety management of construction enterprises.
关 键 词:安全社会工程 Focal loss Bert 术语权重 不均衡数据集 事故隐患分类
分 类 号:X947[环境科学与工程—安全科学]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.90