检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:陈娇娜 陶伟俊 靳引利[2] CHEN Jiao-na;TAO Wei-jun;JIN Yin-li(School of Electronic Engineering,Xi’an Shiyou University,Shaanxi,Xi’an 710065,China;School of Electronics and Control,Chang’an University,Shaanxi,Xi’an 710061,China)
机构地区:[1]西安石油大学电子工程学院,陕西西安710065 [2]长安大学电子与控制工程学院,陕西西安710061
出 处:《公路交通科技》2024年第4期186-193,213,共9页Journal of Highway and Transportation Research and Development
基 金:国家自然科学基金项目(52002315);国家重点研发计划项目(2019YFB1600700)。
摘 要:为实现从自然语言描述的交通事故文本中提取应急处置信息,提出了一种基于预训练模型和BiLSTM-CRF的交通事故命名实体识别方法。首先,基于陕西省高速公路2021年6月至2022年8月的多模态交通事故数据,分别比较了3种深度学习模型的识别效果和训练时长。其次,利用官方微博交通事故语料作为袋外测试集,检验实体识别模型的鲁棒性。然后,从一致性和丰富性两个维度,构建了文本信息和结构化数据的多模态交通事故信息内容评价指标。最后,以测试集为例进行交通事故信息识别,分析了应急处置实体数量与事故持续时间的相关性,计算并探讨了信息内容评价指标结果。结果表明,BERT-BiLSTM-CRF在测试集和袋外测试集的加权F1值分别为97.0294%和69.1555%,为模型精度、训练效率和鲁棒性3个方面综合表现最优。处置机构、处置设备、未处置、处置中、处置效果的实体数量与持续时间之间的相关系数依次为0.309,0.151,0.137,0.220和0.178,呈正相关性。天气、路产损失、交通分流、事故类型和伤亡情况的信息内容一致性依次为7.06%,45.79%,1.59%,67.65%和47.59%,应急处置占为36%,变异性为1.305,说明文本信息蕴含丰富的应急处置信息,然而文本信息和结构化数据对同一交通事故的信息内容一致性尚待提高。研究结果可为提高交通事故信息采集质量和有效性提供参考。In order to extract emergency response information from natural language descriptions of traffic accidents,a named entity recognition method is proposed based on pre-trained models and BiLSTM-CRF.The multimodal traffic accident data on expressways from June 2021 to August 2022 in Shaanxi province are analyzed as data sources.Firstly,3 deep learning models are compared on entity recognition effect and training time.Secondly,the traffic accident corpus from official microblog is obtained to test the robustness.Moreover,according to the dimensions of consistency and richness,the evaluation indicators are constructed to enable quantitative assessment of traffic accident content for text data and structured data.Finally,the traffic accident information recognition is carried out by using the test dataset.The result shows that the weighted F1 values of BERT-BiLSTM-CRF on both test dataset and out-of-bag dataset are 97.0294%and 69.1555%respectively,which have the best comprehensive performance in terms of model accuracy,training efficiency,and robustness.It is verified that there is a positive correlation between the number of emergency disposal entities and the duration of accident.The correlation coefficients of disposal agency,disposal equipment,un-disposal,disposal-ing and disposal effect are 0.309,0.151,0.137,0.220 and 0.178 respectively.The content consistency of weather,road loss,traffic diversion,accident type and casualty are 7.06%,45.79%,1.59%,67.65%and 47.59%respectively.The proportion of emergency response is 36%,and the variability is 1.305.It is proved that text data contain rich emergency disposal information,however,the content consistency of text data and structured data for the same traffic accident should be improved.The study result can provide reference for improving the quality and effectiveness of traffic accident information.
关 键 词:智能交通 交通事故 多模态数据 预训练模型 双向长短时记忆
分 类 号:U491.3[交通运输工程—交通运输规划与管理]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.141.193.237