检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:赵贵中 黄淼华 ZHAO Guizhong;HUANG Miaohua(Huizhou Power Supply Bureau,Guangdong Power Grid Corporation,Huizhou 516001,China)
机构地区:[1]广东电网有限责任公司惠州供电局,广东惠州516001
出 处:《综合智慧能源》2024年第11期19-28,共10页Integrated Intelligent Energy
基 金:南方电网公司科技项目(031300KK52222091)。
摘 要:为了探究电力事故规律,建立人身安全预警模型,在大规模事故样本中自动精准抽取信息并进行多维分析十分必要。传统中文信息实体特征抽取的精确度较低,因此,基于新型中文处理的命名实体识别技术,结合多种特定机器学习和深度学习模型,提出一种专用于电网事故领域的BERT-BiLSTM-CRF模型。通过基于转换器的双向编码表示预训练模型输出高质量词向量,利用语义增强掩码策略增强模型深入理解文本整体结构的能力。运用双向长短期记忆网络模型同时捕捉上下文信息,完成特征提取。根据条件随机场模型输出最优预测序列。试验结果表明,专用模型优势显著,其准确率、召回率和F1值均高于3种现有实体识别模型,包括预训练好的基于生成式预训练转换器技术的通用大模型。试验验证了所提方法在处理中文电力事故信息抽取问题时准确度高,具有显著优势。Investigating patterns in electric power accidents and establishing a safety warning model require accurate,automated information extraction from large-scale accident samples for multidimensional analysis.However,traditional methods for extracting Chinese information entity features have shown low accuracy.Therefore,based on a novel named entity recognition technique for Chinese processing and leveraging multiple machine learning and deep learning models,a BERT-BiLSTM-CRF model tailored to the power grid accident domain was proposed.High-quality word vectors were generated by a pre-trained model of bidirectional encoder representations from transformers(BERT)within a transformer framework.A semantic enhancement masking strategy was employed to improve the model's understanding of the overall text structure.Then,a bidirection long short-term memory(BiLSTM)model was applied to capture contextual information,completing feature extraction.The conditional random field(CRF)model produced the optimal prediction sequence.Experimental results demonstrated the superiority of this customized model,as its accuracy,recall,and F1 score exceeded those of three existing entity recognition models,including a general large model pre-trained using Generative pre-trained transformer(GPT)technology.These experiments validate that the proposed method achieves high accuracy and displays significant advantages in Chinese electric power accident information extraction.
关 键 词:电力事故 信息抽取 双向编码表示预训练 双向长短期记忆网络 条件随机场
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.3