检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:吴克介 Wu Kejie(State Key Laboratory of Coal Mine Disaster Prevention and Control,Chongqing 400037,China;CCTEG Chongqing Research Institute,Chongqing 400037,China)
机构地区:[1]煤矿灾害防控全国重点实验室,重庆400037 [2]中煤科工集团重庆研究院有限公司,重庆400037
出 处:《能源与环保》2024年第10期14-20,共7页CHINA ENERGY AND ENVIRONMENTAL PROTECTION
基 金:天地科技创新创业资金专项项目(2023-TD-QN010);2022年新疆维吾尔自治区第三批重点研发任务专项—厅厅、厅地联动项目(2022B03031-3-1);国家重点研发计划项目(2018YFC0808300);重庆市技术创新与应用发展专项重点项目(cstc2019jscx-mbdxX0007)。
摘 要:针对矿井安全分析所需的事故、处罚等不易获取的数据,选择互联网公开的Web数据作为数据源,在分析总结Web查询结果页面具有的视觉特征基础上,提出了一种基于视觉与DOM树的Web数据抽取方法(VDLE)。首先,引入视觉块重心偏移量定位数据区域,然后利用谱聚类算法定位数据区域内结构相似的节点簇,结合文本组织多样性对数据记录进行定位。实验结果表明,VDLE的抽取结果查准率为99%,比D-EEM提高8.51%,比VIDE查准率提高4.32%;VDLE的抽取结果查全率为98.75%,较D-EEM查全率提高13.33%,较ViDE查全率提高8.17%。在此基础上,研发了煤矿安全Web数据采集系统,现场实验结果表明,该系统采集的事故信息弥补完善了矿井安全信息储备,为矿井安全分析奠定了数据基础。Aiming at the hard to obtain data such as accidents and penalties required for mine safety analysis,the Web data published on the Internet was selected as the data source.Based on the analysis and summary of the visual characteristics of the Web query results page,a Web data extraction method(VDLE)based on vision and DOM tree was proposed.First,the visual block center of gravity offset was introduced to locate the data region,and then the spectral clustering algorithm was used to locate the node clusters with similar structure within the data region.The data records were located based on the diversity of text organization.The experimental results showed that the precision of VDLE extraction results was 99%,which was 8.51%higher than D-EEM and 4.32% higher than VIDE precision;the recall rate of VDLE extraction results was 98.75%,which was 13.33% higher than that of D-EEM and 8.17% higher than that of ViDE.On this basis,a coal mine safety Web data collection system was developed.The results of field experiments showed that the accident information collected by the system complemented and improved the reserve of mine safety information,laying a data foundation for mine safety analysis.
关 键 词:视觉 DOM树 WEB数据抽取 煤矿安全 事故分析
分 类 号:TP67[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.148.232.123