检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:余向前 YU Xiangqian(State Grid Gansu Electric Power Company,Lanzhou 730030,China)
出 处:《自动化仪表》2023年第1期92-95,100,共5页Process Automation Instrumentation
摘 要:电力信息化的发展使得电力营销系统中的数据量不断增加,导致在数据抽取过程中的数据转换能力较差,从而造成抽取结果召回率偏高的情况。针对这一情况,利用可扩展标记语言(XML)的转换能力,设计了新的电力营销数据智能抽取方法。将电力营销数据规范为小范围数据链形式,并应用超文本敏感标题搜索(HITS)算法获取数据源。设定XML数据转换工具,利用XML定位描述符实现数据区域定位。在设定数据抽取规则与抽取内容的基础上,结合数据映射技术实现对电力营销数据的抽取。在性能测试过程中,将测试环境设定为平稳运行与数据入侵2种。通过对比结果可知,基于XML的抽取方法的召回率保持在7%以下,抽取耗时保持在800 ms以下,其值优于传统方法,充分证明了该方法的有效性。The development of power informatization has led to an increasing amount of data in the power marketing system, resulting in poor data conversion capability in the data extraction process, which causes a high recall rate of extraction results. To address this situation, a new intelligent extraction method for power marketing data is designed using the transformation capability of extensible markup language(XML). The power marketing data is standardized into the form of a small range of data chains and the hyperlink-induced topic search(HITS) algorithm is applied to obtain the data sources. XML data conversion tool is set, and XML location descriptors are used to realize data region location. Based on setting data extraction rules and extraction contents, the extraction of electricity marketing data is realized by combining data mapping technology. In the performance testing process, the testing environment is set to both smooth operation and data intrusion. The comparison results show that the recall rate of the XML-based extraction method is kept below 7% and the extraction elapsed time is kept below 800 ms, whose values are better than those of the traditional method, which fully proves the effectiveness of the method.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.43