检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:徐传运 马莹丽 李刚 舒涛 李星光 XU Chuanyun;MA Yingli;LI Gang;SHU Tao;LI Xingguang(School of Artificial Intelligence,Chongqing University of Technology,Chongqing 401135,China;School of Computer and Information Science,Chongqing Normal University,Chongqing 401331,China)
机构地区:[1]重庆理工大学两江人工智能学院,重庆401135 [2]重庆师范大学计算机与信息科学学院,重庆401331
出 处:《重庆理工大学学报(自然科学)》2024年第1期150-159,共10页Journal of Chongqing University of Technology:Natural Science
基 金:重庆市巴南区科委项目(2020QC413);重庆市科委项目(cstc2020jscx-msxmX0086,cstc2019jscx-zdztzx0043);重庆市教委项目(KJQN202001137);重庆理工大学研究生创新项目(gzlcx20222137)。
摘 要:对仪表企业来说,快速高效地自动响应用户的询价请求,实现无人化询价,具有非常重要的意义。但不同用户提供的物料清单表无统一规范的格式,导致仪表企业只能获得半结构化的询价电子表格,无人化询价系统难以分析与理解。构建无人化询价系统的关键是准确地自动提取仪表参数,而提取参数的前提是正确理解表格结构。因此,以构建无人化询价系统为目标,研究仪表询价电子表格的结构识别,提出混合相似性度量表格结构识别方法(hybrid similarity metrics for table structure recognition, HSMTSR)。所提方法结合Levenshtein距离、Dice系数和单元格类型相似度(cell type similarity, TySim),根据单元格和行数据的相似度解析识别表格结构。同时,建立流量仪表电子表格数据集(flowmeter spreadsheet dataset, FSDS)研究分析仪表询价电子表格的结构,包括714个电子表格,8 574行数据。实际应用表明,所提方法可准确高效地自动识别多种复杂结构的仪表询价电子表格,并在多个评价指标上均取得较好效果。For instrumentation companies,it is of great significance to quickly and efficiently automate the response to users’request for quotation and to realize unmanned quotation.Nevertheless,there is no unified and standardized format for the bill of materials spreadsheets provided by different users,resulting in semi-structured quotation spreadsheets for instrumentation companies and creating difficulties for unmanned quotation systems to perform analysis.The key to building an unmanned quotation system is to accurately automate the extraction of meter parameters,which presupposes a proper understanding of the spreadsheet structure.Therefore,with the goal of building an unmanned quotation system,this paper studies the structure recognition of instrument quotation spreadsheets and proposes hybrid similarity metrics for table structure recognition(HSMTSR).With Levenshtein distance,Dice coefficient and cell type similarity(TySim),this approach identifies spreadsheet structures based on the similarity resolution of cell and row data.Meanwhile,flowmeter spreadsheet dataset(FSDS)is built to analyze the structure of meter quotation spreadsheet,including 714 spreadsheets with 8574 rows of data.Practical applications show the method accurately and efficiently automates the identification of multiple complex structures of instrument quotation spreadsheets,and achieves superior results in several evaluation metrics.
关 键 词:电子表格 结构识别 相似性度量 类型相似度 仪表询价
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.46