检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:严海峰 简梓红 江秀明 YAN Haifeng;JIAN Zihong;JIANG Xiuming(Map Institute of Guangdong Province,Guangzhou,Guangdong 510075,China;Guangdong Surveying and Mapping Engineering Company Limited,Guangzhou,Guangdong 510663,China)
机构地区:[1]广东省地图院,广东广州510075 [2]广东省测绘工程有限公司,广东广州510663
出 处:《北京测绘》2024年第9期1271-1276,共6页Beijing Surveying and Mapping
基 金:广东省科技计划(2021B1111610001)。
摘 要:重复数据的处理是地名地址数据治理时一项重要的任务。本文针对广东省地名地址数据库存在的重复数据的问题,提出了一种基于音形码汉字相似度的计算方法,介绍了基于音形码地名地址去重的原理、流程和方法,并结合相关原理开发地名地址数据去重软件。以荔湾区地名地址数据为实验数据,通过软件计算荔湾区地名地址数据库中数据的相似度,结合去重规则和距离的差异进行数据判断,解决地名地址数据库重复的问题,保证数据库的准确性。实验结果表明,该软件对重复数据的匹配程度较高,地名地址数据重复的问题可以通过音形码和距离双驱动方法得到有效解决,为其他区域地名地址数据治理提供可靠的解决方案。The processing of duplicate data is an important task in the management of geographical name and address data.To address the problem of duplicate data in the geographical name and address database of Guangdong Province,this paper proposed a method to calculate Chinese character similarity based on phonetic codes and introduced the principle,process,and method of de-duplication of geographical names and addresses based on phonetic codes.In addition,according to relevant principles,the geographical name and address data deduplication software was developed.This paper took the geographical name and address data of Liwan District as experimental data,calculated the similarity of data in the geographical name and address database of Liwan District by software,and judged the data duplication by the duplication rule and the difference of distance.As a result,it solved the problem of duplicate data in the geographical name and address database and ensured the accuracy of the database.The experimental results show that the software can match duplicate data with high accuracy,and the problem of duplicate geographical name and address data can be effectively solved by the dual drive method of phonetic codes and distance,providing a reliable solution for the management of geographical names and addresses in other regions.
分 类 号:P281[天文地球—地图制图学与地理信息工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.221.240.145