面向导游词的景区地理实体显著性排序方法  被引量:2

A Method of Geographic Entity Significance Ranking with Tour Guide Speeches for Scenic Spots

在线阅读下载全文

作  者:吴越 张翎[1,2] 龙毅 WU Yue;ZHANG Ling;LONG Yi(School of Geography,Nanjing Normal University/Key Laboratory of Virtual Geographic Environment of Ministry of Education,Nanjing 210023;Jiangsu Center for Collaborative Innovation in Geographical Information Resource Development and Application,Nanjing 210023,China)

机构地区:[1]南京师范大学地理科学学院/虚拟地理环境教育部重点实验室,江苏南京210023 [2]江苏省地理信息资源开发与利用协同创新中心,江苏南京210023

出  处:《地理与地理信息科学》2022年第3期9-16,共8页Geography and Geo-Information Science

基  金:国家自然科学基金项目“多模态地理信息融合机制及其关键技术研究”(42171403)。

摘  要:地理实体显著性排序是面向自然语言的层次化场景认知研究的重要内容之一。导游词作为系统描述特定景区环境、景点与重要资源的自然语言形式,包含大量的景区地理实体,但传统实体排序方法忽视了地理空间信息的重要作用,难以处理地理实体特有的非结构化或半结构化地理空间特征。该文提出一种面向导游词的景区地理实体显著性排序(Geographic Entity Significance Ranking,GESR)模型,通过分析包含空间拓扑关系、模糊形态描述在内的景区地理实体相关特征构建目标排序函数,迭代生成基于样本误差分布与随机梯度下降法的弱学习器,再通过加权平均集成与降误差剪枝获得提升后的强学习器,即排序模型。利用中文导游词文本对模型进行验证,结果表明:1)与3种基线方法对比,GESR模型的归一化折损累积增益达0.8841,AUC达0.7579,排序性能最优;2)空间拓扑关系和模糊形态描述特征对GESR模型的影响最显著;3)相比人群关注热度,GESR模型对导游词中地理实体空间特征的反映能力更强。Geographic entity significance ranking is the important content of natural language oriented hierarchical scene cognition and geospatial expression.As a natural language form that systematically describes the environment,attractions and important resources of specific scenic spots,tour guide speeches contain geographic entities with different degrees of significance.However,traditional entity ranking methods lack pertinence and ignore the important role of geospatial information,which are difficult to deal with unstructured or semi-structured geospatial features.In order to solve this problem,this paper proposes a geographic entity significance ranking(GESR)model based on tour guide speeches for scenic spots.Firstly,combining the spatial cognition results in tour guide speeches,the objective function is constructed by selecting and extracting five features:geographic entity frequency,clustering coefficient,feature based on co-occurrence relationship,spatial topological relations and ambiguity levels of morphological descriptions.Secondly,based on the sample error distribution,this paper improves the learning ability of learners to difficult samples,and iteratively generates linear weighted weak learners and figures out their parameters.Finally,the weighted averaging process and reduced-error pruning are used to obtain the strong learner,namely the ranking model.The experiment with Chinese tour guide speeches shows that:1)Compared with three baseline methods,the normalized discounted cumulative gain(NDCG)of the GESR model is 0.8841 and area under curve(AUC)is 0.7579,and the ranking performance is best.2)Features of spatial topological relations and ambiguity levels of morphological descriptions have the most significant influence on the GESR model.3)Compared with the popular attention,the GESR model has a stronger ability to reflect geospatial features of geographic entities in tour guide speeches.

关 键 词:导游词 地理实体 显著性 实体排序 空间拓扑关系 模糊形态描述 

分 类 号:P208[天文地球—地图制图学与地理信息工程] TP391.1[天文地球—测绘科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象