检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:滕婕 胡广伟[1,2] 王婷[1,2] Teng Jie;Hu Guangwei;Wang Ting(School of Information Management,Nanjing University,Jiangsu,210023;Government Data Resources Institution of Nanjing University,Jiangsu,210023)
机构地区:[1]南京大学信息管理学院,江苏210023 [2]南京大学政务数据资源研究所,江苏210023
出 处:《情报资料工作》2022年第3期20-33,共14页Information and Documentation Services
基 金:国家社会科学基金重大项目“大数据驱动的城乡社区服务体系精准化构建研究”(项目编号:20&ZD154);营销服务渠道效能及渠道协同效能评价体系研究(项目编号:SGJSYF00YHJS2000144)的阶段性研究成果。
摘 要:[目的/意义]高效准确地把握社会诉求转变节点、识别社会诉求主题、追踪主题演化趋势,进而为政府服务和社会治理的和谐有序发展提供支撑已成为一项重要议题。[方法/过程]提出了一套基于语义依赖关系的主题识别和演化路径分析方法。首先,针对同一文档核心词汇,利用全组合方法构建“Source-Target”词对,通过时间区间划分和Word2Vec模型构建动态语义依赖关系网;其次,利用社区发现算法识别每一区间中语义依赖网的子社区,并采用PageRank算法识别出每一子社区的主题标签,通过测量前后相邻时间区间主题相似度来反映主题的演化关系,展示主题生成、分裂、融合、衰亡的演化过程;最后,以甘肃省人民政府公开的省长信箱数据进行模型验证,通过与K-means方法在主题识别效果上进行比较,并利用精确率、召回率和F1值进行方法效果评测。[结果/结论]研究结果发现,本方法应用效果的提升差额曲线均大于0,且三项指标的差额曲线整体都处在0.5分界值之上,取得了明显的优化效果。研究为政府网站领导信箱模块反映的公众关切事项构建了全景视图,该方法也为探索其它社交文本挖掘方法以及支撑国家治理大数据分析实践提供了新的思路。[Purpose/significance] For the purpose of pursuing harmonious development for the entire society, governments need to undertake the predominant role of grasping the transformation node of social demands, identifying the theme of social demands, and tracking the evolution trend of the theme.[Method/process] To this end, this paper proposed an innovative method based on semantic dependence network, which is able to identify related topics and analyze evolutionary paths. Firstly, we constructed the source-target word pairs by the full combination method for the same document core terms. The dynamic semantic dependency network was constructed by time interval division and Word2Vec model. Secondly, we employed the community discovery algorithm to identify the semantically dependent web subcommunities in each interval. We also adopted the PageRank algorithm to stand out the topic tags of each subcommunity. By measuring similarities of the adjacent time intervals, we drew the evolutionary relationship of the target topic, illustrating the whole life circle from topic generation, splitting, fusion to eventually decay. Finally, we empirically verified the proposed model by the data collected from governor’s mailbox of Gansu province. We evaluated our results with several key indicators, namely accuracy rate, recall rate and F1 value. [Result/conclusion] All the three indicators were above the 0.5 cut-off values, proving significant optimization effects have been achieved. This research built a panoramic view of the public concerns, providing insights for exploring other social text mining methods, contributing to the successful practice of intelligent governance driven by big data techniques.
关 键 词:语义依赖关系 社会诉求 主题识别 主题演化路径 Word2Vec
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.145