广西非遗文化知识图谱构建与数据处理研究  

Research on the construction and data processing of knowledge graph for Guangxi intangible cultural heritage

在线阅读下载全文

作  者:张涛[1] 周卫[2] ZHANG Tao;ZHOU Wei(College of Electronic Information,Guangxi Minzu University,Nanning 530006,China;College of Artificial Intelligence,Guangxi Minzu University,Nanning 530006,China)

机构地区:[1]广西民族大学电子信息学院,南宁530006 [2]广西民族大学人工智能学院,南宁530006

出  处:《智能计算机与应用》2025年第3期72-78,共7页Intelligent Computer and Applications

摘  要:非物质文化遗产代表着地区文化历史的沉淀,是中华优秀传统文化的重要组成部分,也是人类文明的宝贵财富,具有无可替代的历史文化价值。对于维护文化多样性来说,保护和传承非物质文化遗产至关重要。然而,在当前网络环境下,广西的非物质文化遗产信息存在着杂乱无章、结构不清晰的问题。针对此问题,通过采用Python爬虫技术,对广西非物质文化遗产信息进行了系统采集,通过应用自然语言处理模型、特别是命名实体识别和关系抽取技术,能够将其中的非结构化信息转化为结构化数据,随后对这些数据进行了全面整理和清洗。最终,运用知识图谱技术的强大信息整合和表示能力,成功构建出一个结构清晰的广西非物质文化遗产知识图谱。Intangible cultural heritage represents the cultural history of a region and is an important component of traditional Chinese culture.Meanwhile,it is also a precious asset of human civilization,and has irreplaceable historical and cultural value.For the preservation of cultural diversity,the protection and inheritance of intangible cultural heritage are of paramount importance.However,in the current online environment,there are issues with the disorderly and unclear structure of Guangxi's intangible cultural heritage information.To address this problem,systematic collection of Guangxi's intangible cultural heritage information is conducted using Python Web scraping technology.By applying natural language processing models,especially named entity recognition and relationship extraction techniques,the unstructured information within intangible cultural heritage is transformed into structured data,followed by careful organization and cleansing of this data.Ultimately,leveraging the powerful information integration and representation capabilities of knowledge graph technology,a well-structured knowledge graph for Guangxi's intangible cultural heritage is successfully constructed.

关 键 词:知识图谱 Python爬虫 命名实体识别 关系抽取 Neo4j图数据库 RoBERTa 

分 类 号:TP393[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象