基于自然语言处理(NLP)的生态环境准入清单政策内容分析

Policy texts analysis of list of environmental permit based on natural language processing(NLP)

作　　者：魏泽洋汪自书宫曼莉谢丹[1] 杨洋[1] 刘毅[1] WEI Zeyang;WANG Zishu;GONG Manli;XIE Dan;YANG Yang;LIU Yi(School of Environment,Tsinghua University;School of Resources and Environmental Economics,Inner Mongolia University of Finance and Economics)

机构地区：[1]清华大学环境学院 [2]内蒙古财经大学资源与环境经济学院

出　　处：《环境工程技术学报》2025年第1期1-10,共10页Journal of Environmental Engineering Technology

基　　金：国家重点研发计划项目(2022YFC3203500);清华大学国家高端智库研究项目(2024WTJF0454)。

摘　　要：生态环境准入清单是生态环境分区管控制度的核心抓手,通过空间布局约束、污染排放管控、环境风险防控和资源能源利用效率控制等维度实现生态环境源头预防。生态环境准入清单存在政策文本庞大、管控措施多样、表达构成复杂特点,识别准入清单管控的对象、方式与力度是支撑生态环境分区管控政策实施的重要基础。本研究基于自然语言机器无监督学习技术对生态环境准入清单进行政策词汇模式挖掘并对政策文本设定多维定量化标签,应用自然语言深度学习模型对生态环境准入清单管控措施进行文本分类评估。河北省是我国产业门类最齐全、资源环境问题最复杂的省份之一,其生态环境准入管控具有典型性和代表性。以河北省生态环境准入清单的产业管控措施为例,识别了10类政策关键词特征、64项主要政策关键词,对全清单中对应关键词所在的语句覆盖率达95%;构造了24个管控措施-行业的分类标签,应用并比较了BERT、RoBERTa和ALBERT深度学习模型对政策文本的分类识别效果,预测精度、召回率和F1得分最高分别可达到0.95、0.79和0.86,训练模型可较好地识别准入清单政策内容。结果显示河北省准入清单在管控措施明确化、具体化、定量化方面仍存在不足,产业精细化管控、考核指标型以及时限型内容有待补充和细化。本研究提出的方法具有较好的适用前景,建议在此基础上结合前沿人工智能方法,进一步提高模型自动处理效率、动态分析以及提供精细化政策调整建议的能力。The list of environmental permit(LEP)is the core lever of the ecological environment zoning-based regulation(EZR)system,which aims to achieve pollution prevention and control at the source through spatial layout constraints,pollution emission control,environmental risk prevention and control,and resource and energy utilization efficiency control.Quantitatively identifying the objects,control measures,and intensity of LEPs is a crucial step for supporting EZR implementation.However,LEPs face challenges such as extensive policy texts,diverse control measures,and complex expressions.In this study,we utilized unsupervised natural language machine-learning techniques to mine the pattern of control vocabulary in LEPs and multidimensional quantitative labels for text content.Based on this,we employed natural language deep learning models to classify and evaluate the policy content of LEPs.Hebei Province is one of the provinces with the most complete industrial categories and the most complex resource and environmental issues in China,with typical and representative characteristics in ecological environmental regulation.Taking the industrial control measures of LEPs in Hebei Province as an example,we identified 10 categories of policy keyword features and 64 main policy keywords,with a sentence coverage rate of 95%for corresponding keywords in the entire lists.We constructed 24 classification labels for the control measures and industries,and applied and compared the classification recognition effects of BERT,RoBERTa,and ALBERT deep learning models on policy texts.The highest prediction accuracy,recall rate,and F1 score could reach 0.95,0.79,and 0.86,respectively.The trained models could effectively identify the access control contents.It was found that there were still deficiencies in the clear,specific,and quantitative control measures of LEPs in Hebei Province,and the contents of refined control,assessment indicators,and time limits needed to be supplemented and refined.The method proposed in this study had good appli

关键词：生态环境分区管控生态环境准入清单政策文本自然语言处理(NLP)

分类号：X321[环境科学与工程—环境工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于自然语言处理(NLP)的生态环境准入清单政策内容分析

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于自然语言处理(NLP)的生态环境准入清单政策内容分析

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索