基于Spark Streaming的高校网站敏感信息监测设计与实现  被引量:3

Design and Implementation of University Website Sensitive Information Monitoring Based on Spark Streaming

在线阅读下载全文

作  者:王丹[1] 邓谦 刘姣 WANG Dan;DENG Qian;LIU Jiao(Information Construction and Managem ent Office of Jiangsu University of Science and Technology,Zhenjiang,Jiangsu 212000)

机构地区:[1]江苏科技大学信息化建设与管理办公室,江苏镇江212000

出  处:《科教导刊》2021年第5期40-41,44,共3页The Guide Of Science & Education

摘  要:高校在实现智慧化的同时也面临着网站发布信息或网页内容被黑客篡改成不符合国家或学校规定的信息及内容.通过对已有学术研究发现,现有技术的研究普遍存在着效率低、实时性差的问题,本文提出了一种基于Spark Streaming的高校网站敏感信息监测系统.该系统利用Kafka作为中间存储,系统架构在Spark Streaming框架上可实时消费Kafka中数据进行链接解析处理,将获取到的网页内容存储到Elastic Search中进行倒排索引敏感信息匹配,从而达到数据采集和数据处理同步,提高了网站监测效率.While realizing intelligence,colleges and universities are also facing the fact that the information published on the website or the content of the webpage is tampered with by hackers into information and content that does not comply with the national or school regulations.Through the existing academic research,it is found that the existing technology research generally has the problems of low efficiency and poor real-time performance.This paper proposes a sensitive information monitoring system for college websites based on Spark Streaming.The system uses Kafka as an intermediate storage.The system architecture can consume data in Kafka in real time on the Spark Streaming framework for link resolution processing,and store the obtained web page content in Elastic Search for inverted index sensitive information matching,so as to achieve data collection and data processing synchronization improves the efficiency of website monitoring.

关 键 词:SPARK STREAMING Kafka ELASTIC Search 敏感信息监测 倒排索引 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象