检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:童沐雨 刘建平 林熠来 TONG Mu-yu;LIU Jian-ping;LIN Yi-lai(Nanjing Institute of Product Quality Inspection,Nanjing 210012 China;Jiangsu Suce Inspection and Certification Co.,Ltd.,Nanjing 210012 China)
机构地区:[1]南京市产品质量监督检验院,江苏南京210012 [2]江苏苏测检测认证有限公司,江苏南京210012
出 处:《自动化技术与应用》2024年第3期116-119,共4页Techniques of Automation and Applications
摘 要:针对受到句子相似性较高的影响,存在APP用户隐私保护文本挖掘效率低的问题,为此,设计基于遗传算法的APP用户隐私保护文本挖掘系统。使用Heritrix爬虫结构采集文本信息,采用多线程ToePool,管理抓取的线程,借助ARM处理器,预处理文本信息,使用垂直搜索引擎模块,将索引域写入索引,通过分析词特征在句子中出现的频率,计算两个文档句子之间的相似度,确定导向式文摘的查询相关性,衡量摘要查询相关程度,根据选取的数据源,设计APP用户隐私保护文本提取流程。实验结果可知,该系统与数据源存在5次的频率误差,其余均一致,具有良好挖掘效果。Due to the high sentence similarity,the efficiency of APP user privacy protection text mining is low.Therefore,an app user privacy protection text mining system based on genetic algorithm is designed.The Heritrix crawler structure is used to collect text information,the multithreaded Toepool is used to manage the captured threads,the text information is preprocessed with the help of ARM processor,and the vertical search engine module is used to write the index field into the index.By analyzing the frequency of word features in the sentence,the similarity between the two document sentences is calculated,the query relevance of the guided summary is determined,and the relevance of the summary query is measured.According to the selected data source,it designs APP user privacy protection text extraction process.The experimental results show that there are five frequency errors between the system and the data source,and the rest are consistent,which has a good mining effect.
关 键 词:遗传算法 APP用户 隐私保护 文本挖掘 Heritrix爬虫结构
分 类 号:TP309.2[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.19.244.133