检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:崔彦青 CUI Yan-qing(Institute of Computer Information,Inner Mongolia Medical University,Hohhot Inner Mongolia 010110,China)
机构地区:[1]内蒙古医科大学计算机信息学院
出 处:《计算机仿真》2019年第10期349-352,377,共5页Computer Simulation
基 金:国家自然科学基金项目(51167010)
摘 要:针对当前方法在进行动态分块网页主题信息自动提取是存在提取准确率较低、错误率较高、耗时较长的缺点,采用混合加权方法对动态分块网页主题信息进行自动提取.在对动态分块网页主题信息进行预处理的基础上,构建预处理后动态分块网页主题信息的分层树模型,确定网页主题信息的内在联系,采用二元集合序列描述目标提取的动态分块网页主题信息,计算不同的网页主题信息文本对全网页主题信息的贡献程度;采用空间向量模型描述动态分块网页主题信息特征,并利用混合加强的方法从空间向量模型中提取动态分块网页主题信息.仿真结果证明,采用的方法耗时可控制在0.1s内,对样本数据提取的准确率较高.说明采用的方法能够实现动态分块网页主题信息的准确、高效提取.Currently, the method has low accuracy, high error rate and long time consumption in automatically extracting topic information from dynamic partitioned web page. In this paper, the mixed weight method was used to automatically extract topic information from dynamic partitioned page. On the basis of preprocessing the topic information of dynamic partitioned web page, the hierarchical tree model of topic information of dynamic partitioned web page after the pretreatment was constructed, and then the internal relation of topic information of web page was determined. Moreover, a binary set sequence was used to describe the topic information of dynamic partitioned web page of object extraction and calculate the contribution degree of different page topic information texts to the whole web page topic information. Finally, the spatial vector model was used to describe the feature of topic information of dynamic partitioned web page. Meanwhile, the method of mixed enhancement was used to extract the topic information of dynamic partitioned web page from the spatial vector model. Simulation results prove that the time consumption of proposed method is controlled within 0.1 s. Meanwhile, the accuracy of sample data extraction is high. Therefore, the proposed method can achieve accurate and efficient extraction of topic information from dynamical partitioned web page.
分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.116.112.164