分布式用户痕迹采集存储系统  

Distributed user trace collection and storage system

在线阅读下载全文

作  者:夏乾臣 吕江花[1] 孟祥曦 马世龙[1] XIA Qianchen;LYU Jianghua;MENG Xiangxi;MA Shilong(School of Computer Science and Engineering,Beihang University,Beijing 100083,China)

机构地区:[1]北京航空航天大学计算机学院,北京100083

出  处:《北京航空航天大学学报》2020年第3期548-562,共15页Journal of Beijing University of Aeronautics and Astronautics

基  金:国家自然科学基金(61300007,61305054);软件开发环境国家重点实验室自主探索基金(SKLSDE-2012ZX-28,SKLSDE-2014ZX-06)。

摘  要:在复杂网络的分布式环境中,精准全面地采集海量用户在浏览网站过程中的行为数据和网站过程数据并高效存储是用户行为分析的前提和基础。为了解决数据类型的多样性和存储的差异性问题,提高数据的检索效率,为企业的个性化需求做用户行为的分析提供支持,设计了白盒模式的用户痕迹采集存储系统。用户访问Web服务器过程中会产生交互/交易数据以及用户操作,浏览网站过程中会产生图片、视频、商品描述等多种类型的文件,这些界面和数据称为用户浏览痕迹,操作序列则作为用户行为的实际动作顺序记录。对用户数据和操作序列分析,能精确反映用户特征。采集模型通过界面窗口树来建模,提供统一数据存取接口,根据数据类型的不同,分别存储于不同的位置,完整采集用户痕迹,应用程序传递参数指定存储位置创建数据库文件,通过存取接口可以分类型、按要求存取用户数据,解决了面向互联网的用户交互痕迹捕获、存储和检索的问题,具有良好的精确性和完整性。In the distributed complex network environment,collecting the large number of users’behavioral data along with the website data during browsing accurately and comprehensively,efficiently storing them are the basis of user behavior analysis.In order to solve the problems of diversity of data types and storage differences,improve the efficiency of data retrieval,and provide support for the analysis of user behavior for the individual needs of enterprises,a white box mode of user trace collection and storage system is designed in this paper.The users visit the Web server and processes the data of interaction/transaction and user operations,such as pictures,video,description of goods and other types of files.These interfaces and data are called user browsing traces,and operation sequences are the actual user behaviors in order.User data and operation sequence analysis can accurately reflect user characteristics.The collection system is modeled by the interface window tree,providing a unified access interface for data,which is stored in different locations according to the data types.The applications input parameters to specify the storage location to create the database.Through the access interface,the user data can be accessed according to the different file types and requirements.The model solves the problem of capturing,storing,and retrieving traces of Internet-oriented user interaction,and has good accuracy and integrity.

关 键 词:用户行为 用户痕迹采集 界面窗口树 统一存储 非结构化数据 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象