机构地区:[1]Computer College,Huazhong University of Science and Technology [2]Wuhan National Laboratory for Optoelectronic
出 处:《Journal of Shanghai University(English Edition)》2011年第6期574-588,共15页上海大学学报(英文版)
基 金:Project supported by the National Basic Research Program of China (Grant Nos. 2004CB318201,2011CB302300);the US National Science Foundation (Grant No. CCF-0621526);the National Natural Science Foundation of China (Grant No. 60703046);HUST-SRF (Grant No.2007Q021B)
摘 要:File semantic has proven effective in optimizing large scale distributed file system.As a consequence of the elaborate and rich I/O interfaces between upper layer applications and file systems,file system can provide useful and insightful information about semantic.Hence,file semantic mining has become an increasingly important practice in both engineering and research community.Unfortunately,it is a challenge to exploit file semantic knowledge because a variety of factors coulda ffect this information exploration process.Even worse,the challenges are exacerbated due to the intricate interdependency between these factors,and make it difficult to fully exploit the potentially important correlation among various semantic knowledges.This article proposes a file access correlation miming and evaluation reference(FARMER) model,where file is treated as a multivariate vector space,and each item within the vector corresponds a separate factor of the given file.The selection of factor depends on the application,examples of factors are file path,creator and executing program.If one particular factor occurs in both files,its value is non-zero.It is clear that the extent of inter-file relationships can be measured based on the likeness of their factor values in the semantic vectors.Benefit from this model,FARMER represents files as structured vectors of identifiers,and basic vector operations can be leveraged to quantify file correlation between two file vectors.FARMER model leverages linear regression model to estimate the strength of the relationship between file correlation and a set of influencing factors so that the "bad knowledge" can be filtered out.To demonstrate the ability of new FARMER model,FARMER is incorporated into a real large-scale object-based storage system as a case study to dynamically infer file correlations.In addition FARMER-enabled optimize service for metadata prefetching algorithm and object data layout algorithm is implemented.Experimental results show that is FARMER-enabled prefetching alFile semantic has proven effective in optimizing large scale distributed file system.As a consequence of the elaborate and rich I/O interfaces between upper layer applications and file systems,file system can provide useful and insightful information about semantic.Hence,file semantic mining has become an increasingly important practice in both engineering and research community.Unfortunately,it is a challenge to exploit file semantic knowledge because a variety of factors coulda ffect this information exploration process.Even worse,the challenges are exacerbated due to the intricate interdependency between these factors,and make it difficult to fully exploit the potentially important correlation among various semantic knowledges.This article proposes a file access correlation miming and evaluation reference(FARMER) model,where file is treated as a multivariate vector space,and each item within the vector corresponds a separate factor of the given file.The selection of factor depends on the application,examples of factors are file path,creator and executing program.If one particular factor occurs in both files,its value is non-zero.It is clear that the extent of inter-file relationships can be measured based on the likeness of their factor values in the semantic vectors.Benefit from this model,FARMER represents files as structured vectors of identifiers,and basic vector operations can be leveraged to quantify file correlation between two file vectors.FARMER model leverages linear regression model to estimate the strength of the relationship between file correlation and a set of influencing factors so that the "bad knowledge" can be filtered out.To demonstrate the ability of new FARMER model,FARMER is incorporated into a real large-scale object-based storage system as a case study to dynamically infer file correlations.In addition FARMER-enabled optimize service for metadata prefetching algorithm and object data layout algorithm is implemented.Experimental results show that is FARMER-enabled prefetching al
关 键 词:storage management file correlation file system management mining method and algorithms
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...