R语言程序包依赖关系与更新情况的实证研究  被引量:1

Empirical Study on Dependencies and Updates of R Packages

在线阅读下载全文

作  者:程弘正 杨文华 CHENG Hongzheng;YANG Wenhua(College of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China;Collaborative Innovation Center of Novel Software Technology and Industrialization,Nanjing 210023,China)

机构地区:[1]南京航空航天大学计算机科学与技术学院,南京211106 [2]软件新技术与产业化协同创新中心,南京210023

出  处:《计算机科学》2024年第6期1-11,共11页Computer Science

摘  要:作为一款统计分析和统计制图的优秀工具,R在统计分析和人工智能领域得到了广泛应用,并且拥有丰富的开源生态系统,相关R语言程序包(R包)的数量也在持续增长。R包开发模式的特征,即新开发R包往往通过引入已有的R包来实现功能,导致R包之间的依赖关系非常复杂,甚至出现依赖冲突。而引起此问题的原因除了依赖关系外,还有R包的更新。为了了解现有R包的发展现状,需要对R包的依赖和更新情况进行深入实证研究。但已有关于R的实证研究关注的主要是整个R生态系统,没有专门针对R包的依赖和更新的具体分析。为了弥补这一空缺,基于CRAN与GitHub上的数据对常用R包的依赖关系、包的更新情况、存在的依赖冲突隐患以及R包的依赖更新情况4方面展开了详细分析。发现R包之间的依赖关系复杂、每个包依赖的包的数量普遍较多但依赖集中于一部分R包,虽然常用R包的更新频率较快,但其中依然存在不少依赖间的冲突(不一致);同时,还对这些R包的依赖冲突进行了检测和分类。实证研究结果能够让R开发者和使用者更加了解R包的发展现状,同时提供了一些可以帮助R包的开发者在开发过程中避免隐患的建议,总结了研究者在R包依赖和更新相关问题上可以进一步探究的方向。As an excellent tool for statistical analysis and statistical cartography,R is very popular in the field of statistical analysis and artificial intelligence,and it has a rich open-source ecosystem with a growing number of R packages.The characteristics of the R package development model,i.e.,the new development of an R package is often implemented by introducing existing R packages to achieve functionality,resulting in very complex dependencies between R packages and even dependency conflicts.The other factor that causes this problem is the update of the R package,in addition to the dependencies.Therefore,an in-depth empi-rical study of the dependencies and updates of R packages is needed to understand the current state of development of existing R packages.However,existing empirical studies on R have focused on the entire R ecosystem without a specific analysis of the dependencies and updates of R packages.To bridge this gap,this paper presents a detailed analysis of the dependencies,the updates,the potential conflicts of dependencies,and the updates of dependencies of common R packages based on data from CRAN(Comprehensive R Archive Network)and GitHub.It is found that the dependency relationships between R packages are complex,and the number of packages each R package depends on is generally high.Still,the dependencies are concentrated in a part of R packages.Although the update frequency of common R packages is fast,there are still many conflicts(inconsistencies)between depen-dencies,and we detected and classified the dependency conflicts of these R packages.The results of our empirical study can provide R developers and users with a better understanding of the current state of R package development,and provide some suggestions that can help R package developers avoid pitfalls in the development process,as well as directions that researchers can explore further on issues related to R package dependencies and updates.

关 键 词:R包 实证研究 依赖 更新 依赖冲突 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象