TY - GEN
T1 - How R Developers explain their Package Choice
T2 - 17th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2023
AU - Malviya-Thakur, Addi
AU - Mockus, Audris
AU - Zaretzki, Russell
AU - Bichescu, Bogdan
AU - Bradley, Randy
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Background: Contemporary software development relies heavily on reusing already implemented functionality, usually in the form of packages. Aims: We aim to shed light on developers' preferences when selecting packages in R language. Method: To do that, we create and administer a survey to over 1000 developers who have added one of two common dataframe enhancement libraries in R to their projects: data.table or tidyr. We design a questionnaire using the Social Contagion Theory (SCT) following prior work on technology adoption and ensure that key dimensions affecting developer choice are considered. Results: Of the 1085 developers we contacted, 803 completed the survey asking them to prioritize various factors known to affect developer perceptions of package quality and to provide their background. Most developers self-identified as data scientists with two to five years of work experience. We found significant differences between the preferences of developers who chose data.table and tidyr. Surprisingly, package reputation based on easy-to-see measures, such as the number of stars on GitHub, was not an important factor for either group. Conclusions: Our findings demonstrate the inherently social nature of package adoption. They can help design future studies on how different populations of developers make decisions on which software packages to use in their projects. Finally, package developers and maintainers can benefit by better understanding the prime concerns of the users of their packages.
AB - Background: Contemporary software development relies heavily on reusing already implemented functionality, usually in the form of packages. Aims: We aim to shed light on developers' preferences when selecting packages in R language. Method: To do that, we create and administer a survey to over 1000 developers who have added one of two common dataframe enhancement libraries in R to their projects: data.table or tidyr. We design a questionnaire using the Social Contagion Theory (SCT) following prior work on technology adoption and ensure that key dimensions affecting developer choice are considered. Results: Of the 1085 developers we contacted, 803 completed the survey asking them to prioritize various factors known to affect developer perceptions of package quality and to provide their background. Most developers self-identified as data scientists with two to five years of work experience. We found significant differences between the preferences of developers who chose data.table and tidyr. Surprisingly, package reputation based on easy-to-see measures, such as the number of stars on GitHub, was not an important factor for either group. Conclusions: Our findings demonstrate the inherently social nature of package adoption. They can help design future studies on how different populations of developers make decisions on which software packages to use in their projects. Finally, package developers and maintainers can benefit by better understanding the prime concerns of the users of their packages.
KW - Code reuse
KW - Empirical Software engineering
KW - R System
KW - Social aspects
KW - Social Contagion Theory
KW - Software engineering research
KW - Software measurement
KW - Software Supply chains
KW - User behavior
UR - http://www.scopus.com/inward/record.url?scp=85178661768&partnerID=8YFLogxK
U2 - 10.1109/ESEM56168.2023.10304869
DO - 10.1109/ESEM56168.2023.10304869
M3 - Conference contribution
AN - SCOPUS:85178661768
T3 - International Symposium on Empirical Software Engineering and Measurement
BT - 2023 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2023
PB - IEEE Computer Society
Y2 - 26 October 2023 through 27 October 2023
ER -