TY - GEN
T1 - The debsources dataset
T2 - 12th Working Conference on Mining Software Repositories, MSR 2015, co-located with the 37th ACM/IEEE International Conference on Software Engineering, ICSE 2015
AU - Zacchiroli, Stefano
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/8/4
Y1 - 2015/8/4
N2 - We present the Debsources Dataset: distribution metadata and source code metrics spanning two decades of Free and Open Source Software (FOSS) history, seen through the lens of the Debian distribution. Debsources is a software platform used to gather, search, and publish on the Web the full source code of the Debian operating system, as well as measures about it. A notable public instance of Debsources is available at http://sources.debian.net, it includes both current and historical releases of Debian. Plugins to compute popular source code metrics (lines of code, defined symbols, disk usage) and other derived data (e.g., Checksums) have been written, integrated, and run on all the source code available on sources.debian.net. The Debsources Dataset is a PostgreSQL database dump of sources.debian.net metadata, as of February 10th, 2015. The dataset contains both Debian-specific metadata - e.g., which software packages are available in which release, which source code file belong to which package, release dates, etc. - and source code information gathered by running Debsources plugins. The Debsources Dataset offer a very long-term historical view of the macro-level evolution and constitution of FOSS through the lens of popular, representative FOSS projects of their times.
AB - We present the Debsources Dataset: distribution metadata and source code metrics spanning two decades of Free and Open Source Software (FOSS) history, seen through the lens of the Debian distribution. Debsources is a software platform used to gather, search, and publish on the Web the full source code of the Debian operating system, as well as measures about it. A notable public instance of Debsources is available at http://sources.debian.net, it includes both current and historical releases of Debian. Plugins to compute popular source code metrics (lines of code, defined symbols, disk usage) and other derived data (e.g., Checksums) have been written, integrated, and run on all the source code available on sources.debian.net. The Debsources Dataset is a PostgreSQL database dump of sources.debian.net metadata, as of February 10th, 2015. The dataset contains both Debian-specific metadata - e.g., which software packages are available in which release, which source code file belong to which package, release dates, etc. - and source code information gathered by running Debsources plugins. The Debsources Dataset offer a very long-term historical view of the macro-level evolution and constitution of FOSS through the lens of popular, representative FOSS projects of their times.
KW - Debian
KW - Free software
KW - Open source
KW - Software evolution
KW - Source code
U2 - 10.1109/MSR.2015.65
DO - 10.1109/MSR.2015.65
M3 - Conference contribution
AN - SCOPUS:84957062075
T3 - IEEE International Working Conference on Mining Software Repositories
SP - 466
EP - 469
BT - Proceedings - 12th Working Conference on Mining Software Repositories, MSR 2015
PB - IEEE Computer Society
Y2 - 16 May 2015 through 17 May 2015
ER -