TY - GEN
T1 - Wild SBOMs
T2 - 22nd IEEE/ACM International Conference on Mining Software Repositories, MSR 2025
AU - Soeiro, Luís
AU - Robert, Thomas
AU - Zacchiroli, Stefano
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025/1/1
Y1 - 2025/1/1
N2 - Developers gain productivity by reusing readily available Free and Open Source Software (FOSS) components. Such practices also bring some difficulties, such as managing licensing, components and related security. One approach to handle those difficulties is to use Software Bill of Materials (SBOMs). While there have been studies on the readiness of practitioners to embrace SBOMs and on the SBOM tools ecosystem, a large scale study on SBOM practices based on SBOM files produced in the wild is still lacking. A starting point for such a study is a large dataset of SBOM files found in the wild. We introduce such a dataset, consisting of over 78 thousand unique SBOM files, deduplicated from those found in over 94 million repositories. We include metadata that contains the standard and format used, quality score generated by the tool sbomqs, number of revisions, filenames and provenance information. Finally, we give suggestions and examples of research that could bring new insights on assessing and improving SBOM real practices.
AB - Developers gain productivity by reusing readily available Free and Open Source Software (FOSS) components. Such practices also bring some difficulties, such as managing licensing, components and related security. One approach to handle those difficulties is to use Software Bill of Materials (SBOMs). While there have been studies on the readiness of practitioners to embrace SBOMs and on the SBOM tools ecosystem, a large scale study on SBOM practices based on SBOM files produced in the wild is still lacking. A starting point for such a study is a large dataset of SBOM files found in the wild. We introduce such a dataset, consisting of over 78 thousand unique SBOM files, deduplicated from those found in over 94 million repositories. We include metadata that contains the standard and format used, quality score generated by the tool sbomqs, number of revisions, filenames and provenance information. Finally, we give suggestions and examples of research that could bring new insights on assessing and improving SBOM real practices.
KW - SBOM dataset
KW - SBOM scores
KW - SBOM standards
KW - SBOM usage in the wild
UR - https://www.scopus.com/pages/publications/105009044792
U2 - 10.1109/MSR66628.2025.00036
DO - 10.1109/MSR66628.2025.00036
M3 - Conference contribution
AN - SCOPUS:105009044792
T3 - Proceedings - 2025 IEEE/ACM 22nd International Conference on Mining Software Repositories, MSR 2025
SP - 164
EP - 168
BT - Proceedings - 2025 IEEE/ACM 22nd International Conference on Mining Software Repositories, MSR 2025
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 27 April 2025 through 29 April 2025
ER -