Automatic Classification of Software Repositories: a Systematic Mapping Study

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The rapid growth of software repositories on development platforms such as GitHub, as well as archives like Software Heritage, prompts the need for better repository classification. Machine learning is increasingly used to automate this classification, but there are no secondary studies analyzing this research landscape.We present a systematic mapping study of 43 primary sources published between 2002 and 2023, where we examine the goals, inputs, outputs, training, and evaluation processes involved in automatic repository classification. Our findings reveal a growing interest in automatic classification, particularly to enhance the discoverability and recommendation of relevant repositories. Other applications, such as classification for mining studies, were surprisingly underrepresented. We also observe that a lack of standardized datasets, classification tasks, and evaluation metrics makes it difficult to compare the performance of different techniques.

Original languageEnglish
Title of host publicationProceedings of the 29th International Conference on Evaluation and Assessment in Software Engineering , EASE, 2025 edition, EASE 2025
EditorsMuhammad Ali Babar, Ayse Tosun, Stefan Wagner, Viktoria Stray
PublisherAssociation for Computing Machinery, Inc
Pages102-113
Number of pages12
ISBN (Electronic)9798400713859
DOIs
Publication statusPublished - 24 Dec 2025
Event29th International Conference on Evaluation and Assessment of Software Engineering, EASE 2025 - Istanbul, Turkey
Duration: 17 Jun 202520 Jun 2025

Publication series

NameProceedings of the 29th International Conference on Evaluation and Assessment in Software Engineering , EASE, 2025 edition, EASE 2025

Conference

Conference29th International Conference on Evaluation and Assessment of Software Engineering, EASE 2025
Country/TerritoryTurkey
CityIstanbul
Period17/06/2520/06/25

Keywords

  • repository classification
  • software repositories
  • systematic mapping study

Fingerprint

Dive into the research topics of 'Automatic Classification of Software Repositories: a Systematic Mapping Study'. Together they form a unique fingerprint.

Cite this