TY - GEN
T1 - Fair Play for Individuals, Foul Play for Groups? Auditing Anonymization's Impact on ML Fairness
AU - Arcolezi, Héber H.
AU - Alishahi, Mina
AU - Bendoukha, Adda Akram
AU - Kaaniche, Nesrine
N1 - Publisher Copyright:
© 2025 The Authors.
PY - 2025/10/21
Y1 - 2025/10/21
N2 - Machine learning (ML) algorithms are heavily based on the availability of training data, which, depending on the domain, often includes sensitive information about data providers. This raises critical privacy concerns. Anonymization techniques have emerged as a practical solution to address these issues by generalizing features or suppressing data to make it more difficult to accurately identify individuals. Although recent studies have shown that privacy-enhancing technologies can influence ML predictions across different subgroups, thus affecting fair decision-making, the specific effects of anonymization techniques, such as k-anonymity, l-diversity, and t-closeness, on ML fairness remain largely unexplored. In this work, we systematically audit the impact of anonymization techniques on ML fairness, evaluating both individual and group fairness. Our quantitative study reveals that anonymization can degrade group fairness metrics by up to fourfold. Conversely, similarity-based individual fairness metrics tend to improve under stronger anonymization, largely as a result of increased input homogeneity. By analyzing varying levels of anonymization across diverse privacy settings and data distributions, this study provides critical insights into the trade-offs between privacy, fairness, and utility, offering actionable guidelines for responsible AI development. Our code is publicly available at: https://github.com/hharcolezi/anonymity-impact-fairness.
AB - Machine learning (ML) algorithms are heavily based on the availability of training data, which, depending on the domain, often includes sensitive information about data providers. This raises critical privacy concerns. Anonymization techniques have emerged as a practical solution to address these issues by generalizing features or suppressing data to make it more difficult to accurately identify individuals. Although recent studies have shown that privacy-enhancing technologies can influence ML predictions across different subgroups, thus affecting fair decision-making, the specific effects of anonymization techniques, such as k-anonymity, l-diversity, and t-closeness, on ML fairness remain largely unexplored. In this work, we systematically audit the impact of anonymization techniques on ML fairness, evaluating both individual and group fairness. Our quantitative study reveals that anonymization can degrade group fairness metrics by up to fourfold. Conversely, similarity-based individual fairness metrics tend to improve under stronger anonymization, largely as a result of increased input homogeneity. By analyzing varying levels of anonymization across diverse privacy settings and data distributions, this study provides critical insights into the trade-offs between privacy, fairness, and utility, offering actionable guidelines for responsible AI development. Our code is publicly available at: https://github.com/hharcolezi/anonymity-impact-fairness.
UR - https://www.scopus.com/pages/publications/105024415494
U2 - 10.3233/FAIA250909
DO - 10.3233/FAIA250909
M3 - Conference contribution
AN - SCOPUS:105024415494
T3 - Frontiers in Artificial Intelligence and Applications
SP - 1009
EP - 1018
BT - ECAI 2025 - 28th European Conference on Artificial Intelligence, including 14th Conference on Prestigious Applications of Intelligent Systems, PAIS 2025 - Proceedings
A2 - Lynce, Ines
A2 - Murano, Nello
A2 - Vallati, Mauro
A2 - Villata, Serena
A2 - Chesani, Federico
A2 - Milano, Michela
A2 - Omicini, Andrea
A2 - Dastani, Mehdi
PB - IOS Press BV
T2 - 28th European Conference on Artificial Intelligence, ECAI 2025, including 14th Conference on Prestigious Applications of Intelligent Systems, PAIS 2025
Y2 - 25 October 2025 through 30 October 2025
ER -