TY - GEN
T1 - LUMIA
T2 - 30th European Symposium on Research in Computer Security, ESORICS 2025
AU - Ibanez-Lissen, Luis
AU - Gonzalez-Manzano, Lorena
AU - de Fuentes, Jose Maria
AU - Anciaux, Nicolas
AU - Garcia-Alfaro, Joaquin
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
PY - 2026/1/1
Y1 - 2026/1/1
N2 - Large Language Models (LLMs) are increasingly used in a variety of applications. Concerns around inferring whether data samples belong to the LLM training dataset have grown in parallel. Previous efforts focus on black-to-grey-box models, thus neglecting the potential benefit from internal LLM information. To address this problem, we propose the use of Linear Probes (LPs) as a method to assess Membership Inference Attacks (MIAs) by examining internal activations of LLMs. Our approach, dubbed LUMIA, applies LPs layer-by-layer to get fine-grained data on the model inner workings. We test this method across several model architectures, sizes and datasets, including unimodal and multimodal tasks. In unimodal MIA, LUMIA achieves an average gain of 14.90% in Area Under the Curve (AUC) over previous techniques. Remarkably, LUMIA reaches AUC > 60% in 65.33% of cases—an increase of 46.80% against the state of the art. Furthermore, our approach reveals key insights, such as the model layers where MIAs are most detectable. In multimodal models, LPs indicate that visual inputs significantly contribute to MIAs—AUC > 60% is reached in 85.90% of the experiments.
AB - Large Language Models (LLMs) are increasingly used in a variety of applications. Concerns around inferring whether data samples belong to the LLM training dataset have grown in parallel. Previous efforts focus on black-to-grey-box models, thus neglecting the potential benefit from internal LLM information. To address this problem, we propose the use of Linear Probes (LPs) as a method to assess Membership Inference Attacks (MIAs) by examining internal activations of LLMs. Our approach, dubbed LUMIA, applies LPs layer-by-layer to get fine-grained data on the model inner workings. We test this method across several model architectures, sizes and datasets, including unimodal and multimodal tasks. In unimodal MIA, LUMIA achieves an average gain of 14.90% in Area Under the Curve (AUC) over previous techniques. Remarkably, LUMIA reaches AUC > 60% in 65.33% of cases—an increase of 46.80% against the state of the art. Furthermore, our approach reveals key insights, such as the model layers where MIAs are most detectable. In multimodal models, LPs indicate that visual inputs significantly contribute to MIAs—AUC > 60% is reached in 85.90% of the experiments.
KW - Large Language Models
KW - Large Multimodal Models
KW - Linear Probes
KW - Membership Inference Attacks
UR - https://www.scopus.com/pages/publications/105020261314
U2 - 10.1007/978-3-032-07884-1_10
DO - 10.1007/978-3-032-07884-1_10
M3 - Conference contribution
AN - SCOPUS:105020261314
SN - 9783032078834
T3 - Lecture Notes in Computer Science
SP - 186
EP - 206
BT - Computer Security – ESORICS 2025 - 30th European Symposium on Research in Computer Security, Proceedings
A2 - Nicomette, Vincent
A2 - Benzekri, Abdelmalek
A2 - Boulahia-Cuppens, Nora
A2 - Vaidya, Jaideep
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 22 September 2025 through 24 September 2025
ER -