TY - JOUR
T1 - Inverting Gradient Attacks Makes Powerful Data Poisoning
AU - Bouaziz, Wassim Wes
AU - Usunier, Nicolas
AU - El-Mhamdi, El Mahdi
N1 - Publisher Copyright:
© 2025, Transactions on Machine Learning Research. All rights reserved.
PY - 2025/1/1
Y1 - 2025/1/1
N2 - Gradient attacks and data poisoning tamper with the training of machine learning algorithms to maliciously alter them and have been proven to be equivalent in convex settings. The extent of harm these attacks can produce in non-convex settings is still to be determined. Gradient attacks are practical for fewer systems than data poisoning but have been argued to be more harmful since they can be arbitrary, whereas data poisoning reduces the attacker’s power to only being able to inject data points to training sets, via e.g. legitimate participation in a collaborative dataset. This raises the question whether the harm made by gradient attacks can be matched by data poisoning in non-convex settings. In this work, we provide a positive answer and show how data poisoning can mimic gradient attacks to perform an availability attack on (non-convex) neural networks. Through gradient inversion, commonly used to reconstruct data points from actual gradients, we show how reconstructing data points out of malicious gradients can be sufficient to perform a range of attacks. This allows us to show, for the first time, a worst-case availability attack on neural networks through data poisoning, degrading the model’s performances to random-level through a minority (as low as 1%) of poisoned points. Code available at https://github.com/wesbz/inverting-gradient.
AB - Gradient attacks and data poisoning tamper with the training of machine learning algorithms to maliciously alter them and have been proven to be equivalent in convex settings. The extent of harm these attacks can produce in non-convex settings is still to be determined. Gradient attacks are practical for fewer systems than data poisoning but have been argued to be more harmful since they can be arbitrary, whereas data poisoning reduces the attacker’s power to only being able to inject data points to training sets, via e.g. legitimate participation in a collaborative dataset. This raises the question whether the harm made by gradient attacks can be matched by data poisoning in non-convex settings. In this work, we provide a positive answer and show how data poisoning can mimic gradient attacks to perform an availability attack on (non-convex) neural networks. Through gradient inversion, commonly used to reconstruct data points from actual gradients, we show how reconstructing data points out of malicious gradients can be sufficient to perform a range of attacks. This allows us to show, for the first time, a worst-case availability attack on neural networks through data poisoning, degrading the model’s performances to random-level through a minority (as low as 1%) of poisoned points. Code available at https://github.com/wesbz/inverting-gradient.
UR - https://www.scopus.com/pages/publications/105025534598
M3 - Article
AN - SCOPUS:105025534598
SN - 2835-8856
VL - 2025-December
JO - Transactions on Machine Learning Research
JF - Transactions on Machine Learning Research
ER -