TY - JOUR
T1 - On the Resilience of Traditional AI Algorithms Toward Poisoning Attacks for Vulnerability Detection
AU - González-Manzano, Lorena
AU - Garcia-Alfaro, Joaquin
N1 - Publisher Copyright:
Copyright © 2025 Lorena González-Manzano and Joaquin Garcia-Alfaro. IET Information Security published by John Wiley & Sons Ltd.
PY - 2025/1/1
Y1 - 2025/1/1
N2 - The complexity of implementations and the interconnection of assorted systems and devices facilitate the emergence of vulnerabilities. Detection systems are developed to fight against this security issue, being the use of artificial intelligence (AI) a common practice. However, the use of AI is not without its problems, especially those affecting the training phase. This article tackles this issue by characterizing the resilience against poisoning attacks using a benchmark for vulnerability detection, extracting simple code features while applying traditional AI algorithms. These choices are beneficial for the fast processing of vulnerabilities required in a triage process. The study is carried out in C#, C/C++, and PHP. Results show that the vulnerability detection process is specially affected beyond 20% of false data. Remarkably, detecting some of the most frequent common weakness enumeration (CWE) is altered even with lower poison rates. Overall, K-nearest-neighbor (KNN) and support vector machine (SVM) are the most resilient in C# and C/C++, while multilayer perceptron (MLP) in PHP. Indeed, vulnerability detection in PHP is less affected by attacks, while C# and C/C++ present comparable results.
AB - The complexity of implementations and the interconnection of assorted systems and devices facilitate the emergence of vulnerabilities. Detection systems are developed to fight against this security issue, being the use of artificial intelligence (AI) a common practice. However, the use of AI is not without its problems, especially those affecting the training phase. This article tackles this issue by characterizing the resilience against poisoning attacks using a benchmark for vulnerability detection, extracting simple code features while applying traditional AI algorithms. These choices are beneficial for the fast processing of vulnerabilities required in a triage process. The study is carried out in C#, C/C++, and PHP. Results show that the vulnerability detection process is specially affected beyond 20% of false data. Remarkably, detecting some of the most frequent common weakness enumeration (CWE) is altered even with lower poison rates. Overall, K-nearest-neighbor (KNN) and support vector machine (SVM) are the most resilient in C# and C/C++, while multilayer perceptron (MLP) in PHP. Indeed, vulnerability detection in PHP is less affected by attacks, while C# and C/C++ present comparable results.
KW - artificial intelligence
KW - deadcode insertion
KW - function renaming
KW - label flipping
KW - poison attack
KW - vulnerability detection
UR - https://www.scopus.com/pages/publications/105018667080
U2 - 10.1049/ise2/9997989
DO - 10.1049/ise2/9997989
M3 - Article
AN - SCOPUS:105018667080
SN - 1751-8709
VL - 2025
JO - IET Information Security
JF - IET Information Security
IS - 1
M1 - 9997989
ER -