TY - GEN
T1 - Characterizing obfuscated JavaScript using abstract syntax trees
T2 - 26th IEEE International Conference on Advanced Information Networking and Applications Workshops, WAINA 2012
AU - Blanc, Gregory
AU - Miyamoto, Daisuke
AU - Akiyama, Mitsuaki
AU - Kadobayashi, Youki
PY - 2012/5/14
Y1 - 2012/5/14
N2 - Obfuscation, code transformations that make the code unintelligible, is still an issue for web malware analysts and is still a weapon of choice for attackers. Worse, some researchers have arbitrarily decided to consider obfuscated contents as malicious although it has been proven wrong. Yet, we can assume than some web attack kits only feature a fraction of existing obfuscating transformations which may make it easy to detect malicious scripting contents. However, because of the undecidability on obfuscated contents, we propose to survey, classify and design deobfuscation methods for each obfuscating transformation. In this paper, we apply abstract syntax tree (AST) based methods to characterize obfuscating transformations found in malicious JavaScript samples. We are able to classify similar obfuscated codes based on AST fingerprints regardless of the original attack code. We are also able to quickly detect these obfuscating transformations by matching these in an analyzed sample's AST using a pushdown automaton (PDA). The PDA accepts a set of sub trees representing obfuscating transformations previously learned. Such quick and lightweight sub tree matching algorithm has the potential to detect obfuscated pieces of code in a script, to be later extracted for deobfuscation.
AB - Obfuscation, code transformations that make the code unintelligible, is still an issue for web malware analysts and is still a weapon of choice for attackers. Worse, some researchers have arbitrarily decided to consider obfuscated contents as malicious although it has been proven wrong. Yet, we can assume than some web attack kits only feature a fraction of existing obfuscating transformations which may make it easy to detect malicious scripting contents. However, because of the undecidability on obfuscated contents, we propose to survey, classify and design deobfuscation methods for each obfuscating transformation. In this paper, we apply abstract syntax tree (AST) based methods to characterize obfuscating transformations found in malicious JavaScript samples. We are able to classify similar obfuscated codes based on AST fingerprints regardless of the original attack code. We are also able to quickly detect these obfuscating transformations by matching these in an analyzed sample's AST using a pushdown automaton (PDA). The PDA accepts a set of sub trees representing obfuscating transformations previously learned. Such quick and lightweight sub tree matching algorithm has the potential to detect obfuscated pieces of code in a script, to be later extracted for deobfuscation.
KW - JavaScript
KW - abstract syntax tree
KW - obfuscation
U2 - 10.1109/WAINA.2012.140
DO - 10.1109/WAINA.2012.140
M3 - Conference contribution
AN - SCOPUS:84860732068
SN - 9780769546520
T3 - Proceedings - 26th IEEE International Conference on Advanced Information Networking and Applications Workshops, WAINA 2012
SP - 344
EP - 351
BT - Proceedings - 26th IEEE International Conference on Advanced Information Networking and Applications Workshops, WAINA 2012
Y2 - 26 March 2012 through 29 March 2012
ER -