TY - JOUR
T1 - A survey on multi-lingual offensive language detection
AU - Mnassri, Khouloud
AU - Farahbakhsh, Reza
AU - Chalehchaleh, Razieh
AU - Rajapaksha, Praboda
AU - Jafari, Amir Reza
AU - Li, Guanlin
AU - Crespi, Noel
N1 - Publisher Copyright:
© 2024 Zeng and Asif
PY - 2024/1/1
Y1 - 2024/1/1
N2 - The prevalence of offensive content on online communication and social media platforms is growing more and more common, which makes its detection difficult, especially in multilingual settings. The term “Offensive Language” encompasses a wide range of expressions, including various forms of hate speech and aggressive content. Therefore, exploring multilingual offensive content, that goes beyond a single language, focus and represents more linguistic diversities and cultural factors. By exploring multilingual offensive content, we can broaden our understanding and effectively combat the widespread global impact of offensive language. This survey examines the existing state of multilingual offensive language detection, including a comprehensive analysis on previous multilingual approaches, and existing datasets, as well as provides resources in the field. We also explore the related community challenges on this task, which include technical, cultural, and linguistic ones, as well as their limitations. Furthermore, in this survey we propose several potential future directions toward more efficient solutions for multilingual offensive language detection, enabling safer digital communication environment worldwide.
AB - The prevalence of offensive content on online communication and social media platforms is growing more and more common, which makes its detection difficult, especially in multilingual settings. The term “Offensive Language” encompasses a wide range of expressions, including various forms of hate speech and aggressive content. Therefore, exploring multilingual offensive content, that goes beyond a single language, focus and represents more linguistic diversities and cultural factors. By exploring multilingual offensive content, we can broaden our understanding and effectively combat the widespread global impact of offensive language. This survey examines the existing state of multilingual offensive language detection, including a comprehensive analysis on previous multilingual approaches, and existing datasets, as well as provides resources in the field. We also explore the related community challenges on this task, which include technical, cultural, and linguistic ones, as well as their limitations. Furthermore, in this survey we propose several potential future directions toward more efficient solutions for multilingual offensive language detection, enabling safer digital communication environment worldwide.
KW - Hate speech
KW - Literature review
KW - Multilingualism
KW - Offensive language
KW - Social media
U2 - 10.7717/peerj-cs.1934
DO - 10.7717/peerj-cs.1934
M3 - Article
AN - SCOPUS:85190380615
SN - 2376-5992
VL - 10
JO - PeerJ Computer Science
JF - PeerJ Computer Science
M1 - e1934
ER -