TY - GEN
T1 - Multilingual Fake News Detection
T2 - Intelligent Systems Conference, IntelliSys 2024
AU - Chalehchaleh, Razieh
AU - Farahbakhsh, Reza
AU - Crespi, Noel
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
PY - 2024/1/1
Y1 - 2024/1/1
N2 - Amidst the surge in global online news consumption, tackling the escalating challenge of fake news requires a multilingual approach. While extensive research has explored fake news detection from various perspectives, a notable gap persists—the majority of studies concentrate on the English language. This highlights, the need for more research focusing on other languages, especially considering the scarcity of available non-English fake news datasets, particularly in low-resource settings. Focused on mBERT, XLM-RoBERTa, and LASER embeddings, this study addresses three key questions. Firstly, it evaluates the efficacy of several multilingual models across languages, highlighting the robust performance of mBERT and XLM-RoBERTa. Secondly, it examines the impact of multilingual and cross-lingual training data, demonstrating the effectiveness of multilingual training, including its potential in zero-shot and transfer learning scenarios. Thirdly, it compares multilingual models with translation-based strategies, revealing the superior performance of the former in multilingual fake news detection. Leveraging two datasets encompassing news in English, Spanish, French, Portuguese, Italian, Hindi, Indonesian, Swahili, and Vietnamese, our research underscores the effectiveness of multilingual approaches offering valuable insights for future research to combat the global problem of fake news more effectively.
AB - Amidst the surge in global online news consumption, tackling the escalating challenge of fake news requires a multilingual approach. While extensive research has explored fake news detection from various perspectives, a notable gap persists—the majority of studies concentrate on the English language. This highlights, the need for more research focusing on other languages, especially considering the scarcity of available non-English fake news datasets, particularly in low-resource settings. Focused on mBERT, XLM-RoBERTa, and LASER embeddings, this study addresses three key questions. Firstly, it evaluates the efficacy of several multilingual models across languages, highlighting the robust performance of mBERT and XLM-RoBERTa. Secondly, it examines the impact of multilingual and cross-lingual training data, demonstrating the effectiveness of multilingual training, including its potential in zero-shot and transfer learning scenarios. Thirdly, it compares multilingual models with translation-based strategies, revealing the superior performance of the former in multilingual fake news detection. Leveraging two datasets encompassing news in English, Spanish, French, Portuguese, Italian, Hindi, Indonesian, Swahili, and Vietnamese, our research underscores the effectiveness of multilingual approaches offering valuable insights for future research to combat the global problem of fake news more effectively.
KW - Cross-lingual
KW - Fake news detection
KW - Low-resource
KW - Multilingual
KW - Transfer-learning
KW - Zero-shot
U2 - 10.1007/978-3-031-66428-1_5
DO - 10.1007/978-3-031-66428-1_5
M3 - Conference contribution
AN - SCOPUS:85201079420
SN - 9783031664274
T3 - Lecture Notes in Networks and Systems
SP - 73
EP - 89
BT - Intelligent Systems and Applications - Proceedings of the 2024 Intelligent Systems Conference IntelliSys Volume 2
A2 - Arai, Kohei
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 5 September 2024 through 6 September 2024
ER -