Passer à la navigation principale Passer à la recherche Passer au contenu principal

Towards Cross-Lingual Audio Abuse Detection in Low-Resource Settings with Few-Shot Learning

  • Institut Polytechnique de Paris

Résultats de recherche: Le chapitre dans un livre, un rapport, une anthologie ou une collectionContribution à une conférenceRevue par des pairs

Résumé

Online abusive content detection, particularly in low-resource settings and within the audio modality, remains underexplored. We investigate the potential of pre-trained audio representations for detecting abusive language in low-resource languages, in this case, in Indian languages using Few Shot Learning (FSL). Leveraging powerful representations from models such as Wav2Vec and Whisper, we explore cross-lingual abuse detection using the ADIMA dataset with FSL. Our approach integrates these representations within the Model-Agnostic Meta-Learning (MAML) framework to classify abusive language in 10 languages. We experiment with various shot sizes (50-200) evaluating the impact of limited data on performance. Additionally, a feature visualization study was conducted to better understand model behaviour. This study highlights the generalization ability of pre-trained models in low-resource scenarios and offers valuable insights into detecting abusive language in multilingual contexts.

langue originaleAnglais
titreMain Conference
rédacteurs en chefOwen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
EditeurAssociation for Computational Linguistics (ACL)
Pages5558-5569
Nombre de pages12
ISBN (Electronique)9798891761964
étatPublié - 1 janv. 2025
Evénement31st International Conference on Computational Linguistics, COLING 2025 - Abu Dhabi, Émirats arabes unis
Durée: 19 janv. 202524 janv. 2025

Série de publications

NomProceedings - International Conference on Computational Linguistics, COLING
ISSN (imprimé)2951-2093

Une conférence

Une conférence31st International Conference on Computational Linguistics, COLING 2025
Pays/TerritoireÉmirats arabes unis
La villeAbu Dhabi
période19/01/2524/01/25

Empreinte digitale

Examiner les sujets de recherche de « Towards Cross-Lingual Audio Abuse Detection in Low-Resource Settings with Few-Shot Learning ». Ensemble, ils forment une empreinte digitale unique.

Contient cette citation