TY - GEN
T1 - Using Locally Learnt Word Representations for better Textual Anomaly Detection
AU - Breidenstein, Alicia
AU - Labeau, Matthieu
N1 - Publisher Copyright:
© 2024 Association for Computational Linguistics.
PY - 2024/1/1
Y1 - 2024/1/1
N2 - The literature on general purpose textual Anomaly Detection is quite sparse, as most textual anomaly detection methods are implemented as out of domain detection in the context of pre-established classification tasks. Notably, in a field where pre-trained representations and models are of common use, the impact of the pre-training data on a task that lacks supervision has not been studied. In this paper, we use the simple setting of k-classes out anomaly detection and search for the best pairing of representation and classifier. We show that well-chosen embeddings allow a simple anomaly detection baseline such as OC-SVM to achieve similar results and even outperform deep state-of-the-art models.
AB - The literature on general purpose textual Anomaly Detection is quite sparse, as most textual anomaly detection methods are implemented as out of domain detection in the context of pre-established classification tasks. Notably, in a field where pre-trained representations and models are of common use, the impact of the pre-training data on a task that lacks supervision has not been studied. In this paper, we use the simple setting of k-classes out anomaly detection and search for the best pairing of representation and classifier. We show that well-chosen embeddings allow a simple anomaly detection baseline such as OC-SVM to achieve similar results and even outperform deep state-of-the-art models.
UR - https://www.scopus.com/pages/publications/105000103649
M3 - Conference contribution
AN - SCOPUS:105000103649
T3 - Insights 2024 - 5th Workshop on Insights from Negative Results in NLP, Proceedings of the Workshop
SP - 82
EP - 91
BT - Insights 2024 - 5th Workshop on Insights from Negative Results in NLP, Proceedings of the Workshop
A2 - Tafreshi, Shabnam
A2 - Akula, Arjun Reddy
A2 - Sedoc, Joao
A2 - Drozd, Aleksandr
A2 - Rogers, Anna
A2 - Rumshisky, Anna
PB - Association for Computational Linguistics (ACL)
T2 - 5th Workshop on Insights from Negative Results in NLP, Insights 2024
Y2 - 20 June 2024
ER -