JuriBERT: A Masked-Language Model Adaptation for French Legal Text

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Language models have proven to be very useful when adapted to specific domains. Nonetheless, little research has been done on the adaptation of domain-specific BERT models in the French language. In this paper, we focus on creating a language model adapted to French legal text with the goal of helping law professionals. We conclude that some specific tasks do not benefit from generic language models pre-trained on large amounts of data. We explore the use of smaller architectures in domain-specific sub-languages and their benefits for French legal text. We prove that domain-specific pre-trained models can perform better than their equivalent generalised ones in the legal domain. Finally, we release JuriBERT, a new set of BERT models adapted to the French legal domain.

Original languageEnglish
Title of host publicationNatural Legal Language Processing, NLLP 2021 - Proceedings of the 2021 Workshop
EditorsNikolaos Aletras, Ion Androutsopoulos, Leslie Barrett, Catalina Goanta, Daniel Preotiuc-Pietro
PublisherAssociation for Computational Linguistics (ACL)
Pages95-101
Number of pages7
ISBN (Electronic)9781954085985
Publication statusPublished - 1 Jan 2021
Externally publishedYes
Event3rd Natural Legal Language Processing, NLLP 2021 - Punta Cana, Dominican Republic
Duration: 10 Nov 2021 → …

Publication series

NameNatural Legal Language Processing, NLLP 2021 - Proceedings of the 2021 Workshop

Conference

Conference3rd Natural Legal Language Processing, NLLP 2021
Country/TerritoryDominican Republic
CityPunta Cana
Period10/11/21 → …

Fingerprint

Dive into the research topics of 'JuriBERT: A Masked-Language Model Adaptation for French Legal Text'. Together they form a unique fingerprint.

Cite this