TY - GEN
T1 - The Locality and Symmetry of Positional Encodings
AU - Chen, Lihu
AU - Varoquaux, Gaël
AU - Suchanek, Fabian M.
N1 - Publisher Copyright:
© 2023 Association for Computational Linguistics.
PY - 2023/1/1
Y1 - 2023/1/1
N2 - Positional Encodings (PEs) are used to inject word-order information into transformer-based language models. While they can significantly enhance the quality of sentence representations, their specific contribution to language models is not fully understood, especially given recent findings that various positional encodings are insensitive to word order. In this work, we conduct a systematic study of positional encodings in Bidirectional Masked Language Models (BERT-style), which complements existing work in three aspects: (1) We uncover the core function of PEs by identifying two common properties, Locality and Symmetry; (2) We show that the two properties are closely correlated with the performances of downstream tasks; (3) We quantify the weakness of current PEs by introducing two new probing tasks, on which current PEs perform poorly. We believe that these results are the basis for developing better PEs for transformer-based language models. The code is available at https://github.com/tigerchen52/locality_symmetry.
AB - Positional Encodings (PEs) are used to inject word-order information into transformer-based language models. While they can significantly enhance the quality of sentence representations, their specific contribution to language models is not fully understood, especially given recent findings that various positional encodings are insensitive to word order. In this work, we conduct a systematic study of positional encodings in Bidirectional Masked Language Models (BERT-style), which complements existing work in three aspects: (1) We uncover the core function of PEs by identifying two common properties, Locality and Symmetry; (2) We show that the two properties are closely correlated with the performances of downstream tasks; (3) We quantify the weakness of current PEs by introducing two new probing tasks, on which current PEs perform poorly. We believe that these results are the basis for developing better PEs for transformer-based language models. The code is available at https://github.com/tigerchen52/locality_symmetry.
UR - https://www.scopus.com/pages/publications/85183309360
U2 - 10.18653/v1/2023.findings-emnlp.955
DO - 10.18653/v1/2023.findings-emnlp.955
M3 - Conference contribution
AN - SCOPUS:85183309360
T3 - Findings of the Association for Computational Linguistics: EMNLP 2023
SP - 14313
EP - 14331
BT - Findings of the Association for Computational Linguistics
PB - Association for Computational Linguistics (ACL)
T2 - 2023 Findings of the Association for Computational Linguistics: EMNLP 2023
Y2 - 6 December 2023 through 10 December 2023
ER -