TY - GEN
T1 - Polysemy in Spoken Conversations and Written Texts
AU - Soler, Aina Garí
AU - Labeau, Matthieu
AU - Clavel, Chloé
N1 - Publisher Copyright:
© European Language Resources Association (ELRA), licensed under CC-BY-NC-4.0.
PY - 2022/1/1
Y1 - 2022/1/1
N2 - Our discourses are full of potential lexical ambiguities, due in part to the pervasive use of words having multiple senses. Sometimes, one word may even be used in more than one sense throughout a text. But, to what extent is this true for different kinds of texts? Does the use of polysemous words change when a discourse involves two people, or when speakers have time to plan what to say? We investigate these questions by comparing the polysemy level of texts of different nature, with a focus on spontaneous spoken dialogs; unlike previous work which examines solely scripted, written, monolog-like data. We compare multiple metrics that presuppose different conceptualizations of text polysemy, i.e., they consider the observed or the potential number of senses of words, or their sense distribution in a discourse. We show that the polysemy level of texts varies greatly depending on the kind of text considered, with dialog and spoken discourses having generally a higher polysemy level than written monologs. Additionally, our results emphasize the need for relaxing the popular “one sense per discourse” hypothesis.
AB - Our discourses are full of potential lexical ambiguities, due in part to the pervasive use of words having multiple senses. Sometimes, one word may even be used in more than one sense throughout a text. But, to what extent is this true for different kinds of texts? Does the use of polysemous words change when a discourse involves two people, or when speakers have time to plan what to say? We investigate these questions by comparing the polysemy level of texts of different nature, with a focus on spontaneous spoken dialogs; unlike previous work which examines solely scripted, written, monolog-like data. We compare multiple metrics that presuppose different conceptualizations of text polysemy, i.e., they consider the observed or the potential number of senses of words, or their sense distribution in a discourse. We show that the polysemy level of texts varies greatly depending on the kind of text considered, with dialog and spoken discourses having generally a higher polysemy level than written monologs. Additionally, our results emphasize the need for relaxing the popular “one sense per discourse” hypothesis.
KW - Document classification/Text categorisation
KW - Semantics
KW - Word Sense Disambiguation
UR - https://www.scopus.com/pages/publications/85144400135
M3 - Conference contribution
AN - SCOPUS:85144400135
T3 - 2022 Language Resources and Evaluation Conference, LREC 2022
SP - 1680
EP - 1690
BT - 2022 Language Resources and Evaluation Conference, LREC 2022
A2 - Calzolari, Nicoletta
A2 - Bechet, Frederic
A2 - Blache, Philippe
A2 - Choukri, Khalid
A2 - Cieri, Christopher
A2 - Declerck, Thierry
A2 - Goggi, Sara
A2 - Isahara, Hitoshi
A2 - Maegaard, Bente
A2 - Mariani, Joseph
A2 - Mazo, Helene
A2 - Odijk, Jan
A2 - Piperidis, Stelios
PB - European Language Resources Association (ELRA)
T2 - 13th International Conference on Language Resources and Evaluation Conference, LREC 2022
Y2 - 20 June 2022 through 25 June 2022
ER -