Sequence Graphs Realizations and Ambiguity in Language Models

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Several language models rely on an assumption modeling each local context as a (potentially oriented) bag of words, and have proven to be very efficient baselines. Sequence graphs are the natural structures encoding their information. However, a sequence graph may have several realizations as a sequence, leading to a degree of ambiguity. In this paper, we study such degree of ambiguity from a combinatorial and computational point of view. In particular, we present theoretical properties of sequence graphs. Several combinatorial problems are presented, depending on three levels of generalisation (window size, graph orientation, and weights), that we characterize with new complexity results. We establish different algorithms, including an integer program and a dynamic programming formulation to respectively recognize a sequence graph and to count the number of its distinct realizations.

Original languageEnglish
Title of host publicationComputing and Combinatorics - 27th International Conference, COCOON 2021, Proceedings
EditorsChi-Yeh Chen, Wing-Kai Hon, Ling-Ju Hung, Chia-Wei Lee
PublisherSpringer Science and Business Media Deutschland GmbH
Pages153-163
Number of pages11
ISBN (Print)9783030895426
DOIs
Publication statusPublished - 1 Jan 2021
Event27th International Conference on Computing and Combinatorics, COCOON 2021 - Tainan, Taiwan, Province of China
Duration: 24 Oct 202126 Oct 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13025 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference27th International Conference on Computing and Combinatorics, COCOON 2021
Country/TerritoryTaiwan, Province of China
CityTainan
Period24/10/2126/10/21

Keywords

  • Combinatorics
  • Complexity class
  • Graphs
  • Inverse problem
  • Sequences

Fingerprint

Dive into the research topics of 'Sequence Graphs Realizations and Ambiguity in Language Models'. Together they form a unique fingerprint.

Cite this