Skip to main navigation Skip to search Skip to main content

Diverse Paraphrasing with Insertion Models for Few-Shot Intent Detection

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In contrast to classic autoregressive generation, insertion-based models can predict in a order-free way multiple tokens at a time, which make their generation uniquely controllable: it can be constrained to strictly include an ordered list of tokens. We propose to exploit this feature in a new diverse paraphrasing framework: first, we extract important tokens or keywords in the source sentence; second, we augment them; third, we generate new samples around them by using insertion models. We show that the generated paraphrases are competitive with state of the art autoregressive paraphrasers, not only in diversity but also in quality. We further investigate their potential to create new pseudo-labelled samples for data augmentation, using a meta-learning classification framework, and find equally competitive result. In addition to proving non-autoregressive (NAR) viability for paraphrasing, we contribute our open-source framework as a starting point for further research into controllable NAR generation.

Original languageEnglish
Title of host publicationAdvances in Intelligent Data Analysis XXI - 21st International Symposium on Intelligent Data Analysis, IDA 2023, Proceedings
EditorsBruno Crémilleux, Sibylle Hess, Siegfried Nijssen
PublisherSpringer Science and Business Media Deutschland GmbH
Pages65-76
Number of pages12
ISBN (Print)9783031300462
DOIs
Publication statusPublished - 1 Jan 2023
Event21st International Symposium on Intelligent Data Analysis, IDA 2022 - Louvain-la-Neuve, Belgium
Duration: 12 Apr 202314 Apr 2023

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13876 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference21st International Symposium on Intelligent Data Analysis, IDA 2022
Country/TerritoryBelgium
CityLouvain-la-Neuve
Period12/04/2314/04/23

Keywords

  • Controllable text generation
  • Deep Learning
  • Insertion models
  • Natural language processing
  • Non-autoregressive
  • Transformers

Fingerprint

Dive into the research topics of 'Diverse Paraphrasing with Insertion Models for Few-Shot Intent Detection'. Together they form a unique fingerprint.

Cite this