Construction of language models for an handwritten mail reading system

Olivier Morillot, Laurence Likforman-Sulem, Emmanuèle Grosicki

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper presents a system for the recognition of unconstrained handwritten mails. The main part of this system is an HMM recognizer which uses trigraphs to model contextual information. This recognition system does not require any segmentation into words or characters and directly works at line level. To take into account linguistic information and enhance performance, a language model is introduced. This language model is based on bigrams and built from training document transcriptions only. Different experiments with various vocabulary sizes and language models have been conducted. Word Error Rate and Perplexity values are compared to show the interest of specific language models, fit to handwritten mail recognition task.

Original languageEnglish
Title of host publicationProceedings of SPIE-IS and T Electronic Imaging - Document Recognition and Retrieval XIX
DOIs
Publication statusPublished - 27 Feb 2012
Externally publishedYes
EventDocument Recognition and Retrieval XIX - Burlingame, CA, United States
Duration: 25 Jan 201226 Jan 2012

Publication series

NameProceedings of SPIE - The International Society for Optical Engineering
Volume8297
ISSN (Print)0277-786X

Conference

ConferenceDocument Recognition and Retrieval XIX
Country/TerritoryUnited States
CityBurlingame, CA
Period25/01/1226/01/12

Keywords

  • Hidden Markov Models
  • Offline Handwriting recognition
  • handwritten mail
  • language modeling
  • n-grams
  • text-line recognition

Fingerprint

Dive into the research topics of 'Construction of language models for an handwritten mail reading system'. Together they form a unique fingerprint.

Cite this