Text line segmentation of historical documents: A survey

Research output: Contribution to journalArticlepeer-review

Abstract

There is a huge amount of historical documents in libraries and in various National Archives that have not been exploited electronically. Although automatic reading of complete pages remains, in most cases, a long-term objective, tasks such as word spotting, text/image alignment, authentication and extraction of specific fields are in use today. For all these tasks, a major step is document segmentation into text lines. Because of the low quality and the complexity of these documents (background noise, artifacts due to aging, interfering lines), automatic text line segmentation remains an open research field. The objective of this paper is to present a survey of existing methods, developed during the last decade and dedicated to documents of historical interest.

Original languageEnglish
Pages (from-to)123-138
Number of pages16
JournalInternational Journal on Document Analysis and Recognition
Volume9
Issue number2-4
DOIs
Publication statusPublished - 1 Apr 2007
Externally publishedYes

Keywords

  • Handwriting
  • Historical documents
  • Segmentation
  • Survey
  • Text lines

Fingerprint

Dive into the research topics of 'Text line segmentation of historical documents: A survey'. Together they form a unique fingerprint.

Cite this