Passer à la navigation principale Passer à la recherche Passer au contenu principal

Handwriting recognition of historical documents with few labeled data

  • University of Balamand
  • Université Paris-Saclay

Résultats de recherche: Le chapitre dans un livre, un rapport, une anthologie ou une collectionContribution à une conférenceRevue par des pairs

Résumé

Historical documents present many challenges for offline handwriting recognition systems, among them, the segmentation and labeling steps. Carefully annotated text lines are needed to train an HTR system. In some scenarios, transcripts are only available at the paragraph level with no text-line information. In this work, we demonstrate how to train an HTR system with few labeled data. Specifically, we train a deep convolutional recurrent neural network (CRNN) system on only 10% of manually labeled text-line data from a dataset and propose an incremental training procedure that covers the rest of the data. Performance is further increased by augmenting the training set with specially crafted multi scale data. We also propose a model-based normalization scheme which considers the variability in the writing scale at the recognition phase. We apply this approach to the publicly available READ dataset. Our system achieved the second best result during the ICDAR2017 competition [1].

langue originaleAnglais
titreProceedings - 13th IAPR International Workshop on Document Analysis Systems, DAS 2018
EditeurInstitute of Electrical and Electronics Engineers Inc.
Pages43-48
Nombre de pages6
ISBN (Electronique)9781538633465
Les DOIs
étatPublié - 22 juin 2018
Modification externeOui
Evénement13th IAPR International Workshop on Document Analysis Systems, DAS 2018 - Vienna, Autriche
Durée: 24 avr. 201827 avr. 2018

Série de publications

NomProceedings - 13th IAPR International Workshop on Document Analysis Systems, DAS 2018

Une conférence

Une conférence13th IAPR International Workshop on Document Analysis Systems, DAS 2018
Pays/TerritoireAutriche
La villeVienna
période24/04/1827/04/18

Empreinte digitale

Examiner les sujets de recherche de « Handwriting recognition of historical documents with few labeled data ». Ensemble, ils forment une empreinte digitale unique.

Contient cette citation