Abstract
Handwriting recognition systems rely on predefined dictionaries. Small and static dictionaries are often exploited to obtain high in-vocabulary (IV) accuracy at the expense of coverage. Thus the recognition of out-of-vocabulary (OOV) words is not handled efficiently. To improve OOV recognition while keeping IV dictionaries small, we introduce a multi-step approach that exploits web resources. After an IV-OOV classification, Wikipedia is used to create OOV sequence-adapted dynamic dictionaries. A second decoding is done the dynamic dictionary to determine the most probable word for the OOV sequence. We validate our approach with experiments conducted on the RIMES dataset using a BLSTM recognizer. Results show that improvements are obtained compared to handwriting recognition with static dictionary.
| Original language | French |
|---|---|
| Pages (from-to) | 77-96 |
| Number of pages | 20 |
| Journal | Document Numerique |
| Volume | 17 |
| Issue number | 3 |
| DOIs | |
| Publication status | Published - 1 Jan 2014 |
| Externally published | Yes |