TY - GEN
T1 - Handwritten word preprocessing for database adaptation
AU - Oprean, Cristina
AU - Likforman-Sulem, Laurence
AU - Mokbel, Chafic
PY - 2013/4/10
Y1 - 2013/4/10
N2 - Handwriting recognition systems are typically trained using publicly available databases, where data have been collected in controlled conditions (image resolution, paper background, noise level,...). Since this is not often the case in real-world scenarios, classification performance can be affected when novel data is presented to the word recognition system. To overcome this problem, we present in this paper a new approach called database adaptation. It consists of processing one set (training or test) in order to adapt it to the other set (test or training, respectively). Specifically, two kinds of preprocessing, namely stroke thickness normalization and pixel intensity normalization are considered. The advantage of such approach is that we can re-use the existing recognition system trained on controlled data. We conduct several experiments with the Rimes 2011 word database and with a real-world database. We adapt either the test set or the training set. Results show that training set adaptation achieves better results than test set adaptation, at the cost of a second training stage on the adapted data. Accuracy of data set adaptation is increased by 2% to 3% in absolute value over no adaptation.
AB - Handwriting recognition systems are typically trained using publicly available databases, where data have been collected in controlled conditions (image resolution, paper background, noise level,...). Since this is not often the case in real-world scenarios, classification performance can be affected when novel data is presented to the word recognition system. To overcome this problem, we present in this paper a new approach called database adaptation. It consists of processing one set (training or test) in order to adapt it to the other set (test or training, respectively). Specifically, two kinds of preprocessing, namely stroke thickness normalization and pixel intensity normalization are considered. The advantage of such approach is that we can re-use the existing recognition system trained on controlled data. We conduct several experiments with the Rimes 2011 word database and with a real-world database. We adapt either the test set or the training set. Results show that training set adaptation achieves better results than test set adaptation, at the cost of a second training stage on the adapted data. Accuracy of data set adaptation is increased by 2% to 3% in absolute value over no adaptation.
KW - Handwritten word recognition
KW - database adaptation
KW - word preprocessing
UR - https://www.scopus.com/pages/publications/84875849573
U2 - 10.1117/12.2004312
DO - 10.1117/12.2004312
M3 - Conference contribution
AN - SCOPUS:84875849573
SN - 9780819494313
T3 - Proceedings of SPIE - The International Society for Optical Engineering
BT - Proceedings of SPIE-IS and T Electronic Imaging - Document Recognition and Retrieval XX
T2 - Document Recognition and Retrieval XX
Y2 - 5 February 2013 through 7 February 2013
ER -