Abstract
This paper reports a new approach applied to convolutional neural networks (CNNs), which uses spatial transformer networks (STNs). It consists in training an architecture which combines a localization CNN and a classification CNN, for which most of the weights are tied, which from here on we will name Tied Spatial Transformer Networks (TSTNs). The localization CNN is used for predicting the best affine transform for the input image, which is then processed according to the predicted parameters and passed through the classification CNN. We have conducted initial experiments on the cluttered MNIST dataset, comparing the TSTN and Spatial Transformer Networks (STN) with untied weights, as well as the classification CNN only. In all these cases, we obtain better results using the TSTN architecture.
| Original language | English |
|---|---|
| Pages | 285-294 |
| Number of pages | 10 |
| Publication status | Published - 1 Jan 2016 |
| Externally published | Yes |
| Event | Conference en Recherche d'Informations et Applications, CORIA 2016, 13th French Information Retrieval Conference, CIFED 2016, et Colloque International Francophone sur l'Ecrit et le Document - Conference on Information Retrieval and its Applications, CORIA 2016, 13th French Information Retrieval Conference, CIFED 2016, and International French-Speaking Colloquium on Writing and Documentation - Toulouse, France Duration: 9 Mar 2016 → 11 Mar 2016 |
Conference
| Conference | Conference en Recherche d'Informations et Applications, CORIA 2016, 13th French Information Retrieval Conference, CIFED 2016, et Colloque International Francophone sur l'Ecrit et le Document - Conference on Information Retrieval and its Applications, CORIA 2016, 13th French Information Retrieval Conference, CIFED 2016, and International French-Speaking Colloquium on Writing and Documentation |
|---|---|
| Country/Territory | France |
| City | Toulouse |
| Period | 9/03/16 → 11/03/16 |
Keywords
- Character recognition
- Convolutional neural network
- Deep learning