TY - GEN
T1 - Automatic size and pose homogenization with spatial transformer network to improve and accelerate pediatric segmentation
AU - Barbera, Giammarco La
AU - Gori, Pietro
AU - Boussaid, Haithem
AU - Belucci, Bruno
AU - Delmonte, Alessandro
AU - Goulin, Jeanne
AU - Sarnacki, Sabine
AU - Rouet, Laurence
AU - Bloch, Isabelle
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/4/13
Y1 - 2021/4/13
N2 - Due to a high heterogeneity in pose and size and to a limited number of available data, segmentation of pediatric images is challenging for deep learning methods. In this work, we propose a new CNN architecture that is pose and scale invariant thanks to the use of Spatial Transformer Network (STN). Our architecture is composed of three sequential modules that are estimated together during training: (i) a regression module to estimate a similarity matrix to normalize the input image to a reference one; (ii) a differentiable module to find the region of interest to segment; (iii) a segmentation module, based on the popular UNet architecture, to delineate the object. Unlike the original UNet, which strives to learn a complex mapping, including pose and scale variations, from a finite training dataset, our segmentation module learns a simpler mapping focusing on images with normalized pose and size. Furthermore, the use of an automatic bounding box detection through STN allows saving time and especially memory, while keeping similar performance. We test the proposed method in kidney and renal tumor segmentation on abdominal pediatric CT scanners. Results indicate that the estimated STN homogenization of size and pose accelerates the segmentation (25h), compared to standard data-augmentation (33h), while obtaining a similar quality for the kidney (88.01% of Dice score) and improving the renal tumor delineation (from 85.52% to 87.12%).
AB - Due to a high heterogeneity in pose and size and to a limited number of available data, segmentation of pediatric images is challenging for deep learning methods. In this work, we propose a new CNN architecture that is pose and scale invariant thanks to the use of Spatial Transformer Network (STN). Our architecture is composed of three sequential modules that are estimated together during training: (i) a regression module to estimate a similarity matrix to normalize the input image to a reference one; (ii) a differentiable module to find the region of interest to segment; (iii) a segmentation module, based on the popular UNet architecture, to delineate the object. Unlike the original UNet, which strives to learn a complex mapping, including pose and scale variations, from a finite training dataset, our segmentation module learns a simpler mapping focusing on images with normalized pose and size. Furthermore, the use of an automatic bounding box detection through STN allows saving time and especially memory, while keeping similar performance. We test the proposed method in kidney and renal tumor segmentation on abdominal pediatric CT scanners. Results indicate that the estimated STN homogenization of size and pose accelerates the segmentation (25h), compared to standard data-augmentation (33h), while obtaining a similar quality for the kidney (88.01% of Dice score) and improving the renal tumor delineation (from 85.52% to 87.12%).
KW - Data augmentation
KW - Kidney
KW - Pediatric
KW - Pose size normalization
KW - Renal tumor
KW - STN
KW - Segmentation
U2 - 10.1109/ISBI48211.2021.9434090
DO - 10.1109/ISBI48211.2021.9434090
M3 - Conference contribution
AN - SCOPUS:85107234371
T3 - Proceedings - International Symposium on Biomedical Imaging
SP - 1773
EP - 1776
BT - 2021 IEEE 18th International Symposium on Biomedical Imaging, ISBI 2021
PB - IEEE Computer Society
T2 - 18th IEEE International Symposium on Biomedical Imaging, ISBI 2021
Y2 - 13 April 2021 through 16 April 2021
ER -