TY - GEN
T1 - Evidential grammars for image interpretation-application to multimodal traffic scene understanding
AU - Bordes, Jean Baptiste
AU - Davoine, Franck
AU - Xu, Philippe
AU - Denœux, Thierry
PY - 2013/1/1
Y1 - 2013/1/1
N2 - In this paper, an original framework for grammar-based image understanding handling uncertainty is presented. The method takes as input an over-segmented image, every segment of which has been annotated during a first stage of image classification. Moreover, we assume that for every segment, the output class may be uncertain and represented by a belief function over all the possible classes. Production rules are also supposed to be provided by experts to define the decomposition of a scene into objects, as well as the decomposition of every object into its components. The originality of our framework is to make it possible to deal with uncertainty in the decomposition, which is particularly useful when the relative frequencies of the production rules cannot be estimated properly. As in traditional visual grammar approaches, the goal is to build the "parse graph" of a test image, which is its hierarchical decomposition from the scene, to objects and parts of objects while taking into account the spatial layout. In this paper, we show that the parse graph of an image can be modelled as an evidential network, and we detail a method to apply a bottom-up inference in this network. A consistency criterion is defined for any parse tree, and the search of the optimal interpretation of an image formulated as an optimization problem. The work was validated on real and publicly available urban driving scene data.
AB - In this paper, an original framework for grammar-based image understanding handling uncertainty is presented. The method takes as input an over-segmented image, every segment of which has been annotated during a first stage of image classification. Moreover, we assume that for every segment, the output class may be uncertain and represented by a belief function over all the possible classes. Production rules are also supposed to be provided by experts to define the decomposition of a scene into objects, as well as the decomposition of every object into its components. The originality of our framework is to make it possible to deal with uncertainty in the decomposition, which is particularly useful when the relative frequencies of the production rules cannot be estimated properly. As in traditional visual grammar approaches, the goal is to build the "parse graph" of a test image, which is its hierarchical decomposition from the scene, to objects and parts of objects while taking into account the spatial layout. In this paper, we show that the parse graph of an image can be modelled as an evidential network, and we detail a method to apply a bottom-up inference in this network. A consistency criterion is defined for any parse tree, and the search of the optimal interpretation of an image formulated as an optimization problem. The work was validated on real and publicly available urban driving scene data.
KW - Belief functions
KW - Image understanding
KW - Visual grammars
U2 - 10.1007/978-3-642-39515-4_6
DO - 10.1007/978-3-642-39515-4_6
M3 - Conference contribution
AN - SCOPUS:84880070630
SN - 9783642395147
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 65
EP - 78
BT - Integrated Uncertainty in Knowledge Modelling and Decision Making - International Symposium, IUKM 2013, Proceedings
PB - Springer Verlag
T2 - 2013 International Symposium on Integrated Uncertainty in Knowledge Modelling and Decision Making, IUKM 2013
Y2 - 12 July 2013 through 14 July 2013
ER -