TY - GEN
T1 - Counting, generating and sampling tree alignments
AU - Chauve, Cedric
AU - Courtiel, Julien
AU - Ponty, Yann
N1 - Publisher Copyright:
© Springer International Publishing Switzerland 2016.
PY - 2016/1/1
Y1 - 2016/1/1
N2 - Pairwise ordered tree alignment are combinatorial objects that appear in RNA secondary structure comparison. However, the usual representation of tree alignments as supertrees is ambiguous, i.e. two distinct supertrees may induce identical sets of matches between identical pairs of trees. This ambiguity is uninformative, and detrimental to any probabilistic analysis. In this work, we consider tree alignments up to equivalence. Our first result is a precise asymptotic enumeration of tree alignments, obtained from a context-free grammar by means of basic analytic combinatorics. Our second result focuses on alignments between two given ordered trees. By refining our grammar to align specific trees, we obtain a decomposition scheme for the space of alignments, and use it to design an efficient dynamic programming algorithm for sampling alignments under the Gibbs-Boltzmann probability distribution. This generalizes existing tree alignment algorithms, and opens the door for a probabilistic analysis of the space of suboptimal RNA secondary structures alignments.
AB - Pairwise ordered tree alignment are combinatorial objects that appear in RNA secondary structure comparison. However, the usual representation of tree alignments as supertrees is ambiguous, i.e. two distinct supertrees may induce identical sets of matches between identical pairs of trees. This ambiguity is uninformative, and detrimental to any probabilistic analysis. In this work, we consider tree alignments up to equivalence. Our first result is a precise asymptotic enumeration of tree alignments, obtained from a context-free grammar by means of basic analytic combinatorics. Our second result focuses on alignments between two given ordered trees. By refining our grammar to align specific trees, we obtain a decomposition scheme for the space of alignments, and use it to design an efficient dynamic programming algorithm for sampling alignments under the Gibbs-Boltzmann probability distribution. This generalizes existing tree alignment algorithms, and opens the door for a probabilistic analysis of the space of suboptimal RNA secondary structures alignments.
KW - Dynamic programming
KW - RNA secondary structure
KW - Tree alignment
U2 - 10.1007/978-3-319-38827-4_5
DO - 10.1007/978-3-319-38827-4_5
M3 - Conference contribution
AN - SCOPUS:84979076084
SN - 9783319388267
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 53
EP - 64
BT - Algorithms for Computational Biology - 3rd International Conference, AlCoB 2016, Proceedings
A2 - Vega-Rodríguez, Miguel A.
A2 - Santander-Jiménez, Sergio
A2 - Botón-Fernández, María
A2 - Martín-Vide, Carlos
PB - Springer Verlag
T2 - 3rd International Conference on Algorithms for Computational Biology, AlCoB 2016
Y2 - 21 June 2016 through 22 June 2016
ER -