TY - GEN
T1 - Uncertain version control in open collaborative editing of tree-structured documents
AU - Ba, M. Lamine
AU - Abdessalem, Talel
AU - Senellart, Pierre
PY - 2013/1/1
Y1 - 2013/1/1
N2 - In order to ease content enrichment, exchange, and sharing, web-scale collaborative platforms such as Wikipedia or Google Docs enable unbounded interactions between a large number of contributors, without prior knowledge of their level of expertise and reliability. Version control is then essential for keeping track of the evolution of the shared content and its provenance. In such environments, uncertainty is ubiquitous due to the unreliability of the sources, the incompleteness and imprecision of the contributions, the possibility of malicious editing and vandalism acts, etc. To handle this uncertainty, we use a probabilistic XML model as a basic component of our version control framework. Each version of a shared document is represented by an XML tree and the whole document, together with its different versions, is modeled as a probabilistic XML document. Uncertainty is evaluated using the probabilistic model and the reliability measure associated to each source, each contributor, or each editing event, resulting in an uncertainty measure on each version and each part of the document. We show that standard version control operations can be implemented directly as operations on the probabilistic XML model; efficiency with respect to deterministic version control systems is demonstrated on real-world datasets.
AB - In order to ease content enrichment, exchange, and sharing, web-scale collaborative platforms such as Wikipedia or Google Docs enable unbounded interactions between a large number of contributors, without prior knowledge of their level of expertise and reliability. Version control is then essential for keeping track of the evolution of the shared content and its provenance. In such environments, uncertainty is ubiquitous due to the unreliability of the sources, the incompleteness and imprecision of the contributions, the possibility of malicious editing and vandalism acts, etc. To handle this uncertainty, we use a probabilistic XML model as a basic component of our version control framework. Each version of a shared document is represented by an XML tree and the whole document, together with its different versions, is modeled as a probabilistic XML document. Uncertainty is evaluated using the probabilistic model and the reliability measure associated to each source, each contributor, or each editing event, resulting in an uncertainty measure on each version and each part of the document. We show that standard version control operations can be implemented directly as operations on the probabilistic XML model; efficiency with respect to deterministic version control systems is demonstrated on real-world datasets.
KW - collaborative work
KW - uncertain data
KW - version control
KW - xml
U2 - 10.1145/2494266.2494277
DO - 10.1145/2494266.2494277
M3 - Conference contribution
AN - SCOPUS:84887354957
SN - 9781450317894
T3 - DocEng 2013 - Proceedings of the 2013 ACM Symposium on Document Engineering
SP - 27
EP - 36
BT - DocEng 2013 - Proceedings of the 2013 ACM Symposium on Document Engineering
PB - Association for Computing Machinery
T2 - 2013 ACM Symposium on Document Engineering, DocEng 2013
Y2 - 10 September 2013 through 13 September 2013
ER -