Paper by Erik D. Demaine

Reference:
Erik D. Demaine, Stefan Langerman, and Eric Price, “Confluently Persistent Tries for Efficient Version Control”, Algorithmica, volume 57, number 3, 2010, pages 462–483. Special issue of selected papers from 11th Scandinavian Workshop on Algorithm Theory, 2008.

Abstract:
We consider a data-structural problem motivated by version control of a hierarchical directory structure in a system like Subversion. The model is that directories and files can be moved and copied between two arbitrary versions in addition to being added or removed in an arbitrary version. Equivalently, we wish to maintain a confluently persistent trie (where internal nodes represent directories, leaves represent files, and edge labels represent path names), subject to copying a subtree between two arbitrary versions, adding a new child to an existing node, and deleting an existing subtree in an arbitrary version.

Our first data structure represents an n-node degree-Δ trie with O(1) “fingers” in each version while supporting finger movement (navigation) and modifications near the fingers (including subtree copy) in O(lg Δ) time and space per operation. This data structure is essentially a locality-sensitive version of the standard practice—path copying—costing O(d lg Δ) time and space for modification of a node at depth d, which is expensive when performing many deep but nearby updates. Our second data structure supporting finger movement in O(lg Δ) time and no space, while modifications take O(lg n) time and space. This data structure is substantially faster for deep updates, i.e., unbalanced tries. Both of these data structures are functional, which is a stronger property than confluent persistence. Without this stronger property, we show how both data structures can be sped up to support movement in O(lg lg Δ), which is essentially optimal. Along the way, we present a general technique for global rebuilding of fully persistent data structures, which is nontrivial because amortization and persistence do not usually mix. In particular, this technique improves the best previous result for fully persistent arrays and obtains the first efficient fully persistent hash table.

Comments:
This paper is also available from SpringerLink.

Copyright:
Copyright held by the authors.

Length:
The paper is 18 pages.

Availability:
The paper is available in PostScript (1497k), gzipped PostScript (287k), and PDF (336k).
See information on file formats.
[Google Scholar search]

Related papers:
ConfluentTries_SWAT2008 (Confluently Persistent Tries for Efficient Version Control)


See also other papers by Erik Demaine.
These pages are generated automagically from a BibTeX file.
Last updated July 23, 2024 by Erik Demaine.