13.2: XFastTrie - Searching in Doubly-Logarithmic Time

Last updated
Save as PDF

Page ID: 8491

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

The performance of the BinaryTrie structure is not very impressive. The number of elements, \(\mathtt{n}\), stored in the structure is at most \(2^{\mathtt{w}}\), so \(\log \mathtt{n}\le \mathtt{w}\). In other words, any of the comparison-based SSet structures described in other parts of this book are at least as efficient as a BinaryTrie, and are not restricted to only storing integers.

Next we describe the XFastTrie, which is just a BinaryTrie with \(\mathtt{w+1}\) hash tables--one for each level of the trie. These hash tables are used to speed up the \(\mathtt{find(x)}\) operation to \(O(\log \mathtt{w})\) time. Recall that the \(\mathtt{find(x)}\) operation in a BinaryTrie is almost complete once we reach a node, \(\mathtt{u}\), where the search path for \(\mathtt{x}\) would like to proceed to \(\texttt{u.right}\) (or \(\texttt{u.left}\)) but \(\mathtt{u}\) has no right (respectively, left) child. At this point, the search uses \(\texttt{u.jump}\) to jump to a leaf, \(\mathtt{v}\), of the BinaryTrie and either return \(\mathtt{v}\) or its successor in the linked list of leaves. An XFastTrie speeds up the search process by using binary search on the levels of the trie to locate the node \(\mathtt{u}\).

To use binary search, we need a way to determine if the node \(\mathtt{u}\) we are looking for is above a particular level, \(\mathtt{i}\), of if \(\mathtt{u}\) is at or below level \(\mathtt{i}\). This information is given by the highest-order \(\mathtt{i}\) bits in the binary representation of \(\mathtt{x}\); these bits determine the search path that \(\mathtt{x}\) takes from the root to level \(\mathtt{i}\). For an example, refer to Figure \(\PageIndex{1}\); in this figure the last node, \(\mathtt{u}\), on search path for 14 (whose binary representation is 1110) is the node labelled \(11{\star\star}\) at level 2 because there is no node labelled \(111{\star}\) at level 3. Thus, we can label each node at level \(\mathtt{i}\) with an \(\mathtt{i}\)-bit integer. Then, the node \(\mathtt{u}\) we are searching for would be at or below level \(\mathtt{i}\) if and only if there is a node at level \(\mathtt{i}\) whose label matches the highest-order \(\mathtt{i}\) bits of \(\mathtt{x}\).

Figure \(\PageIndex{1}\): Since there is no node labelled \(111\star\), the search path for 14 (1110) ends at the node labelled \(11{\star\star}\).

In an XFastTrie, we store, for each \(\mathtt{i}\in\{0,\ldots,\mathtt{w}\}\), all the nodes at level \(\mathtt{i}\) in a USet, \(\mathtt{t[i]}\), that is implemented as a hash table (Chapter 5). Using this USet allows us to check in constant expected time if there is a node at level \(\mathtt{i}\) whose label matches the highest-order \(\mathtt{i}\) bits of \(\mathtt{x}\). In fact, we can even find this node using \(\mathtt{t[i]\texttt{.}find(x\text{>>>}(w-i))}\)

The hash tables \(\mathtt{t[0]},\ldots,\mathtt{t[w]}\) allow us to use binary search to find \(\mathtt{u}\). Initially, we know that \(\mathtt{u}\) is at some level \(\mathtt{i}\) with \(0\le \mathtt{i}< \mathtt{w}+1\). We therefore initialize \(\mathtt{l}=0\) and \(\mathtt{h}=\mathtt{w}+1\) and repeatedly look at the hash table \(\mathtt{t[i]}\), where \(\mathtt{i}=\lfloor(\mathtt{l+h})/2\rfloor\). If \(\mathtt{t[i]}\) contains a node whose label matches \(\mathtt{x}\)'s highest-order \(\mathtt{i}\) bits then we set \(\mathtt{l=i}\) ( \(\mathtt{u}\) is at or below level \(\mathtt{i}\)); otherwise we set \(\mathtt{h=i}\) ( \(\mathtt{u}\) is above level \(\mathtt{i}\)). This process terminates when \(\mathtt{h-l}\le 1\), in which case we determine that \(\mathtt{u}\) is at level \(\mathtt{l}\). We then complete the \(\mathtt{find(x)}\) operation using \(\texttt{u.jump}\) and the doubly-linked list of leaves.

    T find(T x) {
        int l = 0, h = w+1, ix = it.intValue(x);
        Node v, u = r, q = newNode();
        while (h-l > 1) {
            int i = (l+h)/2;
            q.prefix = ix >>> w-i;
            if ((v = t[i].find(q)) == null) {
                h = i;
            } else {
                u = v;
                l = i;
            }
        }
        if (l == w) return u.x;
        Node pred = (((ix >>> w-l-1) & 1) == 1) 
                 ? u.jump : u.jump.child[0];
        return (pred.child[next] == dummy) 
                     ? null : pred.child[next].x;
    }

Each iteration of the \(\mathtt{while}\) loop in the above method decreases \(\mathtt{h-l}\) by roughly a factor of two, so this loop finds \(\mathtt{u}\) after \(O(\log \mathtt{w})\) iterations. Each iteration performs a constant amount of work and one \(\mathtt{find(x)}\) operation in a USet, which takes a constant expected amount of time. The remaining work takes only constant time, so the \(\mathtt{find(x)}\) method in an XFastTrie takes only \(O(\log\mathtt{w})\) expected time.

The \(\mathtt{add(x)}\) and \(\mathtt{remove(x)}\) methods for an XFastTrie are almost identical to the same methods in a BinaryTrie. The only modifications are for managing the hash tables \(\mathtt{t[0]}\),..., \(\mathtt{t[w]}\). During the \(\mathtt{add(x)}\) operation, when a new node is created at level \(\mathtt{i}\), this node is added to \(\mathtt{t[i]}\). During a \(\mathtt{remove(x)}\) operation, when a node is removed form level \(\mathtt{i}\), this node is removed from \(\mathtt{t[i]}\). Since adding and removing from a hash table take constant expected time, this does not increase the running times of \(\mathtt{add(x)}\) and \(\mathtt{remove(x)}\) by more than a constant factor. We omit a code listing for \(\mathtt{add(x)}\) and \(\mathtt{remove(x)}\) since the code is almost identical to the (long) code listing already provided for the same methods in a BinaryTrie.

The following theorem summarizes the performance of an XFastTrie:

Theorem \(\PageIndex{1}\).

An XFastTrie implements the SSet interface for \(\mathtt{w}\)-bit integers. An XFastTrie supports the operations

\(\mathtt{add(x)}\) and \(\mathtt{remove(x)}\) in \(O(\mathtt{w})\) expected time per operation and
\(\mathtt{find(x)}\) in \(O(\log \mathtt{w})\) expected time per operation.

The space used by an XFastTrie that stores \(\mathtt{n}\) values is \(O(\mathtt{n}\cdot\mathtt{w})\).

Search

Text Color

Text Size

Margin Size

Font Type