Paper by Erik D. Demaine
- Reference:
- Erik D. Demaine, Alejandro López-Ortiz, and J. Ian Munro, “Experiments on Adaptive Set Intersections for Text Retrieval Systems”, in Proceedings of the 3rd Workshop on Algorithm Engineering and Experiments (ALENEX 2001), Lecture Notes in Computer Science, volume 2153, Washington, DC, January 5–6, 2001, pages 91–104.
- Abstract:
-
In [3] we introduced an adaptive algorithm for computing the intersection of
k sorted sets within a factor of at most 8k comparisons of the
information-theoretic lower bound under a model that deals with an encoding of
the shortest proof of the answer. This adaptive algorithm performs better
for “burstier” inputs than a straightforward worst-case optimal
method. Indeed, we have shown that, subject to a reasonable measure of
instance difficulty, the algorithm adapts optimally up to a constant factor.
This paper explores how this algorithm behaves under actual data
distributions, compared with standard algorithms. We present experiments for
searching 114 megabytes of text from the World Wide Web using 5,000 actual
user queries from a commercial search engine. From the experiments, it is
observed that the theoretically optimal adaptive algorithm is not always the
optimal in practice, given the distribution of WWW text data. We then proceed
to study several improvement techniques for the standard algorithms. These
techniques combine improvements suggested by the observed distribution of the
data as well as the theoretical results from [3]. We perform controlled
experiments on these techniques to determine which ones result in improved
performance, resulting in an algorithm that outperforms existing algorithms in
most cases.
- Comments:
- This paper is also available from the electronic LNCS volume as http://link.springer.de/link/service/series/0558/papers/2153/21530091.pdf.
- Copyright:
- The paper is \copyright Springer-Verlag.
- Length:
- The paper is 14 pages.
- Availability:
- The paper is available in PostScript (1108k) and gzipped PostScript (263k).
- See information on file formats.
- [Google Scholar search]
- Related papers:
- SODA2000 (Adaptive Set Intersections, Unions, and Differences)
See also other papers by Erik Demaine.
These pages are generated automagically from a
BibTeX file.
Last updated July 23, 2024 by
Erik Demaine.