**Reference**:- Erik D. Demaine, Alejandro López-Ortiz, and J. Ian Munro, “Frequency Estimation of Internet Packet Streams with Limited Space”, Massachusetts Institute of Technology, August 2002.
**Abstract**:-
We consider a router on the Internet analyzing the statistical properties of a
TCP/IP packet stream. A fundamental difficulty with measuring traffic behavior
on the Internet is that there is simply too much data to be recorded for later
analysis, on the order of gigabytes a second. As a result, network routers can
collect only relatively few statistics about the data. The central problem
addressed here is to use the limited memory of routers to determine essential
features of the network traffic stream. A particularly difficult and
representative subproblem is to determine the top
*k*categories to which the most packets belong, for a desired value of*k*and for a given notion of categorization such as the destination IP address.We present an algorithm that deterministically finds (in particular) all categories having a frequency above 1/(

*m*+1) using*m*counters, which we prove is best possible in the worst case. We also present a sampling-based algorithm for the case that packet categories follow an arbitrary distribution, but their order over time is permuted uniformly at random. Under this model, our algorithm identifies flows above a frequency threshold of roughly 1/√*n**m*with high probability, where*m*is the number of counters and*n*is the number of packets observed. This guarantee is not far off from the ideal of identifying all flows (probability 1/*n*), and we prove that it is best possible up to a logarithmic factor. We show that the algorithm ranks the identified flows according to frequency within any desired constant factor of accuracy. **Length**:- The paper is 14 pages.
**Availability**:- The paper is available in PostScript (286k), gzipped PostScript (116k), and PDF (227k).
- See information on file formats.
- [Google Scholar search]
**Related papers**:- NetworkStats_ESA2002 (Frequency Estimation of Internet Packet Streams with Limited Space)

See also other papers by Erik Demaine.

Last updated January 13, 2020 by Erik Demaine.