Towards Tight Bounds for the Streaming Set Cover Problem
$\renewcommand{\Re}{{\rm I\!\hspace{-0.025em} R}}
\newcommand{\tldOmega}{\widetilde{\Omega}}%
\newcommand{\tldO}{\widetilde{O}}
\newcommand{\SetX}{\mathsf{X}}
\newcommand{\eps}{\varepsilon}
\newcommand{\VorX}[1]{\mathcal{V} \pth{#1}}
\newcommand{\Polygon}{\mathsf{P}}
\newcommand{\IntRange}[1]{[ #1 ]}
\newcommand{\Space}{\overline{\mathsf{m}}}
\newcommand{\pth}[2][\!]{#1\left({#2}\right)}$

Sariel Har-Peled,
Piotr Indyk,
Sepideh Mahabadi,
and
Ali Vakilian.
We consider the classic SetCover problem in the data stream
model. For $n$ elements and $m$ sets ($m\geq n$) we give a
$O(1/\delta)$-pass algorithm with a strongly sub-linear
$\tldO(mn^{\delta})$ space and logarithmic approximation factor. This
yields a significant improvement over the earlier algorithm of Demaine
etal that uses exponentially larger number of passes. We complement
this result by showing that the tradeoff between the number of passes
and space exhibited by our algorithm is tight, at least when the
approximation factor is equal to $1$. Specifically, we show that any
algorithm that computes set cover exactly using $({1 \over
2\delta}-1)$ passes must use $\tldOmega(mn^{\delta})$ space in the
regime of $m=O(n)$. Furthermore, we consider the problem in the
geometric setting where the elements are points in $\Re^2$ and sets
are either discs, axis-parallel rectangles, or fat triangles in the
plane, and show that our algorithm (with a slight modification) uses
the optimal $\tldO(n)$ space to find a logarithmic approximation in
$O(1/\delta)$ passes.
Finally, we show that any randomized one-pass algorithm that
distinguishes between covers of size 2 and 3 must use a linear (i.e.,
$\Omega(mn)$) amount of space. This is the first result showing that
a randomized, approximate algorithm cannot achieve a space bound that
is sublinear in the input size.
This indicates that using multiple passes might be necessary in order
to achieve sub-linear space bounds for this problem while guaranteeing
small approximation factors.
PDF.
Last modified: Mon Jul 6 13:05:52 CDT 2015