Home Bookmarks Papers Blog

Approximate Sparse Linear Regression $\renewcommand{\Re}{{\rm I\!\hspace{-0.025em} R}} \newcommand{\SetX}{\mathsf{X}} \newcommand{\norm}[1]{\left\|#1\right\|}% \newcommand{\tldO}{{\widetilde{O}}}% \newcommand{\vecA}{\tau} \newcommand{\Matrix}{M}% \newcommand{\rad}{r} \newcommand{\query}{q} \newcommand{\eps}{\varepsilon} \newcommand{\VorX}[1]{\mathcal{V} \pth{#1}} \newcommand{\Polygon}{\mathsf{P}} \newcommand{\IntRange}[1]{[ #1 ]} \newcommand{\Space}{\overline{\mathsf{m}}} \newcommand{\pth}[2][\!]{#1\left({#2}\right)} \newcommand{\polylog}{\mathrm{polylog}}$

Sariel Har-Peled, Piotr Indyk, and Sepideh Mahabadi.
In the Sparse Linear Regression (SLR) problem, given a $d \times n$ matrix $\Matrix$ and a $d$-dimensional query $\query$, we want to compute a $k$-sparse $n$-dimensional vector $\vecA$ such that the error $\norm{\Matrix \vecA-\query}$ is minimized. In this paper, we present data-structures/algorithms and conditional lower bounds for several sparse variants of this problem.

Specifically, we present approximation algorithms for the online variants of all these problems with query time $\tldO(n^{k-1})$. In the "low sparsity regime" where $k$ is small, e.g., $2$ or $3$, this provides a significant improvement over the naive algorithm that exhaustively searches all ${ n \choose k}$ subsets. For $k=d-1$ this matches, up to polylogarithmic factors, the lower bound that relies on the affinely degenerate conjecture (i.e., deciding if $n$ points in $\Re^d$ contains $d+1$ contained in a hyperplane takes $\Omega(n^d)$ time). We show also somewhat weaker lower bounds that relies on Hopcroft's problem, and the $k$-sum problem.

Last modified: Sun 2018-09-02 04:18:07 UTC 2018 by Sariel Har-Peled