Randomized Algorithms


Date: February 1, 2001

"Wir müssen wissen, wir werden wissen" (We must know, we shall know)

                        --- David Hilbert

Tail Inequalities

The Chernoff Bound

  1. $ X_1, \ldots, X_n$ - $ n$ independent Bernoulli triangles, where

    $\displaystyle \Pr[ X_i = 1 ] =p_i,$    and $\displaystyle \Pr[ X_i = 0 ] =q_i = 1-p_i.
$

    Each $ X_i$ is known as a Poisson trials.

  2. $ X = \sum_{i=1}^{b} X_i$. $ \mu = E[X] = \sum_i
p_i$.

  3. Question: Probability that $ X > (1+\delta)\mu$?


\begin{theorem}
% latex2html id marker 21For any $\delta > 0$,
\begin{displa...
...delta}{(1+\delta)^{1+\delta}}}^\mu.
\end{displaymath} \theolab{1}
\end{theorem}

Proof.

$\displaystyle \Pr[ X > (1+\delta)\mu ] = \Pr \pbrc{
e^{t X} > e^{t(1+\delta)\mu}}.
$

By Markov inequality, we have:

$\displaystyle \Pr \pbrc{ X > (1+\delta)\mu } < \frac{E
\pbrc{e^{t X}}}{e^{t(1+\delta)\mu}}
$

On the other hand,

$\displaystyle E[e^{t X} ] = E \pbrc{ e^{t(X_1 + X_2 \ldots + X_n)} }
= E \pbrc{e^{t X_1}} \cdots E \pbrc{e^{t X_n}}.
$

Namely,

$\displaystyle \Pr \pbrc{ X > (1+\delta)\mu } < \frac{\prod_{i=1}^n E
\pbrc{e^{t...
...)\mu}} =
\frac{\prod_{i=1}^n \pth{ (1+ p_i(e^{t} -1 ) }}
{e^{t(1+\delta)\mu}}.
$

Let $ y = p_i (e^{t} - 1 )$. We know that $ 1+y < e^y$ (since $ y > 0$). Thus,
$\displaystyle \Pr \pbrc{ X > (1+\delta)\mu }$ $\displaystyle <$ $\displaystyle \frac{\prod_{i=1}^n \pth{ \exp( p_i(e^{t} -1 )) }}
{e^{t(1+\delta)\mu}}
.=
\frac{\exp \pth{ \sum_{i=1}^n p_i(e^{t} -1 ) }}
{e^{t(1+\delta)\mu}}$  
  $\displaystyle =$ $\displaystyle \frac{\exp \pth{ (e^{t} -1 )\sum_{i=1}^n p_i }}
{e^{t(1+\delta)\m...
...^{t(1+\delta)\mu}}
=
\pth{\frac{\exp \pth{ (e^{t} -1 )}}
{e^{t(1+\delta)}}}^\mu$  
  $\displaystyle =$ $\displaystyle \pth{\frac{\exp \pth{ \delta }}
{(1+\delta)^{(1+\delta)}}}^\mu,$  

if we set $ t = \log ( 1+\delta)$. $ \qedsymbol$


\begin{defn}
$F^+( \mu, \delta )
= \pbrc{\frac{e^{\delta}}{(1+\delta)^{(1+\delta)}}}^\mu$.
\end{defn}


\begin{example}
Arkansas Aardvarks win a game with probability $1/3$.
What is ...
...on converge to its expectation,
and this converge is exponential.
\end{example}


\begin{exercise}
Prove that for $\delta > 2e -1$, we have
\begin{displaymath}
...
...mu} \leq 2^{-(1+\delta)\mu}.
\end{displaymath} \exelab{Chernoff}
\end{exercise}


\begin{theorem}
% latex2html id marker 85Under the same assumptions as \theor...
...
\Pr[ X < (1-\delta)\mu ] < e^{-\mu\delta^2/2}.
\end{displaymath}\end{theorem}


\begin{defn}
$F^-( \mu, \delta ) = {e^{-\mu\delta^2/2}}$.
\par$\Delta^-(\mu, \e...
...lta^+( \mu,\eps) < \frac{\log_2{(1/\eps)}}{\mu} -1
\end{displaymath}\end{defn}

4.2 - Routing in a Parallel Computer

$ G$: A graph of processors. Packets can be sent on edges.

$ [1, \ldots, N]$: The vertices (i.e., processors) of $ G$.

$ N = 2^n$, and $ G$ is a hypercube. Each processes is a binary string $ b_1b_2\ldots b_n$.

Question: Given a permutation $ \pi$, how to send the permutation and create minimum delay?


\begin{theorem}
For any deterministic oblivious permutation routing
algorithm ...
... the routing of the
permutation takes $\Omega(\sqrt{N/n})$\ time.
\end{theorem}

How do we sent a packet? We use bit fixing. Namely, the packet from the $ i$ node, always go to the current adjacent node that have the first different bit as we scan the destination string $ d(i)$. For example, packet from $ (0000)$ going to $ (1101)$, would pass through $ (1000)$, $ (1100)$, $ (1101)$.

We assume each edge have a FIFO queue. Here is the algorithm:

(i)
Pick a random intermediate destination $ \sigma(i)$ from $ [1, \ldots, N]$. Packet $ v_i$ travels to $ \sigma(i)$.

(ii)
Wait till all the packet arrive to their intermediate destination.

(iii)
Packet $ v_i$ travels from $ \sigma(i)$ to its destination $ d(i)$.

We analyze only (i) as (iii) follows from the same analysis. $ \rho_i$ - the route taken by $ v_i$ in (i).


\begin{exercise}
Once a packet $v_j$\ can not leave $\rho_i$, and then
join it again. Namely, $\rho_i \cap \rho_j$\ is (maybe an
empty) path.
\end{exercise}


\begin{lemma}
Let the route of $v_i$\ follow the sequence of edges
$\rho_i = (...
... e_k)$. Then, the delay incurred by $v_i$ is at most $\vert S\vert$.
\end{lemma}

$ H_{i j}$ - 1 if $ \rho_i, \rho_j$ share an edge, 0 otherwise. Total delay for $ v_i$ is $ \sum_{j} H_{i j}$. Note, that for a fixed $ i$, the variables $ H_{i1}, \ldots, H_{i N}$ are independent (not however, that $ H_{11}, \ldots, H_{N N}$ are not independent!). $ \rho_i = (e_1, \ldots, e_k)$, let $ T(e)$ be the number of routes that pass through $ e$.

$\displaystyle \sum_{j=1}^{N} H_{i j} \leq \sum_{j=1}^{k} T(e_j)$    and thus $\displaystyle E \pbrc{ \sum_{j=1}^{N} H_{i j}} \leq E \pbrc{
\sum_{j=1}^{k} T(e_j)}.
$

Because of symmetry, the variables $ T(e)$ have the same distribution for all the edges of $ G$. On the other hand, the expected length of a path is $ n/2$, there are $ N$ routes, and there are $ N n$ edges. We conclude $ E[ T(e)] =
1/2$. Thus

$\displaystyle \mu = E \pbrc{ \sum_{j=1}^{N} H_{i j}} \leq E \pbrc{
\sum_{j=1}^{k} T(e_j)} \leq \frac{n}{2}.
$

By the Chernoff inequality Chernoff, we have

$\displaystyle \Pr \pbrc{ \sum_{j} H_{i j} > 7n }
\leq
\Pr \pbrc{ \sum_{j} H_{i j} > (1+13) \mu } < 2^{-13\mu} \leq 2^{-6n}.
$

Since there are $ N = 2^n$ packets, we know that with probability $ \leq 2^{-5n}$ all packets arrive to their temporary destination in a delay of most $ 7n$.


\begin{theorem}
Each packet arrives to its destination in $\leq 14n$\ stages,
...
...bility at least $1 - 1/N$\ (note that this is
very conservative).
\end{theorem}



Sariel Har-Peled 2001-02-01
Home Bookmarks Papers Blog