CS 497 -- Randomized Algorithms

Sariel Har-Peled

Homework 3, Due 3/6

Collaboration on solving homeworks is strongly encouraged. However, each person must independently write up their own solution. You must list all collaborators on your homework.

You must not cooperate with people you already cooperated with in previous homeworks. Namely, form new groups to work on this exercise.

  1. MR 8.12. Let's analyze the number of random bits needed to implement the operations of a treap. Suppose we pick a priority $p_i$ at random from the unit interval. Then the binary representation of each $p_i$ can be generated as a potentially infinite series of bits that are the outcome of unbiased coin flips. The idea is to generate only as many bits in this sequence as is necessary for resolving comparisons between different priorities. Suppose we have only generated some prefixes of the binary representations of the priorities of the elements in the treap $T$. Now, while inserting an item $y$, we compare its priority $p_y$ to other's priorities to determine how $y$ should be rotated. While comparing $p_y$ to some $p_i$, if their current partial binary representation can resolve the comparison, then we are done. Otherwise, the have the same partial binary representations (upto the length of the shorter of the two) and we keep generating more bits for each until they first differ.
    Compute a tight upper bound on the expected number of coin flips or random bits needed for each priority comparison.

    Compute a tight upper bound on the expected number of bits generated during a single update operation. Warning: multiplying number of priority comparisons by the number of bits per comparison is making an ill-founded independence assumption.

    Generating bits one at a time like this is probably a bad idea in practice. Give a more practical scheme that generates the priorities in advance, using a small number of random bits, given an upper bound $n$ on the treap size.

  2. (Based on MR 8.10 and 8.11) Suppose you are told that for your set of items $S=x_1,\ldots,x_n$, item $x_i$ is going to be accessed $w_i$ times, but in some unknown order. Let $W=\sum w_i$. A data structuring theorem says that (modulo certain assumptions that exclude hashing) your expected time to satisfy all the access requests is

\Omega \left(\sum_i (1+w_i \log W/w_i)\right)

    An alternative way to say this is that when each access requests $x_i$ with probability $p_i$, the expected access time (per access) is

\Omega(-\sum_i p_i \log p_i),

    that is, the entropy of the access distribution. Treaps can be modified to meet this lower bound on access time, as follows. Define a random weighted treap as a treap obtained by choosing priorities for each $x_i \in
S$ as the maximum of $w_i$ independent samples from a continuous (say uniform $[0,1]$) distribution. Maintain the treap as for the unweighted case.

    Suppose that the weights $w_i$ are known in advance. Describe how you can easily generate a priority for item $i$ from the right distribution. Do something more efficient than sampling $w_i$ times from the uniform distribution. You can use an oracle that gives you uniform samples from $[0,1]$, and you can perform infinite precision arithmetic on these samples in unit time.

    Now explain how weighted treaps can be made to adapt to the observed frequency of access of the elements in the treaps and yield the same asymptotic performance. That is, after $w_i$ accesses, item $i$ should have priority for an item of weight $w_i$. Give a solution that does not explicitly keep track of the observed frequency.

    Give a different weight-adapting solution that will use no more random bits than in the case where the frequencies are known in advance.

  3. Show that there is no perfect hash family mapping from $m$ to $n$ of size polynomial in $m$ if $2n \le m
\le 2^{o(n)}$.

  4. The Aethernet corporation has developed an exciting new protocol to let a collection of $m$ machines communicate on a shared wire. Time is divided into fixed slots. In a slot, any machine that wants to send information to another can broadcast it onto the wire. If exactly one request is broadcast in the slot, the broadcast is successful (and that machine finishes). If more than one broadcast is attempted, none are successful.
    Assuming that the number of machines $n$ that wish to broadcast is known, devise a randomized protocol that will, with no communication, let some broadcast succeed with constant probability in a time slot. That is, we want $\Pr[\mbox{some
broadcast succeeds}]=\Omega(1)$.

    Assume that there is a fixed set of $n$ machines, each of which repeatedly tries (in every slot) to send (an infinite sequence of) packets using the protocol from the previous part. What is the expected time until every machine sends its first packet?

    For this subproblem and the remainder, assume that in each time slot, some set of $n$ machines decides to broadcast a packet. However, in each time slot, the set of machines might be different.

    Prove that over a sufficient length of time (number of slots), with high probability every machine broadcasts at least $\Omega(1/n)$ of the times that it wants to (thus achieving a fair share of the bandwidth on the wire). In other words, show that if there are $s$ slots in which $M$ is one of the $n$ machines trying to broadcast, then with high probability, $M$ broadcasts in $\Omega(s/n)$ slots.

    For a competitive advantage, Aethernet believes they can cut costs by using fewer random bits to make broadcast decisions. Assume that each of the $m$ machines has a distinct integer serial number in $1,\ldots,m$ ($m \geq n$). Assume that Aethernet corp can can broadcast to the entire group of machines a single $O(\log m)$-bit random number per slot, but that no other randomness is used. And assume again that $n$ is known. Devise a scheme that achieves the same result as part (a) under this model. That is, in each slot achieve $\Pr[\mbox{some
broadcast succeeds}]=\Omega(1)$. (Optional: how can such sharing of a limited number of random bits per round actually be accomplished in this system model without involving Aethernet corp?)

    In fact, $O(\log m)$ bits are all that is needed, forever! Prove the following. By distributing a single $O(\log m)$-bit value to all the machines when we turn on the system, we can run the above protocol (or something like it) such that any particular machine that tries to broadcast in every slot will, over some polynomial (in $n$) length of time, get to do so in at least $\Omega(1/n)$ of the slots with high probability.

  5. MR 4.13. The set cover problem is the following: given sets $S_1,\ldots,S_n$ over a universe $U$, find a minimum $T \subseteq U$ such that $T$ intersects every $S_i$. An alternative formulation is the following: given a $0$-$1$ matrix $M$, find a $0$-$1$ vector $c$ with the fewest possible $1$'s such that the dot product of $c$ with every row of $M$ is positive. The matrix $M$ has $n$ rows, and the $i^{th}$ row is the incidence vector for set $S_i$.

    Given matrix $M$, let $C(M)$ be the size of the smallest set cover, and let $n$ be the number of rows in $M$. Show we can adapt the linear-programming/randomized rounding technique to get an $O(\log n)$ approximation to the minimum set cover.

Sariel Har-Peled 2001-02-20
Home Bookmarks Papers Blog