Algo Scribe

10
CSE548 & AMS542: Analysis of Algorithms Lecturer: Rezaul A. Chowdhury Topic: Randomized Quicksort Analysis Date: 16 april 2014 Scribe: Atluri Vamsi Krishna 20.1 Chernoff Bounds: Revisited with an Example Lower bounds Upper Bounds Pr[X (1 - δ)μ] e -δ (1-δ) (1-δ) μ Pr[X (1 + δ)μ] e δ (1+δ) (1+δ) μ Pr[X (1 - δ)μ] e -μδ 2 3 Pr[X (1 + δ)μ] e -μδ 2 3 Pr[X μ - γ ] e - γ 2 3μ Pr[X μ + γ ] e - γ 2 3μ There are chernoff bounds for lower tail and for upper tail. Lets illustrate this with an example : Q: Suppose we have n bins,we are throwing balls into the bins. Each ball is thrown uniformly at random into each bin. So the question is after we throw n 2 balls into the bins, Whats the number of balls in every bin? Sol : We have n bins U 1 U 2 U 3 ...... U n We have n 2 balls If we throw all n 2 balls. How many balls each bin might have? we will show that no bin will have more than 3n 2 balls Let no of balls in each bin i be b i We want to show that n 2 b i 3n 2 On average the expected number of balls in each bin is n as there are n 2 balls and n bins. We will use chernoff bounds to prove. We create n 2 0 , 1 variables for 1 i n 2 X i = 1 i th ball lands in the choosen bin 0 otherwise Then 1

Transcript of Algo Scribe

Page 1: Algo Scribe

CSE548 & AMS542: Analysis of Algorithms Lecturer: Rezaul A. ChowdhuryTopic: Randomized Quicksort Analysis Date: 16 april 2014Scribe: Atluri Vamsi Krishna

20.1 Chernoff Bounds: Revisited with an Example

Lower bounds Upper Bounds

Pr[X ≤ (1− δ)µ] ≤(

e−δ

(1−δ)(1−δ)

)µPr[X ≥ (1 + δ)µ] ≤

(eδ

(1+δ)(1+δ)

)µPr[X ≤ (1− δ)µ] ≤ e

−µδ23 Pr[X ≥ (1 + δ)µ] ≤ e

−µδ23

Pr[X ≤ µ− γ] ≤ e−γ2

3µ Pr[X ≥ µ+ γ] ≤ e−γ2

There are chernoff bounds for lower tail and for upper tail.

Lets illustrate this with an example :Q : Suppose we have n bins,we are throwing balls into the bins. Each ball is thrown uniformly at randominto each bin. So the question is after we throw n2 balls into the bins, Whats the number of balls in every bin?

Sol :We have n bins

U1 U2 U3 ...... Un

We have n2 ballsIf we throw all n2 balls. How many balls each bin might have?

we will show that no bin will have more than 3n2 balls

Let no of balls in each bin i be biWe want to show that

n

2≤ bi ≤

3n

2

On average the expected number of balls in each bin is n as there are n2 balls and n bins.

We will use chernoff bounds to prove.We create n2 0 , 1 variables for 1 ≤ i ≤ n2

Xi =

{1 ith ball lands in the choosen bin0 otherwise

Then

1

Page 2: Algo Scribe

Pr(Xi) =1n → E[Xi] = 1× Pr(X = i) + 0× Pr(X = 0)

X =n2∑i=1

Xi =n2∑i=1

E[Xi]

=

n2∑i=1

1

n

=n2

n= n = µ

We use the µ in chernoff bounds

Pr(x ≥ (1 + δ)n) ≤ e−δ2n

3 if δ =1

2

Pr(x ≥ (1 +1

2)n) ≤ e

− 122n

3

Pr(x ≥ (3

2)n) ≤ e

−n12

So,

1

en12

=1

1 + ( n12) +12!(

n12)

2 + 13!(

n12)

3 + .......1

en12

≤ 1

1 + ( n12) +12!(

n12)

2= O(

1

n2)

Pr(x ≥ 3n

2holds for any of n bins) = n×O(

1

n2)

= O(1

n)

Pr(x ≥ 3n

2holds for any of n bins) = n×O(

1

n2)

= O(1

n)

Pr(x <3n

2will hold for all bins) = 1−O(

1

n)

2

Page 3: Algo Scribe

As n becomes larger and larger we have Pr(x < 3n2 ) will near to 1

So the higher bound on bi is 3n2 with High Probability.

We will now prove Lower Bounds

Pr(x ≤ (1− δ)n) ≤ e−δ2n

3 if δ =1

2

Pr(x ≤ (1− 1

2)n) ≤ e

− 122n

2

Pr(x ≤ (1

2)n) ≤ e

−n8

1

en8

=1

1 + (n8 ) +12!(

n8 )

2 + 13!(

n8 )

3 + .......1

en12

≤ 1

1 + ( n12) +12!(

n12)

2= O(

1

n2)

Pr(x ≤ n

2holds for any of n bins) = n×O(

1

n2)

= O(1

n)

Pr(x ≤ n

2holds for any of n bins) = n×O(

1

n2)

= O(1

n)

Pr(x >n

2will hold for all bins) = 1−O(

1

n)

As n becomes larger and larger we have Pr(x > n2 ) will near to 1

So the lower bound on bi is n2 with High Probability.

3

Page 4: Algo Scribe

20.2 Expected Time of Randomized Quicksort : Revisited

Randomized Quicksort Algorithm

Input: A set of numbers . ( i.e., all numbers are distinct )

Output: The numbers of sorted in increasing order.

Steps:

1. Pivot Selection: Select a number x ∈ S uniformly at random.

2. Partition: Compare each number of S with x, and determine sets

Sl = {y ∈ S | y < x}andSr = {y ∈ S | y > x}.3. Recursion: Recursively sort Sl and Sr

4. Output: Output the sorted version of Sl, followed by x, followed by the sorted version of Sr

Observations: If S = n and X is the total number of comparisons made in step 2 ( Partition ) across all (original and recursive ) calls to RAND QS, then RAND QS sorts S in n+X time.Then all we need to do is determine E[X].

Let s1, s2, s3, ......., sn be the elements of S in sorted order.Let Sij = si, si+1, ......, sjfor all1 ≤ i < j ≤ n.

For 1 ≤ i < j ≤ n, LetXij =

{1 if si is compared to sjx otherwise

ThenX =n−1∑i=1

n∑j=i+1

Xij

E[X] =n−1∑i=1

n∑j=i+1

E[Xij ].

E[X] =n−1∑i=1

n∑j=i+1

Pr[Xij = 1].

Observations:xij = 0 : Once a pivot x with si < x < sj is chosen, si and sj will never be compared at any subsequenttime.

xij = 1 : if either si or sj is chosen as a pivot before any other item in Sij then si will be compared with sj .Xij = 1 then si and sj are in same segment when comparision was made.

Let the length of the segment in which both si and sj lie be mSo Pr(si is choosen as pivot) = 1

mSince si and sj are int the same segement m ≥ j − i+ 1∴ Pr(si) ≤ 1

j−1+1 = Pr(sj)

4

Page 5: Algo Scribe

Pr[Xij = 1] = Pr(si) + Pr(sj)

=1

j − i+ 1+

1

i− j + 1=

2

j − i+ 1

Hence,E[X] =n−1∑i=1

n∑j=i+1

Pr[Xij = 1]

=n−1∑i=1

n∑j=i+1

2

j − i+ 1

=n−1∑i=1

n−i∑k=1

2

k + 1

=

n−1∑i=1

n∑k=1

2

k

=n−1∑i=1

O(logn)

= O(nlogn)

20.3 High Probability Bound on Rand QS

Input: A set of numbers . ( i.e., all numbers are distinct )

Output: The numbers of sorted in increasing order.

Steps:

1. Pivot Selection: Select a number x ∈ S uniformly at random.

2. Partition: Compare each number of S with x, and determine sets

Sl = {y ∈ S | y < x}andSr = {y ∈ S | y > x}.3. Recursion: Recursively sort Sl and Sr

4. Output: Output the sorted version of Sl, followed by x, followed by the sorted version of Sr

Let us fix an element z in the original input set of size n

We will trace the partition containing z for c lnn levels of recursion, where c is a constant that will bedetermined later.

If a partitioning step divides S such that |S|4 ≤| Sl |, | Sr |≤3|S|4 then that partition is a balanced partition.

If at any point z is in a partition of size k, after a balanced partitioning step it ends up in a partition of sizeat most

(34

)k.

5

Page 6: Algo Scribe

Since the input size is n, after c4 lnn balanced partitions, z will end up in a partition of

size ≤(34

) c4 lnnn = n

c

4 ln 43

, which is ≤ 1 for c ≥ 14

It turns out that after c lnn levels i.e 14 lnn levels the z will end up in a segment of size 1.That means if c ≥ 14, then z will end up in its final sorted position in the output after undergoing c

4 lnnbalanced partitions.

For 1 ≤ i ≤ c lnn, let

Zi =

{1 if the partition at recursion level i is balaced;0 otherwise.

But a balanced partition is obtained by choosing a pivot with rank between k4 and 3k

4 , where k is the sizeof the set being partitioned. Since each element of the set is chosen uniformly at random, a balancing pivot

will be chosen with probability3k4− k

4k = 1

2

Hence Pr[Zi = 1] = 12 . Thus E[Zi] = Pr[Zi = 1] = 1

2 .

Total number of balanced partitions,Z =c lnn∑i=1

Zi.

Then µ = E[Z] =c lnn∑i=1

E[Zi] =c lnn2 .

Now applying chernoff lower bound 5 with δ = 12 ,

Pr[Z ≤ (1− δ)µ] ≤ e−µδ2

2

Pr[Z ≤ (1− δ)µ] ≤ e−µδ2

2 = e−c16

lnn = n−c16 = 1

nc16

For c = 32, we have Pr[Z ≤ 8 lnn] ≤ 1n2 .

This means that the probability that z fails to reach its final sorted position even after 32 lnn levels ofrecursion is ≤ 1

n2 .

The probability that at least one of the n input elements fail to reach its final sorted position after 32 lnnlevels of recursion is ≤ n× 1

n2 = 1n .

∴ the probability that all n input elements reach their final sorted positions after 32 lnn levels of recursionis ≥ 1− 1

n .

But observe that the total amount of work done in each level of recursion is O (n).∴ total workdone in 32 lnn levels of recursion is O (n lnn).

Hence, with High Probability RAND QS terminates in O (n lnn) time.

6

Page 7: Algo Scribe

20.4 Random Skip Lists

Searching in a sorted Linked List :

We have a sorted linked list. we want to find out whether an item exits in the linked list.

In worst case it takes n comparisions. So its of order O(n)

2-level Linked List:

In 2 level linked list we divide the elements into√n segments. Then we promote the first element in each

segment to the next level. So whenever we have to search for an element. we first search in the first level itis found then our search ends, otherwise we search in the next level.

7

Page 8: Algo Scribe

In the worst case it takes ≤ 2√n comparisions. So its order is O(

√n).

3-level Linked List:

In this we divide the linked list in 3√n segments.The first level contains n elements. The second level

contains n23 elements. And the third contains 3

√n.

In the worst case search takes 3 3√n comparisions. So the order is O( 3

√n) .

8

Page 9: Algo Scribe

So similarly, k- level Linked List: takes ≤ k k√n time. For k =log n: Search takes ≤ (log n).n

1logn =

2 log n time.

(log n)-level Linked List:

Search takes ≤ (log n) .n1

logn = 2 log n time.Observations :1. Let nl = no of items in level l. Then nl+1 = dnl2 e.2. Let ml = nl − nl+1 = no of items in level l that have not reached level l + 1. Then ml = d n2l e.

Random Skip List

In Random skip list we insert a sentinal at the start of the list. We then promote each non sentinel item oflevel l > 0 to level l + 1 with a probability 1

2 . If l + 1 is non empty then promote the sentinel too.

9

Page 10: Algo Scribe

The height of the structure is log n with high probability.

In the next lecture we will show that we can search with a high probability of log n.

References1. Class slides

10