1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

34
1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

Transcript of 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

Page 1: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

1

• Minimum Spanning Tree in expected linear time.

• Epilogue: Top-in card shuffling.

Page 2: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

2

The problem

• Input: – A connected n-node

m-edge graph G with edge weight w.

• Output: – A spanning tree T of G

with minimum w(T).

Page 3: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

3

Illustration

2

1

1

12

2

2

3

1

Page 4: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

4

Inventor of MST

• Otakar Borůvka– Czech scientist– Introduced the problem– Gave an O(m log n) time

algorithm– The original paper was

written in Czech in 1926.– The purpose was to efficiently

provide electric coverage of Bohemia.

Page 5: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

5

Bohemia – Western Czech

Page 6: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

6

The competition

• Unit-cost RAM model– O(m) Fredman-Willard (FOCS 1990)

• Deterministic comparison based algorithms.– O(m log n) Borůvka, Prim, Dijkstra, Kruskal,…– O(m log log n) Yao (1975), Cheriton-Tarjan

(1976)– O(m (m, n)) Fredman-Tarjan (1987)– O(m log (m, n)) Gabow-Galil-Spencer-Tarjan (1986)– O(m (m, n)) Chazelle (JACM 2000)– O(m) Holy grail

Page 7: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

7

Today’s Topic

Expected O(m)-time comparison-based algorithm for MST

[Karger-Klein-Tarjan, JACM 1995]

Page 8: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

8

Without loss of generality

• We may assume that all edge weights are distinct.

• Why?

Page 9: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

9

Warm-up: Fundamental Properties of MST

(a) Cut Property

(b) Cycle Property

(c) Uniqueness Property

Page 10: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

10

Cut Property

u v

y Why?x

Suppose that is a minimum spanning tree.Clearly, each edge of divides the nodes of into two sets. That is, defines a cut for .Then, for any edge of that is cut by , i.e., the end points are on the different sides of the cut, we have .

Page 11: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

11

Cycle Property

Why?

For ANY cycleC of G, theedgeonC withmaximumweight cannot bein ANY minimumspanning treeofG.

Page 12: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

12

Uniqueness Property

uv

yx

T

has exactly one minimum spanning tree.

Assume for contradiction that and are two distinct minimum spanning trees of . Let be an edge in . Adding to forms a cycle , i.e., plus the unique path of connecting and . Clearly, for each edge on , it follows from the cut property that . Therefore, is the heaviest edge on , contradicting the cycle property.

Page 13: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

13

Boruvka’s algorithm

• Repeat the following procedure until the resulting graph becomes a single node.– For each node u, mark its lightest incident

edge. – Now, the marked edges form a forest F. Add

the edges of F into the set of edges to be reported.

– Contract each maximal subtree of F into a single node.

Page 14: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

14

Illustration

2.1

1.3

2.3

1.2

2.2

3.1

2.4

3

1

1.5

1.4

2.6

2.7

2.5

3.2

5

3.34

4.1

5.1

Page 15: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

15

Running time = O(m log n)Why?

Each phase can be done in time.After each contraction phase, the number of

nodes is reduced by at least one half.

Mathematically, the recurrence relation for the worst-case running time is as follows:

Page 16: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

16

Karger-Klein-TarjanThe strategy: Using random sampling to further delete at least a constant factor of edges on average after each phase of edge contraction.

Mathematically, the recurrence relation for the expected running time becomes something like

implying that .

Page 17: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

17

Question:

What edges can be deleted without affecting the optimality of

the output tree?

Resorting to the cycle property!

Page 18: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

18

T-heavy edges

vu

T

G ¡ T

Let be any spanning tree of . An edge of is -heavy if is heavier than any edge on the unique path of connecting nodes and .

By the cycle property, a -heavy edge for any spanning tree of cannot be in the minimum spanning tree of .

Page 19: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

19

The Heaviness Lemma is the minimum spanning tree of if and only if each edge of is -heavy.

The only-if part follows immediately from the cycle property.

As for the if part, let be an edge of that is not -heavy. That is, the weight of can be reduced by adding and deleting an edge on the path of T connecting and that is heavier than .

Page 20: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

20

Illustration

2.1

1.3

2.3

1.2

2.2

3.1

2.4

3

1

1.5

1.4

2.6

2.7

2.5

3.2

5

3.34

4.1

5.1

Page 21: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

21

Tool 1: Dixon-Rauch-Tarjan

• [SIAM J. Computing 1992]– Given a spanning tree T of G,

it takes (deterministic) O(m) time to output all T-heavy edges of G.

Page 22: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

22

Verifying MST is easier!

• It follows from Dixon-Rauch-Tarjan that verifying whether an input tree T is the minimum spanning tree G can be done in (deterministic) O(m) time.

Page 23: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

23

Tool 2: A Sampling Lemma

Let be an arbitrary spanning tree of . Let be a random sample of the edges in . Let be the minimum spanning tree of . Then the expected number of T-heavy edges in G-T is a least

We will choose to be . The expected number of -heavy edges, which are disposable, is at least .

Page 24: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

24

The (recursive) algorithmRun Boruvka for three phases in time. Let be the

contracted graph. The number of nodes in is at most .Randomly sample a subset of the edges of with .

Compute (recursively) the minimum spanning tree of , where is an arbitrary spanning tree of .

Run Dixon-Rauch-Tarjan in time to delete all -heavy edges. The expected number of remaining edges is at most .

Compute (recursively) the minimum spanning tree for the remaining graph.

Page 25: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

25

Expected Running Time

The recurrence relation for the expected running time is as follows.

With simple calculation, we have .

Page 26: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

26

Comments

• The original sampling lemma, which is slightly more complicated, is due to Karger, Klein, and Tarjan.

• The version we see is due to Timothy Chan [IPL 1998].– The statement and its proof are both

extremely simple!

Page 27: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

27

Chan’s Proof

Pick a random edge of (independent of ). Let be the minimum spanning tree of . It suffices to prove that

Page 28: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

28

Claim1: is not -heavy

Assume that is not on the minimum spanning tree . By the heaviness lemma, we know that is -heavy. By e we also have . Therefore, e implies that $e$ is -heavy.

As a matter of fact, the converse of the above statement also holds. The proof of Chan does not need that direction, though.

Page 29: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

29

Claim 2: .

Recall that is the minimum spanning tree of . We know

.For any possible value of , we have that

It follows that

Page 30: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

30

Shuffling cards

Page 31: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

31

Top-In Shuffling

• Suppose that we are given a deck of n cards.

• Each iteration, we pick the card on top, and then insert it back to the deck at a random position: there are n positions, each with probability 1/n.

Page 32: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

32

Question

How many iterations are required to make the deck random?

Page 33: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

33

Consider the position of the card that is initially at the bottom.Clearly, always goes upward and never goes downward until it becomes the card on top.

Observe that the cards below are random.

We divide the process into phases. Phase specifies that is at the -th position from bottom. Clearly, the expected number of iterations required to turn phase into phase is .

iterations suffice on average

Page 34: 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

34

iterations suffice on average

As a result, the expected number of iterations required to reach phase (i.e., card appearing on top) is

Now all cards except are in random positions. Then one more iteration move to a random position, too.