A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

48
A Randomized Linear- Time Algorithm to Find Minimum Spaning Trees 黃黃黃 R96922141 黃黃黃 R96922077 黃黃黃 R96922136 黃黃黃 D95922022 黃黃黃 R96922171 David R. Karger Philip N. Klein Robert E. Tarjan

description

A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees. 黃則翰 R96922141 蘇承祖 R96922077 張紘睿 R96922136 許智程 D95922022 戴于晉 R96922171. David R. Karger Philip N. Klein Robert E. Tarjan. Outline. Introduction Basic Property & Definition Algorithm Analysis. Outline. Introduction - PowerPoint PPT Presentation

Transcript of A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

Page 1: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

黃則翰 R96922141蘇承祖 R96922077張紘睿 R96922136許智程 D95922022戴于晉 R96922171

David R. KargerPhilip N. KleinRobert E. Tarjan

Page 2: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

OutlineIntroductionBasic Property & DefinitionAlgorithmAnalysis

Page 3: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

OutlineIntroductionBasic Property & DefinitionAlgorithmAnalysis

Page 4: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

Introduction[Borůvka 1962] O(m log n)Gabow et al.[1984] O(m log β(m,n) )

◦β(m,n)= min { i |log(i)n <= m/n}

Verification algorithm ◦King[1993] O(m)

A randomize algorithm runs in O(m) time with high probability

Page 5: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

OutlineIntroductionBasic Property & DefinitionAlgorithmAnalysis

Page 6: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

Cycle propertyFor any cycle C in a graph, the

heaviest edge in C dose not appear in the minimum spanning forest.

2

3

5

6

2

3

5

6

Page 7: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

Cut Property

2

3

5

6

For any proper nonempty subset X of the vertices, the lightest edge with exactly one endpoint in X belongs to the minimum spanning tree

X

Page 8: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

DefinitionLet G be a graph with weighted edges.

◦w(x,y) The weight of edge {x,y}

If F is a forest of a subgraph in G◦F(x, y) the path (if any) connecting x and y

in F◦wF(x, y) the maximum weight of an edge on

F(x, y)◦wF(x, y)=∞ If x and y are not connected in F

Page 9: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

F-heavy & F-lightAn edge {x,y} is F-heavy if w(x,y) >

wF(x,y) and F-light otherwise

Edge of F are all F-lightA C

B D

2

3

5

6

E G

F H

2

3

5

6W(B,D)=6WF(B,D)=max{2,3,5}F-heavy

W(F,H)=6WF(F,H)= ∞ F-light

W(C,D)=5WF(C,D)=5F-light

Page 10: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

No F-heavy edge can be in the minimum spanning forest of G (cycle property)

Discard edge that cannot be in the minimum spanning tree

F-light edge can be the candidate edge for the minimum spanning tree of G

Observation

Page 11: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

OutlineIntroductionBasic Property & DefinitionAlgorithmAnalysis

Page 12: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

Boruvka AlgorithmFor each vertex, select the minimum-

weight edge incident to the vertex.Replace by a single vertex each

connected component defined by the selected edges.

Delete all resulting isolated vertices, loops, and all but the lowest-weight edge among each set of multiple edges.

Page 13: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

Algorithm Step1Apply two successive Boruvka steps to

the graph, thereby reducing the number of vertices by at least a factor of four.

Page 14: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

Algorithm Step2Choose a subgraph H by selecting

each edge independently with probability ½.

Apply the algorithm recursively to H, producing a minimum spanning forest F of H.

Find all the F-heavy edges and delete them from the contracted graph.

Page 15: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

Algorithm Step3Apply the algorithm recursively to the

remaining graph to compute a spanning forest F’. Return those edges contracted in Step1 together with the edges of F’.

Page 16: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

G

H

Boruvka × 2

G*

Original Problem

G’

Right Sub-

problem

Return minimum forest

F of H

Delete F-heavy edges from G*

Left Sub-

problem

F’Sample with p=0.5

Page 17: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

CorrectnessBy the cut property, every edge

contracted during Step1 is in the MSF.By the cycle property, the edges

deleted in Step2 do NOT belong to the MSF.

By the induction hypothesis, the MSF of the remaining graph is correctly determined in the recursive call of Step3.

Page 18: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

Candidate Edge of MSTThe expected number of F-light edges

in G is at most n/p (negative binomial)

For every sample graph H, the expected candidate edge for MST in G is at most n/p (F-light edge)

Page 19: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

Random-samplingTo help discard some edge that cannot

be in the minimum spanning treeConstruct the sample graph H

◦Process the edges in increasing order◦To process an edge e◦1. Test whether both endpoints of e

in same component◦2. Include the edge in H with

probability p◦3. If e is in H and is F-light, add e to

the Forest F

Page 20: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

Random-samplingC E

D F

6

5

11

9

A G4

3

10

14

13

B7

C E

D F

6

5

11

9

A G4

3

10

14

13

B7

GH

F

W(E,G)=14WF(E,G)=max{5,6,9,13}F-heavyW(E,F)=11WF(E,F)=max{5,6,9}F-heavy

W(D,F)=9WF(D,F)=9F-lightW(A,B)=7WF(A,B)= ∞F-light

Page 21: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

Random-samplingC E

D F

6

5

11

9

A G4

3

10

14

13

B7

G

F

1. Increasing Order2. If F-light

Throw If

Select3. Else

ThrowDon’t select

1. Random select edges to H2. Find F of HC E

D F

6

5

11

9

A G4

3

10

14

13

B7

G

Page 22: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

No F-heavy edge can be in the minimum spanning forest of G (cycle property)

F-light edge can be the candidate edge for the minimum spanning tree of G

The forest F produced is the forest that would be produced by Kruskal and inlcude all possible MSF of G

Observation

Page 23: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

ObservationThe size of F is at most n-1The expected number of F-light edges

in G is at most n/p (negative binomial)kn pp

knk

pnkf )1(1

);;(

pppknk kn )1(

)1( 1

ppn 1

nppn1

pn

Mean k =

Expected n =

Page 24: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

OutlineIntroductionBasic Property & DefinitionAlgorithmAnalysis

Page 25: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

Analysis of the AlgorithmThe worst case.The expectations running time.The probability of the expectations

running time.

Page 26: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

Running time AnalysisTotal running time= running time in

each steps.Step(1): 2 steps Boruvka’s algorithmStep(2):Dixon-Rauch-Tarjan verification

algorithm.All takes linear time to the number of

edges.◦Estimate the total number of edges.

Page 27: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

Observe the recursion treeG=(V,E) |V| = n, |E|=m .

◦m≧n/2 since there is no isolate vertices.

Each problem generates at most 2 subproblems.◦At depth d, there is at most 2d nodes.◦Each node in depth d has at most

n/4d vertices.The depth d is at most log4n.

◦There are at most vertices in all subproblems

0022/4/2

dd

ddd nnn

Page 28: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

The worst case Theorem 4.1 The worst-case running

time of the minimum-spanning-forest algorithm is O(min{n2,m log n}), the same as the bound for Boruvka’s algorithm.

Proof: There is two different estimate ways.

1. A subproblem at depth contains at most (n/4d)2/2 edges. Total edges in all subproblems is:

n

dd

d

nOn4log

02

2

)(22

)4/(

Page 29: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

The worst case2. Consider a subprolbem G=(V,E)

after step(1), we have a G’=(V’ ,E’),|E’|≦|E| - |V|/2, |V’| ≦|V|/4Edges in left-child = |H| Edges in right-child ≦ |E’| - |H| + |F| so edges in two subproblem is less then: (|H|) + (|E’| - |H| + |F|)

=|E’| +|F|≦|E|-|V|/2 + |V|/4≦|E| The two sub problem at most contains

|E| edges.

Page 30: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

The worst case

m edges

edges m

edges m

edges mnlog

Page 31: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

The worst caseThe depth is at most log4n and each

level has at most m edges, so there are at most (m log n) edges.

The worst-case running time of the minimum-spanning-forest algorithm is O(min{n2,m log n}).

Page 32: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

Analysis of the AlgorithmThe worst case.The expectations running time.The probability of the expectations

running time.

Page 33: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

Analysis – Average Case (1/8)

Theorem: the expected running time of the minimum spanning forest algorithm is O(m)◦Calculating the expected total number

of edges for all left path problemsOriginal Problem

Left Sub-problem Right Sub-problem

Left Subsub-problem Right Subsub-problem

Page 34: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

Analysis – Average Case (2/8)

Calculating the expected total edge number for one left path started at one problem with m’ edges

Evaluating the total edge number for all right sub-problems# of edges

= m’

Expected total edge number

≤ 2m’

Page 35: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

Analysis – Average Case (3/8)

G

H G’

Boruvka × 2

G*Sample with p=0.5

1. E[edge number of H] = 0.5 × edge number of G*

Original Problem

Left Sub-problem

Right Sub-problem

2. ∵ Boruvka × 2 ∴ edge number of G* ≤ edge number of G

E[edge number of H] ≤ 0.5 × edge number of G

Calculating the expected total edge number for one left path started at one problem with m’ edges

Page 36: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

Analysis – Average Case (4/8)

G

H G’

Boruvka × 2

G*Sample with p=0.5

Original Problem

Left Sub-problem

Right Sub-problem

E[edge number of H] ≤ 0.5 × edge number of G

Calculating the expected total edge number for one left path started at one problem with m’ edges

# of edges = m’

# of edges ≤ 0.5 × m’

Expected total edge number ≤ = 2m’

Page 37: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

Analysis – Average Case (5/8)

Calculating the expected total edge number for one left path L started at one problem with m’ edges◦Expected total edge number on L ≤ 2m’

• Evaluating the total edge number of all right sub-problems• E[total edges of all right sub-problem] ≤ n

K.O.

Page 38: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

Analysis – Average Case (6/8)

G

H G’

Original Problem

Left Sub-problem

Right Sub-problem

1. ∵ Boruvka × 2 ∴ vertex number of G* ≤ 0.25 × vertex number of G

E[edge number of G’] ≤ 0.5×vertex number of G

Evaluating the total edge number for all right sub-problems◦ To prove : E[total edges of all right sub-problem] ≤ n

Boruvka × 2

G*Sample with p=0.5

Return minimum forest

F of H

Delete F-heavy edges from G*

2. Based on lemma 2.1: E[edge number of G’] ≤ 2 × vertex number of G*

Page 39: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

Analysis – Average Case (7/8)

E[edge number of G’] ≤ 0.5×vertex number of G

Evaluating the total edge number for all right sub-problems◦ To prove : E[total edges of all right sub-problem] ≤ n

G

H G’

Original Problem

Left Sub-problem

Right Sub-problem

Boruvka × 2

G*Sample with p=0.5

# of vertices of sub-problems ≤ 2×n/4# of vertices of sub-problems ≤ 4×n/42

# of vertices of sub-problems ≤ 8×n/43

# of vertices of sub-problems ≤ 16×n/44

# of edges of right sub-problems ≤ n/2# of edges of right sub-problems ≤ 2×n/8

# of vertices of original-problems=n

# of edges of right sub-problems ≤ 4×n/(42×2)# of edges of right sub-problems ≤ 8×n/(43×2)

= n

Page 40: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

Analysis – Average Case (8/8)

Calculating the expected total edge number for one left path started at one problem with m’ edges◦ Expected total edge number for one left path ≤ 2m’

Evaluating the total edge number for all right sub-problems◦ E[total edges of all right sub-problem] ≤ n

# of edges = m’

Expected total edge number

≤ 2m’E[processed edges in the original problem and all sub-problems]=2×(m+n)

Page 41: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

Analysis of the AlgorithmThe worst case.The expectations running time.The probability of the expectations

running time.

Page 42: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

The Probability of LinearityTheorem 4.3

◦The minimum spanning forest algorithm runs in Ο(m) time with probability 1 – exp(-Ω(m))

Page 43: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

The Probability of Linearity

n

1i

tXAt ieEeAXPr

Chernoff Bound:Given xi as i.d.d. random variables and 0< i n, and X is the sum of all xi, for t > 0, we have

Thus, the probability that less than s successes (each with chance p) within k trials is

2121)s(Ω

ktst

k

1i

tXst

p and t for ,e)pe(e

eEesXPr i

Page 44: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

The Probability of LinearityRight Subproblems

◦At most the number of vertices in all right subproblems: n/2 ( proved by theorem 4.2 )

◦n/2 is the upper bound on the total number of heads in nickel-flips

Page 45: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

Right SubproblemsThe probability

◦It occurs fewer than n/2 heads in a sequence of 3m nickel-tosses

m + n ≦ 3m since n/2 ≦ mThe probability is exp (-Ω(m)) by

a Chernoff bound

Page 46: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

The Probability of LinearityLeft Subproblem

◦Sequence: every sequence ends up with a tail, that is, HH…HHT

◦The number of occurrences of tails is at most the number of sequences

◦Assume that there are at most m’ edges in the root problem and in all right subproblems

Page 47: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

Left SubproblemsThe probability

◦It occurs m’ tails in a sequence of more than 3m’ coin-tosses

The probability is exp (-Ω(m)) by a Chernoff bound

Page 48: A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees

The Probability of LinearityCombining Right & Left

Subproblems◦The total number of edges is Ο(m)

with a high-probability bound 1 – exp(-Ω(m))