Vertex-pursuit in heirarchical social networks

Post on 26-Feb-2016

29 views 0 download

description

TAMC’12. Vertex-pursuit in heirarchical social networks. Anthony Bonato Ryerson University. 很高 兴来北京. Friendship networks. network of friends (some real, some virtual) form a large web of interconnected links. 6 degrees of separation. - PowerPoint PPT Presentation

Transcript of Vertex-pursuit in heirarchical social networks

Complex Networks 1

Vertex-pursuit in heirarchical social networks

Anthony BonatoRyerson University

TAMC’12

Complex Networks 2

很高兴来北京

Complex Networks 3

Friendship networks• network of friends (some real, some virtual) form

a large web of interconnected links

Complex Networks 4

6 degrees of separation

• (Stanley Milgram, 67): famous chain letter experiment

Complex Networks 5

6 Degrees in Facebook?• 900 million users, > 70

billion friendship links• (Backstrom et al., 2012)

– 4 degrees of separation in Facebook

– when considering another person in the world, a friend of your friend knows a friend of their friend, on average

• similar results for Twitter and other OSNs

Complex Networks 6

Complex Networks• web graph, social networks, biological networks, internet

networks, …

Complex Networks 7

The web graph

• nodes: web pages

• edges: links

• over 1 trillion nodes, with billions of nodes added each day

Complex Networks 8

Biological networks: proteomics

nodes: proteins

edges: biochemical

interactions

Yeast: 2401 nodes11000 edges

Complex Networks 9

On-line Social Networks (OSNs)Facebook, RenRen, Twitter, LinkedIn,…

Complex Networks 10

Key parameters• degree distribution:

• average distance:

• clustering coefficient:

|})deg(:)({| , iuGVuN ni

)(,

1

2),()(

GVvu

nvudGL

)(

1-1

)()( ,2

)deg(|))((| )(

GVxxcnGC

xxNExc

Complex Networks 11

Properties of Complex Networks• power law degree distribution

(Broder et al, 01)

2 some ,, bniN bni

Power laws in OSNs (Mislove et al,07):

Complex Networks 12

Complex Networks 13

Small World Property• small world networks

(Watts & Strogatz,98)– low distances

• diam(G) = O(log n)• L(G) = O(loglog n)

– higher clustering coefficient than random graph with same expected degree

Complex Networks 14

Sample data: Flickr, YouTube, LiveJournal, Orkut

• (Mislove et al,07): short average distances and high clustering coefficients

Complex Networks 15

• (Zachary, 72)

• (Mason et al, 09)

• (Fortunato, 10)

• (Li, Peng, 11): small community property

Community structure

Complex Networks 16

(Leskovec, Kleinberg, Faloutsos,05):• densification power law: average degree is

increasing with time• decreasing distances

• (Kumar et al, 06): observed in Flickr, Yahoo! 360

Complex Networks 17

Models of complex networks

• We focus on 1. Geometric models of OSNs.

– Logarithmic dimension hypothesis

2. Vertex pursuit in hierarchical social networks– spread of gossip and news; disrupting

terrorist networks

Complex Networks 18

Why model complex networks?

• uncover and explain the generative mechanisms underlying complex networks

• predict the future• nice mathematical challenges• models can uncover the hidden reality of

networks

Complex Networks 19

Many different models

Complex Networks 20

“All models are wrong, but some are more useful.” – G.P.E. Box

Complex Networks 21

Geometry of OSNs?

• OSNs live in social space: proximity of nodes depends on common attributes (such as geography, gender, age, etc.)

• IDEA: embed OSN in 2-, 3- or higher dimensional space

Complex Networks 22

Dimension of an OSN• dimension of OSN: minimum number of

attributes needed to classify nodes

• like game of “20 Questions”: each question narrows range of possibilities

• what is a credible mathematical formula for the dimension of an OSN?

Complex Networks 23

Random geometric graphs, G(n,r)• nodes are randomly

placed in space

• each node has a constant sphere of influence

• nodes are joined if their sphere of influence overlap

Complex Networks 24

Simulation with 5000 nodes

Complex Networks 25

Spatially Preferred Attachment (SPA) model(Aiello, Bonato, Cooper, Janssen, Prałat, 08),

(Cooper, Frieze, Prałat,12)

• volume of sphere of influence proportional to in-degree

• nodes are added and spheres of influence shrink over time

• asymptotically almost surely (a.a.s.) leads to power laws graphs, low directed diameter, and small separators

Complex Networks 26

Protean graphs(Fortunato, Flammini, Menczer,06),

(Łuczak, Prałat, 06), (Janssen, Prałat,09) • parameter: α in (0,1)• each node is ranked 1,2, …, n by some function r

– 1 is best, n is worst

• at each time-step, one new node is born, one randomly node chosen dies (and ranking is updated)

• link probability proportional to r-α

• many ranking schemes a.a.s. lead to power law graphs: random initial ranking, degree, age, etc.

Complex Networks 27

Geometric model for OSNs• we consider a geometric

model of OSNs, where– nodes are in m-

dimensional Euclidean space

– threshold value variable: a function of ranking of nodes

Complex Networks 28

Geometric Protean (GEO-P) Model(Bonato, Janssen, Prałat, 12)

• parameters: α, β in (0,1), α+β < 1; positive integer m• nodes live in m-dimensional hypercube (torus metric)• each node is ranked 1,2, …, n by some function r

– 1 is best, n is worst – we use random initial ranking

• at each time-step, one new node v is born, one randomly node chosen dies (and ranking is updated)

• each existing node u has a region of influence with volume

• add edge uv if v is in the region of influence of u nr

Complex Networks 29

Notes on GEO-P model

• models uses both geometry and ranking• number of nodes is static: fixed at n

– order of OSNs at most number of people (roughly…)

• top ranked nodes have larger regions of influence

Complex Networks 30

Simulation with 5000 nodes

Complex Networks 31

Simulation with 5000 nodes

random geometric GEO-P

Complex Networks 32

Properties of the GEO-P model (Bonato, Janssen, Prałat, 2012)

• a.a.s. the GEO-P model generates graphs with the following properties:– power law degree distribution with exponent

b = 1+1/α– average degree d = (1+o(1))n(1-α-β)/21-α

• densification– diameter D = O(nβ/(1-α)m log2α/(1-α)m n)

• small world: constant order if m = Clog n– clustering coefficient larger than in comparable

random graph

Complex Networks 33

Diameter• eminent node:

– old: at least n/2 nodes are younger– highly ranked: initial ranking greater than

some fixed R• partition hypercube into small hypercubes• choose size of hypercubes and R so that

– each hypercube contains at least log2n eminent nodes

– sphere of influence of each eminent node covers each hypercube and all neighbouring hypercubes

• choose eminent node in each hypercube: backbone

• show all nodes in hypercube distance at most 2 from backbone

Complex Networks 34

Spectral properties• the spectral gap λ of G is defined by the

difference between the two largest eigenvalues of the adjacency matrix of G

• for G(n,p) random graphs, λ tends to 0 as order grows

• in the GEO-P model, λ is close to 1• (Estrada, 06): bad spectral expansion in real

OSN data

Complex Networks 35

Dimension of OSNs

• given the order of the network n, power law exponent b, average degree d, and diameter D, we can calculate m

• gives formula for dimension of OSN:

Dn

nd

bb

Dnm

loglog

loglog

211

loglog

Complex Networks 36

Uncovering the hidden reality• reverse engineering approach

– given network data (n, b, d, D), dimension of an OSN gives smallest number of attributes needed to identify users

• that is, given the graph structure, we can (theoretically) recover the social space

Complex Networks 37

6 Dimensions of Separation

OSN DimensionFacebook 7YouTube 6Twitter 4Flickr 4

Cyworld 7

Complex Networks 38

Hierarchical social networks• Twitter is highly directed: can view a

user and followers as a directed acyclic graph (DAG)– flow of information is top-down

• such hierarchical social networks also appear in the social organization of companies and in terrorist networks

• How to disrupt this flow? What is a model?

Complex Networks 39

Good guys vs bad guys games in graphsslow medium fast helicopter

slow traps, tandem-win

medium robot vacuum Cops and Robbers edge searching eternal security

fast cleaning distance k Cops and Robbers

Cops and Robbers on disjoint edge sets

The Angel and Devil

helicopter seepage Helicopter Cops and Robbers, Marshals, The Angel and Devil,Firefighter

Hex

badgood

Complex Networks 40

Complex Networks 41

Seepage• motivated by the 1973

eruption of the Eldfell volcano in Iceland

• to protect the harbour, the inhabitants poured water on the lava in order to solidify and halt it

Complex Networks 42

Seepage (Clarke,Finbow,Fitzpatrick,Messinger,Nowakowski,2009)

• greens and sludge, played on a directed acylic graph (DAG) with one source s

• the players take turns, with the sludge going first by contaminating s• on subsequent moves sludge contaminates a non-protected vertex

that is adjacent to a contaminated vertex• the greens, on their turn, choose some non-protected, non-

contaminated vertex to protect– once protected or contaminated, a vertex stays in that state to

the end of the game

• sludge wins if some sink is contaminated; otherwise, the greens win

Complex Networks 43

Example: G

S

GG

S

x

Complex Networks 44

Green number• green number of a DAG G, gr(G), is the

minimum number of greens needed to win– gr(G) = 1: G is green-win; as in previous

example• (CFFMN,2009):

– characterized green-win trees– bounds given on green number of truncated

Cartesian products of paths

Complex Networks 45

Mathematical counter-terrorism• (Farley et al. 2003-): ordered sets as

simplified models of terrorist networks– the maximal elements of the poset are

the leaders– submit plans down via the edges to the

foot soldiers or minimal nodes – only one messenger needs to receive

the message for the plan to be executed.

– considered finding minimum order cuts: neutralize operatives in the network

Complex Networks 46

Seepage as a counter-terrorism model?

• seepage has a similar paradigm to model of (Farley et al)

• main difference: seepage is dynamic– as messages move down the network towards

foot soldiers, operatives are neutralized over time

Complex Networks 47

Structure of terrorist networks• competing views; for eg (Xu et al, 06),

(Memon, Hicks, Larsen, 07), (Medina,Hepner,08):

• complex network: power law degree distribution– some members more influential

and have high out-degree

• regular network: members have constant out-degree– members are all about equally

influential

Complex Networks 48

Our model

• we consider a stochastic DAG model• total expected degrees of vertices are

specified–directed analogue of the G(w) model of

Chung and Lu

Complex Networks 49

General setting for the model

• given a DAG G with levels Lj, source v, c > 0• game G(G,v,j,c):

– nodes in Lj are sinks– sequence of discrete time-steps t– nodes protected at time-step t

• grj(G,v) = inf{c ϵ R+: greens win G(G,v,j,c)}

)1( tcct

Complex Networks 50

Random DAG model (Bonato, Mitsche, Prałat,12+)

• parameters: sequence (wi : i > 0), integer n• L0 = {v}; assume Lj defined• S: set of n new vertices• directed edges point from Lj to Lj+1 a subset of S• each vi in Lj generates max{wi -deg-(vi),0} randomly

chosen edges to S• edges generated independently• nodes of S chosen at least once form Lj+1

• parallel edges possible (though rare in sparse case)

Complex Networks 51

d-regular case

• for all i, wi = d > 2 a constant– call these random d-regular DAGs

• in this case, |Lj| ≤ d(d-1)j-1

• we give bounds on grj(G,v) as a function of the levels j of the sinks

Complex Networks 52

Main results

Theorem (BMP,12+) :If G is a random d-regular DAG, then a.a.s. the following hold.

1) If 2 ≤ j ≤ O(1), then grj(G,v) = d-2+1/j.2) If ω is any function tending to infinity with n and

ω ≤ j ≤ logd-1n- ωloglog n, then grj(G,v) ≤ d-2.3) If logd-1n- ωloglog n ≤ j ≤ logd-1n - 5/2klog2log n +

logd-1log n-O(1) for some integer k>0, then d-2-1/k ≤ grj(G,v) ≤ d-2.

Complex Networks 53

• grj(G,v) is smaller for larger j

Theorem (BMP,12+) For a random d-regular DAG G, for s ≥ 4 there is a constant Cs > 0, such that if

j ≥ logd-1n + Cs,then a.a.s.

grj(G,v) ≤ d - 2 - 1/s.• proof uses a combinatorial-game theory type

argument

Complex Networks 54

Sketch of proof• greens protect d-2 vertices on some

layers; other layers (every si steps, for i ≥ 0) they protect d-3

• greens play greedily: protect vertices adjacent to the sludge

• ≤1 choice for sludge when the greens protect d-2; at most 2, otherwise

• greens can move sludge to any vertex in the d-2 layers

• bad vertex: in-degree at least 2• if there is a bad vertex in the d-2

layers, greens can directs sludge there and sludge loses– greens protect all children

t = si+1d-3

Complex Networks 55

Sketch of proof, continued• sludge wins implies that there are no bad

vertices in d-2 layers, and all vertices in the d-3 layers either have in-degree 1 and all but at most one child are sludge-win, or in-degree 2 and all children are sludge-win

• allows for a cut proceeding inductively from the source to a sink:– in a given d-3 layer, if a vertex has in-degree 1,

then we cut away any out-neighbour and all vertices not reachable from the source (after the out-neighbour is removed)

• if sludge wins, then there is cut which gives a (d-1,d-2)-regular graph

• the probability that there is such a cut is o(1)

d-3

Complex Networks 56

Power law case• fix d, exponent β > 2, and maximum degree M =

nα for some α in (0,1)• wi = ci-1/β-1 for suitable c and range of i

– power law sequence with average degree d

• ideas:– high degree nodes closer to source, decreasing

degree from left to right– greens prevent sludge from moving to the highest

degree nodes at each time-step

Complex Networks 57

Theorem (BMP,12+)In a random power law DAG a.a.s.

Complex Networks 58

Contrasting the cases• hard to compare d-regular and power law random DAGs,

as the number of vertices and average degree are difficult to control

• consider the first case when there is Cn vertices in the d-regular and power law random DAGs– many high degree vertices in power law case– green number higher than in d-regular case

• interpretation: in random power law DAGs, more difficult to disrupt the network

Complex Networks 59

Problems and directions: GEO-P

• Logarithmic Dimension Hypothesis (LDH): the dimension of an OSN is best fit by about log n, where n is the number of users OSN– theoretical evidence GEO-P and MAG (Leskovec, Kim,12)

models– empirical evidence?

• generalize the GEO-P model to other ranking schemes (such as ranking by age or degree).

• SCP in GEO-P?

Complex Networks 60

Problems and directions: Seepage

• other sequences• empirical analysis on various hierarchical

social networks• vertex pursuit in geometric networks

–G(n,r), SPA, GEO-P

Complex Networks 61

• preprints, reprints, contact:search: “Anthony Bonato”

Complex Networks 62

謝謝