On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn...

33
Kamesh Munagala (Duke) Reza Bosagh Zadeh (Stanford) Ashish Goel (Stanford) Aneesh Sharma(Twitter, Inc.) On the Precision of Social and Information Networks

Transcript of On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn...

Page 1: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Kamesh Munagala (Duke)

Reza Bosagh Zadeh (Stanford) Ashish Goel (Stanford)

Aneesh Sharma(Twitter, Inc.)

On the Precision of Social and Information Networks

Page 2: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Information Networks � Social Networks play an important

role in information dissemination �  Emergency events, product launches, sports

updates, celebrity news,…

� Their effectiveness as information dissemination mechanisms is a source of their popularity

Page 3: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

A Fundamental Tension Two conflicting characteristics in social networks:

� Diversity: Users are interested in diverse content � Broadcast: Users disseminate information via posts/

tweets – these are blunt broadcast mechanisms!

Page 4: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Running Example

Adam interested in •  Apple •  Rap music •  Lakers

Bob tweets about: •  Christianity •  DC Politics •  Bulls

Charlie tweets about: •  Jay-Z •  Lady Gaga •  Kobe

Page 5: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Running Example

Adam interested in •  Apple •  Rap music •  Lakers

Bob tweets about: •  Christianity •  DC Politics •  Bulls

Charlie tweets about: •  Jay-Z •  Lady Gaga •  Kobe

Follow Follow

Page 6: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

A Fundamental Tension Two conflicting characteristics in social networks:

� Diversity: Users are interested in diverse content � Broadcast: Users disseminate information via posts/

tweets – these are blunt broadcast mechanisms!

Precision: Do users receive a lot of un-interesting content?

Recall: Do users miss a lot of potentially interesting content?

Page 7: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Question we study

Can information networks have high precision and recall?

Page 8: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Case Study: Twitter �  A random tweet is uninteresting to a random user…

�  … but users have interests and follow others based on these

Information networks like Twitter are constructed according to users’ interests!

Page 9: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Revisiting our example…

Adam interested in •  Apple •  Rap music •  Lakers

Bob tweets about: •  Christianity •  DC Politics •  Bulls

Charlie tweets about: •  Jay-Z •  Lady Gaga •  Kobe

Follow

Page 10: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Small User Study on Twitter

!"

!#$"

!#%"

!#&"

!#'"

!#("

!#)"

!#*"

!#+"

!#,"

$"

$" %" &" '" (" )" *" +" ," $!"

!"#$%&%'()

*&#")(+,-#")

*&#"."/0#1)2"#$%&%'()'3)04##0&)

-./01.20"

34256/"

Page 11: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Roadmap �  Assumptions:

1.  Users have immutable interests (independent of the network) 2.  Choose to connect to other users based on their interests 3.  Step (2) is optimized for precision and recall

Page 12: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Roadmap �  Assumptions:

1.  Users have immutable interests (independent of the network) 2.  Choose to connect to other users based on their interests 3.  Step (2) is optimized for precision and recall

�  Question 1: What conditions on the structure of user interests are necessary for high precision and recall, and small dissemination time?

�  Question 2: Can we empirically validate these conditions as well as the conclusion on Twitter?

Page 13: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

User-Interest Model �  Set of interests I; Set of users U

�  Each interest i is associated with two sets of users: � Producers P(i) = Users who tweet about i � Consumers C(i) = Users who are interested in i

� Denote the mapping from users to interests as Q(I,U)

�  Assume: P(i) µ C(i) for all interests i

Page 14: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Example Revisited

P(a) = {q} C(a) = {q, r, s}

P(b) = {s} C(b) = {r, s, t}

P(c) = {q, t} C(c) = {q, r, s, t}

User a

User b User c

Page 15: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Social (user-user) Graph G(U,E)

Social graph

P(a) = {q} C(a) = {q, r, s}

P(b) = {s} C(b) = {r, s, t}

P(c) = {q, t} C(c) = {q, r, s, t}

User a

User b User c

User a receives interests R(a) = {q, t}

Page 16: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

PR Score PR(u) = Precision and recall score for user u

- Function of user-interest map Q(I,U) and social graph G(U,E) PR(u) =

R(u)!C(u)R(u)"C(u)

Interests u receives from its followees

The consumption interests of u

Page 17: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Example

Social graph

P(a) = {q} C(a) = {q, r, s}

P(b) = {s, t} C(b) = {r, s, t}

P(c) = {q, t} C(c) = {q, s, t}

User a

User b User c

R(a) = {q, t} C(a) = {q, r, s} PR(a) = ¼ = 0.25

Page 18: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Improved Score

Social graph

P(a) = {q} C(a) = {q, r, s}

P(b) = {s, t} C(b) = {r, s, t}

P(c) = {q, t} C(c) = {q, s, t}

User a

User b User c

R(a) = {q, s, t} C(a) = {q, r, s} PR(a) = 2/4 = 0.5

Page 19: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

α-PR User-Interest Maps Q(I,U)

A user-interest map Q(I,U) is α-PR if: There exists a social graph G(U,E) s.t.

all users u have PR-Score ≥α

Special case: 1-PR means that R(u) = C(u) for all users u

Page 20: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Necessary Conditions for 1-PR �  Condition 1:

If Q(I,U) is “non-trivial” and G(U,E) is (strongly) connected: Then P(i) ½ C(i) for some interest i

�  Informal implication: Users have broader consumption interests and narrower

production interests

Page 21: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Experimental Setup � Classify text of tweets using 48 topics

� Yields “topic distribution” for each user �  Entropy of distribution lies between 0 and log2(48) = 3.87

�  P(u)= Interest distribution in tweets produced by u �  C(u) = Interest distribution in URL clicks made by u

Page 22: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Verifying Condition 1

TYPE OF INTEREST DISTRIBUTION

AVERAGE SUPPORT

AVERAGE ENTROPY

Consumption 7.78 2.00

Production 3.96 1.24

Page 23: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Can Interests be chosen at Random? Different interests can have different “participation levels” Theorem: If users choose P production and C consumption interests at random preserving participation levels of the interests then:

With high probability the interest structure is not α-PR for any constant α

Technically needs:

�  n = |U| and |I| = m > n1/2 �  P = logδn forδ > 2 and C < n1/3 �  Bounded second moment of participation level distribution

Key proof idea: Q(I,U) behaves like an expander graph

Page 24: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

● ●

●●●

●●

● ●

●●

● ●

●●

1

2

3

4

5

6

7

8

9

10 11

12

13

14

15

161718

19

20

21 22

23

24

25

26

27

28

29

30

31 32

3334

35

36

37

38

39

40

41 42

4344

45

46

47

Condition 2: Interests have Clustered Structure

Share more users than is predicted by a random assortment

Sports/Games

Technology

Page 25: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Interest Structure achieving 1-PR Kronecker graph model

User u

Kobe Gaga Obama

Attributes/Dimensions

Y N M N

d = O(log n) dimensions K = O(log n) values Lakers

Page 26: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Interest Structure achieving 1-PR Kronecker graph model

User u

Kobe Gaga Obama

Attributes/Dimensions

Y N M N

d = O(log n) dimensions K = O(log n) values

Similarity graph on values Y M N

Y 1 1 0

M 1 1 1

N 0 1 1 Y M N

Lakers

Page 27: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Interest Structure

Interest i

Kobe Gaga Obama

Attributes/Dimensions

Y M

Lakers

Similarity graph on values

Y M N

Y * M *

M * N *

Producer

Consumer

N * M * Not interested

Agrees exactly on all relevant dimensions

Set of relevant dimensions + their values

Similar on all relevant dimensions

Page 28: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

User-user Graph [Leskovec, Chakrabarti, Kleinberg, Faloutsos, Ghahramani ‘10]

User a

Kobe Gaga Obama

Attributes/Dimensions

Y M

Lakers Similarity graph on values

Y M N

Y M N Y

M N M M

User b

User c

Y M

Edge

Edge

Undirected Edge between two users iff ALL dimensions are similar

Such graphs can have: •  Super-constant average degree •  Heavy tailed degree distributions •  Constant diameter

Page 29: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Main Positive Result �  The Kronecker interest structure has 100% PR!

�  Users only receive interesting information

�  Users receive all information they are interested in

�  The dissemination time is constant.

Page 30: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Empirical Study of Precision

Precision(u) =C(u)!P(v)

(u,v)"E#

P(v)(u,v)"E#

0.0 0.2 0.4 0.6 0.8 1.0Precision per user

0

100

200

300

400

500

600

Freq

uenc

y

Precision distribution

Median precision = 40% Baseline precision = 17%

Interpretation: One in 2.5 interests received on any follow edge are interesting

Page 31: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Caveat: This is only a first step!

� Measuring interests � Use URL clicks as a measure of consumption/relevance � Use 48 topics as proxy for interests � Not considered quality of tweets in measuring interest � Not explored structure of interests in great detail

� Empirical validation � User studies are more reliable, but our study is small � We have not measured recall or dissemination time

Page 32: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social

Open Questions � Better empirical measures of interests and PR?

�  In-depth analysis of structure of interests � How can recall be measured?

� Can high PR information networks arise in a decentralized fashion? � How can users discover high PR links?

Page 33: On the Precision of Social and Information Networksstanford.edu/~rezab/papers/precision_slides.pdfOn the Precision of Social and Information Networks . Information Networks ! Social