Graph Analysis with Node DifferentialNode Differential PrivacyNode differentially private analysis...

27
Graph Analysis with Node Differential Privacy Node Differential Privacy Sofya Sofya Raskhodnikova Raskhodnikova Sofya Sofya Raskhodnikova Raskhodnikova Penn State University Joint work with Shiva Shiva Kasiviswanathan Kasiviswanathan (GE Research), Kobbi Kobbi Nissim Nissim (Ben-Gurion U. and Harvard U.), Ad S ih Ad S ih (P S ) Adam Smith Adam Smith (Penn State) 1

Transcript of Graph Analysis with Node DifferentialNode Differential PrivacyNode differentially private analysis...

  • Graph Analysis with Node Differential PrivacyNode Differential Privacy

    SofyaSofya RaskhodnikovaRaskhodnikovaSofyaSofya RaskhodnikovaRaskhodnikovaPenn State University

    Joint work with Shiva Shiva KasiviswanathanKasiviswanathan (GE Research),KobbiKobbi NissimNissim (Ben-Gurion U. and Harvard U.),

    Ad S i hAd S i h (P S )Adam Smith Adam Smith (Penn State)

    1

  • Publishing information about graphs

    Many datasets can be represented as graphs• “Friendships” in online social network• “Friendships” in online social network• Financial transactionsE il i ti• Email communication

    • Romantic relationships image source http://community.expressor-software.com/blogs/mtarallo/36-extracting-data-facebook-social-graph-expressor-tutorial.html

    Privacy is a big issue!

    2

    American J. Sociology, Bearman, Moody, Stovel

    big issue!

  • Private analysis of graph dataGraph G

    i )(Government,researchers

    Trustedcurator

    Users

    queries

    answers)( researchers,businesses

    (or) maliciousmaliciousadversary

    T fli i l ili d i• Two conflicting goals: utility and privacy– utility: accurate answers

    – privacy: ?

    3image source http://www.queticointernetmarketing.com/new-amazing-facebook-photo-mapper/

  • Differential privacy for graph dataGraph G

    i )(Government,researchers

    Trustedcurator

    Users

    Aqueries

    answers)( researchers,businesses

    (or) maliciousmaliciousadversary

    • Intuition: neighbors are datasets that differ only in someIntuition: neighbors are datasets  that differ only in some information we’d like to hide (e.g., one person’s data) 

    4image source http://www.queticointernetmarketing.com/new-amazing-facebook-photo-mapper/

  • Two variants of differential privacy for graphs

    • Edge differential privacy

    GG:

    Two graphs are neighbors if they differ in one edge.

    • Node differential privacyG:

    Two graphs are neighbors if one can be obtained from the other by deleting a node and its adjacent edgesby deleting a node and its adjacent edges.

    5

  • Node differentially private analysis of graphsGraph G

    i )(Government,researchers

    Trustedcurator

    Users

    Aqueries

    answers)( researchers,businesses

    (or) malicious

    T fli ti l tilit d i

    maliciousadversary

    • Two conflicting goals: utility and privacy– Impossible to get both in the worst case

    • Previously: no node differentially private• Previously: no node differentially private algorithms that are accurate on realistic graphs

    6image source http://www.queticointernetmarketing.com/new-amazing-facebook-photo-mapper/

  • Our contributions

    • First node differentially private algorithms that are accurate for sparse graphsaccurate for sparse graphs– node differentially private for all graphs

    acc rate for a s bclass f h hi h i l d– accurate for a subclass of graphs, which includes • graphs with sublinear (not necessarily constant) degree bound

    • graphs where the tail of the degree distribution is not too heavygraphs where the tail of the degree distribution is not too heavy

    • dense graphs

    • Techniques for node differentially private algorithmsec ques o ode d e e a y p a e a go s

    • Methodology for analyzing the accuracy of such algorithms on realistic networksalgorithms on realistic networks

    Concurrent work on node privacy [Blocki Blum Datta Sheffet 13]Concurrent work on node privacy [Blocki Blum Datta Sheffet 13]

    7

  • Our contributions: algorithms

    ……

    Frequency

    …Degrees

    8

  • Our contributions: accuracy analysis

    (1+o(1))-approximation

    9

  • Previous work ondifferentially private computations on graphs

    Edge differentially private algorithmsEdge differentially private algorithms• number of triangles, MST cost [Nissim Raskhodnikova Smith 07]• degree distribution [Hay Rastogi Miklau Suciu 09 Hay Li Miklau Jensen 09]• degree distribution [Hay Rastogi Miklau Suciu 09, Hay Li Miklau Jensen 09]• small subgraph counts [Karwa Raskhodnikova Smith Yaroslavtsev 11]• cuts [Blocki Blum Datta Sheffet 12]

    Edge private against Bayesian adversary (weaker privacy)ll b h t [ kl ]• small subgraph counts [Rastogi Hay Miklau Suciu 09]

    Node zero‐knowledge private (stronger privacy)Node zero knowledge private (stronger privacy)• average degree, distances to nearest connected, Eulerian, 

    cycle‐free graphs for dense graphs [Gehrke Lui Pass 12]

    10

  • Differential privacy basicsGraph G

    t ti ti f )(Government,researchers

    Trustedcurator

    Users

    Astatistic f

    approximation )( researchers,businesses

    (or) maliciousto f(G) maliciousadversary

    11

  • Global sensitivity framework [DMNS’06]

    12

  • “Projections” on graphs of small degree

    Goal: privacy for all graphs

    13

  • Method 1: Lipschitz extensions

    14

  • 1 1'1

    s

    1

    3 3' t

    2 2'

    1

    5 5'

    4 4'

    15

  • 1 1'11/

    s

    1

    3 3' t

    2 2'

    11/

    5 5'

    4 4'

    16

  • 1 1'1

    s

    1

    3 3' t

    2 2'

    1

    5 5'

    4 4'

    6'6 6'6

    17

  • via Smooth Sensitivity framework [NRS’07] via Smooth Sensitivity framework [NRS 07]

    18

  • Our results

    via Lipschitzextensions

    }  via generic reduction

    19

  • Conclusions

    • It is possible to design node differentially private  algorithms with good utility on sparse graphswith good utility on sparse graphs– One can first test whether the graph is sparse privately

    • Directions for future work– Node‐private synthetic graphs

    – What are the right notions of privacy for network data?

    20

  • Lipschitz extensions via linear/convex programs

    21

  • Our results

    ……

    Frequency

    … …Degrees…

    22

  • Our results

    ……

    (1+o(1))-approximation(1 o(1)) approximation

    23

  • T query fT(G)G A

    24S

  • Generic Reduction via TruncationFrequency

    … …d Degrees

    25

  • Smooth Sensitivity of Truncation

    #(nodes of degree above d)

    26

  • Releasing Degree Distribution via Generic Reduction

    T query fT(G)G A

    T q y fT(G)

    S

    27