Vitaly Feldman - GitHub Pages · 2020. 8. 30. · Online monitoring 4 𝑥1,1 𝑥2,1 𝑥3,1...
Transcript of Vitaly Feldman - GitHub Pages · 2020. 8. 30. · Online monitoring 4 𝑥1,1 𝑥2,1 𝑥3,1...
Amplification by Shuffling:From Local to Central Differential Privacy via Anonymity
Vitaly Feldman
Ulfar Erlingsson Ilya Mironov Ananth Raghunathan Kunal Talwar Abhradeep Thakurta
Local Differential Privacy (LDP)
For all 𝑖, 𝐴𝑖 is a local 𝜖-DP randomizer:
for all 𝑣, 𝑣′ ∈ 𝑋
[Warner ‘65; EGS ‘03; KLNRS ‘08]
𝐴𝑖(𝑥𝑖 = 𝑣) 𝐴𝑖(𝑥𝑖 = 𝑣′)
Server
𝑥2
𝑥1
𝑥3
𝑥𝑛
𝐴1
𝐴2
𝐴3
𝐴𝑛
Compute (approximately)𝑓(𝑥1, 𝑥2, … , 𝑥𝑛)
Outline
Online monitoring with LDP
3
Benefits of anonymity:
privacy amplification by shuffling
Online monitoring
4
𝑥1,1
𝑥2,1
𝑥3,1
𝑥𝑛,1
𝑥1,3
𝑥2,3
𝑥3,3
𝑥𝑛,3
𝑥1,2
𝑥2,2
𝑥3,2
𝑥𝑛,2
𝑥1,𝑑
𝑥2,𝑑
𝑥3,𝑑
𝑥𝑛,𝑑
Estimate the daily counts 𝑆𝑗 = σ𝑖=1𝑛 𝑥𝑖,𝑗 for all 𝑗 ∈ [𝑑]
𝑥𝑖,𝑗 ∈ {0,1}
Status of user 𝑖 on day 𝑗
Assume that each user’s status
changes at most 𝑘 times
• only for utility
𝑆1 𝑆2 𝑆3 𝑆𝑛
time
Monitoring with LDP
• Report the status changes (only first 𝑘)
• Maintains a tree of counters each over an interval of time
• Based on [DNPR ‘10; CSS ‘11]
5
There exists an 𝜖-LDP algorithm that constructs estimates መ𝑆1, መ𝑆2, … , መ𝑆𝑑 such that with high prob. for all 𝑗 ∈ [𝑑],
𝑆𝑗 − መ𝑆𝑗 = 𝑂𝑛𝑘 (log 𝑑)2
𝜖
Encode-Shuffle-Analyze (ESA) [Bittau et al. ‘17]
6
Server
𝑥2
𝑥1
𝑥3
𝑥𝑛
𝐴1
𝐴2
𝐴3
𝐴𝑛
Shuffle and anonymize
Privacy amplification by shuffling
7
For any 𝜖 = 𝑂(1) and any sequence of 𝜖-LDP algorithms (𝐴1, … , 𝐴𝑛), let
𝐴shuffle 𝑥1, … , 𝑥𝑛 = 𝐴1 𝑥𝜋 1 , 𝐴2 𝑥𝜋 2 , … , 𝐴𝑛 𝑥𝜋 𝑛
for a random and uniform permutation 𝜋: 𝑛 → 𝑛
Then 𝐴shuffle is 𝜖′, 𝛿 -DP in the central model for 𝜖′ = 𝑂𝜖 log 1/𝛿
𝑛
Holds for adaptive case: 𝐴𝑖 may depend on outputs of 𝐴1, … , 𝐴𝑖−1
Comparison with subsampling
Advantages of shuffling:
• does not affect the statistics of the dataset
• does not increase LDP cost
8
Running 𝜖-DP algorithm on random 𝑞-fraction of elements is ≈ 𝑞𝜖-DP (𝜖 ≤ 1) [KLNRS ‘08]
Output 𝐴1 𝑥𝑖1 , 𝐴2 𝑥𝑖2 , … , 𝐴𝑛 𝑥𝑖𝑛
where 𝑖1, 𝑖2, … , 𝑖𝑛 ∼ [𝑛] (independently) is 𝜖′, 𝛿 -DP for 𝜖′ = 𝑂𝜖 log 1/𝛿
𝑛
e.g. [BST ‘14]
Shuffling includes all elements so 𝑞 = 1
Server
𝑥2
𝑥1
𝑥3
𝑥𝑛
𝐴1
𝐴2
𝐴3
𝐴𝑛
Shuffle and anonymize
Implications for ESA
9
Set 𝑆 ⊆ [𝑛] with the
same randomizer
For every 𝑖 ∈ 𝑆, the output is 𝑂𝜖 log 1/𝛿
𝑆, 𝛿 -DP for element at position 𝑖
Output distribution is determined by 𝑚 = #1(RR(𝑥1), … , RR(𝑥𝑛))
𝑚 ∼ Bin 𝑘,2
3+ Bin 𝑛 − 𝑘,
1
3, where 𝑘 = #1(𝑥1, … , 𝑥𝑛)
For a neighboring dataset: 𝑘′ = 𝑘 ± 1
Bin 𝑘,2
3+ Bin 𝑛 − 𝑘,
1
3≈
log 1/𝛿𝑛 ,𝛿
Bin 𝑘 + 1,2
3+ Bin 𝑛 − 𝑘 − 1,
1
3
[DKMMN ‘06]
Special case: binary randomized response
10
RR: For 𝑥 ∈ 0,1 , return 𝑥 flipped with probability 1/3. Satisfies (log 2)-LDP
Also given in [Cheu,Smith,Ullman,Zeber,Zhilyaev ‘18] (independently)
Conclusions
• Monitoring with LDP and log dependence on time
• General privacy amplification technique
o Match state of the art in the central model
o Can be used to derive lower bounds for LDP
• Provable benefits of anonymity for ESA-like architectures
• To appear in SODA 2019
• arxiv.org/abs/1811.12469
11