Privacy Amplification for Correlated-Noise Mechanisms via b-Min-Sep Subsampling
YouTube · wwZdLFYnY28
Quick Read
Summary
Takeaways
- ❖Differentially Private Matrix Factorization (DPMF) adds correlated noise across training iterations for better utility, but complicates privacy analysis.
- ❖B-min-sep subsampling enforces a property where examples participate at most once within a 'B' iteration window, simplifying analysis.
- ❖The method uses Poisson subsampling and reconsiders unsampled examples immediately, unlike cyclic Poisson which imposes fixed waiting periods.
- ❖Privacy analysis is efficiently conducted via Monte Carlo accounting and a dynamic program, leveraging the Markovian nature of the subsampling.
- ❖B-min-sep subsampling extends effectively to multi-attribution settings, outperforming other methods in utility for user-level differential privacy.
- ❖Empirical results show B-min-sep subsampling achieves better accuracy and lower error on CIFAR-10 and ArXiv datasets, particularly at higher epsilon values.
Insights
1The Challenge of Privacy Amplification in DPMF with Correlated Noise
Differentially Private Matrix Factorization (DPMF) improves utility by adding Gaussian noise that is correlated across training iterations. However, this correlation breaks the independence assumption fundamental to most DP-SGD privacy amplification proofs, making standard composition analysis difficult. The goal is to find privacy amplification methods suitable for DPMF's correlated noise setting.
The speaker states, 'DPMF and privacy amplification is slightly more challenging to combine them, mainly due to the fact that DPMF adds a correlated noise into the training process, so it's harder to carry out the analysis.' and 'most DP-SGD application proofs, they rely on the fact that we have independence across different training iterations.'
2B-min-sep Subsampling for Correlated Noise Mechanisms
The proposed B-min-sep subsampling method addresses the challenge by performing Poisson subsampling on the training set while excluding any example that participated in the previous B-1 training iterations. This enforces the 'B-min-sep property,' ensuring that each coordinate of the output is affected by at most one participation, simplifying analysis. Unlike cyclic Poisson, it allows immediate reconsideration of unsampled examples in the next iteration, improving utility.
The speaker describes, 'we perform Poisson subsampling on the training set with some probability P, but we exclude any example that have participated in the previous B minus one training iterations.' and explains its advantage: 'if the example is available to be Poisson subsampled and if it's not picked by the Poisson subsample process, then it can be it can be immediately reconsidered in the next training iteration.'
3Efficient Privacy Analysis via Markovian Dynamic Programming
To analyze the privacy guarantees of B-min-sep subsampling with correlated noise, the method uses Monte Carlo accounting. The key innovation is a dynamic program that leverages the Markovian property of the subsampling scheme. This allows for efficient evaluation of the likelihood ratio between distributions (P, with participation; Q, without) by breaking down the exponential number of possible participation patterns, making the analysis feasible.
The speaker notes, 'the bottleneck of this approach is in evaluating the value of Q over P evaluated at Y.' and 'since BM set has something has this Markov chain view, we can we can leverage the Markovian property of of it.' leading to a 'likelihood ratio recursion' formula.
4Practical Extension to Multi-Attribution Settings
B-min-sep subsampling is extended to multi-attribution settings (where users can own multiple, shared examples) by pre-processing data to limit examples per user (K) and excluding examples that share a user with any example sampled in the previous B-1 iterations. This is a significant advancement as existing DPMF subsampling methods (cyclic Poisson, Balls and Bins) are difficult to apply in such complex graph structures without substantial utility loss or analytical challenges.
The speaker states, 'this is also the first practical application for multi-attribution biometric.' and explains the extension: 'we perform Poisson sub-sampling on the data set, but we exclude any examples that shares the user with any example that gets sampled in the previous Bounded Mins F iterations, which would include itself.'
5Empirical Superiority in Utility
Experiments on CIFAR-10 (example-level) and ArXiv (multi-attribution, TinyBERT fine-tuning) datasets demonstrate that B-min-sep subsampling achieves better utility (lower Mean Squared Error and test loss, higher accuracy) compared to DP-SGD with Poisson subsampling and other BandMF methods (cyclic Poisson, Balls and Bins). This advantage is particularly pronounced at higher epsilon values (lower privacy budgets), indicating a more favorable privacy-utility trade-off.
The results tables show that for epsilon values greater than or equal to 2, 'the BM and SAP method is the best' for MSE and accuracy on CIFAR-10. For multi-attribution, 'for epsilon values that are greater than two, our B min step is the best is the best performer' in MSE and achieves 'lower loss' in test loss plots.
Bottom Line
The choice of 'B' (band width/separation) and 'P' (Poisson subsampling probability) are critical parameters that influence both privacy and utility, and their optimal settings may need to be decoupled and dynamically adjusted based on the desired batch size and privacy budget.
Practitioners need sophisticated strategies for parameter tuning. A fixed 'B' might not be optimal across all epsilon values or dataset scales. The current coupling of 'B' to both noise correlation and batch size presents a practical challenge for independent control of these factors.
Develop adaptive algorithms that dynamically select 'B' and 'P' based on real-time training metrics or desired privacy-utility trade-offs. Research into the impact of decoupling these parameters on large-scale datasets could yield further utility improvements.
The Monte Carlo accounting for privacy analysis, while formally sound, can be computationally intensive, requiring parallelization or offline pre-computation for practical use.
Despite its theoretical rigor, the practical application of this analysis method requires significant computational resources. This could be a barrier for researchers or smaller organizations without access to large-scale computing infrastructure.
Explore more efficient, perhaps approximate, methods for privacy accounting that retain strong guarantees but reduce computational overhead. Develop open-source tools or pre-computed tables for common settings to lower the barrier to entry for using these advanced privacy amplification techniques.
Key Concepts
Privacy Amplification through Structured Randomness
The core idea that any source of randomness in the training process, if modeled and hidden from an adversary, can be leveraged to achieve stronger privacy guarantees than a naive analysis. This extends beyond simple data subsampling to include submodel training or dropout layers.
Markov Chain View for Subsampling
Modeling the participation status of an example in the B-min-sep subsampling scheme as a Markov chain with states representing wait times. This allows for efficient calculation of stationary distributions and conditional independence, which is crucial for the dynamic programming approach to privacy accounting.
Lessons
- For DPMF applications, implement B-min-sep subsampling with a 'warm start' variant to ensure stable batch sizes from the beginning of training.
- When analyzing privacy for B-min-sep subsampling, leverage the Markov chain view and dynamic programming for efficient and accurate Monte Carlo accounting, especially for long training sequences.
- Consider B-min-sep subsampling as a primary candidate for user-level differential privacy in multi-attribution settings, as it offers a practical and empirically superior approach compared to existing methods.
Notable Moments
Discussion on the subtle downfall of Cyclic Poisson, highlighting its deterministic cycling and unnecessary blocking of participation, which limits privacy amplification.
This explains why the proposed B-min-sep subsampling offers an advantage by allowing more flexible participation and immediate reconsideration of examples, leading to better utility.
Q&A regarding the coupling of 'B' (bands in correlation matrix) and 'B' (minimum separation for subsampling) and its impact on batch size control.
This exchange reveals a practical challenge for practitioners who might want independent control over batch size and noise correlation, suggesting an area for future research or more flexible algorithm design.
Q&A on the computational cost of Monte Carlo accounting for privacy analysis and its parallelizability.
This clarifies the practical feasibility of the analysis method, indicating that while it can be intensive, it's manageable with sufficient parallel computing resources or offline pre-computation.
Quotes
"DPMF and privacy amplification is slightly more challenging to combine them, mainly due to the fact that DPMF adds a correlated noise into the training process, so it's harder to carry out the analysis."
"The B-min-sep property says that... if each example only participates at most once, then each coordinate of the of the outcomes CX + C is only affected by at most one participation."
"This is also the first practical application for multi-attribution biometric."
Q&A
Recent Questions
Related Episodes

Differentially Private Table-Image Multimodal Data Generation
"This research introduces DP-TabImage, a novel differentially private framework for generating synthetic multimodal data (tables and images) that preserves both individual data fidelity and cross-modal correlations, significantly outperforming existing methods."

Leveraging Per-Instance Privacy for Machine Unlearning
"This research reveals a theoretical and empirical framework for understanding and quantifying the difficulty of machine unlearning for individual data points, showing that unlearning steps scale logarithmically with per-instance privacy loss."

Recursion Is The Next Scaling Law In AI
"This episode explores how recursion, applied at inference time, is emerging as a powerful scaling law in AI, enabling models to achieve advanced reasoning capabilities with significantly fewer parameters than large language models."

How to Build the Future: Demis Hassabis
"DeepMind CEO Demis Hassabis details the missing pieces for Artificial General Intelligence (AGI), the strategic role of smaller AI models, and how AI will transform scientific discovery, urging founders to combine AI with other deep tech."