
"Differential privacy is a mathematical technique designed to publish statistical information derived from a dataset without leaking information about individual samples contained in it. This is typically achieved by injecting calibrated noise into the training data in such a way that its overall statistical properties are preserved while making it more difficult to infer details about specific samples."
"In the context of a large language model, this approach ensures that the model outputs are statistically indistinguishable from those of a model trained on a dataset that excludes any given individual sample from the original dataset. This, in turn, implies that adversaries cannot infer with confidence whether a particular sample was part of the training set based on the model's outputs."
"While differential privacy provides a rigorous, quantifiable privacy guarantee, it does at a cost, as the added noise can reduce model accuracy and makes training more computationally expensive. Google's research leading to VaultGemma has in fact focused especially on this balance and attempted to identify scaling laws for DP models, or in other words define what is the optimal training configuration to achieve the lowest performance loss for a given privacy guarantee and compute budget."
VaultGemma is a 1B-parameter Gemma 2-based large language model trained from scratch with differential privacy to limit memorization of training data. Differential privacy injects calibrated noise into training to preserve overall statistics while protecting individual samples. Effective DP requires injected noise to outweigh intrinsic data randomness, increasing batch sizes and compute costs. DP training makes model outputs statistically indistinguishable from models trained without any single sample, preventing confident membership inference. The privacy guarantee reduces accuracy and raises training expense. Research identified scaling laws to optimize training configuration for minimal performance loss under a given privacy and compute budget.
Read at InfoQ
Unable to calculate read time
Collection
[
|
...
]