Abstract
Understanding when and how much a model gradient leaks information about the training sample is an important question in privacy. In this talk, we present a surprising result: even without training or memorizing the data, we can fully reconstruct the training samples from a single gradient query at a randomly chosen parameter value.
We prove the identifiability of the training data under mild conditions: with shallow or deep neural networks and a wide range of activation functions. We also present a statistically and computationally efficient algorithm based on low-rank tensor decomposition to reconstruct the training data. As a provable attack that reveals sensitive training data, our findings suggest potential severe threats to privacy, especially in federated learning.
Time
2023-05-11 10:00 - 11:00
Speaker
Qi Lei, Courant Institute of Mathematical Sciences and the Center for Data Science, NYU
Room
Tencent Meeting ID:360-623-986;PW:0511