Reconstructing Training Data from Model Gradient, Provably (Qi Lei)

Abstract

Understanding when and how much a model gradient leaks information about the training sample is an important question in privacy. In this talk, we present a surprising result: even without training or memorizing the data, we can fully reconstruct the training samples from a single gradient query at a randomly chosen parameter value.

We prove the identifiability of the training data under mild conditions: with shallow or deep neural networks and a wide range of activation functions. We also present a statistically and computationally efficient algorithm based on low-rank tensor decomposition to reconstruct the training data. As a provable attack that reveals sensitive training data, our findings suggest potential severe threats to privacy, especially in federated learning.  

Time

2023-05-11  10:00 - 11:00   

Speaker

Qi Lei,  Courant Institute of Mathematical Sciences and the Center for Data Science, NYU

Room

Tencent Meeting ID:360-623-986;PW:0511