Grant / January 2020

An Open Research Privacy Toolkit

As algorithms for statistical learning advance, the needs of social science research increasingly conflict with the privacy concerns of individuals in databases. Even when databases are “anonymized,” today’s algorithms are able to exploit statistical patterns and reveal secret information. As a result, some organizations are dramatically scaling back the extent to which they make data available for social science research.

The goal of this project is to develop new techniques for supporting social science research while maintaining formal privacy guarantees. We will build a system that is capable of taking arbitrary code written by a researcher, evaluating its behavior under a bootstrap procedure, and inserting noise to mask information about individuals. Due to the unrestricted nature of code, such a setting is incompatible with classic definitions in differential privacy. Instead, we will employ a statistical lens to estimate the distribution of privacy losses and provide users with guarantees that are asymptotically precise. We will further study how our bootstrap system is related to regularization in machine learning, proving results that connect privacy and generalization error.

Topics

Related Research

Blog Post

May 13, 2025

Reflections on Cybersecurity Futures 2025: Looking Back from the Present
White Paper

May 8, 2025

Survey of Search Engine Safeguards and their Applicability for AI
White Paper

March 31, 2025

Managing Commercial Spyware Through Export Controls: Lessons Learned from the Wassenaar Experience

An Open Research Privacy Toolkit

Topics

Related Research

Reflections on Cybersecurity Futures 2025: Looking Back from the Present

Survey of Search Engine Safeguards and their Applicability for AI

Managing Commercial Spyware Through Export Controls: Lessons Learned from the Wassenaar Experience

Help build and expand our future-focused research agenda

Subscribe to our mailing list