News

Privacy Goes Public With New Database

Differential privacy repository based on Harvard research helps companies protect sensitive data

Key Takeaways

  • Harvard researchers have launched the Differential Privacy Deployments Registry, a public database that catalogs real-world uses of differential privacy by companies and agencies to better protect individuals’ data.
  • Developed from Harvard-originated theory and shaped by a 2025 study, the registry is designed as an interactive resource for practitioners, with potential to support policymakers and the public in learning how differential privacy protects sensitive data.

When Apple discovers trending popular emojis, or when Google reports traffic at a busy restaurant, they’re analyzing large datasets made up of individual people. Those people’s personal information is systematically protected thanks in large part to research by Harvard computer scientists.

Now, after two decades of work on the cryptography-adjacent mathematical framework known as differential privacy, researchers in the John A. Paulson School of Engineering and Applied Sciences have reached a key milestone in moving privacy best practices from academia into real-world applications. 

A team led by Salil Vadhan, the Vicky Joseph Professor of Computer Science and Mathematics at SEAS, has launched the Differential Privacy Deployments Registry, a collaborative, shared database of companies and agencies actively using the highly rigorous data-protection scheme that first entered the academic literature in 2006. The theoretical privacy-protection framework has since seen growing popularity amongst large companies and organizations that handle sensitive information. The new database should enable even more adoptions and refinements.

Elena Ghazi, Salil Vadhan, and Priyanka Nanayakkara

Elena Ghazi, Salil Vadhan, and Priyanka Nanayakkara.

“There’s real societal value that differential privacy has the potential to provide, but only if we can make it easy and effective enough for people to adopt,” said Vadhan, who, in 2019, co-founded the community project OpenDP, which develops open-source tools for deploying differential privacy. OpenDP emerged from a preceding National Science Foundation-supported research initiative at Harvard called the Privacy Tools Project and is led by Vadhan and Gary King, Albert J. Weatherhead III University Professor at Harvard.

The 2006 paper that described the foundational theory behind differential privacy was first authored by Cynthia Dwork, Gordon McKay Professor of Computer Science at SEAS, in collaboration with Frank McSherry, Kobbi Nissim and Adam Smith. Dwork’s research in cryptography and privacy was recently awarded the National Medal of Science.

Since that time, the theoretical framework has moved into diverse real-world applications, springboarded by the U.S. government’s high-profile deployment of the technology on U.S. Census Bureau data in 2020. Thanks to the protections afforded by differentially private algorithms, survey-takers who provided personal information to the government enjoyed an extra guarantee of privacy.

The National Institute of Standards and Technology, a government agency that plays a central role in developing guidelines for information security and privacy technology across the United States, has proposed hosting the new public registry, with a final decision pending. 

A resource for the DP community

Billed as a resource hub for the differential privacy community to support broader understanding and communication across sectors, the new database should not only help create new users of differential privacy but also help legal and policy teams better understand existing uses. Current deployments in the database include large companies like Apple and Microsoft as well as government agencies like the National Statistics Office of Korea, who have self-reported their differential privacy deployments.

Key insights into how to design the registry came from a 2025 research study led by Priyanka Nanayakkara, a postdoctoral researcher in Vadhan’s lab, who joined Harvard in 2024 with plans to develop the registry. The research has been accepted for publication by the IEEE Symposium on Security and Privacy. Together, Nanayakkara, Ph.D. student Elena Ghazi, and Vadhan developed a research prototype of the registry and conducted a user study with practitioners to learn how they might use the registry in their work. 

During the research process, they worked with collaborators on the OpenDP team and at Oblivious, an Ireland-based data privacy company, to incorporate their research into a live version of the registry initially started by Oblivious a year prior.

“We said, ‘How can we build the registry concept out into an interactive interface so that it’s usable by practitioners? Longer term, it would be great to further develop the registry to be usable by policymakers and data subjects – for example, if you are contributing your personal data for model training for analysis, wouldn’t it be great to be able to use the registry to see how your data has been protected?’” Nanayakkara said.

Mathematically rigorous privacy guarantee

Differential privacy is a mathematically formulated definition of privacy. Rather than a set of particular algorithms or equations, it is a benchmark for privacy protection that’s afforded by the process of constructing a post-analysis dataset such that individual information cannot be extracted from it, either unintentionally or otherwise.

For example, if a medical database was used for a statistical analysis or to train a machine learning model, the data would be differentially private only if individual information would be difficult to retrieve from the published results. This standard is met by adding random statistical “noise” during computations of the data. These carefully calibrated blurring mechanisms are created via algorithms that employ specific probability distributions.

There’s real societal value that differential privacy has the potential to provide, but only if we can make it easy and effective enough for people to adopt.

Salil Vadhan
Vicky Joseph Professor of Computer Science and Mathematics at SEAS

The idea for a public-facing deployment registry was initialized by a 2018 paper by Dwork and colleagues. Computer scientists label the critical parameter that must be set when using differential privacy as “epsilon,” so the paper first called the idealized database an “epsilon registry.”

Dwork, who has been giving talks on differential privacy for 20 years, said that the choice to implement the technology is always a policy decision, not a technical one – “yet still, every time, the first question from a general audience is, ‘How should we choose epsilon?’” Dwork said. 

Thus, she is “thrilled” with the establishment of the registry and “in awe” of Vadhan’s leadership in building and sustaining the OpenDP community. “The collective wisdom of the community in balancing the feasible and the tolerable will aid future practice, not just in choosing epsilon but in myriad other decisions and strategies needed for the deployment of differential privacy in different settings and with different goals,” Dwork said.

While it remains to be seen how the new registry will change the differential privacy landscape, initial findings from the Harvard user study are promising: For instance, many practitioners saw potential for the registry to become a needed hub for the community, helping to develop best practices and inform future deployments.

Topics: Computer Science, Industry, Research

Scientist Profiles

Press Contact

Anne J. Manning | amanning@seas.harvard.edu