HDSI Trust in Science Fellow Open DP Project
The data about people being analysed for these crises - mobile phone data, transaction information, medical information - must be used and analysed with care. From a privacy perspective, being able to assure researchers and operators that the data being used cannot be used to re-identify individuals but is also statistically useful is key. Differential privacy is a statistical approach to meeting these goals. Currently, implementing differential privacy is done on a per-project, per-data set basis, an error-prone and potentially dangerous approach, often requiring specific expertise of implementers. The OpenDP project anchored at Harvard’s Institute of Quantitative Social Sciences is designed to simplify the end-use of differential privacy by providing a software library with bindings to common languages that can be integrated with statistical and machine learning systems.
We seek a graduate or post-doctoral candidate that will develop and integrate differential privacy algorithms into the OpenDP library and collaborate with library users to understand the developer experience and long-term scope and efficacy of the application of the algorithms against static data sets produced daily at the US census tract level. This will be an example for more expansive applications. The candidate will have a statistical and epidemiological background with experience in developing statistical software, Python or Rust preferred. They will need to be able to collaborate with end-users on assessing the privacy protections that the differential privacy algorithms provide in the context of human mobility metrics.
The candidate will work with an interdisciplinary team comprising Professors Gary King, Salil Vadhan, Merce Crosas, Caroline Buckee, Satchit Balsari, and others at Harvard.
- Support the integration of the OpenDP algorithms into the Camber Systems data pipeline, implemented in PySpark running on AWS.
- Integrate the OpenDP algorithms into the Dataverse ecosystem
- Validate the resulting data sets against other, external data sets (e.g., Facebook Data for Good)
Please direct inquiries to email@example.com