Schedule
This class is organized in a sequence of topics. For each topic, we first cover basic concepts in an instructor-driven lecture, after which we read papers on the topic and discuss them in class. Below is the approximate class schedule, along with major dates for homework deadlines, midterm, and project milestones.(*) While those dates are set in stone, the specific lecture topics and papers that will covered at each date may change up until the week before.
Topic 0: Course introduction
01/18 Lecture
Topic 1: Privacy risks and attacks
01/18 Lecture
- General privacy concerns in big data
- Running example: privacy risks in modern ML ecosystems
- Privacy attacks against anonymization, aggregates, models
- Demos of privacy attacks
01/25 Reading & discussion
01/25 HW 1 assigned (due 02/08)
Topic 2: Differential privacy (DP)
02/01 Lecture:
- Defining privacy in statistical data analysis
- DP definitions and interpretations
- DP parameters and semantics
- DP mechanisms
- Composite DP algorithms (DP LR, SGD, etc.)
- Code overviews, demos
02/08 Reading & discussion
02/08 HW1 due
02/08 HW2 assigned (due 02/22)
Topic 3: DP deployments and systems
02/15 Reading & discussion
- Abowd, Ashmead, Garfinkel, et al., Harvard Data Science Review 2022. The 2020 Census Disclosure Avoidance System TopDown Algorithm.
- Adeleye, Berghel, Desfontaines, et al., PEPR 2022. Publishing Wikipedia usage data with strong privacy guarantees. See also the associated blog post.
- SHORT – Aktay, Bavadekar, Cossoul et al., ArXiv 2020. Google COVID-19 Community Mobility Reports: Anonymization
Process Description. See also the associated blog post.
02/22 Reading & discussion
- Rogers, Subramaniam, Peng, et. al., JPC 2021. LinkedIn’s Audience Engagements API: a Privacy Preserving Data Analytics System at Scale.
- Luo, Pan, Tholoniat, Cidon, Geambasu, and Lecuyer, OSDI 2021. Privacy Budget Scheduling.
- SHORT – Berghel, Bohannon, Desfontaines, et al., TPDP 2023. Tumult Analytics: a robust, easy-to-use, scalable, and expressive
framework for differential privacy.
02/22 HW2 due
Topic 4: Homomorphic encryption (HE)
02/29 Lecture:
- Limitations of traditional encryption for data exposure risks in (ML) clouds
- Homomorphic encryption and example schemes
- Example system: homomorphic databases
- Code overviews, demos
02/29 HW3 assigned
03/07 Reading & discussion
03/21 Midterm & project launch
03/21 HW3 due
Topic 5: Secure multiparty computation (MPC)
03/28 Lecture:
- Problem settings
- Secure multiparty computation
- Federated learning
- MPC protocols
- Existing systems
03/28 Project status update: team, selected project
04/04 Reading & discussion
04/04 Project status update: three-step execution plan
Topic 6: Private web advertising
04/11 Reading & discussion
04/11 Project status update: step 1 report
Topic 7: Compositions and tensions of privacy technologies
4/18 Lecture
- Connections and tradeoffs of advanced privacy technologies
- Composing them in systems to address privacy risks in modern ML ecosystems
04/18 Reading & discussion
04/18 Project status update: step 2 report
Topic 8: Beyond technology: legal and policy perspectives of privacy
04/25 Invited lecture
04/25 Project status update: step 3 report
Final project presentations
Final project presentations will be delivered on the final exam day as scheduled by the Registrar, though the timing is slightly modified.
(*) We are distributing the papers from our own repository to avoid problems
with unstable Internet links. The originals can be easily found by searching
for papers’ titles/authors.