ML Fairness

A series of Python labs designed to train the next generation of students to identify, discuss, and address the risks posed by machine learning algorithms.

Are you an instructor who teaches students about machine learning and artificial intelligence? Are you aware that fairness and bias are important issues in AI/ML, but not quite sure how to teach students about these aspects of the technology?

The Daylight Lab at UC Berkeley’s Center for Long-Term Cybersecurity has developed a curriculum for a week-long Algorithmic Fairness Bootcamp that trains students to identify, discuss, and address the risks posed by machine learning algorithms in a variety of contexts. This unique mini-course can be integrated into any existing class on artificial intelligence.

The syllabus consists of two lectures and one interactive lab, plus an optional second lab. The first lab, a Google Colaboratory (or Jupyter) notebook, teaches students how to identify racial bias in a risk-scoring algorithm widely used in healthcare. An optional second lab teaches students how to ameliorate gender bias in a hiring algorithm. We provide programming and non-programming versions of both labs, as well as lecture slides, readings, and ideas for breakout sessions.

Bootcamp Timeline Infographic

The Daylight Lab ML Fairness Bootcamp was excellent. It gave my students a structured overview of common sources of bias, and frameworks for reasoning about fairness in applied machine learning settings. Students especially appreciated the hands-on labs, which provided an opportunity for hands-on experience with the ways in which standard ML algorithms can fail in real-world environments.

Joshua Blumenstock
Associate Professor, School of Information

Lecture 1: Identifying Bias

Key Ideas:

  • Machine learning algorithms can exhibit bias against people whose characteristics have served as the basis for systematically unjust treatment in the past. This bias can emerge for a variety of reasons, and can be so severe as to be illegal.
  • Bias in machine learning algorithms is both a social and a technical problem. There are no technical “fixes,” though technical tools can help us identify bias and reduce its harmfulness.
  • Do NOT remove sensitive data (like race and gender) from the training set. That makes it difficult to know when your algorithm is biased.

A slide from Lecture 1 of the Algorithmic Fairness Bootcamp on the implications of algorithmic bias in healthcare for Black patients.

Lab 1: Identifying racial bias in a health care algorithm

To effectively manage patients, health systems often need to estimate particular patients’ health risks. Using quantitative measures, or “risk scores,” healthcare providers can prioritize patients and allocate resources to patients who need them most. In this lab, students examine an algorithm widely-used in industry to establish quantitative risk scores for patients. They will discover how this algorithm embeds a bias against Black patients, undervaluing their medical risk relative to White patients.

Lecture 2: Ameliorating bias

Topics covered:

  • What’s the lingo? (Terminology for discussing bias and its impact)
  • What metrics can describe bias? (Quantitative indicators of bias)
  • How do we make biased algorithms less harmful? (Technical strategies to ameliorate bias)
  • What’s next? (Issues in machine learning beyond fairness and bias)

A slide from Lecture 2 of the Algorithmic Fairness Mini-Bootcamp highlighting more topics to explore about algorithmic bias.

(Optional) Lab 2: Ameliorating gender bias in a hiring algorithm

Various companies (including Amazon) have attempted to use machine learning to automate hiring decisions. However, when these algorithms are trained on past hiring decisions, they are likely to learn human biases: in this case, encoding a pay gap between men and women. From a social and ethical standpoint, we want to remove or minimize this bias so that our models are not perpetuating harmful stereotypes or injustices. In this lab, we take a dataset in which prior hiring decisions have adversely impacted women, and show how applying fairness constraints can ameliorate the effect of this impact, making it less harmful. We also prompt students to consider when, whether, and to what extent machine learning ought to be applied to hiring decisions.