Humans are flawed decision-makers. Decades of research show that we’re bad with numbers and easily influenced by cognitive and social biases that operate beneath our awareness. Racism still creeps into human decisions about hiring, housing, and credit.
As a result, many of us are tempted to think we’re better off having machines make inferences directly from data. But that turns out to be a dangerous assumption, as we explore in this episode of Glad You Asked.
The many subjective choices that data scientists make as they select and structure training data can increase or decrease racial bias in machine learning systems. After all, engineers are humans with biases and blind spots like everyone else.
For other data sets, historical discrimination and systematic inequality will color the data no matter how diligent the collection process. Data on crime, for instance, is ultimately derived from the choices law enforcement officers make on which neighborhoods to patrol and who to arrest. All AI writing systems are trained on human writing samples, which reflect the perspectives of the dominant writing groups.
But even if we assume the data is unbiased, machine learning systems are designed to minimize error rates, and they’ll do that without any regard to fairness (unless programmed otherwise). That means they tend to “care” more about accuracy on majority (literally larger) groups than minorities, to the extent that the groups differ on a predictive attribute.
You’ve surely heard that correlation isn’t causation. Well, AI systems are correlation machines. They can be accurate for the wrong reasons. So it’s one thing to use them to improve weather forecasts and search results; it’s quite another to deploy them in decisions about the lives of individual human beings.
None of this is to say that these problems can’t be overcome. Some predictive models may be less biased than the status quo decision-maker. But the assumption that these systems are necessarily more objective is clearly wrong. And since AI systems are both opaque and highly scalable, they demand a level of scrutiny that we haven’t even begun to ensure.
You can find this video and all of Vox’s videos on YouTube. Subscribe for more.
“Scrutinizing Saliency Based Image Cropping” by Vinay Prabhu
“Model Cards for Model Reporting” ArXiv
Race After Technology by Ruha Benjamin
Weapons of Math Destruction by Cathy O’Neil
The Ethical Algorithm by Michael Kearns and Aaron Roth
“Why Algorithms Can Be Racist and Sexist,” Recode
“How Big Data Is Unfair” by Moritz Hardt
“The Problem With ‘Biased Data’” by Harini Suresh
“Big Data’s Disparate Impact” by Solon Barocas and Andrew D. Selbst
“Fairness in Algorithmic Decision-Making” by Mark MacCarthy