Application of bias: Why algorithms can be racist and sexist

Rebecca Heilweil* says artificial intelligence can make decisions faster than humans, but that doesn’t mean those decisions are fair.

Humans are error-prone and biased, but that doesn’t mean that algorithms are necessarily better.

Still, the tech is already making important decisions about your life, how your application to your dream job is screened, how police officers are deployed in your neighbourhood, and even predicting your home’s risk of fire.

But these systems can be biased based on who builds them, how they’re developed, and how they’re ultimately used.

This is commonly known as algorithmic bias.

It’s tough to figure out exactly how systems might be susceptible to algorithmic bias, especially since this technology often operates in a corporate black box.

We frequently don’t know how a particular artificial intelligence (AI) or algorithm was designed, what data helped build it, or how it works.

Typically, you only know the end result: how it has affected you, if you’re even aware that AI or an algorithm was used in the first place.

That makes addressing the biases of AI tricky, but even more importantly, to understand.

Machine learning–based systems are trained on data — lots of it.

When thinking about “machine learning” tools, it’s better to think about the idea of “training”.

This involves exposing a computer to a bunch of data so the computer learns to make judgements, or predictions, about the information it processes based on the patterns it notices.

Often, the data on which many of these decision-making systems are trained are not complete, balanced, or selected appropriately, and that can be a major source of bias.

Nicol Turner-Lee, at the Brookings Institution, explains that we can think about algorithmic bias in two primary ways: accuracy and impact.

An AI can have different accuracy rates for different demographic groups.

Similarly, an algorithm can make vastly different decisions when applied to different populations.

Let’s take one source of data everyone has access to: the internet.

One study found that, by teaching an AI to crawl through the internet, the system would produce prejudices against black people and women.

Another example of how training data can produce sexism in an algorithm occurred a few years ago, when Amazon tried to use AI to build a résumé-screening tool.

According to Reuters, the company’s hope was that technology could make the process of sorting through job applications more efficient.

It built a screening algorithm using résumés the company had collected for a decade, but those résumés tended to come from men.

That meant the system learned to discriminate against women.

Shouldn’t we just make more representative datasets?

That might be part of the solution, though it’s worth noting that not all efforts aimed at building better datasets are ethical.

And it’s not just about the data.

As Karen Hao of the MIT Tech Review explains, AI could also be designed to frame a problem in a fundamentally problematic way.

For instance, an algorithm designed to determine “creditworthiness” that’s programmed to maximise profit could ultimately decide to give out predatory, subprime loans.

Here’s another thing to keep in mind: Just because a tool is tested for bias against one group doesn’t mean it is tested for bias against another type of group.

This is also true when an algorithm is considering several types of identity factors at the same time: A tool may deemed fairly accurate on white women, for instance, but that doesn’t necessarily mean it works with black women.

So, you might have the data to build an algorithm.

Who gets to decide what level of accuracy and inaccuracy for different groups is acceptable?

Who gets to decide which applications of AI are ethical and which aren’t?

AI tends to be dominated by men.

And the tech sector, more broadly, tends to overrepresent white people.

But there’s also a broader question of what questions AI can help us answer.

Hu, the Harvard researcher, argues that for many systems, the question of building a “fair” system is essentially nonsensical, because those systems try to answer social questions that don’t necessarily have an objective answer.

Some algorithms probably shouldn’t exist, or at least they shouldn’t come with such a high risk of abuse.

Just because a technology is accurate doesn’t make it fair or ethical.

For instance, the Chinese Government has used AI to track and racially profile its largely Muslim Uyghur minority.

Transparency is a first step for accountability

One of the reasons algorithmic bias can seem so opaque is because, on our own, we usually can’t tell when it’s happening (or if an algorithm is even in the mix).

But consumers being able to make apples-to-apples comparisons of algorithmic results are rare.

How can you critique an algorithm — a sort of black box — if you don’t have true access to its inner workings or the capacity to test a good number of its decisions?

Companies will claim to be accurate, but won’t always reveal their training data.

Many don’t appear to be subjecting themselves to audit by a third-party evaluator or publicly sharing how their systems fare when applied to different demographic groups.

We will likely need new laws to regulate AI.

That raises the challenge of whether governments are prepared to study and govern this technology, and figure out how existing laws apply.

“You have a group of people that really understand it very well, and that would be technologists,” Turner-Lee cautions, “and a group of people who don’t really understand it at all, or have minimal understanding, and that would be policymakers.”

That’s not to say there aren’t technical efforts to “de-bias” flawed AI, but it’s important to keep in mind that the technology won’t be a solution to fundamental challenges of fairness and discrimination.

And there’s no guarantee companies building or using this tech will make sure it’s not discriminatory, especially without a legal mandate to do so.

It would seem it’s up to us, collectively, to push governments to rein in the tech and to make sure it helps us more than it might already be harming us.

* Rebecca Heilweil is reporter for Open Sourced. She tweets at @rebeccaheilweil.

This article first appeared at www.vox.com/recode.