27 September 2023

This algorithm could ruin lives

19 March 2023

A system used by the Dutch city of Rotterdam attempted to rank people based on their risk of fraud. Matt Burges* says the results were troubling.

It was October 2021, and Imane, a 44-year-old mother of three, was still in pain from the abdominal surgery she had undergone a few weeks earlier.

She certainly did not want to be where she was: sitting in a small cubicle in a building near the center of Rotterdam, while two investigators interrogated her.

But she had to prove her innocence or risk losing the money she used to pay rent and buy food.

Imane emigrated to the Netherlands from Morocco with her parents when she was a child.

She started receiving benefits as an adult, due to health issues, after divorcing her husband.

Since then, she has struggled to get by using welfare payments and sporadic cleaning jobs.

Imane says she would do anything to leave the welfare system, but chronic back pain and dizziness make it hard to find and keep work.

In 2019, after her health problems forced her to leave a cleaning job, Imane drew the attention of Rotterdam’s fraud investigators for the first time.

She was questioned and lost her benefits for a month.

“I could only pay rent,” she says.

She recalls the stress of borrowing food from neighbours and asking her 16-year-old son, who was still in school, to take on a job to help pay other bills.

Now, two years later, she was under suspicion again.

In the days before that meeting at the Rotterdam social services department, Imane had meticulously prepared documents: her rental contract, copies of her Dutch and Moroccan passports, and months of bank statements.

With no printer at home, she had visited the library to print them.

In the cramped office she watched as the investigators thumbed through the stack of paperwork.

One of them, a man, spoke loudly, she says, and she felt ashamed as his accusations echoed outside the thin cubicle walls.

They told her she had brought the wrong bank statements and pressured her to log in to her account in front of them.

After she refused, they suspended her benefits until she sent the correct statements two days later.

Imane, who asked that her real name not be used for fear of repercussions from city officials, isn’t alone.

Every year, thousands of people across Rotterdam are investigated by welfare fraud officers, who search for individuals abusing the system.

Since 2017, the city has been using a machine learning algorithm, trained on 12,707 previous investigations, to help it determine whether individuals are likely to commit welfare fraud.

The machine learning algorithm generates a risk score for each of Rotterdam’s roughly 30,000 welfare recipients, and city officials consider these results when deciding whom to investigate.

Imane’s background and personal history meant the system ranked her as “high risk.”

But the process by which she was flagged is part of a project beset by ethical issues and technical challenges.

In 2021, the city paused its use of the risk-scoring model after external government-backed auditors found that it wasn’t possible for citizens to tell if they had been flagged by the algorithm and some of the data it used risked producing biased outputs.

In response to an investigation by Lighthouse Reports and WIRED, Rotterdam handed over extensive details about its system.

These include its machine learning model, training data, and user operation manuals.

The disclosures provide an unprecedented view into the inner workings of a system that has been used to classify and rank tens of thousands of people.

With this data, we were able to reconstruct Rotterdam’s welfare algorithm and see how it scores people.

Doing so revealed that certain characteristics—being a parent, a woman, young, not fluent in Dutch, or struggling to find work—increase someone’s risk score.

The algorithm classes single mothers like Imane as especially high risk.

Experts who reviewed our findings expressed serious concerns that the system may have discriminated against people.

Annemarie de Rotte, director of Rotterdam’s income department, says that people flagged by the algorithm as high risk were always assessed by human consultants, who ultimately decided whether to remove benefits.

“We understand that a reexamination can cause anxiety,” de Rotte says, using the city’s preferred term for welfare investigations.

She says the city does not intend to treat anyone badly and that it tries to conduct examinations while treating people with respect.

The pattern of local and national governments turning to machine learning algorithms is being repeated around the world.

The systems are marketed to public officials on their potential to cut costs and boost efficiency.

Yet the development, deployment, and operation of such systems is often shrouded in secrecy.

Many systems do not work as intended, and they can encode troubling biases.

The people who are judged by them are often left in the dark even as they suffer devastating consequences.

From Australia to the United States, welfare fraud algorithms sold on claims that they make governments more efficient have made people’s lives worse.

In the Netherlands, Rotterdam’s algorithmic troubles have run in parallel with a nationwide machine learning scandal.

More than 20,000 families were wrongly accused of childcare benefit fraud after a machine learning system was used to try to spot wrongdoing.

Forced evictions, broken homes, and financial ruin followed, and the entire Dutch government resigned in response in January 2021.

In Rotterdam alone, thousands of people are being scored by algorithms they don’t know anything about and do not understand.

From the outside, Rotterdam’s welfare algorithm appears complex.

The system, which was originally developed by consulting firm Accenture before the city took over development in 2018, is trained on data collected by Rotterdam’s welfare department.

It assigns people risk scores based on 315 factors.

Some are objective facts, such as age or gender identity.

Others, such as a person’s appearance or how outgoing they are, are subjective and based on the judgment of social workers.

In Hoek van Holland, a town to the west of Rotterdam that is administratively part of the city, Pepita Ceelie is trying to understand how the algorithm ranked her as high risk.

Ceelie is 61 years old, heavily tattooed, and has a bright pink buzz cut.

She likes to speak English and gets to the point quickly.

For the past 10 years, she has lived with chronic illness and exhaustion, and she uses a mobility scooter whenever she leaves the house.

Ceelie has been investigated twice by Rotterdam’s welfare fraud team, first in 2015 and again in 2021.

Both times investigators found no wrongdoing.

In the most recent case, she was selected for investigation by the city’s risk-scoring algorithm.

Ceelie says she had to explain to investigators why her brother sent her €150 ($180) for her sixtieth birthday, and that it took more than five months for them to close the case.

Sitting in her blocky, 1950s house, which is decorated with photographs of her garden, Ceelie taps away at a laptop.

She’s entering her details into a reconstruction of Rotterdam’s welfare risk-scoring system created as part of this investigation.

The user interface, built on top of the city’s algorithm and data, demonstrates how Ceelie’s risk score was calculated—and suggests which factors could have led to her being investigated for fraud.

All 315 factors of the risk-scoring system are initially set to describe an imaginary person with “average” values in the data set.

When Ceelie personalizes the system with her own details, her score begins to change.

She starts at a default score of 0.3483—the closer to 1 a person’s score is, the more they are considered a high fraud risk.

When she tells the system that she doesn’t have a plan in place to find work, the score rises (0.4174).

It drops when she enters that she has lived in her home for 20 years (0.3891).

Living outside of central Rotterdam pushes it back above 0.4.

Switching her gender from male to female pushes her score to 0.5123.

“This is crazy,” Ceelie says.

Even though her adult son does not live with her, his existence, to the algorithm, makes her more likely to commit welfare fraud.

Ceelie’s divorce raises her risk score again, and she ends with a score of 0.643: high risk, according to Rotterdam’s system.

After two welfare fraud investigations, Ceelie has become angry with the system.

“They’ve only opposed me, pulled me down to suicidal thoughts,” she says.

Throughout her investigations, she has heard other people’s stories, turning to a Facebook support group set up for people having problems with the Netherlands’ welfare system.

Ceelie says people have lost benefits for minor infractions, like not reporting grocery payments or money received from their parents.

“There are a lot of things that are not very clear for people when they get welfare,” says Jacqueline Nieuwstraten, a lawyer who has handled dozens of appeals against Rotterdam’s welfare penalties.

She says the system has been quick to punish people and that investigators fail to properly consider individual circumstances.

The Netherlands takes a tough stance on welfare fraud, encouraged by populist right-wing politicians.

And of all the country’s regions, Rotterdam cracks down on welfare fraud the hardest.

Of the approximately 30,000 people who receive benefits from the city each year, around a thousand are investigated after being flagged by the city’s algorithm.

In total, Rotterdam investigates up to 6,000 people annually to check if their payments are correct.

In 2019, Rotterdam issued 2,400 benefits penalties, which can include fines and cutting people’s benefits completely.

In 2022 almost a quarter of the appeals that reached the country’s highest court came from Rotterdam.

From the algorithm’s deployment in 2017 until its use was halted in 2021, it flagged up to a third of the people the city investigated each year, while others were selected by humans based on a theme—such as single men living in a certain neighborhood.

Rotterdam has moved to make its overall welfare system easier for people to navigate since 2020.

(For example, the number of benefits penalties dropped to 749 in 2021.)

De Rotte, the director of the city’s income department, says these changes include adding a “human dimension” to its welfare processes.

The city has also relaxed rules around how much money claimants can receive from friends and family, and it now allows adults to live together without any impact on their benefits.

As a result, Nieuwstraten says, the number of complaints she has received about welfare investigations has decreased in recent years.

The city’s decision to pause its use of the welfare algorithm in 2021 came after an investigation by the Rotterdam Court of Audit on the development and use of algorithms in the city.

The government auditor found there was “insufficient coordination” between the developers of the algorithms and city workers who use them, which could lead to ethical considerations being neglected.

The report also criticized the city for not evaluating whether the algorithms were better than the human systems they replaced.

Singling out the welfare fraud algorithm, the report found there was a likelihood of biased outcomes based on the types of data used to determine people’s risk scores.

Since then, the city has been working to develop a new version—though minutes from council meetings show there are doubts that it can successfully build a system that is transparent and legal.

De Rotte says that since the Court of Audit report, the city has worked to add “more safeguards” to the development of algorithms in general, including introducing an algorithm register to show what algorithms it uses.

“A new model must not have any appearance of bias, must be as transparent as possible, and must be easy to explain to the outside world,” de Rotte says.

Welfare recipients are currently being selected for investigation at random, de Rotte adds.

While the city works to rebuild its algorithm, those caught up in the welfare system have been battling to discover how it works—and whether they were selected for investigation by a flawed system.

Among them is Oran, a 35-year-old who’s lived in Rotterdam all his life.

In February 2018 he received a letter saying he was being investigated for welfare fraud.

Oran, who asked that his real name not be used for privacy reasons, has a number of health issues that make it difficult to find work.

In 2018, he was receiving a monthly loan from a family member.

Rotterdam’s local government asked him to document the loan and agree that it be paid back.

Although Oran did this, investigators pursued fraud charges against him, and the city said he should have €6,000 withheld from future benefits payments, a sum combining the amount he had been loaned plus additional fines.

From 2018 to 2021, Oran fought against the local authority in court.

He says being accused of committing fraud took a huge toll.

During the investigation, he says, he couldn’t focus on anything else and didn’t think he had a future.

“It got really difficult. I thought a lot about suicide,” he says.

Two court appeals later, in June 2021, Oran cleared his name, and the city refunded the €6,000 it had deducted from his benefits payments.

“It feels like justice,” he says.

Despite the lengthy process, he did not find out why he was selected for scrutiny, what his risk scores were, or what data contributed to the creation of his scores.

So he requested it all.

Five months later, in April 2021, he received his risk scores for 2018 and 2019.

While his files revealed he was not selected for investigation by the algorithm but rather part of a selection of single men, his risk score was among the top 15 percent of benefits recipients.

His zip code, history of depression, and assessments by social workers contributed to his high score.

“That’s not reality, that’s not me, that’s not my life, it’s just a bunch of numbers,” Oran says.

As the use of algorithmic systems grows, it could become harder for people to understand why decisions have been made and to appeal against them.

Tamilla Abdul-Aliyeva, a senior policy advisor at Amnesty International in the Netherlands, says people should be told if they are being investigated based on algorithmic analysis, what data was used to train the algorithm, and what selection criteria were used.

“Transparency is key for protecting human rights and also very important in the democratic society,” says Abdul-Aliyeva.

De Rotte says Rotterdam plans to give people more information about “why and how they were selected” and that more details of the new model will be announced “before the summer.”

For those already caught in Rotterdam’s welfare dragnet, there is little solace.

Many of them, including Oran and Ceelie, say they don’t want the city to use an algorithm to judge vulnerable people.

Ceelie says it feels like she has been “stamped” with a number and that she is considering taking Rotterdam’s government to court over its use of the algorithm.

Developing and using the algorithm won’t make people feel like they are being treated with care, she says.

“Algorithms aren’t human.

“Call me up, with a human being, not a number, and talk to me.

“Don’t do this.”

*Matt Burges is a senior writer at WIRED focused on information security, privacy, and data regulation in Europe.

Additional reporting by Eva Constantaras, Justin-Casimir Braun, and Soizic Penicaud. Reporting was supported by the Pulitzer Center’s AI Accountability Network and the Eyebeam Center for the Future of Journalism.

A fuller version of this article first appeared at wired.co.uk