25 September 2023

Driving force: How an evolving algorithm is training self-driving cars

Start the conversation

Will Knight* says that a more efficient way of training the neural nets needed for self-driving cars takes inspiration from biological evolution.

Waymo’s self-driving cars now have something in common with the brains that guide regular vehicles: their intelligence comes partly from the power of evolution.

Engineers at Waymo, owned by Alphabet, teamed up with researchers at DeepMind, another Alphabet division dedicated to artificial intelligence (AI), to find a more efficient process to train and fine-tune the company’s self-driving algorithms.

They used a technique called population-based training (PBT), previously developed by DeepMind for honing video-game algorithms.

PBT, which takes inspiration from biological evolution, speeds up the selection of machine-learning algorithms and parameters for a particular task by having candidate code draw from the “fittest” specimens (the ones that perform a given task most efficiently) in an algorithmic population.

Refining AI algorithms in this way may also help give Waymo an edge.

The algorithms that guide self-driving cars need to be retrained and recalibrated as the vehicles collect more data and are deployed in new locations.

Dozens of companies are racing to demonstrate the best self-driving technology on real roads.

Waymo is exploring various other ways of automating and accelerating the development of its machine-learning algorithms.

Indeed, more efficient methods for retraining machine-learning code should allow AI to be flexible and useful in different contexts.

“One of the key challenges for anyone doing machine learning in an industrial system is to be able to rebuild the system to take advantage of new code,” says Matthieu Devin, Director of Machine Learning Infrastructure at Waymo.

“We need to constantly retrain the net and rewrite our code.”

“And when you retrain, you may need to tweak your parameters.”

Modern self-driving cars are controlled by an almost Rube Goldberg combination of algorithms and techniques.

Numerous machine-learning algorithms are used to spot road lines, signs, other vehicles, and pedestrians in sensor data.

These work in concert with conventional, or handwritten, code to control the vehicle and respond to different eventualities.

Each new iteration of a self-driving system has to be tested rigorously in simulation.

Today’s self-driving vehicles rely heavily upon deep learning, in particular.

But configuring a deep neural network with the right properties and parameters (the values that are hard-coded at the start) is a tricky art.

Candidate networks and parameters are mostly either selected manually, which is time-consuming, or tweaked at random by a computer, which requires lots of processing power.

“At Waymo we train tonnes of different neural nets, and researchers spend a lot of time figuring out how to best train these neural nets,” says Yu-hsin (Joyce) Chen, a machine-learning infrastructure engineer at Waymo.

“We had a need for it and just jumped at the opportunity.”

Chen says her team is now using PBT to improve the development of deep-learning code used to detect lane markings, vehicles, and pedestrians, and to verify the accuracy of labelled data that is fed to other machine-learning algorithms.

She says PBT has reduced the computer power required to retrain a neural net by about half and has doubled or tripled the speed of the development cycle.

Google is developing a range of techniques to help automate the process of training machine-learning models, and it already offers some of them to customers through a project known as Cloud Auto-ML.

Making AI training more efficient and automated will undoubtedly prove crucial to efforts to commercialise, and profit from, the technology.

Oriol Vinyals, a principal research scientist at DeepMind and one of the inventors of PBT, says the idea for using PBT at Waymo came up when he was visiting Devin.

Vinyals and colleagues first developed the technique as a way to speed the training of a computer to play StarCraft II, a combat video game that is especially challenging for machines, through reinforcement learning.

The evolution-like process employed in PBT also makes it easier to understand how a deep-learning algorithm has been tweaked and optimised, with something that resembles a genealogical tree.

“One of the cool things is that you can you can visualise the evolution of parameters,” says Vinyals.

“It’s a nice way to verify that what happens actually makes sense to you.”

* Will Knight is MIT Technology Review’s Senior Editor for Artificial Intelligence. He tweets at @willknight.

This article first appeared at www.technologyreview.com.

Start the conversation

Be among the first to get all the Public Sector and Defence news and views that matter.

Subscribe now and receive the latest news, delivered free to your inbox.

By submitting your email address you are agreeing to Region Group's terms and conditions and privacy policy.