25 September 2023

Child’s play: How childhood learning could bring reason to machines

Start the conversation

Karen Hao* says the sort of unsupervised learning that helps babies and toddlers make sense of the world could bring about the next AI revolution.


Photo: Nadine Doerlé

At six months old, a baby won’t bat an eye if a toy truck drives off a platform and seems to hover in the air.

But perform the same experiment a mere two to three months later, and she will instantly recognise that something is wrong.

She has already learned the concept of gravity.

“Nobody tells the baby that objects are supposed to fall,” said Yann LeCun, the chief AI scientist at Facebook and a Professor at New York University (NYU), during a recent webinar organised by the Association for Computing Machinery, an industry body.

And because babies don’t have very sophisticated motor control, he hypothesises, “a lot of what they learn about the world is through observation.”

That theory could have important implications for researchers hoping to advance the boundaries of artificial intelligence (AI).

Deep learning, the category of AI algorithms that kick-started the field’s most recent revolution, has made immense strides in giving machines perceptual abilities like vision.

But it has fallen short in imbuing them with sophisticated reasoning, grounded in a conceptual model of reality.

In other words, machines don’t truly understand the world around them, which makes them fall short in their ability to engage with it.

New techniques are helping to overcome this limitation — for example, by giving machines a kind of working memory so that as they learn and derive basic facts and principles, they can accumulate them to draw on in future interactions.

But LeCun believes that is only a piece of the puzzle.

“Obviously we’re missing something,” he said.

A baby can develop an understanding of an elephant after seeing two photos, while deep-learning algorithms need to see thousands, if not millions.

A teen can learn to drive safely by practising for 20 hours and manage to avoid crashes without first experiencing one, while reinforcement-learning algorithms (a subcategory of deep learning) must go through tens of millions of trials, including many egregious failures.

The answer, he thinks, is in the underrated deep-learning subcategory known as unsupervised learning.

While algorithms based on supervised and reinforcement learning are taught to achieve an objective through human input, unsupervised ones extract patterns in data entirely on their own.

(LeCun prefers the term “self-supervised learning” because it essentially uses part of the training data to predict the rest of the training data.)

In recent years, such algorithms have gained significant traction in natural-language processing because of their ability to find the relationships between billions of words.

This proves useful for building text prediction systems like autocomplete or for generating convincing prose.

But the vast majority of AI research in other domains has focused on supervised or reinforcement learning.

LeCun believes the emphasis should be flipped.

“Everything we learn as humans — almost everything — is learned through self-supervised learning,” he said.

“There’s a thin layer we learn through supervised learning, and a tiny amount we learn through reinforcement learning.”

“If machine learning, or AI, is a cake, the vast majority of the cake is self-supervised learning.”

What does this look like in practice?

Researchers should begin by focusing on temporal prediction.

In other words, train large neural networks to predict the second half of a video when given the first.

While not everything in our world can be predicted, this is the foundational skill behind a baby’s ability to realise that a toy truck should fall.

“This is kind of a simulation of what’s going on in your head, if you want,” LeCun said.

Once the field develops techniques that refine those abilities, they will have important practical uses as well.

“It’s a good idea to do video prediction in the context of self-driving cars because you might want to know in advance what other cars on the streets are going to do,” he said.

Ultimately, unsupervised learning will help machines develop a model of the world that can then predict future states of the world, he said.

It’s a lofty ambition that has eluded AI research but would open up an entirely new host of capabilities.

LeCun is confident: “The next revolution of AI will not be supervised.”

* Karen Hao is the artificial intelligence reporter for MIT Technology Review. She tweets at @_KarenHao.

This article first appeared at www.technologyreview.com.

Start the conversation

Be among the first to get all the Public Sector and Defence news and views that matter.

Subscribe now and receive the latest news, delivered free to your inbox.

By submitting your email address you are agreeing to Region Group's terms and conditions and privacy policy.