Skip to Content
Uncategorized

OpenAI’s Goofy Sumo-Wrestling Bots Are Smarter Than They Look

October 12, 2017

It could be a virtual blood sport in some absurdist techno-future.

OpenAI, a research institute backed by Elon Musk and several other Silicon Valley big shots, has revealed its latest research on developing more powerful forms of machine learning. And it’s demonstrating the technology using virtual sumo wrestling.

The virtual wrestlers might look slightly ridiculous, but they are using a very clever approach to learning in a fast-changing environment while dealing with an opponent.

The agents use a form of reinforcement learning, a technique inspired by the way animals learn through feedback. It has proved useful for training computers to play games and to control robots (see “10 Breakthrough Technologies 2017: Reinforcement Learning”).

One big challenge with using reinforcement learning is that it doesn’t work so well in more realistic situations, where things are constantly in flux. OpenAI already developed its own reinforcement algorithm called proximal policy optimization (PPO), which is especially well suited to changing environments.

The latest work, done in collaboration with researchers from Carnegie Mellon University and UC Berkeley, demonstrates a way for AI agents to apply what the researchers call a “meta-learning” framework. This means the agents can take what they have already learned and apply it to a new situation.

Inside the RoboSumo environment (see video above), the agents started out behaving randomly. Through thousands of iterations of trial and error, they gradually developed the ability to move—and, eventually, to fight. Through further iterations, the wrestlers developed the ability to avoid each other, and even to question their own actions. This learning happened on the fly, with the agents adapting even they wrestled each other.

Flexible learning is a very important part of human intelligence, and it will be crucial if machines are going to become capable of performing anything other than very narrow tasks in the real world. This kind of learning is very difficult to implement in machines, and the latest work is a small but significant step in that direction.

The researchers found that by using meta-learning, their sumo-bots could learn effective strategies more quickly. So even if they look a bit hapless, don’t underestimate them.

Keep Reading

Most Popular

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

OpenAI teases an amazing new generative video model called Sora

The firm is sharing Sora with a small group of safety testers but the rest of us will have to wait to learn more.

Google’s Gemini is now in everything. Here’s how you can try it out.

Gmail, Docs, and more will now come with Gemini baked in. But Europeans will have to wait before they can download the app.

This baby with a head camera helped teach an AI how kids learn language

A neural network trained on the experiences of a single young child managed to learn one of the core components of language: how to match words to the objects they represent.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.