OpenAI’s robotic hand doesn’t need humans to teach it human behaviors

Gripping something with your hand is one of the first things you learn to do as an infant, but it’s far from a simple task, and only gets more complex and variable as you grow up. This complexity makes it difficult for machines to teach themselves to do, but researchers at Elon Musk and Sam Altman-backed OpenAI have created a system that not only holds and manipulates objects much like a human does, but developed these behaviors all on its own.

Many robots and robotic hands are already proficient at certain grips or movements — a robot in a factory can wield a bolt gun even more dexterously than a person. But the software that lets that robot do that task so well is likely to be hand-written and extremely specific to the application. You couldn’t for example, give it a pencil and ask it to write. Even something on the same production line, like welding, would require a whole new system.

Yet for a human, picking up an apple isn’t so different from pickup up a cup. There are differences, but our brains automatically fill in the gaps and we can improvise a new grip, hold an unfamiliar object securely and so on. This is one area where robots lag severely behind their human models. And furthermore, you can’t just train a bot to do what a human does — you’d have to provide millions of examples to adequately show what a human would do with thousands of given objects.

The solution, OpenAI’s researchers felt, was not to use human data at all. Instead, they let the computer try and fail over and over in a simulation, slowly learning how to move its fingers so that the object in its grasp moves as desired.

The system, which they call Dactyl, was provided only with the positions of its fingers and three camera views of the object in-hand — but remember, when it was being trained, all this data is simulated, taking place in a virtual environment. There, the computer doesn’t have to work in real time — it can try a thousand different ways of gripping an object in a few seconds, analyzing the results and feeding that data forward into the next try. (The hand itself is a Shadow Dexterous Hand, which is also more complex than most robotic hands.)

In addition to different objects and poses the system needed to learn, there were other randomized parameters, like the amount of friction the fingertips had, the colors and lighting of the scene and more. You can’t simulate every aspect of reality (yet), but you can make sure that your system doesn’t only work in a blue room, on cubes with special markings on them.

They threw a lot of power at the problem: 6144 CPUs and 8 GPUs, “collecting about one hundred years of experience in 50 hours.” And then they put the system to work in the real world for the first time — and it demonstrated some surprisingly human-like behaviors.

The things we do with our hands without even noticing, like turning an apple around to check for bruises or passing a mug of coffee to a friend, use lots of tiny tricks to stabilize or move the object. Dactyl recreated several of them, for example holding the object with a thumb and single finger while using the rest to spin to the desired orientation.

What’s great about this system is not just the naturalness of its movements and that they were arrived at independently by trial and error, but that it isn’t tied to any particular shape or type of object. Just like a human, Dactyl can grip and manipulate just about anything you put in its hand, within reason of course.

This flexibility is called generalization, and it’s important for robots that must interact with the real world. It’s impossible to hand-code separate behaviors for every object and situation in the world, but a robot that can adapt and fill in the gaps while relying on a set of core understandings can get by.

As with OpenAI’s other work, the paper describing the results is freely available, as are some of the tools they used to create and test Dactyl.