Disney Research has robots matching verbal styles with kids

Roboticists at Disney Research are investigating how to improve the quality of human-robot interactions by studying how speech patterns affect engagement with a creepy anthropomorphic bot that imitates its playmates’ speech.

It’s a natural enough line of research: computer-generated speech and interaction patterns are so wooden across the board that little is expected of them. But pairs of people — for example, kids playing with each other — who are in tune verbally (in “prosodic synchrony”) tend to be more engaged and successful in what they’re doing.

For the study, the team paired kids with a robot and had them play a little platforming game where one player tells the character to go and the other tells it to jump. The system they created listened to the child’s voice and extracted some basic properties: loudness, word length and frequency (i.e. quickness or rhythm).

One kid, for example, might be quick to respond but draws out “juuump,” while another hesitates but says “go” quickly and loudly.

Once the child in question was profiled, the robot would choose its vocal response from a pool of slightly different sound files, with various speeds and intonations. In one form, it picked the style most similar to the kid’s; in another, it basically picked any style but the kid’s. As the paper summarizes:

Each child played multiple game levels with both synchronizing and non-synchronizing versions of the robot, with order of condition counterbalanced across children. The results showed both the dyadic nature of synchronization and its profound effects.

Kids who played with the synchronizing version engaged better and scored higher, even when the synchronizing was turned off partway through. When they played with the non-synchronizing version, they scored lower and engaged less. On further analysis, the difference really only affected older kids, suggesting this cooperative aspect is a skill learned through social interactions.

Matching the tone of real-world interactions could mean the difference between robots that squick us out and robots that we don’t mind chatting with. It may be a little creepy to think that toys or household bots of the future may adopt different tones and verbal patterns when addressing different people — “oh, Siri’s just like that when Patricia asks questions” — but it could also go a long way toward pulling robots out of the uncanny valley.

Not the one they used in this experiment, though. Seriously.

The study, which is being presented at the Human Robot Interaction conference in Vienna, was a collaboration between Disney Research, Carnegie Mellon and the University of Texas at Dallas.