Crunch Network

Crowdsourced data can teach your phone to follow your eyes

Next Story

Meet Domgy, an AI pet robot from Beijing startup ROOBO

Eye tracking has always been a tough problem. Multi-camera solutions existed in order to sense the position of the eyes in 3D space, but in general watching where your peepers pointed was too hard for cellphones. Now researchers at MIT and the University of Georgia have created an eye-tracking system that depends on crowdsourced data.

The team created a simple app that showed a dot on the screen and then asked if it was on the left or right side. Subjects would tap where they saw the dot and the phone would record video and images of the interaction. They used Amazon’s Mechanical Turk to employ hundreds of users to perform the test and then used image processing techniques to asses exactly where their eyes were pointed when tapping on certain stops.

“The field is kind of stuck in this chicken-and-egg loop,” said Aditya Khosla, an MIT graduate student. “Since few people have the external devices, there’s no big incentive to develop applications for them. Since there are no applications, there’s no incentive for people to buy the devices. We thought we should break this circle and try to make an eye tracker that works on a single mobile device, using just your front-facing camera.”

Most eye-tracking systems used smaller sample sizes and tended to fail when put into practice. With 800 data points, however, researchers were able to “see” where eyes were pointing without trouble.

They use machine learning technology to sense eye position from the 1,600 photos the system took of each user.

The researchers’ machine-learning system was a neural network, which is a software abstraction but can be thought of as a huge network of very simple information processors arranged into discrete layers. Training modifies the settings of the individual processors so that a data item — in this case, a still image of a mobile-device user — fed to the bottom layer will be processed by the subsequent layers. The output of the top layer will be the solution to a computational problem — in this case, an estimate of the direction of the user’s gaze.

This means app developers could sense eye movement in smartphones and potentially add this functionality to their devices. It’s an interesting use of crowdsourced data and the Mechanical Turk that could have implications in phone use among the handicapped… and make it easier to target advertisements.