Waymo is opening up its significant stores of autonomous driving data with a new Open Data Set it’s making available for the purposes of research. The data set isn’t for commercial use, but its definition of “research” is fairly broad, and includes researchers at other companies as well as academics.
The data set is “one of the largest, riches and most diverse self-driving data sets ever released for research,” according to Waymo principal scientist and head of Research, Drago Anguelov, who was at both Zoox and Google prior to joining Waymo last year. Anguelov said in a briefing that the reason he initiated the push to make this data available is that Waymo and several other companies working in the field are “currently hampered by the lack of suitable data sets.”
“We decided to contribute our part to make, ultimately, researchers in academia ask the right questions — and for that, they need the right data,” Anguelov said. “And I think this will help everyone in the field; it is not an admission in any way that we have problems solving these issues. But there is always room for improvement in terms of efficiency, scaleability, amount of labels to need. It’s a developing field. Mostly we’re trying to get others into thinking about our problems and working with us, as opposed to doing work that’s potentially not so impactful, given the current state of things.”
The Waymo Open Data set tries to fill in some of these gaps for their research peers by providing data collected from 1,000 driving segments done by its autonomous vehicles on roads, with each segment representing 20 seconds of continuous driving. It includes driving done in Phoenix, Ariz.; Kirkland, Wash.; Mountain View, Calif.; and San Francisco, Calif., and offering a range of different driving conditions, including at night, during rain, at dusk and more. The segments include data collected from five of Waymo’s own proprietary lidars, as well as five standard cameras that face front and to the sides, providing a 360-degree view captured in high resolution, as well as synchronization Waymo uses to fuse lidar and imaging data. Objects, including vehicles, pedestrians, cyclists and signage is all labeled.
Waymo has traditionally been among the more closed companies when it comes to its collected data, and it’s also the player that often touts its own long experience as a key competitive advantage (Waymo began life as Google’s Self-Driving Car project, which officially began work out of Google’s X Lab in 2009). The company has also had a high-profile legal spat over intellectual property with autonomous driving technology rival Uber, following the hiring by Uber of a former member of its own team. Naturally, then, some might be skeptical about how “open” it actually is about ways this data can be used.
Vijaysai Patnaik, a product lead at Waymo, explained that “research” use actually covers a lot of ground. There’s a specific licensing agreement with the data set, as you would expect, but Patnaik also gave a general explanation during the briefing about who they expect might make use of the data and for what purposes.
“That could include universities and PhD students and professors at various universities who are interested in this field, it could include independent research labs or robotics labs, for example.” Patnaik said. “There are a number of those in the Bay Area. And […] companies can use this data set as long as they comply with our license agreements, or it could also include folks like Drago [Anguelov] and his teams in other organizations.
Other companies working in autonomous driving have taken similar approaches, with Lyft and Argo AI as two recent examples. Waymo does indeed have a commanding lead on the rest of the field when it comes to actual time on the road and miles driven, however, so researchers in both autonomous driving, and related robotics fields, including computer vision, are probably eager to see what they’re releasing.