Scale, whose army of humans annotate raw data to train self-driving and other AI systems, nabs $18M

The artificial intelligence revolution is underway in the world of technology, but as it turns out, some of the most faithful foot soldiers are still humans. A startup called Scale, which works with a team of contractors who examine and categorise visual data to train AI systems in a two-sided marketplace model, announced that it has raised an additional $18 million in a Series B round. The aim will be to expand Scale’s business to become — in the words of CEO Alexandr Wang, the 21-year-old MIT grad who co-founded Scale with Lucy Guo — “the AWS of AI, with multiple services that help companies build AI algorithms.”

“Our mission is to accelerate the development of AI apps,” Wang said. “The first product is visual data labelling, but in the future we have a broad vision of what we hope to provide.”

Wang declined to comment on the startup’s valuation in an interview. But according to Pitchbook, which notes that this round actually closed in May of this year, the post-money valuation of Scale is now $93.50 million ($75 million pre-money).

The money comes on the back of an eventful two years since the company first launched, with revenues growing 15-fold in the last year, and “multiple millions of dollars in revenue” from individual customers. (It doesn’t disclose specific numbers, however.)

Today, Scale’s base of contractors numbers around 10,000, and it works with a plethora of businesses that are developing autonomous vehicle systems such as General Motors’ Cruise, Lyft Zoox, Nuro, Voyage, nuTonomy and Embark. These companies send Scale’s contractors raw, unlabelled data sets by way of Scale’s API, which provides services like Semantic Segmentation, Image Annotation, and Sensor Fusion, in conjunction with its clients LIDAR and RADAR data sets. In total, it says it’s annotated 200,000 “miles of data” collected by self-driving cars.

AV companies are not its only customers, though. Scale also works with several non-automotive companies like Airbnb and Pinterest, to help build their AI-based visual search and recommendation systems. Airbnb, for example, is looking for more ways of being able to ascertain what kinds of homes repeat customers like and don’t like, and also to start to provide other ways of discovering places to stay that are based not just on location and number of bedrooms (which becomes more important especially in cities where you may have too many choices and want a selection more focused on what you are more likely to rent).

This latest funding round was led by Index, with existing investors Accel and Y Combinator (where Scale was incubated), also participated in this Series B, along with some notable, new individual investors such as Dropbox CEO Drew Houston and Justin Kan (two YC alums themselves who have been regular investors in other YC companies). This latest round brings the total raised by Scale to $22.7 million.

When Scale first made its debut in July 2016 as part of YC’s summer cohort, the company presented itself as a more intelligent alternative to Mechanical Turk, specifically to address the demands of artificial intelligence systems that needed more interaction and nuanced responses than the typical microtask asked of a Turker.

“We’re honing in on AI broadly,” Wang said. “Our goal is to be a pick axe in the AI goldrush.”

Early efforts covered a wide spread of applications — categorization/content moderation, comparison, transcription, and phone calling as some examples. But more recently the company has seen a particular interest from self-driving car companies, and specifically the ability to look at, understand and categorise images of what might appear on a road with the kind of recognition that only a human can provide for training purposes. For example, to be able to identify a scooter versus a wagon, a piece of asphalt or an article of granite-colored clothing on a person that could potentially look like asphalt to an unsuspecting camera, or whatever.

“This sub-segment of AI, autonomous vehicles, really took off after we launched, and that segment has been the killer use case for us,” Wang said.

My experience in talking with autonomous car companies and those who work with them has been that many of them are extremely guarded about their data, so much so that there are entire companies being built to help manage this IP standoff so that no one has to share what they know, but they can still benefit from each other.

Wang says that the same holds for Scale’s clients, and part of its unique selling point is that it not only provides data identification services but does so with the assurance that its systems retain none of that data for its own or other companies’ purposes.

“We don’t share across different silos and are very clear about that,” Wang said. “These companies are very sensitive, as are all AI companies about their data and where it goes, and we’ve been able to gain trust as a partner because will not share or sell data to any other parties.”

Scale uses AI itself to help select contractors. “We have built a bunch of algorithms and AI to vet and train contractors,” Wang said. In the training, “we provide feedback and determine if they are getting good enough to do the work, and in terms of ensuring the quality of their work, our algorithms go through what they are doing and verify the work against our models, too. There are a lot of algorithms.”

For clients who are calling in data from the public web — for example Pinterest or Airbnb — Scale uses a broader contractor pool that could include stay-at-home moms, students or others looking for extra money.

For clients who are sensitive about the data that’s being analysed — such as the car companies — the conditions are more restricted, and sometimes include centres where Scale controls the machines that are being used as well as how the data sets can be viewed.

This is one reason why Scale isn’t simply focused on growing the numbers of contractors as its only route for growing business. “We’ve noticed that when you have people who spend more time on this they do better work,” Wang said.

Wang said the Series B funding will be used to expand the kind of work Scale does for existing customers in the area of visual data analysis, as well as to gradually add in other categories of data, such as text.

“Our first goal is to improve algorithms for customers today,” he said. “There is no limit to how accurate they want to make their systems, and they need to be constantly feeding their AI with more data. All of our customers have this, and it’s an evergreen problem.”

The second is to diversify more outside driving and the visual data set, he said. “Right now, so much of the success has been in processing imagery and robotics or other perception challenges, but we really want to be the fabric of the AI world for new applications, including text or audio. That is another use of funds to expand to those areas.”

“Fabric” is the operative word, it seems: “Scale has the potential to become the fabric that connects and powers the Artificial Intelligence world,” said Mike Volpi, General Partner, Index Ventures, in a statement. “For autonomous vehicles in particular, Scale is well-positioned to take over an emerging field of data annotation regardless of which players ultimately come out on top. Alex…has recruited a highly talented and technical team to tackle this challenge and their progress is evident in the marquee list of customers they’ve won in such a short amount of time.”