Amazon launches an automated labeling service for its SageMaker machine learning tool

You can’t build a good machine learning model without good training data. But building those training sets is hard, often manual work, that involves labeling thousand and thousands of images, for example. With SageMaker, AWS has been working on a service that makes building machine learning models a lot easier. But until today, that labeling task was still up to the user. Now, however, the company is launching SageMaker Ground Truth, a training set labeling service.

Using Ground Truth, developers can point the service at the storage buckets that hold the data and allow the service to automatically label it. What’s nifty here is that you can both set a confidence level for the fully automatic service or you can send the data to human laborers. Those human labelers, who probably have the most mind-numbing job in tech, can either be the company’s Mechanical Turk users or third-party service. If you really hate your employees, you can have them do the labeling, too.

Currently the service supports text classification, image classification, object detection and semantic segmentation. Users can also create their own tasks.

As the labeling data comes in, Ground Truth then pulls some of the objects and sends them to human labelers to build a new custom model for the user.

“This is a total game changer in being able to label your data,” said AWS CEO Andy Jassy. “So you can build those types of models that before were really difficult or nearly impossible or too expensive to do.”

more AWS re:Invent 2018 coverage