AWS announces new Inferentia machine learning chip

AWS is not content to cede any part of any market to any company. When it comes to machine learning chips, names like Nvidia or Google come to mind, but today at AWS re:Invent in Las Vegas, the company announced a new dedicated machine learning chip of its own called Inferentia.

“Inferentia will be a very high-throughput, low-latency, sustained-performance very cost-effective processor,” AWS CEO Andy Jassy explained during the announcement.

Holger Mueller, an analyst with Constellation Research, says that while Amazon is far behind, this is a good step for them as companies try to differentiate their machine learning approaches in the future.

“The speed and cost of running machine learning operations — ideally in deep learning — are a competitive differentiator for enterprises. Speed advantages will make or break success of enterprises (and nations when you think of warfare). That speed can only be achieved with custom hardware, and Inferentia is AWS’s first step to get in to this game,” Mueller told TechCrunch. As he pointed out, Google has a 2-3 year head start with its TPU infrastructure.

Inferentia supports popular frameworks like INT8, FP16 and mixed precision. What’s more, it supports multiple machine learning frameworks, including TensorFlow, Caffe2 and ONNX.

Of course, being an Amazon product, it also supports data from popular AWS products such as EC2, SageMaker and the new Elastic Inference Engine announced today.

While the chip was announced today, AWS CEO Andy Jassy indicated it won’t actually be available until next year.