As companies build more complex machine learning models, the cost of training and running these models becomes a real issue. AWS has created a series of custom instances to help bring down the cost, and today it introduced a preview of an all-new Inf2 instance for EC2 designed to process data from larger workloads more efficiently.
AWS CEO Adam Selipsky made the announcement today at AWS re:Invent in Las Vegas.
As Selipsky told the AWS re:Invent audience, “Inf1 is great for small-to-medium complexity models, but for larger models, customers have often relied on more powerful instances because they don’t actually have the optimal resource configuration for their inference workloads.”
They did this because up until now, there simply wasn’t another solution available to help bring down the cost and complexity of processing these larger workloads.
“You want to choose the solution that is the best fit for your specific needs, which is why today I’m excited to announce a preview of the Inf2 instance powered by our new inferentia2 chip,” he said.
For folks who need that extra power, Inf2 provides it. “Customers can deploy a 175 billion parameter model for inference on a single instrument with four times higher throughput and 1/10 the latency of Inf1 instances,” he said.
The new instances are available in preview starting today.