For starters, the company is making Nvidia K80 GPUs generally available. At the same time, it’s launching support for Nvidia P100 GPUs in Beta along with a new sustained pricing model.
For companies working with machine learning workloads, having access to GPUs in the cloud provides them with flexibility to pay for what they use with by the minute pricing, and the sustained pricing model means if they end up running the GPUs for a sustained period of time, they get up to a 30 percent discount, depending on the usage. In other words, they don’t get whacked with a huge bill because they like the service.
And Google is claiming that this approach could provide bare metal kind of performance. “Cloud GPUs are offered in passthrough mode to provide bare-metal performance. Attach up to 4 P100 or 8 K80 per VM (we offer up to 4 K80 boards, that come with 2 GPUs per board),” Google wrote in the blog post announcing the GPU support.
What’s more, Google is offering the flexibility to run GPU workloads in virtual machines or containers and they are delivering the service in four global locations.
Google is seeing customers use this service for a variety of compute-intensive tasks such as genomics, computational finance and training and inference on machine learning models. The two different chips give teams the flexibility to choose the chips they need to optimally run their workloads while balancing speed and pricing. Among their early GPU cloud customers is Shazam, which uses GPU to help fuel its music identification service.