Google's second generation TPU chips takes machine learning processing to a new level

Google announced its next generation of its custom Tensor Processing Units (TPUs) machine learning chips at Google I/O today. These chips, which are designed specifically to speed up machine learning tasks, are supposed to be more capable than CPUs or even GPUs at these tasks and are an upgrade from the first generation of chips the company released at last year’s I/O.

And speed up they have. Google claims the each second-generation TPU can deliver up to 180 teraflops of performance. We will have to wait and see what the average benchmarks look like, but they are a step forward for more than speed. The first generation TPU was only able to handle inference. The new one can also be used for training machine learning models, a significant part of the machine learning workflow all within this single, powerful chip.

That means that you can build a machine learning model — for example, to correctly identify an object in a photo is a tree, a car or a cat. Inference in machine learning refers to the statistical likelihood that the machine’s conclusions are correct — for example, based on the model, you may be 85 percent confident that this is actually a tree and not a broccoli stock.

How fast are these new chips? “To put this into perspective, our new large-scale translation model takes a full day to train on 32 of the world’s best commercially available GPU’s—while one 1/8th of a TPU pod can do the job in an afternoon,” Google wrote in a statement.

Google’s second generation Tensor Flow chip set. Photo: Google

It’s always hard to know how useful these comparisons are in practice, but it should at least give you a sense of the speed compared to GPUs, which are typically the most powerful chips being used in machine learning operations today.

What’s more, Google is packaging availability of these chips as a service on the Google Cloud Platform, substantially lowering the barrier to entry to this technology. It is also allowing users to start building their models on competing chips like Intel’s Skylake or GPUs like Nvidia’s Volta and then move the project to Google’s TPU cloud for final processing.

And if that cost is a barrier, Google also announced free — as in beer — access to the TensorFlow Research Cloud, a cluster of 1,000 Cloud TPUs for researchers working on open machine learning research.