Google Cloud wants to make it easier for data scientists to share models

Today, Google Cloud announced Kubeflow pipelines and AI Hub, two tools designed to help data scientists put to work across their organizations the models they create.

Rajen Sheth, director of product management for Google Cloud’s AI and ML products, says that the company recognized that data scientists too often build models that never get used. He says that if machine learning is really a team sport, as Google believes, models must get passed from data scientists to data engineers and developers who can build applications based on them.

To help fix that, Google is announcing Kubeflow pipelines, which are an extension of Kubeflow, an open-source framework built on top of Kubernetes designed specifically for machine learning. Pipelines are essentially containerized building blocks that people in the machine learning ecosystem can string together to build and manage machine learning workflows.

By placing the model in a container, data scientists can simply adjust the underlying model as needed and relaunch in a continuous delivery kind of approach. Sheth says this opens up even more possibilities for model usage in a company.

“[Kubeflow pipelines] also give users a way to experiment with different pipeline variants to identify which ones produce the best outcomes in a reliable and reproducible environment,” Sheth wrote in a blog post announcing the new machine learning features.

The company is also announcing AI Hub, which, as the name implies, is a central place where data scientists can go to find different kinds of ML content, including Kubeflow pipelines, Jupyter notebooks, TensorFlow modules and so forth. This will be a public repository seeded with resources developed by Google Cloud AI, Google Research and other teams across Google, allowing data scientists to take advantage of Google’s own research and development expertise.

But Google wanted the hub to be more than a public library — it also sees it as a place where teams can share information privately inside their organizations, giving it a dual purpose. This should provide another way to extend model usage by making essential building blocks available in a central repository.

AI Hub will be available in Alpha starting today with some initial components from Google, as well as tools for sharing some internal resources, but the plan is to keep expanding the offerings and capabilities over time.

Google believes if it provides easier ways to share model building blocks across an organization, the more likely they will be put to work. These tools are a step toward achieving that.