Google Cloud Storage has long had the ability to run Hadoop so developers can do advanced analytics on its distributed computing platform. Today, Google is attempting to simplify this process with a new connector that the company says makes it easier to run Hadoop on the Google Cloud Platform.
The Google Cloud Storage connector for Hadoop manages the cluster and file system for developers so they can focus on the logic of their data processing without the complexity of setting it up and managing themselves.
Google originally developed the Google File System in 2003. It is now the basis for Hadoop, an open-source distributed computing environment managed by the Apache Software Foundation that allows for data to be kept in small pieces on server clusters and then processed to do data analytics. Out of Hadoop has emerged a diverse ecosystem anchored by companies such as Cloudera and Hortonworks.
The new Google Connector for Hadoop is based on Colossus, the latest iteration of the company’s cloud storage system. The new service uses a simple connector library, allowing Hadoop to run directly against Google Cloud Storage, giving the customer the ability to leverage Google’s expertise in large data processing.
Google lists a number of benefits that come with the new connector. Developers can start up a Hadoop cluster pretty quickly as it is all managed in one place on Google Cloud Storage. It leverages Google’s scalability, giving the user higher availability. It costs less as there is no need to keep two copies of the data. With most traditional Hadoop systems, the user has to keep a copy of the data for running Hadoop and another for backup. Instead, Google handles it all through its storage system.
Hadoop has hit the mainstream of the data-analytics world. As I noted in a post last month, Hadoop is an important technology for Internet companies like Twitter that process data by the petabyte. It is also of increasing importance for more traditional organizations that also now must process unprecedented amounts of information.
But Hadoop has traditionally been a complex undertaking, requiring people with multiple talents to unleash its potential. Google Cloud Storage Connector for Hadoop shows how Hadoop is becoming more accessible and readily available as a service more so than a complex undertaking that requires any number of skills to leverage.