Yahoo Releases Internal Hadoop Source Code

Yahoo! is releasing their tested source code used to help power its sites and products, called Hadoop. Hadoop is free Java software framework born out of an open-source implementation of Google’s published computing infrastructure and fostered within the Apache Software Foundation. Yahoo made the announcement at the second annual Hadoop Summit today in Santa Clara, California, which was co-sponsored by other cloud computing vendors Amazon Web Services, Cloudera, IBM, and Sun Microsystems.

Yahoo! has been the primary developer and investor to Apache’s Hadoop. In 2006, Hadoop founder Doug Cutting joined Yahoo to lead the project of developing the open-source software. Hadoop now provides the framework for many Yahoo properties including Yahoo Search, Yahoo Mail, and several content and ad services. Hadoop runs on more than 25,000 servers and analyzes billions of Web pages.

Yahoo says its opening up the source code to Hadoop to “increase the pace of innovation around open and collaborative research and development.” Hadoop is currently being used by a number of cloud computing vendors, including Amazon Web Services (to power its Elastic MapReduce feature,) IBM (for its Blue Cloud Initiative) and Google. Startup Cloudera offers its own Hadoop-powered computational services on top of Amazon’s EC2.

Yahoo hasn’t been doing much in the cloud computing space but releasing this code could further its commitment to making a name in the cloud. It has a ways to go to catch up to Amazon, IBM, Google, Microsoft and others but this release may engage developers in Hadoop.