Intel has launched its own Hadoop distribution, entering an already crowded market of major players all looking to get a piece of the big data pie. The company also announced an open-source effort to enhance security in Hadoop.
Earlier this week, EMC and HP each announced its own Hadoop distribution. But for Intel, the challenge is to fortify its market-leading position in the data center, where it will face increasing challenge from an emerging ARM ecosystem.
Intel says the distribution is optimized for the Intel Xeon processor platform. In its announcement, the company states it can analyze one terabyte of data, which would previously take more than four hours to fully process, can now be done in seven minutes.
Partners supporting the launch include Cisco, Datameer, Dell, Hadapt, LucidWorks, Red Hat, SAP, Tableau Software, Teradata, Wipro and Zettaset.
As part of the news, Intel has also launched Project Rhino, an open-source effort to improve the data protection capabilities of the Hadoop ecosystem and contribute the code back to the Apache Foundation.
Avik Dey, director of Hadoop Services at Intel, posted the details of Project Rhino last night on the Apache Hadoop mailing list.
The project will seek to improve encryption, provide improved ways to authenticate users, make security more granular and available at the “cell” level.
Ely Khan is co-founder of big data startup sqrrl and the former director of cybersecurity at the White House. He said in an email interview that his team is following Rhino closely:
We are seeing more and more customers in sectors such as healthcare, finance, and government wanting take Hadoop to the next level by integrating big data with mission-critical systems and sensitive data. In order for this to happen, Hadoop and NoSQL databases need to adopt enterprise security functionality, such as encryption, fine-grained access controls, and auditing capabilities. Project Rhino is a good validation of this.