Cloud Giants And The War For Customer Data

Timothy Howes Contributor

Timothy Howes is Chief Technology Officer of ClearStory Data , where he leads innovation on ClearStory’s Spark-based data analysis platform.

Apache Spark’s effect on the data landscape is akin to the leap in Internet applications enabled by the move from dial-up to broadband.

That revolution was not just about applications getting faster, but rather the rise of new applications consumers previously could never get to, as the “pipe” was just too narrow and slow to support them.

Think real-time communication, streaming music and video, massive multiplayer gaming, and other bandwidth-intensive applications. When it comes to data and delivering information to business people who need it, Spark enables a similar quantum leap in information access.

With 90 percent of the world’s data created in the last several years, the pace of data creation is only accelerating. It’s no wonder vendors and customers alike view this as a pivotal moment in data history.

As enterprises herd their applications (and therefore data) into the cloud, traditional enterprise data vendors are in danger of being left in the dust, if not outright extinction. But if the dinosaurs taught us anything, it’s that extinction of one species is another’s opportunity. Enter the cloud computing giants.

Like the quantum leap in Internet bandwidth, many think Spark’s real-time processing capabilities will ignite new ways of working with data, providing streams of constantly refreshing data that employees, partners, and customers can tune in to.

Behemoths like Amazon, Google, Microsoft, and IBM aspire to own the cloud environment where companies run their applications, and more importantly, where customers store their data. This link is key. Data provides stickiness, but it follows the applications. Thus, the big battle for who owns the cloud is ultimately a battle for who owns customers and their data.

Where does Apache Spark – just massively endorsed by IBM as potentially the most important open source project of the next decade – fit in?

Like the quantum leap in Internet bandwidth, many think Spark’s real-time processing capabilities will ignite new ways of working with data, providing streams of constantly refreshing data that employees, partners, and customers can tune in to, much like consumers tune in to their TV or other media.

Mobile devices today are easy to use and fast enough to work interactively with data analytics systems driven directly by business users who need answers about what’s happening in the business. End-to-end data streams that carry data quality of service metrics, along with the insights they produce, pave the way to confident, data-driven decisions.

New ways of sharing and collaborating with the latest data streams will be the new enterprise backbone. Everyday business users will be at the receiving end of this, able to drive through information and insights as quickly and easily as they do with broadband-powered apps.

Some cloud giants view the data-crunching acceleration of Spark – tied to the vast data stores of companies and a host of new data channels – as an enticing way to lock their brands in with how we’ll work in the future. New ways of sharing and collaborating with the latest data streams will be the new enterprise backbone. Everyday business users will be at the receiving end of this, able to drive through information and insights as quickly and easily as they do with broadband-powered apps.

Like cable TV and its predecessors, the quality of data – from data-producing systems to data-delivery applications – will help differentiate the workplace. In essence, higher fidelity data will help set new services apart, along with speed of delivery, business benefits, and certification that data is credible so business leaders can confidently take action.

To create data stickiness, access to data-driven insights should be fast, and information should be simple for everyone to obtain. These are a few explanations for IBM’s substantial “all in” commitment to Apache Spark – and why a host of other big brands like Amazon, Google and Microsoft are doing the same. They see we’re at a pivotal moment in data history and recognize the game-changing potential for Spark to impact how we live and work with data every day.