Amazon Kinesis, the company’s new data streaming analytics platform, is now in public beta. It allows developers to build real-time apps without managing the complexity of multiple clusters. But though it has been heralded as a new type of real-time app platform, it also has some drawbacks that have emerged since its launch at AWS Re:Invent.
AWS Kinesis streams thousands of data streams on a per-second basis. It allows developers to pull any amount of data, from any number of sources, scaling up and down as needed. The power of the platform comes with its capability to process data in a world where sensors are transmitting information in any number of ways, said Amazon CTO Werner Vogels at At AWS Re:Invent. Vogels made the point that there will be an increasing use of sensors to record data. With Kinesis, builders could look at the data and determine the best time to pour the concrete in the foundation.
Kinesis works across multiple availability zones, which is also replicated for high availability. The service shards the data into streams with each handling 1,000 write transactions and up to 20 read transactions.
Here’s a video primer about the new Kinesis service last week made by Bernard Golden, a senior director of cloud management enterprise solutions at Dell.
AWS is positioning Kinesis as an alternative to Hadoop, which has traditionally used batch processing of data to do analysis. But that only tells part of the story. Hadoop has a diverse ecosystem behind it with new pieces such as Yarn, which provides the real-time processing capability and sets the stage for building real-time apps.
In his tests, Worley found that Kinesis is elastic and able to scale automatically based on the load, which takes away some of the complexity of managing EC2 clusters. If it works as advertised, he writes, it will greatly simplify cluster operations over what is done with a Storm setup.
AWS has a lower barrier to entry compared to Hadoop batch processing, Worley writes. But overall, Kinesis is not built for complex data stream integrations.
The downside, though, is that every Kinesis application consists of just one procedure, so you can’t do complex stream processing like can be done with Storm unless you connect together multiple Kinesis applications. Naturally, I have some concerns about this.
Kinesis does mark a new era in the analytics world with its data streaming capability. But AWS is not the first on the scene and by no means necessarily the leader. There are a growing list of options from the open-source community that are viable alternatives to AWS and its proprietary infrastructure.