Citus Data Update Gives True Real-Time Analytics

Sometimes necessity truly is the mother of invention, and such was the case recently with Citus Data, the company that helps you extend Postgres SQL and run analytics across massive amounts of data in near real-time. The “near” part was the problem, because while Postgres SQL had the real-time capability, CitusDB, the Citus Data product, did not. This was gap they wanted to fill.

Today, Citus Data announced an update to the product that brings this real-time processing (or as close as you can get to that) to CitusDB users. Umur Cubukcu, co-founder at Citus Data, says customers were asking for this capability, and even though the underlying Postgres SQL product had it, their product did not and he was forced to ask the customers to build in the capability themselves. Cubukcu says he hated doing this and after several customers made requests, he realized he needed to build it into the product.

The update is called pg_shard and it not only provides a way to deliver ad-hoc query results faster, it also builds on the CitusDB capability to grow the network and the database. “It allows you to add more machines as your data grows; and through built-in replication, automatically enjoy high availability if your machines or network fail,” Cubukcu explained.

The pg_shard piece works on Postgres SQL or the CitusDB product.

The update allows users to insert a short command or request and get back results very quickly, in milliseconds where these types of time distinctions really count. Sometimes you can afford to wait for a few a seconds or even a couple of minutes for a query to return an answer and sometimes you just need it now because it’s that important. This new feature gives users that speed they need when they need it.

What’s more, Cubukcu says it’s an open source extension, and it doesn’t require changes on the application layer and there is no middleware to manage and it doesn’t require that you retrain your database administrators.

This could be useful in several scenarios, but it works particularly well in if you are dealing with large volumes of machine-generated data (e.g., user event logs, clickstream, sensor logs, etc), and want to keep the familiar relational semantics and PostgreSQL reliability, he explained.

Postgres SQL is an increasingly popular database and is used by many startups including Instagram. Citus Data has raised almost $5M to date and currently has 13 employees with plans to expand and add more in 2015.