Crate Lets Developers Set Up Big Data Backends In Minutes

Big data is (still) hot, but setting up the backend servers to work with huge amounts of information isn’t easy. It often involves setting up many different services and once you’re done, you still don’t know how well you’ll be able to scale everything. Crate, which is presenting at TechCrunch Disrupt Europe in London today, massively simplifies this process. With Crate, developers can quickly set up a distributed database cluster, either on their own hardware or in a public cloud, and know that it will be able to scale.

The guiding principle for Crate is simplicity. Not only is it easy to set up, but once everything is up and running, developers can use standard SQL queries to work with their data.

[gallery ids="1072920,1072921,1072922,1072918"]

The company was founded by Jodok Batlogg (CEO), Christian Lutz (COO) and Bern Dorn (CTO). Before starting Crate, the team ran a consulting business that helped companies use open source tools to meet their big data needs. Some of these companies included the likes of StudiVZ, Germany’s version of Facebook. “We got really good at building all kinds of backends for our clients, using a wide variety of software stacks,” Lutz told me earlier this week. About a year-and-a-half ago, however, the team decided that it could take this knowledge and turn it into a product — which eventually became Crate (and the IT services business is now being run by some of its former employees).

Crate was developed on top of a wide variety of open-source projects, including Facebook’s Presto SQL parser, the Netty network application framework, and the Apache Lucene search library. The inspiration for the project, Lutz tells me. was Elasticsearch (which Crate also uses) — and the way you can set up a distributed search engine with it.

Instead of running — and trying to scale — a MongoDB system with Elasticsearch, Crate promises that its users can get most of the benefits of those systems (it can store tabular data, unstructured records and binary objects) without the hassle.

Developers get a lot of flexibility in how they can use Crate. One use case the company advocates is for developers to put Crate right onto their application servers and to dedicate about half the memory on that server for Crate. That way, data can easily be replicated across a large number of machines and because the data lives right on the application server, you also get some performance gains.

“Typically, you have a single point of failure database and lots of application servers,” Lutz said. “We say: kill that database and install Crate on every application server.”

crate-2

That’s just one way of using Crate, however, developers can also run it on a set of dedicated machines, too, and some of the companies that use it in production, including ClearVoice. Since earlier this month, Crate is also available through the Docker Hub Registry, so to install it, Docker users can now simply type “docker pull crate” and start working with the service. Importing existing data into Crate should be pretty straightforward, the team argues. Because it can work with JSON objects, for example, a move from MongoDB should be straightforward and the system can ingest this data through its REST API.

While the project is open source, the team plans to charge for additional services like priority access to engineers, help with managing clusters, etc. It currently offers a 1,000 Euro/month plan for this and will launch an additional plan for enterprise customers in the near future. As Lutz tells me, the company doesn’t want to charge per node, however, which would be the standard business model. Instead, it’s going to charge per cluster. The reason for this is that Crate works best in clusters with many nodes and “we want people to use node,” he said.

Crate has the potential to greatly simplify the lives of developers who have to manage large database setups. The team has already raised 1.5 million Euro in seed funding from Denmark’s Sunstone Capital (which specializes in these kind of open source projects) and DJF Esprit.

Disrupt Q&A

Who does something similar?

Hadoop, Elasticsearch and MongoDB, but they don’t do SQL. And there are a few startups that are trying to go after every developer.

How easy is it to migrate to yours?

Pretty easy, we have client drivers for most programming languages.

How do you market?

We are experimenting with paid traffic, we produce relevant content and we are at every developer conference we can get to.

Traction?

Our growth is about 20 to 30 percent and we now have our first enterprise customer.

What is the sweet spot for your users?

Real-time analytics, Internet of Things and web-scale APIs. It works for companies that have 5,000 or 5 million users. It’s great for enterprises and small startups that want to make sure they can scale.

Correction: in a previous version, the story said the Crate team had shut down its former consulting business. That was incorrect. The company is actually still up and running and being run by some of its former employees.