JethroData, an analytics database company based on Hadoop, announced today it has closed a $4.5 million investment round led by Pitango Venture Capital.
JethroData, based in Israel, combines the storage scalability of Hadoop with the query performance of a fully indexed, columnar analytic database. A columnar database has historically been useful with data warehouse systems that do complex queries over large amounts of data.
Founder Eli Singer said in an email interview that its differentiation comes with simplifying and enhancing the often complex process of storing data in Hadoop and then extracting it for analysis. Often companies will place an analytics database next to Hadoop, but the weakness comes in the time-consuming task of scanning the data to analyze. The analytics are done in batches as opposed to seeing the results in realtime. By doing it natively, Jethro maintains it can get better query performance.
JethroData faces its fair share of competition, Singer said. Its most direct competition comes from Hadapt, which also approaches the problem by organizing data stored in Hadoop and organizing it like a database.
Cloudera is releasing Impala, which replaced MapReduce with a faster full-scan system based on Google Dremel, the successor to the search company’s pioneering work in big data analytics. MapR has announced plans to support Apache Drill, also based on Google Dremel. Last week, Hortonworks announced Tez, their own version of the technology. Citus Data has its own analytics database based on Google Dremel. Its innovation comes in parallel computing in PostgresSQL core to do its queries.
Singer said some companies are betting on HBase, the only database that is currently available on Hadoop. Drawn to Scale and Splice Machine are betting on HBase. Salesforce.com has a new open source project called Phoenix that offers SQL over HBase.
Analytic database and data warehouse companies that compete with JethroData include HP Vertica, EMC Greenplum, IBM Netezza, Teradata Aster and InfoBright.
JethroData has had one customer doing alpha testing. The product will go into beta next quarter and be available to more customers. The company has eight employees and expects to have 25 by the end of the year.
The analytics database market represents the next frontier of data analytics. JethroData addresses the Achilles heel of Hadoop — the process of extracting data for results. The challenge will come in separating itself from the growing list of competitors in this fast-emerging market.
JethroData is an index-based SQL engine for Hadoop. It lets you run 1000 X faster interactive ad-hoc queries, live dashboards and reports. With Jethro, you enjoy the scalability of Hadoop with the performance of an analytical database, in one system. Jethro works by automatically indexing data as it is written into Hadoop. Queries use indexes to access only the data they need instead of performing a full-scan of the entire dataset, leading to an increase in speed...