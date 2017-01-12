Talend, the big data integration vendor that went public last July, announced its winter release today with new tools to help automate data preparation, a sticky problem for enterprise customers.

Surely, there are ever-increasing amounts of data and companies struggle to keep up. There aren’t enough data scientists in the world to fill the need. It requires software to pick up some of the slack and companies like Talend are hoping to help customers process that data — and eventually write applications to help customer create more data-driven businesses.

Perhaps that explains why, when Talend went public last summer, its shares popped 54 percent out of the gate and closed at $25.50. The company’s stock price has stayed close to that price ever since, fluctuating as stocks do, but sitting at $23.79 at close of business yesterday. Clearly, there is a demand for these kinds of services and companies are hungry for solutions to help them out of the data morass.

From its inception, Talend’s primary job has been to provide an interface to create clean Hadoop code, removing the need to hand-code everything. The company has tried to simplify this process by providing an interface to manage the data and create Hadoop-compatible code, across a variety of releases — whether that’s Cloudera, Hortonworks, MapR or Amazon EMR, Amazon’s Hadoop processing service.

In the latest version, they want to take that a step further, says Ray Christopher, Talend’s product marketing manager. The new release attempts to deal with a couple of problems that have plagued companies trying to process the growing data pile.

One of the issues of dealing with new data is ensuring that it is in a proper state for whatever application you wish to create. When data is coming from a variety of internal and external sources, that can be problematic. It needs to have a common vocabulary and that cleansing can be time-consuming and pain-staking work.

As Christopher and others have said, data scientists often spend a vast majority of their time just getting the data ready. The new Talend Data Preparation app lets customers create a common dictionary, and using machine learning, helps automate the data cleansing process, with the goal of reducing the time it takes to get data into a ready state. While data is messy and sometimes defies automation techniques, this should provide a good head start if it works as described.

The other issue, related to getting data prepared, is verifying the quality of that data, and the new Talend Stewardship app has been designed to “crowd source” data quality by bringing in the line of business experts to help. The tool lets admins set up a data cleansing campaign where users can go in and point to data to certify data sets, or put them aside if they see a problem. There is a full audit trail to track and review the activity, so that there are checks and balances built in.

The new tools will be available a week from today and existing customers get a deal — two free license for each new tool, should they decide to upgrade.

Featured Image: Colin Anderson/Getty Images