RapidMiner Adds Streams To Bring Real Time To Its Big Data Processing

RapidMiner announced their new Streams service today to enable customers to capture streams of data and process it in real time. RapidMiner president Michele Chambers explained this could be particularly useful when it comes to capturing and processing Internet of Things or industrial sensor data and getting back answers in seconds.

The new Streams product pulls the data from your source and deals with data blending, streaming data analysis, and model scoring, processing all of this in Apache Storm clusters, and it does so without coding. Chambers tells me RapidMiner does all the coding on the back end based on user choices in the graphical user front end. She claims there is little or no overhead in this process and they are able to process the requests and the data in near real time (around 5 seconds by her definition).

Chambers also told me that the product was in incubation for about a year, partly because it needed to wait for the underlying open source-Apache Storm technology to stabilize enough for use in a commercial product. During the testing period, RapidMiner worked with an unnamed media company tracking information about viewer habits through set top boxes and web streaming behavior. Based on this data, they helped the company make recommendations for individual viewers on what to watch, as well as how to send targeted advertising based on their viewing behavior.

In another Beta installation, a concrete company captured data from sensors on their processing equipment. Chambers explained that in a typical concrete operation, the company only operates at about 75 percent of capacity because the equipment is highly sensitive and can break down rapidly from overuse. By capturing data from the sensors though, the company was able to figure that the two biggest factors on breakage were vibration and humidity and moisture. When they controlled for those factors they were able to increase the capacity to 95 percent without taxing or breaking the equipment.

It is precisely these types of scenarios RapidMiner hopes to exploit with the Streams product. Chambers hinted that there would be updates coming soon, but they were waiting on another piece of open source software to be sufficiently baked before they released that, likely in the first quarter next year.

For now, Chambers says the product fits nicely in the product family and gives customers another big-data processing option.

She also announced the company was releasing new connectors for Qlik for data visualization, Apache Solr for search and Mozenda for web scraping.