Firehose Provider DataSift Raises $42M Led By Insight Venture Partners For Global and Non-Social Ambitions

DataSift, a social data platform that provides brands and enterprises with access to content from the likes of Facebook, Twitter, Tumblr and dozens of other social networks, is today announcing a $42 million, Series C round of funding. Rob Bailey, DataSift’s CEO, tells me that the company plans to use the new financing for a number of different purposes.

First up is international expansion, starting initially with Japan, Brazil, Turkey, South Africa, and Indonesia. DataSift also plans to add more data partners; and to expand into what Bailey calls “non-social” data sources — which can include messaging and gaming services, enterprise collaboration platforms and more. DataSift has seen a “huge amount of inbound interest” for data from these sources, he says.

Bailey tells me that today DataSift is not profitable today, but that is intentional. “We could be if we wanted to but we’re playing the long game. We want to be a billion-dollar company and so we are investing for growth. Going public is our long term goal.” The company in understood to today be generating in the region of $25 million annually, according to sources.

Insight Venture Partners — the VC firm that has backed Flipboard, Buddy Media, HootSuite — led the round with existing investors Scale Venture Partners, Upfront Ventures, IA Ventures, Northgate Capital and Daher Capital also participating. As part of the round, Insight Venture Partners’ co-founder and MD Jeff Horing is joining DataSift’s board of directors.

DataSift has now raised just under $72 million.

Bailey and DataSift are not providing a post-funding valuation, but considering that another player in the data firehose game, Topsy, has just sold to Apple for reportedly over $200 million, and DataSift is “considerably” bigger in size, it may well be a number well north of that.

Since being founded in 2010 in the UK, DataSift has been riding a veritable social media tsunami. A swathe of popular (and free) services like Facebook, Twitter and Tumblr attract billions of users, who use the sites daily to post messages to each other and read what others have to say. That rush of consumers and their opinions are of huge interest to advertisers and others for obvious reasons, yet most of that data is unstructured and therefore hard to “read”. DataSift therefore provides a way for those enterprises to make better use of that data from these social media platforms: each piece of data gets tagged with metadata, which can then be used in different applications to chart what people are talking about, gain insight on different trends, and so on.

DataSift says that its 1,000 corporate customers today cover 40 countries and include Bloomberg, Dow Jones, CBS Interactive and Dell and social technology application innovators Marketwired, Dachis Group, Conversocial, SecondSync, HootSuite and Simply Measured.

The move to looking for new business in international markets makes sense for DataSift, Bailey says, because they are the markets “where we see the biggest amount of social activity, yet are the most underserved.”

Unsurprisingly, DataSift ate a little of its own dogfood when selecting what countries it would target first. “We did a lot of sophisticated analysis internally,” he says. “We looked at aggregated social and local networks and the size of the advertising and business intelligence markets in these countries.” And in a sense the infrastructure for growth is already in place: the company already provides detection on its platform for 150 different languages, and is built for scaling. “Right out of the gate, it will be an incredibly easy path for us to enter Brazil, for example,” he says.

I also asked him about Japan. There, a lot of the buzz has been around messaging platform Line, which is more of a private, direct service than the one-to-many nature of networks like Twitter and Tumblr. In these sorts of scenarios, it’s likely that messaging companies might tap DataSift for competitive intelligence of their own platform for their own commercial development, CTO Nick Halstead tells me. Still, the two would not comment on Line directly. “We have not announced a deal with Line yet,” Bailey said. “We cannot comment on deals that have not been announced but I think Line is one the most important data sources in Japan, along with some other ones.” (My interpretation: watch this space.)

The move to work messaging platforms that are not built on the one-to-many principle is indicative of a development at DataSift, says Bailey: “We are expanding outside of social.” The first area, he says, is news. The company already has a partnership with NewsCred, and he says that around nine of the world’s top 10 news organizations already use DataSift “as a part of how they identify breaking news and validate stories, optimising content and helping them publish for things like virality and pageviews.” (CBS Interactive is one news organization that has developed a company-wide platform that uses DataSift data.) Now he says that this is evolving to build bigger applications within news organizations. That makes sense when you consider that these news organizations themselves are on the hunt for new revenue streams to make up for the drop off from legacy platforms like print.

Another area will be to tap further into enterprise applications, many of which are built on social media premises and therefore provide a lot of unstructured data to tackle — 70% unstructured data, according to one estimate from McKinsey. One example here is Yammer, which DataSift already users as a data source for enterprise customers. Halstead explains it this way: “Let’s say you have 10,000 users on Yammer. What they talk about in there can flow into DataSift, where we use our processes to apply curation and context.” He says that DataSift has a waiting list of other enterprise social networks, and enteprises using them, “where conversations have no analysis done to them today.”

Although Topsy is not a direct competitor to DataSift — both are firehose data providers, but Topsy also veers into the area of search, which is thought to be of key interest to Apple — Bailey and Halstead were happy to hear about the sale to Apple. “We think it’s a fantastic validation of our market,” Bailey told me.

And on the subject of acquisitions, DataSift may use some of this latest round in that area itself. “We’ve had a number of startups come to us already,” he says. “But we’re very cautious on any kind of acquisition. Our core focus is building out our platform and constantly serving our customers.” He adds, however, that there are some opportunities out there where interesting technology to better shape big data has been created, but the companies have not managed to pick up enough customers for it, and therefore are running out of money. These kinds of companies, he says, “Constantly reach out to us.”

What else is on the cards? Since DataSift is thinking a bit outside the social databox these days, I thought I’d ask about something else: Pinterest. To date, DataSift has focused on text-based data, but image-first Pinterest is an obvious area for the company to tap into as well. “We haven’t announced anything at this time but I’m a big fan of Pinterest,” Bailey says, but then moves the conversation to the bigger idea. “It’s not the individual data sources, but what you can do with them. Companies that work with social data are overwhelmed by it, and think they are wasting massive amounts of money. What we hear from companies is not that they we want more data, but more precise data, more usable data.”

Photo: Flickr