Import.io Launches "Data Factory" To Simplify Converting Web Sites Into An API

By allowing its users to turn any web page into an API with just a few clicks, Import.io exists to make it easy for developers to pull data from the web.

At Disrupt Europe this morning, the company launched a new service that they call the “Data Factory” that should make it even easier.

To explain Data Factory, it might be easier to contrast it against how import.io’s service currently works:

To turn a web page into a developer-friendly API (in this case, that meaning something which developers can use to programmatically pull selected chunks of data from a page), import.io provides what is essentially a bespoke, sandboxed browser.

You load the browser, open the URL for the page you’re interested in turning into an API, then start selecting the specific elements of the page (like, say, each result on a search page) that you want to be able to extract. After you’ve picked a few elements, import.io starts to figure out which data you’re aiming for. Hit save, give it a name, and bam — you’ve got your API. This data can also be exported as HTML, CSV, or XLS.

With their new Data Factory, import.io is mostly gettin’ rid of the need for that standalone browser, and doing away with much of the clicking. While they will continue to offer the standalone browser option, they’re also launching a Chrome extension that adds an import.io button to your browser. The import.io data factory button works in two ways:

If you click the button while on a URL that they already recognize, import.io immediately provides an API for that page and its data.
If you click it on a page it doesn’t recognize, it snaps a screenshot of the page. You visually highlight the elements of interest, and send it off to import.io. Someone on their end (at one of their “factories” in London or India), will quickly prep the API and send it your way. From that point on, that page now has a ready-to-go API for future users to be served immediately.

Curiously, everything that import.io offers is currently completely free. In time, they plan to offer premium services on top of the APIs (like, say, API usage analytics), and are considering introducing volume limits on free accounts eventually.

Import.io says they have around 8,000 users to date, with those users having created around 15,000 APIs. With Data Factory, they’ll be able to immediately return APIs (with no user-driven training or further clicking required) for ~1,000 different sites — by the end of 2013, they aim to bump that number up to 10,000.

If hearing “import.io” and “Disrupt” in the same sentence seems familiar, you’re not crazy. This isn’t import•io’s first time at the conference, but it is their first time on the main stage. We first spotted them showing off their wares in our Startup Alley back at Disrupt SF in September — but at this week’s Disrupt Europe in Berlin, the company has fought their way all the way into the Battlefield competition.

[gallery ids="906699,906701,906702,906703,906704,906705,906706"]

Judges Q&A

It sounds like you guys are onto something. Who are your ideal customers?

Our customers range from an owner/operator of a yoga studio, all the way up to a national bank. It’s a broad market, and we think it’s an addressable market.

So you’re scraping as a service. You can’t write back into the servers. It’s a read-only API, right?

Anything you can do in a browser, you can program using the service. You can record actions and play them back, allowing you to POST anything like you could do in a browser.

Why hasn’t this been done before? If it has, why wasn’t it successful?

There have been tools that exist that allow you to turn a single page into a static file. They’ve been largely technical. The difference is that we’re very simple, and it’s been designed from the ground up to work with many websites at once.

Some organizations, unlike most of us, don’t believe that data should be free. How do you avoid being taken to court?

We act as a pipe. We don’t store the data. If someone is misappropriating the data, [that data’s] TOS apply to the end user of the API.