YC-Funded Data Marketplace Is An Amazon For Structured Information
Leena Rao
Mar 18, 2010

There has always been a vibrant ecosystem around financial data. Financial institutions, such as hedge funds and investment banks, pay thousands of dollars for quantitative tabular data (financial data in spreadsheets). But now, the web has provided a mechanism to distribute and publish large amounts of data, but much of this data is raw (meaning, it’s not built into a spreadsheet format) and hard to find in a Google search. An finding the data, and then putting the data into a format that is easy to digest can be a laborious task. Y Combinator’s Data Marketplace is hoping to change this by providing a platform where financial professionals can request data sets and then data aggregators/consultants can then find and format the appropriate data.

Founded by two former analysts at investment banks, Data Marketplace is essentially the middleman in helping financial organizations find quality data on the web. Users can submit requests to Data Marketplace, and the site will send those requests to its database of 200,000 data aggregators, programmers, and consultants who specialize in finding financial data and essentially transferring it into a readable format.

Providers then post data resources to Data Marketplace, provide descriptive metadata, and also set a price. The stored metadata is used to help consumers find relevant data through traditional search engines and when browsing Data Marketplace. Data can also be posted on the site without a request, that users can search for. For example, here’s a data set of a complete list of Wal-Mart Store Locations, which is priced at $30.

Prices range for data, and can be anywhere from $5 to several thousand dollars. Data Marketplace co-founder Matt Hodan tells me he spent $10,000 in on year on data at one of the financial organizations he worked for. Data Marketplace takes a 14% cut of each transaction on the site, from the provider. Data Marketplace handles all of the payment processing and allows users to directly purchase and download resources in an accessible format online.

Hodan says that current models for selling and distributing data online are inefficient and expensive for financial organizations. Users only pay for what they need as opposed to plans or buying bundles of information. And providers don’t have many platforms where they can sell their data in a marketplace.

Data Marketplace is similar in some ways to Factual, which is a Wikipedia-like site for open data, and InfoChimps, which takes a more collaborative approach to open data.

Advertisement
Advertisement
  • http://www.listia.com Gee

    Cool! This looks really promising… good luck with it! Since data is so valuable, they have a business model right from the start unlike many other companies :-)

  • http://allantyoung.com Allan

    I wished Data Marketplace existed when I used to work in venture capital and private equity. It would have made my life a lot easier and my decisions a whole lot less stupid.

    The Data Marketplace team is outstanding too. Seriously. You don’t find a lot of IBanking and financial industry types who can also code a decent web app. It’s been fun watching them refine their idea at Y Combinator.

  • http://www.aggdata.com Chris Hathaway

    I am the CEO and founder of a company called AggData (www.aggdata.com). We also sell similar sets of data, and actually, the Complete List of Walmart locations mentioned in the article is an exact copy of the list we sell at http://www.aggdata.com/store_locations/walmart, and their selling of the list is a clear violation of our terms of use, found here: http://www.aggdata.com/terms_of_use. We have not been contacted by the company for permission or made any arrangements for them to sell data they purchased from us, so we find this move unethical and perhaps illegal. We have contacted Data Marketplace and asked them to remove the data, and I hope this issue will be solved very soon. The market to sell and buy data is large enough for many players to participate, without stepping on each others toes.

    Chris Hathaway
    CEO & Founder, AggData LLC

  • http://fabricly.com Ari

    I could see this making market research a lot less painful. Being able to request the data you need is great.

  • Jason

    The problem is that the data you buy as an ibank is guaranteed by the company selling it, their entire reputation is riding on providing the top quality data…that’s why it costs so much money.

    Here you have no idea whether the data you get is good or complete garbage.

    It’s not that good for sellers either…data sellers trust their customers, here nothing is stopping some random guy from buying your data set and then reselling it for half the price…or better yet giving it away for free on their blog

  • John

    Cool.

    “$10,000 in on year”

    should be

    “$10,000 in one year”

  • DDavis

    Ever read Snow Crash? This sounds like the start of the CIC Library. Walk around gathering data, tag it, upload it, hope someone buys it, get paid.

  • http://www.facebook.com/profile.php?id=627826 Jonathan Marcus

    Sounds like a great idea.

  • http://my.flashh.in mpchekuri

    Technology has developed to make ourselves more comfortable. In future it may also make it free to request data sets.

  • SarahW

    Hey! This is another good use for the iMacros web scraping software we use at work.

    Some people use iStockhoto to sell their images, and I can now use now this website to sell our “used” datasets. I will give this a try ;)

  • http://tjoozey.com/?p=3642 YC-Funded Data Marketplace is an Amazon for Structured Information

    [...] full post on Hacker News If you enjoyed this article, please consider sharing it! Tagged with: Amazon • [...]

  • http://www.articlez7.com/as-seen-on-tv-hold-n-one-golf-bag-holder-garage-organizer/ The One Golf Bag Holder Garage Organizer | Articlez on Golf Bags

    [...] YC-Funded Data Marketplace Is An Amazon For Structured Information [...]

  • http://shouldget.com david

    @john – yeah – there are a number of typos in this piece. strange.

    “An finding the data, and then putting the data ..”

  • http://www.articlez7.com/?p=734 A Review of Titleist Golf Bags | Articlez on Golf Bags

    [...] YC-Funded Data Marketplace Is An Amazon For Structured Information [...]

  • Lion

    I wonder who is in this “database of 200,000 data aggregators, programmers, and consultants”

    no transparency = no credibility

  • HTL

    Wow. This is an amazing idea, and seems well executed so far. I like the buyer-driven concept, like a help wanted ad. Takes minimal effort from the data requester and it ensures that there’s a financial incentive for people to post data. Looks like that will help them get past the initial traction problem that every multi billion dollar marketplace must get past.

  • http://www.datamarketplace.com Steve DeWald

    Chris,

    Our list of Walmart store locations was collected by scraping walmart.com, and as far as I can gather is different than the list you’re selling. The resource was collected on a different date, has a different number of records and less complete information about each store. Regardless, it’s public information so I’m not sure it’s unethical for us to also sell it. I’d be happy to talk with you about it if you want to contact me directly (steve@datamarketplace.com).

    You guys have an interesting business. It’s funny, I expect we’ll be dealing with a lot of the same IP issues as you in trying to protect our customers’ data.

  • Max

    I have seen the site (AggData) long before. It seems that Data Marketplace is just a clone of AggData. It is a shame that TC doesn’t even mention AggData while reporting a clone a AggData. ( i just can’t believe that they even copy the same data (walmart) from you guys).

    I feel like TC gives a lot of press to YC companies regardless of quality. Is it because Michael is also an investor of most YC companies?

  • Max

    It seems to me that the only major difference between AggData and Data Marketplace is that Data Marketplace allows third parties to collect and sells data while AggData team do it all yourself. Other than that Data Marketplace is pretty much a clone.

    My advise to AggData:
    1) you guys have a head start. Allow third parties to collect and sell data. share requests with them. Be a platform. Think Big.

    2) Build a system to evaluate providers to maintain data quality.

    My advise to Data Marketplace:
    It seems like you have a talented team and while your idea is very similar to AggData, you seem to have a bigger and clearer vision. Even though a late comer, you maybe able to implement it much better.

    Above all, you have paul graham and Michael Arlington and their connections backing you.

  • http://www.cathedralpartners.com/blog Peter

    Congrats to Matt Hodan and his colleague whom I haven’t yet met.

    There are two big issues for any new or fledgling marketplace: liquidity and quality of inventory.

    If you don’t have liquidity, you can’t drive the network effects. If you have liquidity (lots of data, in this case) but you don’t have quality inventory, then you still have a problem. No professional will pay to buy bad data, incomplete data, hard to format data, overly raw data.

    Solve for liquidity by bringing large anchor tenants into the marketplace as quickly as possible. Solve for quality by having strict controls over how data must be formatted and presented in the market. Bad data or crappy data should be flagged and removed promptly.

    Marketplaces, especially B2B marketplaces, don’t work without a quality-based framework governing them.

    Good luck to you!

  • http://www.aggdata.com Chris Hathaway

    Hey Steve,

    I have no problem with you guys collecting and selling public information, even if it is the same data as ours. It just seemed to be so much of a coincidence that you had a list collected only one day different than ours, with only one more location than ours (which could be just the header row), that we figured you simply purchased our list, took out a few of the columns, then re-sold it on your site. If this is not what you did, than I apologize for jumping to conclusions. I’ll send you an email directly so we can further discuss.

    Thanks,
    Chris

  • Yong

    I really think it’s a good start, but as you can see in the comments by aggdata, IPR, privacy, copyright, and verification/validation of the datasets need to be solved before the idea of data marketplace could really take off.

  • http://etacts.com Evan

    This company has a large market and a talented founding team. I look forward to watching them grow.

  • http://timetric.com/ Andrew Walkingshaw

    Good luck to DataMarketplace – it looks very interesting! We’re working on a related challenge: building services around all the interesting time series data out there. We’re going further by building tools you can use, though; it’s not quite the same market

    Check out what we’re doing – http://timetric.com/, and our portfolio analysis tool at http://finance.timetric.com/portfolios/ for those of you into stocks and shares :)

  • tio

    Many startups are taking the “I am gonna build a platform…” route – because it is easy! a big idea, some php code, hook up with paypal and you are there! But just like the other commenter said, you need liquidity and quality, which are difficult to get, because you don’t own the “product”, you only have a “platform”…..how will you take care of copyright? can people buy and re-sell data?…

    wait… you are selling too – the “wal-mart location data” in “available data sets” is owned by you!! That is good, try to get the market started….but, of all the billions billions of data online you can scrape and collect from the web, you only have wal-mart location data? you must have done your research – store location data is popular based on AggData.com….not impressed by the effort.

  • PJ

    Obviously there are demands there – the challenge is at the supplier side.
    The world is full of data – the real valuable data usually is owned privately by labs, corporations, universities, exchanges etc. Those data is hard to be available for sale, or the data owners have their own channel.
    How about publicly available data, especially online data? the problem with them is they are low value and will be lower when sellers crowd in since everyone has access to it (write my own program to scrape the web or just buy and re-sell, there is no way to say my wal-mart location data is YOURS) .

  • http://www.sandiegocountryestateshome4sale.info/california-real-estate-trust/ california real estate trust | 24527 Watt Road

    [...] YC-Funded Data Marketplace Is An Amazon For Structured Information [...]

  • http://www.aggdata.com Chris Hathaway

    Just a clarification… we never claim that the data is “ours” or that we somehow own it. However, when someone purchases a list from AggData, they have to agree to “Terms of Use”, which clearly states the customer may not simply purchase and resell our data. This would be a breach of contract, not theft or a copyright violation. We rely heavily on the fact the data is public and not owned, otherwise the companies themselves, like Walmart, could lay claim to the information.

  • http://www.interestingfiles.com/william-gibson-x-files/ William Gibson X-files | Lax Files

    [...] YC-Funded Data Marketplace Is An Amazon For Structured Information [...]

  • http://greyhoundracingtrack.info/index.php/2010/01/racing-champions-1995-premier-edition-kenny-wallace-no-8-red-dog-racing-ford-thunderbird-164-scale-die-cast-replica-race-car-and-collector-card-nasca Racing Champions – 1995 Premier Edition – Kenny Wallace – No. 8 – Red Dog Racing Ford Thunderbird – 1:64 Scale Die Cast Replica Race Car and Collector Card – NASCAR

    [...] YC-Funded Data Marketplace Is An Amazon For Structured Information [...]

  • scottk

    Not sure about the long term success…they will be competing with :
    - Factual (Free)
    - Google (Free)
    - Wolfram Alpha (Free)

    IMHO, Unless there is process/ technology to add much higher-level synthesis on top of data, rather than data themselves, these type of companies ((AggData, DataMarketplace, DataMarket.Net) won’t aggregate enough demand and/or won’t create enough value to be a meaningful size companies.

  • MarketNinja

    Agree. Trying to get people to pay for someone else to aggregate & reformat publicly available data in spreadsheet (or even adding visualization on top) is NOT a big business.

    There is tons of public data and accessing them will become easier & easier as Google, Bing, and even Wolfram Alpha dial in their algorithms. I think Factual might have chance aggregating enough demand & engagement since it’s open and FREE. Hate to compete with too many technology titans (who got cash & offer them for free) to win big.

    Also, I would be very skeptical about quality signals also. How does DataMarketplace decide which of the 200K people are best suited to do the job? That’s a very tough task and if the most available and lowest bidder wins, you will likely get what you pay for. Not clear how they try to maintain or enforce quality of contributors.

    Finally, incentive system doesn’t seem high enough for real high quality contributors to contribute. If I’ve done market research for 5~10 years and knowing that there will be 1,000 others look at the request with $50 for data work that takes even 30min, I won’t bother….expected value is just too low.

  • Josh

    I’ve been trying to upload material to the site but have had technical problems for the last week. Is anybody else having this problem?

blog comments powered by Disqus
Advertisement
Got a tip? Building a startup? Tell us