Facebook is a data company. Today’s news about its new search features proves it. So did last week’s news about the company testing what it can charge people to send Mark Zuckerberg a message. And tomorrow at Open Compute Project’s Open Compute Summit, Facebook will again show why becoming the world’s largest data broker depends on the success of its massive data-center buildout.
The Open Compute Summit is Facebook’s creation, really. The company open-sourced its data center designs in 2011. And with that, they and others in the data center world formed Open Compute Project, an organization that promotes open architectures for data center design. All kinds of companies joined in the fun. Intel, Rackspace, Arista Networks — they and dozens of other enterprise companies will show up tomorrow at the Santa Clara Convention Center.
Facebook had to open up to be a data broker, and over the long-term, I think it gives them a certain advantage over those who try to invent the data center themselves.
Hacking and using open-source is also part of the Facebook culture. Remember, this is Facebook, where developers are kings. And developers love open source. You see it in the way they build out new features. For example, the company has resources dedicated to Apache HBase, an open-source-distributed database built on Java.
With the Open Compute Project, hardware gets a lot of emphasis. At the summit tomorrow, there will be a data center hardware hackathon, which is unusual. At TechCrunch Disrupt, we gave some developers access to Raspberry Pi computers to do hardware hacks. App hackathons are going mainstream. But a hardware hackathon for a data center is pretty new and promises to be a major geek-out fest. Here’s an example of the kind of hackathon project that the conference organizers say is a model for what we will see tomorrow:
Use low-power sensors for temperature information across a data center. Use the Zigbee wireless protocol and aggregate the heat data across the data center. This has the benefit of not requiring any additional wiring or interfaces.
Ahhh — heating, cooling, sensors and protocols: This is the stuff of data-center geeks. This event is about hacking the data center – hacking it to get more out of it at ever-cheaper costs so Facebook, and really any company, can use the data for making money.
Facebook and any of the companies at the Open Compute Summit know the data-center world needs new designs, new thinking. Otherwise, it just gets too expensive and way too slow for a data service to actually be viable.
Take photos, for example. Facebook users store 300 million photos per day. That’s what Facebook Vice President of Infrastructure Engineering Jay Parikh told the Structure Europe conference in Amsterdam last Fall. None of the photos ever get deleted by Facebook. The user can delete if they wish but Facebook itself in its contract states that the company will store the user’s photos without ever deleting. So how to store them? We may get an answer to that tomorrow.
The Zuckerberg email story is fun on the face of it but fascinating when considered in a different context. It is at the depths of the infrastructure where the data gets processed and served. Algorithms sit on top, setting the pricing on the products coming out of the digital factory.
The main point is that Facebook, like Amazon Web Services, is treating the data center like a computer. It is programmable. It uses software to orchestrate the consumption and delivery of data up the software stack. The algorithms treat the data like it’s a special sauce that delivers the fine-grained services, such as determining how much to charge someone to contact Mark Zuckerberg.