We covered BackBlaze’s cloud-based backup system way back in 2008, when $5 for unlimited storage must have sounded like a Christmas present. Since then the business has matured somewhat, but one thing they nailed that perhaps has become more important is scaling the hardware. With cloud backup and media services firing on all cylinders, data storage space is more valuable than ever, and providing the terabytes and petabytes of space is increasingly important to emerging companies.
In 2009 BackBlaze was kind enough to let everyone in on the top-secret design specs for their “Storage Pod,” the custom server unit that they claimed made their prices possible (CEO Gleb Budman explains in this Ignite talk). And now they’re doing it again. Want to build your own ultra-low-cost storage solution for far less than the likes of Dell and HP? They’ve done the legwork and provided a part-by-part breakdown.
Many items from the previous build are no longer available, which makes things difficult both for them and for aspiring cloud lovers. The new parts will likely die out in two years or so, but if anything the primary cost (the drives) will go down and the capacity will go up, while the other parts (motherboard, RAM, etc) will remain sufficient.
The basic layout is a micro-ATX server motherboard with the usual array of ports and slots. There are six SATA ports but they’re not used for the storage itself. Instead, BackBlaze put in four PCI Express SATA cards, then connected three of their four ports to multipliers with five ports each — a lot of numbers that in the end add up to 45 hard drives in one case. There are two 760-watt PSUs, 4GB of RAM, and one 160GB drive for the system itself. I thought that last bit extravagant and said so, but BackBlaze’s Gleb Budman assured me that’s pretty much the cheapest drive you can get new right now (~$40). It all goes inside a fire engine red custom case made by ProtoCase, which is the single most expensive component at $350.
An unavoidable consequence is the greater magnitude of unit failure, but that’s something that can be controlled by redundancy at a huge scale, and certainly there are algorithms and top-secret software that make avoiding data loss like that a snap. I guess if a SATA controller were to go rogue (or something, I don’t know), even 135TB of lost data is manageable if you’ve taken the correct precautions. And the far more likely failure here (drives) don’t seem to be at any additional risk.
The benefit of having lower cost hardware is obvious, but I think there’s something more to it than saving money. Being the master of your domain counts for something: BackBlaze is in control of their hardware in a way many companies aren’t, and they aren’t beholden to, say, Amazon or Dell for support or maintenance. Plus every part is easily replaceable and they designed the system (it’s really quite straightforward, not to say easy) so they know it top to bottom. Maybe it’s that forward-thinking leanness that almost got them bought?
You can read far more technical details (such as file system changes and cluster stats) at BackBlaze’s blog. I build my own systems as well, and while I don’t have eight grand lying around to replicate their baby, I do like their style.