Amazon Web Services suffered a major outage this morning, affecting the thousands of Websites that rely on its storage (S3) and cloud computing (EC2) services. Startups including Twitter, SmugMug, 37Signals, and AdaptiveBlue, for instance, use Amazon’s S3 storage service to store all the data for their Websites. Reports started coming in across the Web, email, and Twitter about the outage (Twitter only uses S3 for file hosting, not its main messaging application). The major difficulties seem to have been fixed, but some issues persist. The outage started at around 4:30 AM PT.
This could just be growing pains for Amazon Web Services, as more startups and other companies come to rely on it for their Web-scale computing infrastructure. But even if the outage only lasted a couple hours, it is unacceptable. Nobody is going to trust their business to cloud computing unless it is more reliable than the data-center computing that is the current norm. So many Websites now rely on Amazon’s S3 storage service and, increasingly, on its EC2 compute cloud as well, that an outage takes down a lot of sites, or at least takes down some of their functionality. Cloud computing needs to be 99.999 percent reliable if Amazon and others want it to become more widely adopted.
Update: A response from Amazon PR:
For one of our services, the Amazon Simple Storage Service, one of our three geographic locations was unreachable for approximately two hours and was back to operating at over 99% of normal performance before 7 a.m. pst. We’ve been operating this service for two years and we’re proud of our uptime track record. Any amount of downtime is unacceptable and we won’t be satisfied until it’s perfect. We’ve been communicating with our customers all morning via our support forums and will be providing additional information as soon as we have it.