What started as a small issue affecting some instances of Amazon’s Elastic Cloud Compute (EC2) in North Virginia became a full-blown outage of AWS in North Virginia. Major services, such as Reddit, Foursquare, Minecraft and Heroku, are down. GitHub, imgur, Pocket, HipChat, Coursera and others are affected.
Reddit experienced hiccups at first when only the Elastic Bloc System was affected, but then seemed to go back to normal for most people. It is now down again.
Many other services are now down as the issue is now affecting other parts of Amazon Web Services. Amazon’s RDS database instances and Elastic Beanstalk are down in North Virginia. Minecraft, Pinterest, Foursquare and Airbnb are down. Finally, AWS’s ElastiCache and CloudWatch are experiencing delays and connectivity issues.
Heroku currently reports elevated error rates for both the production service and the development service. Many smaller services are affected by the outage.
As always, users and developers are wondering why half of the Internet is going down every time an AWS data center experiences issues.
On Amazon’s AWS status page, the company first reported issues only affecting EBS volumes:
10:38 AM PDT We are currently investigating degraded performance for a small number of EBS volumes in a single Availability Zone in the US-EAST-1 Region.
11:11 AM PDT We can confirm degraded performance for a small number of EBS volumes in a single Availability Zone in the US-EAST-1 Region. Instances using affected EBS volumes will also experience degraded performance.
11:26 AM PDT We are currently experiencing degraded performance for EBS volumes in a single Availability Zone in the US-EAST-1 Region. New launches for EBS backed instances are failing and instances using affected EBS volumes will experience degraded performance.
Then the Relational Database Service went down as well:
11:03 AM PDT We are currently experiencing connectivity issues and degraded performance for a small number of RDS DB Instances in a single Availability Zone in the US-EAST-1 Region.
11:45 AM PDT A number of Amazon RDS DB Instances in a single Availability Zone in the US-EAST-1 Region are experiencing connectivity issues or degraded performance. New instance create requests in the affected Availability Zone are experiencing elevated latencies. We are investigating the root cause.
The Elastic Beanstalk experience similar issues:
11:06 AM PDT We are currently experiencing elevated API failures and delays launching, updating and deleting Elastic Beanstalk environments in the US-East-1 Region.
11:45 AM PDT We are continuing to see delays launching, updating and deleting Elastic Beanstalk environments in the US-East-1 Region.
Update: Operations are now slowly recovering and everything should be back to normal.
The site is down right now. It appears to be a network-related issue. We are investigating.— reddit status (@redditstatus) October 22, 2012
Hi Pinners, we are currently experiencing site issues and working hard to resolve this as soon as possible. Thanks for your patience!— Pinterest (@Pinterest) October 22, 2012
You may have noticed that some of your favorite sites are down, including Foursquare. We're hoping things will be back to normal soon!— Foursquare Support (@4sqSupport) October 22, 2012
Apologies. Our site is having a case of the Mondays... We'll Airbrb as soon as possible.— Airbnb (@Airbnb) October 22, 2012
The Heroku API has been transitioned from maintenance to read-only mode. App unidling has been restored. via @herokustatus— Heroku (@heroku) October 22, 2012