They outline two lines of defense in preparation for a big wave of Twitter usage around the Steve Jobs/Apple WWDC Keynote:
- “We’ve moved much of the load off our database by utilizing more memcache, employing more read-slave servers, and by fixing some bugs for improved efficiency” (until now they had just three database servers, possibly shown in the image above ) and
- “In the event that our estimates and preparations fail…We have isolated and created on/off switches for many Twitter features.”
Experts I’ve spoken with say these are reasonable precautions to take, although they question why more slave servers weren’t set up in the past (“it takes ten minutes,” said one anonymous source). But as a Twitter user, I’m glad to see they’re preparing for the surge.
But this blog post is a possible mistake – if Twitter does go down people will know that the team is unable to keep control even when they promise things will go right. I would have kept quiet on the changes and then wrote a postmortem if everything was smooth sailing.
The smartest thing Twitter could have done would be to hire former Chief Architect Blaine Cook back as a consultant to keep an eye on things for the day (he seems to be the only person that can keep his crazy architecture actually live). But from what we’ve heard that hasn’t happened.
The main thing Twitter has to be careful of is going down. Once that happens it will take them hours to get the service live again, and the keynote will then be over. If they can keep load down to a point where nothing fails, they may win the day. Expect silence on our end if they do, and a merciless blog post if they fail.