How Do Top Android Developers QA Test Their Apps?

A couple weeks ago I ran this post showing how one Hong Kong developer, Animoca, tests its Android games. The company, which has had more than 70 million downloads, tests every one of their apps on about 400 different devices. The photo above is from their headquarters and is just a taste of all the Android phones and tablets they use.

Needless to say, that post pissed Android supporters off. Some commenters said it intimidated would-be developers, who might get scared off by Android fragmentation and the perception that you have to support hundreds of devices, screen sizes and densities and versions of the OS.

So, I asked around to see how other mobile game developers do quality assurance testing for Android. This is what I got:

Red Robot Labs: (Backed by Benchmark Capital. Veteran founding team from EA, Playdom and Crowdstar. More than 3.5 million downloads. They currently have the #27 top-grossing game in the Google Play store.)

Red Robot uses about 12 devices in-house and has a quality assurance team of two people. They then use a U.K.-based company called Testology to get further coverage with 35 handsets.

“I applied a common sense filter,” says co-founder Pete Hawley, who hails from EA and has more than 15 years’ experience in the gaming industry. He goes by an 80/20 rule in trying to identify a low number of devices that will cover the widest amount of users. They start with the basic data from Google that shows overall distribution of different versions of Android and screen size densities. Then they look at their analytics to find which devices are most widely used by their players. Finally, they’ll look at player requests and support tickets.

He says it’s good to be selective about which devices to support, especially with all sorts of lower-end handsets coming in from Asia.

“Saying no to players with small, poor, outdated phones or old OSs is important too,” he says. “Overall, I’d say the process of staying on top of all the handsets, carriers, OS’s and carriers wasn’t as hard as I expected. It’s not a great deal of work to keep the 80 percent well-covered.”

Here’s a snapshot of how Red Robot’s device distribution looked last fall. (It’s a very fragmented pie!)

Pocket Gems:
(Backed by Sequoia Capital, Redpoint Ventures. More than 70 million downloads. Newer to Android, but they had two of the top 10 grossing iOS games for all of last year according to Apple’s iTunes Rewind. #35 top-grossing game in Google Play.)

So Pocket Gems’ QA testing is actually run by a former Air Force colonel(!) named Ray Vizzone. They use a little more than 40 devices evaluated in a matrix they explain in the video below. They make sure they include both tablets and phones and then high-resolution and low-resolution devices. They also make sure to include all five major graphic processing units (GPUs) including Adreno, PowerVR, Tegra, Mali and Vivante.

Their QA process is designed to be hyper-speedy as the gaming industry has changed in some fundamental ways over the last few years. Like what Zynga has done in the social gaming industry, today’s mobile games are more like services rather than finished products you pick up off the shelf. So they require constant updates with fresh content every few days.

For the San Francisco-based startup, quality assurance testing is a 24-7 process that involves teams both in the U.S. and abroad. After the U.S. team designs and performs tests during the day, they hand their work to an offshore team that has all of the exact same 40 or so Android devices. This team does extra compatibility testing overnight and files all of the bugs into a defect tracking system, which go back to the U.S. team in the morning.

Pocket Gems tests all features in three phases. They have 1) new features testing 2) integration testing and 3) release candidate testing. Even as developers design new features for their games, Pocket Gems’ QA teams are already at work designing tests for them so they can be checked the moment they’re ready. Once those features are stabilized, they’re integrated into the games and tested a second time.

“As the bugs are found and fixed during integration testing, the product managers and test leads begin their risk assessment as to when to freeze the code base in preparation for shipping,” co-founder Harlan Crystal explains. “Once this decision is made, a full regression test pass is started.”

That final pass involves a full suit of tests that examine memory, performance and device compatibility. “If we don’t find any new or critical bugs during this RC test pass, we bless the bits and ship it!” he says.

[scribd id=95607604 key=key-1yovrxvf2hnqduhau5bv mode=list]


Storm8: (More than 300 million downloads. Totally bootstrapped. Four games in Android’s top-grossing 50. Founders are early Facebook alums.)

Storm8 uses between 30 and 50 devices, which they divide into groups of high-end, mid-range and low-end devices.  They intentionally buy devices for each category. After they launch games, they have the apps send back different KPIs (key performance indicators) back to the company’s servers.

“This way, we can tell if we need to further fine-tune a certain class of devices, or even specific devices, to squeeze the last bit of performance from the devices,” says chief executive Perry Tam.

Animoca(More then 70 million downloads. Backed by IDG-Accel and Intel Capital).

After the original post ran, Animoca ran a longer piece explaining why it does quality assurance testing with so many devices. The main reason is because the company has a huge user base in mainland China and other parts of Asia where there is a plethora of lower-end and non-compatible Android devices (meaning phones that are based on the OS but aren’t certified to run Google applications or the official Android app store).

“If we had taken the approach that 90 percent compatibility is good enough, we’d be lacking support for 7 million of [our] downloads,” the company explains. “Several millions of consumers would have had a bad experience as a result of our decision, and our app revenues would probably be short by around 10 percent.”

Keep in mind that Animoca is not exactly a young company. It’s a mobile gaming-centric arm of a more than 10-year-old company called Outblaze that has focused on digital media and apps for years. So they have lots of experience in doing compatibility and quality assurance testing.

The company’s chief executive Yat Siu feels that their comprehensiveness is a part of why they perform decently on the platform, with “double-digit” millions of dollars in revenue per year from Android. Animoca doesn’t have any games in the top-grossing 50 right now in the U.S., but they make up for it with high rankings in Asian markets and in the sheer number of apps they publish per year.

Conclusion: If this still freaks you out, just remember that it was way worse in the days of feature phones. (At least, that’s what Rovio’s Peter Vesterbacka tells us. Rovio says compared to the J2ME/Brew era, Android is actually easy! They had to make more than 50 games before they created uber-hit Angry Birds.)

Just for reminders about how hard it was then, here are two slides from JAMDAT’s original IPO slidedeck in 2005. JAMDAT was the seminal mobile gaming acquisition of the feature phone era when they were bought by Electronic Arts for $680 million. The company had to spend five years building relationships with more than 90 carriers in about 40 countries and it was standard to support about 400 devices.

So while Android fragmentation seems like a headache, your dad’s mobile app maker was trudging seven miles uphill in the snow QA testing with 400 different phones and dealing with business development people from a hundred carriers.

It’s also easier now with specialty shops handle mobile QA testing now like Testology, which Red Robot uses, and uTest. That said, the very biggest developers still want to do most everything in-house.