The Kaggle data science community is competing to improve airport security with AI

Going through airport security is a universally painful experience. And despite being slow and invasive, the TSA doesn’t have a great record at catching threats. With the help of the Kaggle data science community, the Department of Homeland Security (DHS) is hosting an online competition to build machine learning-powered tools that can augment agents, ideally making the entire system simultaneously more accurate and efficient.

Kaggle, acquired by Google earlier this year, regularly hosts online competitions where data scientists compete for money by developing novel approaches to complex machine learning problems. Today’s competition to improve threat recognition algorithms will be Kaggle’s third launch this year featuring more than a million dollars in prize money.

With a top prize of $500,000 and a total of $1.5 million at stake, competitors will have to accurately predict the location of threat objects on the body. The TSA is making its data set of images available to competitors so they can train on images of people carrying weapons. Importantly, these will be staged images created by the TSA rather than real-world examples — a necessary move to ensure privacy.

“The outcome of the competition will be a good indicator for how well we can expect such systems to work,” Reza Zadeh, founder and CEO of computer vision startup Matroid told me. “At the very least, we should have such a system augmenting current security guards to ensure they don’t miss dangerous items.”

Of course, the problem the TSA faces isn’t just a machine learning issue. Expensive physical machines are complicated to upgrade, and none feature the kinds of sophisticated GPUs found in modern data centers. Thankfully, Google, Facebook and others are heavily investing in lighter versions of machine learning frameworks, optimized to run locally, at the edge (without internet).

This means that it’s possible that some submissions to this competition could wind up in use on actual scanning machines — it’s just a matter of training beforehand and optimizing for the constrained conditions. The DHS has promised to work closely with the winners to explore potential real-world applications.

“This is a really hard problem, machines do not have crazy GPUs,” Anthony Goldbloom, Kaggle’s creator, told me in an interview. “But one thing that gets lost is that doing inference doesn’t necessarily need such heavy compute.”

Another concern that Kaggle and the TSA had to account for was the risk of bias influencing the automated threat detection process — a potential nightmare for travelers that could be inappropriately segregated based on arbitrary factors. To mitigate this, the TSA put special effort into creating the data set of images that will ultimately be used to train the detectors.

“The TSA did a nice job in setting this up,” Goldbloom emphasized. “They recruited volunteers but made sure that they had a decent amount of diversity so models don’t fail on a certain type of person.”

Google plans to make GCP available to competitors in the near future. And though Google owns Kaggle, it is thankfully not forcing people to use TensorFlow, its own open-source framework. You can check out additional details here; the competition will draw to a close in December.