Exposed GOP database demonstrates the risks of data-hungry political campaigns

Since November, the Trump campaign’s possibly brilliant, maybe-just-lucky data strategy launched a thousand thinkpieces. Now, we’ve got a more intimate look at that data than we were ever meant to have.

As discovered by Chris Vickery, a cyber risk analyst at UpGuard, and reported by Gizmodo, an analytics firm hired by the Republican National Committee left the data of 198 million U.S. voters sitting out in the open on a public server. The more than a terabyte of data, owned by Deep Root Analytics, included personal identifying information like birth dates, home addresses and phone numbers as well as demographic info like ethnicity and religion.

UpGuard’s blog explains how the firm came across the unprotected data:

“In the early evening of June 12th, UpGuard Cyber Risk Analyst Chris Vickery discovered an open cloud repository while searching for misconfigured data sources on behalf of the Cyber Risk Team, a research unit of UpGuard devoted to finding, securing, and raising public awareness of such exposures. The data repository, an Amazon Web Services S3 bucket, lacked any protection against access. As such, anyone with an internet connection could have accessed the Republican data operation used to power Donald Trump’s presidential victory, simply by navigating to a six-character Amazon subdomain: “dra-dw”.”

In 2016, Deep Root earned more than $900,000 from the RNC for campaign year data and analysis on potential voters. The unprotected Deep Root database also contained data from other firms with RNC contracts, including Americans for Prosperity and the Data Trust, both well-funded conservative groups with massive data troves.

It is not fully clear if anyone made off with the exposed data during the 12 days it sat out in the open, but Deep Root doesn’t seem to think so. In a security statement, the company admitted to its big data self-own:

“Deep Root Analytics has become aware that a number of files within our online storage system were accessed without our knowledge…

We are conducting an internal review and have retained cyber security firm Stroz Friedberg to conduct a thorough investigation. Through this process, which is currently underway, we have learned that access was gained through a recent change in access settings since June 1. We accept full responsibility, will continue with our investigation, and based on the information we have gathered thus far, we do not believe that our systems have been hacked.”

Deep Root’s open data stash notably included raw text scraped from Reddit, including the now-banned subreddit r/fatpeoplehate, a popular forum with Trump’s r/The_Donald Reddit base, some Spanish-speaking subreddits and at least one about mountain biking. Where that data fit into the GOP’s strategy remains unclear, but it shows that social sites well beyond real identity-obsessed Facebook have evolved into rich sources for political campaigns seeking to understand and predict voter behavior.

As valuable as this kind of dataset might be, Deep Root’s carelessness shows that when the race has come and gone, keeping all of that aggregate data safe must not be quite as lucrative as scooping it all up in the first place.