Hacker Scrapes Thousands Of Public Phone Numbers Using Facebook Graph Search

A hacker has exploited Facebook’s Graph Search to collect a database of thousands of phone numbers and Facebook users. Both parties agree that all the information was left public by users (even if the users themselves may still not realize it). But Facebook issued him a cease and desist after the hacker continued to scrape data and argued with Facebook that the availability of the information invades users’ privacy.

Brandon Copley, a mobile developer in Dallas, Texas, searched and downloaded 2.5 million entries of phone numbers from the social network. He says many of these entries are empty, as they either aren’t active numbers or aren’t connected to a Facebook user with public settings; however, he notes that thousands of entries do match a phone number with the name of a Facebook user.

A Facebook representative tells me that this is a feature of graph search and that these users have their contact information set to “public.”

“Your privacy settings govern who can find you with search using the contact info you have provided, such as your email address and phone number,” the Facebook representative says. “You can modify these settings at any time from the Privacy Settings page.”

Copley confirms that these users have their contact information set to public, but argues that this is still a security issue.

“Facebook is denying its users the right to privacy by allowing our phone numbers to be publicly searchable as the default setting,” Copley tells me. “This means that anyone with my number knows my Facebook contact information.  I may have not told my future employer about my Facebook account, but if I called them on my cell phone they can now know how to find me on Facebook.”

Facebook admitted to a major security flaw regarding the Download Your Information tool on Friday afternoon that displayed the email and phone numbers of 6 million users; while similar in nature, Facebook says that flaw is unrelated to Copley’s hack, which they say is not a security flaw.

On March 5, Copley reported a tip to Facebook security, writing, “There is a security invulnerability that allows someone to essentially create a database of phone numbers and Facebook users.”

“Personally, I used this to catch a criminal–someone was selling stolen goods on Craigslist, and I had their number, and used this to find who that person was on Facebook and from there reported them to the police,” he continued in the March 5 email.

A member of Facebook’s security team wrote back, in an email Copley shared with us, “I agree with you personally. We do have anti­scraping protections (rate­limiting, bad ip blocks, etc) but it comes down to people controlling their privacy, we can make the privacy tools available and we can encourage them to use them but we could never just switch their privacy settings for them. So there is not much more we can do”

Copley says Facebook told him the supposed security flaw was a feature of Graph Search.

“I then went on to gather the <2.5 million entry database at this point to show them how a ‘feature’ like this is a security flaw,” he tells me.

Copley says he used his access tokens from his developer account and the Facebook Search API to perform thousands of searches per day for phone numbers; when he began hitting up against the rate limit of his developer account, he found a way to use the API token of an app that isn’t rate-limited and performed millions of searches.

In March and early April, Copley’s Facebook account was banned several times.

“I have no idea why you are getting banned or what is going on. I don’t think there is any reason it would be happening because of the research you were doing here though,” the same member of Facebook’s security team wrote to Copley in early April.

On April 26, Facebook’s lawyers sent Copley a cease-and-desist letter, stating, “you are unlawfully acquiring Facebook user data. It appears that you are accessing Facebook through automated means and stealing Facebook access tokens in order to scrape data from Facebook’s site without permission.”

Facebook’s lawyers demanded that Copley send them: “1. Copies of all scripts or methods that you use or have used to scrape Facebook user data, along with a brief description of how each method works; 2. All information regarding the individuals that you shared or discussed Facebook token jacking scripts with, including any identifying information such as name, email address, forum screen names; 3. A complete description of what you shared with the individuals listed in No. 2; 4. Access to all Facebook user data that you scraped from the Facebook site; and 5. The names of individuals that you disclosed any portion of the scraped Facebook data to.”

Copley says that Facebook lawyers mentioned Andrew Auernheimer’s case in conversations with him.

In 2010, Auernheimer discovered a security flaw in AT&T’s iPad user database, which let him reveal contact information for 114,000 iPad 3G users. Auernheimer showed this to a writer at Gawker Media; he is now serving a 41-month jail sentence.

The obvious difference between these two instances is that Auernheimer revealed that information that was supposed to be private but was publicly accessible; the information that Copley scraped is all set to public.

Copley also says he has also been looking at other ways to search Facebook for phone numbers and now believes he has found an even faster way to connect Facebook users and phone numbers than through the search API.

Facebook wants to have it both ways. It creates interfaces that often encourage users to share more data publicly, which lets them do things like search for each other using only a phone number. But it also wants to retain a sense of privacy — and control over users — so it fights anyone else who tries to access the data it helps make public.

At this point, it is unclear if Facebook will pursue litigation against Copley. He appears to be determined to press Facebook on this privacy issue and show the world how widely accessible users’ public contact information is.