Meta sues for scraping Facebook and Instagram data

Facebook’s parent company Meta has announced that it’s suing the U.S. subsidiary of a Chinese tech company, accusing it of offering data-scraping services for Facebook and Instagram.

The social networking giant also revealed that it’s suing an individual, who the company alleges set up automated Instagram accounts to scrape data from some 350,000 Instagram users.

Both cases have been filed in the U.S. District Court for the Northern District of California.

Facebook versus scrapers

While Meta and other internet companies are no strangers to fighting web-scrapers — a practice that involves using automated tools to gather data en-masse from websites — the timing of these latest cases is particularly notable. It comes less than three months after a U.S. court reaffirmed an earlier ruling that web-scraping is legal, the culmination of a long-standing legal battle between Microsoft-owned LinkedIn and a data science company called Hiq Labs, which scraped personal information from LinkedIn to help its customers predict employee attrition.

While the outcome was celebrated by many across the industrial spectrum, including archivists, researchers and journalists who rely on scraping publicly available data, it also dealt a serious blow to legitimate privacy and security concerns around how people’s data can be harnessed without their permission. In this particular case, the court ruled that scraping publicly accessible information does not contravene the Computer Fraud and Abuse Act (CFAA), a cybersecurity law that governs computer hacking in the U.S.

Not to be deterred, Meta is now pursuing similar legal action against a company called Octopus Data, the U.S. offshoot of a “Chinese national high-tech enterprise” — the parent company’s website says that’s it’s called “Shenzhen Vision Information Technology Co.,” and it claims to have launched its core product in 2016.

In addition, Meta confirmed that it’s filing a suit against a Turkey-based individual going by the name of Ekrem Ateş, who allegedly published scraped Instagram data to their own websites, or so-called “clone sites.”

Rather than targeting the entities under the auspices of CFAA, Meta’s pursuing matters via the Digital Millennium Copyright Act (DMCA), which is more concerned with copyright and intellectual property (IP) infringements than hacking. With regards to this, in its court filing Meta specifically points to Section 3 of its terms of service, which state:

You own the intellectual property rights (things such as copyright or trademarks) in any such content that you create and share on Facebook and other Meta Company Products you use. Nothing in these Terms takes away the rights you have to your own content. You are free to share your content with anyone else, wherever you want.

Elsewhere, Facebook’s terms also state that:

You will not collect users’ content or information, or otherwise access Facebook, using automated means (such as harvesting bots, robots, spiders or scrapers) without our prior permission.

According to Meta, Octopus charges its customers a fee to access a software product called Octoparse to launch scraping attacks, or they can also pay Octopus to scrape websites directly. For it to work, customers must give access to their accounts, which allows the software to glean data that’s normally only available to logged-in users, including Facebook friends, email addresses, birth dates, phone numbers andInstagram followers, among other engagement data.

It’s also worth noting that Octoparse is not limited to Meta’s properties, either, with services offered across numerous sites including Twitter, YouTube, Amazon, LinkedIn and more.

“Our lawsuit alleges that Octopus has violated our Terms of Service and the Digital Millennium Copyright Act, by engaging in unauthorized and automated scraping and attempting to conceal their scraping and avoid being detected and blocked from Facebook and Instagram,” Jessica Romero, Meta’s director of platform enforcement and litigation, wrote in a blog post.

These latest instances come shortly after Meta emerged mostly victorious from another data-scraping case it filed some two years ago against an Israeli company called BrandTotal, which offered a browser extension that collected data from Facebook users. The judge in that case sided with Meta in its claim that BrandTotal breached the Facebook terms of use, while it also issued a summary judgement that BrandTotal violated CFAA or California’s CDAFA (Computer Data Access and Fraud Act) by accessing password-protected pages using fake user accounts.

Web-scraping is pretty much as old as the web itself, and it’s not something that will be going away any time soon. However, by targeting some of the worst offenders — both at a corporate and individual level — Meta wants to deter others from following suit.