Web scraping is legal, US appeals court reaffirms

Good news for archivists, academics, researchers and journalists: Scraping publicly accessible data is legal, according to a U.S. appeals court ruling.

The landmark ruling by the U.S. Ninth Circuit of Appeals is the latest in a long-running legal battle brought by LinkedIn aimed at stopping a rival company from web scraping personal information from users’ public profiles. The case reached the U.S. Supreme Court last year but was sent back to the Ninth Circuit for the original appeals court to re-review the case.

In its second ruling on Monday, the Ninth Circuit reaffirmed its original decision and found that scraping data that is publicly accessible on the internet is not a violation of the Computer Fraud and Abuse Act, or CFAA, which governs what constitutes computer hacking under U.S. law.

The Ninth Circuit’s decision is a major win for archivists, academics, researchers and journalists who use tools to mass collect, or scrape, information that is publicly accessible on the internet. Without a ruling in place, long-running projects to archive websites no longer online and using publicly accessible data for academic and research studies have been left in legal limbo.

But there have been egregious cases of web scraping that have sparked privacy and security concerns. Facial recognition startup Clearview AI claims to have scraped billions of social media profile photos, prompting several tech giants to file lawsuits against the startup. Several companies, including Facebook, Instagram, Parler, Venmo and Clubhouse have all had users’ data scraped over the years.

The case before the Ninth Circuit was originally brought by LinkedIn against Hiq Labs, a company that uses public data to analyze employee attrition. LinkedIn said Hiq’s mass web scraping of LinkedIn user profiles was against its terms of service, amounted to hacking and was therefore a violation of the CFAA. LinkedIn first lost the case against Hiq in 2019 after the Ninth Circuit found that the CFAA does not bar anyone from scraping data that’s publicly accessible.

On its second pass of the case, the Ninth Circuit said it relied on a Supreme Court decision last June, during which the U.S. top court took its first look at the decades-old CFAA. In its ruling, the Supreme Court narrowed what constitutes a violation of the CFAA as those who gain unauthorized access to a computer system — rather than a broader interpretation of exceeding existing authorization, which the court argued could have attached criminal penalties to “a breathtaking amount of commonplace computer activity.” Using a “gate-up, gate-down” analogy, the Supreme Court said that when a computer or website’s gates are up — and therefore information is publicly accessible — no authorization is required.

The Ninth Circuit, in referencing the Supreme Court’s “gate-up, gate-down” analogy, ruled that “the concept of ‘without authorization’ does not apply to public websites.”

“We’re disappointed in the court’s decision. This is a preliminary ruling and the case is far from over,” said LinkedIn spokesperson Greg Snapper in a statement. “We will continue to fight to protect our members’ ability to control the information they make available on LinkedIn. When your data is taken without permission and used in ways you haven’t agreed to, that’s not okay. On LinkedIn, our members trust us with their information, which is why we prohibit unauthorized scraping on our platform.”