False news spreads faster than truth online thanks to human nature

The rapidity with which falsity travels has been proverbial for centuries: “Falsehood flies, and the Truth comes limping after it,” wrote Swift in 1710. Yet empirical verification of this common wisdom has been scarce — to our chagrin these past few years as lies in seven-league boots outpace a hobbled truth on platforms seemingly bespoke for this lopsided race.

A comprehensive new study from MIT looks at a decade of tweets, and finds that not only is the truth slower to spread, but that the threat of bots and the natural network effects of social media are no excuse: we’re doing it to ourselves.

The study, published today in Science, looked at the trajectories of more than 100,000 news stories, independently verified or proven false, as they spread (or failed to) on Twitter. The conclusion, as summarized in the abstract: “Falsehood diffused farther, faster, deeper, and more broadly than the truth in all categories of information.”

Image: Bryce Durbin/TechCrunch

But read on before you blame Russia, non-chronological feeds, the election or any other easy out. The reason false news (a deliberate choice in nomenclature to keep it separate from the politically charged “fake news”) spreads so fast is a very human one.

“We have a very strong conclusion that the spread of falsity is outpacing the truth because human beings are more likely to retweet false than true news,” explained Sinan Aral, co-author of the paper.

“Obviously we didn’t get inside the heads of the people deciding to retweet or consume this information,” he cautioned. “We’re really just scratching the surface of this. There’s been very little empirical large scale evidence one way or the other about how false news spreads online, and we need a lot more of it.”

Still, the results are robust and fairly straightforward: people just seem to spread false news faster.

It’s an unsatisfying answer, in a way, because people aren’t an algorithm or pricing model we can update, or a news outlet we can ignore. There’s no clear solution, the authors agreed — but that’s no reason why we shouldn’t look for one.

A decade of tweets

The study, which co-author Soroush Vosoughi pointed out was underway well before the current furor about fake news, worked like this.

The researchers took millions of tweets from 2006 to 2017 and sorted through them, finding any that related to one of 126,000 news stories that had been evaluated by at least one of six fact-checking organizations: Snopes, PolitiFact, FactCheck.org, Truth or Fiction, Hoax Slayer and About.com.

They then looked at how those news stories were posted and retweeted using a series of measures, such as total tweets and retweets, time to reach a threshold of engagement, reach from the originating account and so on.

These patterns form “cascades” with different profiles: for instance, a fast-spreading rumor that’s quickly snuffed out would have high breadth but little depth, and low virality.

The team compared the qualities of cascades from false news stories and true ones, and found that, with very few exceptions, false ones reached more people, sooner, and spread further.

And we’re not talking a few percentage points here. Some key quotes:

Whereas the truth rarely diffused to more than 1000 people, the top 1% of false-news cascades routinely diffused to between 1000 and 100,000 people.
It took the truth about six times as long as falsehood to reach 1500 people.
Falsehood also diffused significantly more broadly and was retweeted by more unique users than the truth at every cascade depth.
False political news also diffused deeper more quickly and reached more than 20,000 people nearly three times faster than all other types of false news reached 10,000 people.

Every way that mattered, false report moved faster and reached more people, usually by multiples or orders of magnitude.

Objection!

Before we go on to the reasons why and the researchers’ suggestions for remedies and future research, we should address some potential objections.

Maybe it’s just bots? Nope. The researchers ran bot-detection algorithms and carefully removed all obvious bots, studying their patterns separately, then testing the data with and without them present. The patterns remained. “We did find that bots do spread false news at a slightly higher rate than true news, but the results still stood. Bots don’t explain the difference,” said Vosoughi.

“Our results are contrary to some of the hype recently about how important bots are to the process,” Aral said. “Not to say they aren’t important, but our research shows they aren’t the main driver.”

Maybe the fact-checking sites are just biased? No fact checker can be completely without bias, but these sites agreed on the veracity of stories more than 95 percent of the time. A systematic bias across half a dozen sites obsessed with objectivity and documentation begins to verge on conspiracy theory. Not convinced?

“We were very conscious of the potential for selection bias from starting with the fact checking organizations,” Aral said. “So we created a second set of 13,000 stories that were fact checked independently — all new stories. We ran that data and found very similar results.”

Three MIT undergrads were the ones independently verifying the 13,000-story data set, agreeing on veracity over 90 percent of the time.

Maybe false news spreaders just have large, established networks? Quite the contrary. As the paper reads:

One might suspect that structural elements of the network or individual characteristics of the users involved in the cascades explain why falsity travels with greater velocity than the truth. Perhaps those who spread falsity “followed” more people, had more followers, tweeted more often, were more often “verified” users, or had been on Twitter longer. But when we compared users involved in true and false rumor cascades, we found that the opposite was true in every case.

In fact, people spreading false news…

had fewer followers
followed fewer people
tweeted less often
were verified less often
had joined later

“Falsehood diffused farther and faster than the truth despite these differences, not because of them,” the researchers write.

So why does false news spread quicker?

On this count the researchers can only speculate, although their speculation is of the justified, data-backed sort. Fortunately, while the large-scale spreading of false news is a new and relatively unstudied phenomenon, sociology and psychology have more to say elsewhere.

“There’s actually extensive study in human communications in why certain news spreads faster, not just a common sense understanding of it,” explained Deb Roy, the third co-author of the paper. “It’s well understood that there’s a bias to our sharing negative over positive news, and also a bias to sharing surprising over unsurprising news.”

If people are more likely to spread news that’s novel (which is “almost definitional,” Roy said) and also news that’s negative (the “if it bleeds, it leads” phenomenon), then all that remains to be seen is whether false news is more novel and more negative than true news.

Photo: SuperStock/Getty Images

The researchers analyzed a subset of users and their histories to compare the novelty of false versus true rumor tweets. They found that indeed, “false rumors were significantly more novel than the truth across all novelty metrics.”

Looking at word choice and the emotions associated with them, the researchers then found that false rumors created replies expressing surprise and disgust — while the replies to truths resulted in sadness, anticipation, joy and trust.

The implications seem clear, though they can only be made official through further experimentation. At present the researchers have established that false news propagates faster, and false news is more novel and negative. Another experiment will have to prove that false news propagates faster because it is more novel and negative.

What can we do about it?

If humans are responsible for the spread of false news, what hope do we have? Well, don’t lose hope, this is an old problem and people have been dealing with it for centuries, as Swift showed us. Just maybe not on this scale.

“Putting millions — or, overall across platforms, billions of people in a position to play an active real time role in news distribution is new,” said Roy. “There’s a lot more science to be done to understand networked human behavior and how that intersects with communicating news and information.”

Roy said he liked to frame the question as one of health. And in fact Jack Dorsey just last week used the same metaphor during a lengthy tweetstorm — citing Roy’s nonprofit company Cortico as the source for it.

We’re committing Twitter to help increase the collective health, openness, and civility of public conversation, and to hold ourselves publicly accountable towards progress.

— jack (@jack) March 1, 2018

Roy and others are working on building what he called health indicators for a system like Twitter, but obviously also for other online systems — Facebook, Instagram, forums, you name it. But he was quick to point out that those platforms are just part of what you might call a holistic online health approach.

For instance, Aral pointed out issues on the economic side: “The social media advertising system creates incentives for spreading false news, because advertisers are rewarded for eyeballs.” Cutting false news means making less money, a choice few companies would make.

“There’s a short-term profit hit from stopping news from spreading online,” Aral admitted. “But there’s also a long-term sustainability issue. If the platform becomes a wasteland of false news and unhealthy conversations, people may lose interest altogether. I think Facebook and Twitter have a true long-term profit maximizing incentive.”

But if the problem is with people as well as algorithms and ad rates, what can be done?

“What you want is for people to pause and reflect on what they’re doing, but boy is that hard, as every behavioral economist knows,” said Roy. But what if you make it easy and ubiquitous?

“When you go to the grocery store,” Aral said, “the food is extensively labeled. How it’s produced, where it came from, does it have nuts in it, etc. But when it comes to information we don’t have any of that. Does this source tend to produce false information or not? Does this news outlet require 3 independent sources or just one? How many people contributed to the story? We don’t have any of that information about the news, only the news as it’s presented to us.”

He mentioned that Vosoughi (who modestly or absent-mindedly neglected to mention it on our separate call) had designed an algorithm that could give a good indication of the truthfulness of stories before they spread on Twitter. Why don’t companies like Facebook and Google do something like this with all their data, their experts in machine learning and language, their comprehensive histories of sites and stories, activity and engagement?

There’s a lot of talk, but action seems a bit harder to come by. But Roy cautioned against looking for a magic bullet from the likes of Twitter or Facebook.

“There’s a lot of focus on the platforms,” he said. “The platform is super important, but there’s also the content producers, advertisers, influencers and then of course there’s the people! The kind of policy changes or interventions, or tools, that allow for regulation or change for each of those is going to look different, because they all have different roles.”

“That’s good,” he noted, “because it’ll keep researchers like us humming along for a long time.”

So will the data set, which the researchers are releasing (with Twitter’s consent) for anyone to experiment on or verify the current results. Expect further work in this area soon.