Facebook will change algorithm to demote "borderline content" that almost violates policies

Facebook will changed its News Feed algorithm to demote content that comes close to violating its policies prohibiting misinformation, hate speech, violence, bullying, clickbait so it’s seen by fewer people even it’s highly engaging. The change could massively reduce the reach of incendiary political groups, fake news peddlers, and more of the worst stuff on Facebook. It allows the company to hide what it doesn’t want on the network without taking a hard stance it must defend about the content breaking the rules.

In a 5000-word letter by Mark Zuckerberg published today, he explained how there’s a “basic incentive problem” that “when left unchecked, people will engage disproportionately with more sensationalist and provocative content. Our research suggests that no matter where we draw the lines for what is allowed, as a piece of content gets close to that line, people will engage with it more on average — even when they tell us afterwards they don’t like the content.”

Without intervention, the engagement with borderline content looks like the graph above, increasing as it gets closer to the policy line. So Facebook is intervening, artificially suppressing the News Feed distribution of this kind of content so engagement looks like the graph below.

[Update: While Zuckerberg refers to the change in the past tense in one case, Facebook tells me borderline content demotion is only in effect in limited instances. The company will continue to repurpose its AI technology for proactively taking down content in violation of its policies to find and demote content that approaches the limits of those policies.]

Facebook will apply penalties to borderline content not just the News Feed but to all of its content, including Groups and Pages themselves to ensure it doesn’t radicalize people by recommending they join communities because they’re highly engaging thanks to toeing the policy line. “Divisive groups and pages can still fuel polarization” Zuckerberg notes.

However, users who purposefully want to view borderline content will be given the chance to opt in. Zuckerberg writes that “For those who want to make these decisions themselves, we believe they should have that choice since this content doesn’t violate our standards.” For example, Facebook might create flexible standards for types of content like nudity where cultural norms vary, like how some coutnries ban women from exposing much skin in photographs while others allow nudity on network television. It may be some time until these opt ins are available, though, as Zuckerber says Facebook must first train its AI to be able to reliably detect content that either crosses the line, or purposefully approaches the borderline.

Facebook had previously changed the algorithm to demote clickbait. Starting in 2014 it downranked links that people clicked on but quickly bounced from without going back to Like the post on Facebook. By 2016, it was analyzing headlines for common clickbait phrases, and this year it banned clickbait rings for inauthentic behavior. But now it’s giving the demotion treatment to other types of sensational content. That could mean posts with violence that stop short of showing physical injury, or lewd images with genitalia barely covered, or posts that suggest people should commit violence for a cause without directly telling them to.

Facebook could end up exposed to criticism, especially from fringe political groups who rely on borderline content to whip up their bases and spread their messages. But with polarization and sensationalism rampant and tearing apart society, Facebook has settled on a policy that it may try to uphold freedom of speech, but users are not entitled to amplification of that speech.

Below is Zuckerberg’s full written statement on the borderline content:

One of the biggest issues social networks face is that, when left unchecked, people will engage disproportionately with more sensationalist and provocative content. This is not a new phenomenon. It is widespread on cable news today and has been a staple of tabloids for more than a century. At scale it can undermine the quality of public discourse and lead to polarization. In our case, it can also degrade the quality of our services.

[ Graph showing line with growing engagement leading up to the policy line, then blocked ]

Our research suggests that no matter where we draw the lines for what is allowed, as a piece of content gets close to that line, people will engage with it more on average — even when they tell us afterwards they don’t like the content.

This is a basic incentive problem that we can address by penalizing borderline content so it gets less distribution and engagement. By making the distribution curve look like the graph below where distribution declines as content gets more sensational, people are disincentivized from creating provocative content that is as close to the line as possible.

[ Graph showing line declining engagement leading up to the policy line, then blocked ]

This process for adjusting this curve is similar to what I described above for proactively identifying harmful content, but is now focused on identifying borderline content instead. We train AI systems to detect borderline content so we can distribute that content less.

The category we’re most focused on is click-bait and misinformation. People consistently tell us these types of content make our services worse — even though they engage with them. As I mentioned above, the most effective way to stop the spread of misinformation is to remove the fake accounts that generate it. The next most effective strategy is reducing its distribution and virality. (I wrote about these approaches in more detail in my note on [Preparing for Elections].)

Interestingly, our research has found that this natural pattern of borderline content getting more engagement applies not only to news but to almost every category of content. For example, photos close to the line of nudity, like with revealing clothing or sexually suggestive positions, got more engagement on average before we changed the distribution curve to discourage this. The same goes for posts that don’t come within our definition of hate speech but are still offensive.

This pattern may apply to the groups people join and pages they follow as well. This is especially important to address because while social networks in general expose people to more diverse views, and while groups in general encourage inclusion and acceptance, divisive groups and pages can still fuel polarization. To manage this, we need to apply these distribution changes not only to feed ranking but to all of our recommendation systems for things you should join.

One common reaction is that rather than reducing distribution, we should simply move the line defining what is acceptable. In some cases this is worth considering, but it’s important to remember that won’t address the underlying incentive problem, which is often the bigger issue. This engagement pattern seems to exist no matter where we draw the lines, so we need to change this incentive and not just remove content.

I believe these efforts on the underlying incentives in our systems are some of the most important work we’re doing across the company. We’ve made significant progress in the last year, but we still have a lot of work ahead.

By fixing this incentive problem in our services, we believe it’ll create a virtuous cycle: by reducing sensationalism of all forms, we’ll create a healthier, less polarized discourse where more people feel safe participating.