Meta’s new AI deepfake playbook: More labels, fewer takedowns

Meta has announced changes to its rules on AI-generated content and manipulated media following criticism from its Oversight Board. Starting next month, the company said, it will label a wider range of such content, including by applying a “Made with AI” badge to deepfakes. Additional contextual information may be shown when content has been manipulated in other ways that pose a high risk of deceiving the public on an important issue.

The move could lead to the social networking giant labeling more pieces of content that have the potential to be misleading — important in a year of many elections taking place around the world. However, for deepfakes, Meta is only going to apply labels where the content in question has “industry standard AI image indicators,” or where the uploader has disclosed it’s AI-generated content.

AI-generated content that falls outside those bounds will, presumably, escape unlabeled. 

The policy change is also likely to lead to more AI-generated content and manipulated media remaining on Meta’s platforms, since it’s shifting to favor an approach focused on “providing transparency and additional context,” as the “better way to address this content” (rather than removing manipulated media, given associated risks to free speech).

So, for AI-generated or otherwise manipulated media on Meta platforms like Facebook and Instagram, the playbook appears to be: more labels, fewer takedowns.

Meta said it will stop removing content solely on the basis of its current manipulated video policy in July, adding in a blog post published Friday that: “This timeline gives people time to understand the self-disclosure process before we stop removing the smaller subset of manipulated media.”

The change of approach may be intended to respond to rising legal demands on Meta around content moderation and systemic risk, such as the European Union’s Digital Services Act. Since last August, the EU law has applied a set of rules to its two main social networks that require Meta to walk a fine line between purging illegal content, mitigating systemic risks and protecting free speech. The bloc is also applying extra pressure on platforms ahead of elections to the European Parliament this June, including urging tech giants to watermark deepfakes where technically feasible.

The upcoming U.S. presidential election in November is also likely on Meta’s mind.

Oversight Board criticism

Meta’s advisory board, which the tech giant funds but permits to run at arm’s length, reviews a tiny percentage of its content moderation decisions but can also make policy recommendations. Meta is not bound to accept the board’s suggestions, but in this instance it has agreed to amend its approach.

In a blog post published Friday, Monika Bickert, Meta’s VP of content policy, said the company is amending its policies on AI-generated content and manipulated media based on the board’s feedback. “We agree with the Oversight Board’s argument that our existing approach is too narrow since it only covers videos that are created or altered by AI to make a person appear to say something they didn’t say,” she wrote.

Back in February, the Oversight Board urged Meta to rethink its approach to AI-generated content after taking on the case of a doctored video of President Biden that had been edited to imply a sexual motive to a platonic kiss he gave his granddaughter.

While the board agreed with Meta’s decision to leave the specific content up, they attacked its policy on manipulated media as “incoherent” — pointing out, for example, that it only applies to video created through AI, letting other fake content (such as more basically doctored video or audio) off the hook. 

Meta appears to have taken the critical feedback on board.

“In the last four years, and particularly in the last year, people have developed other kinds of realistic AI-generated content like audio and photos, and this technology is quickly evolving,” Bickert wrote. “As the Board noted, it’s equally important to address manipulation that shows a person doing something they didn’t do.

“The Board also argued that we unnecessarily risk restricting freedom of expression when we remove manipulated media that does not otherwise violate our Community Standards. It recommended a ‘less restrictive’ approach to manipulated media like labels with context.”

Earlier this year, Meta announced it was working with others in the industry on developing common technical standards for identifying AI content, including video and audio. It’s leaning on that effort to expand labeling of synthetic media now.

“Our ‘Made with AI’ labels on AI-generated video, audio and images will be based on our detection of industry-shared signals of AI images or people self-disclosing that they’re uploading AI-generated content,” said Bickert, noting the company already applies “Imagined with AI” labels to photorealistic images created using its own Meta AI feature.

The expanded policy will cover “a broader range of content in addition to the manipulated content that the Oversight Board recommended labeling,” per Bickert.

“If we determine that digitally-created or altered images, video or audio create a particularly high risk of materially deceiving the public on a matter of importance, we may add a more prominent label so people have more information and context,” she wrote. “This overall approach gives people more information about the content so they can better assess it and so they will have context if they see the same content elsewhere.”

Meta said it won’t remove manipulated content — whether AI-based or otherwise doctored — unless it violates other policies (such as voter interference, bullying and harassment, violence and incitement, or other Community Standards issues). Instead, as noted above, it may add “informational labels and context” in certain scenarios of high public interest.

Meta’s blog post highlights a network of nearly 100 independent fact-checkers, which it says it’s engaged with to help identify risks related to manipulated content.

These external entities will continue to review false and misleading AI-generated content, per Meta. When they rate content as “False or Altered,” Meta said it will respond by applying algorithm changes that reduce the content’s reach — meaning stuff will appear lower in feeds so fewer people see it, in addition to Meta slapping an overlay label with additional information for those eyeballs that do land on it.

These third party fact-checkers look set to face an increasing workload as synthetic content proliferates, driven by the boom in generative AI tools. And because more of this stuff looks set to remain on Meta’s platforms as a result of this policy shift.