Twitter relaunches test that asks users to revise harmful replies

Twitter is running a new test that will ask users to pause and think before they tweet. According to the company’s announcement, when Twitter detects what appears to be a potentially harmful or offensive reply to someone else’s tweet, it will prompt you to consider revising your text instead of tweeting.

Users whose tweets are flagged in this way will see a pop-up message appear on their screen, which asks, “Want to review this before Tweeting?” There are three buttons to then choose from: one to tweet the reply anyway, an Edit button (this is as close as we’ll get, apparently) and a delete button to discard the tweet entirely. There is also a small link to report if the system got things wrong.

This is not the first time Twitter has run a test like this.

In May 2020 and again in August 2020, Twitter ran variations on this same experiment. In those cases, the text on the pop-up screen was largely the same, but the layout of the three buttons looked different and were less colorful.

The earlier tests ran on Android, iOS and web, but this current iteration is only on iOS in English, for the time being.

At the time of the initial test, Twitter explained its systems were able to detect harmful language based on the kind of language that had been used in other tweets that had been reported in the past.

It’s been shown that these sorts of built-in small nudges can have an impact.

For example, when Twitter began prompting users to read the article linked in a tweet before retweeting it, the company found that users would open the articles 40% more often than without the nudge. Twitter has also built similar experiments to try to slow down the pace of online conversation on its platform, by doing things like discouraging retweets without commentary or slow down “Likes” on tweets containing misinformation.

Other social networks use small nudges like this, too, to influence their users’ behavior. Instagram back in 2019 launched a feature that would flag potentially offensive comments before they were posted, and later expanded this to captions. TikTok more recently launched a banner that would ask users if they were sure they wanted to share a video that contains “unverified content.”

It’s unclear why Twitter hasn’t simply rolled out the pop-up to combat online abuse — still a serious issue on its platform — and then iterated on the design and style of the message box, as needed.

Compared with the much larger engineering and design efforts the company has had underway — including its newer Stories feature known as Fleets and a Clubhouse rival called Spaces — a box asking users to pause and think seems like something that could have graduated to a full product by now.

Twitter, however, says it chose to pause the earlier experiment because it needed to make improvements.

“We paused the experiment once we realized the prompt was inconsistent and that we needed to be more mindful of how we prompted potentially harmful Tweets. This led to more work being done around our health model while making it easier for people to share feedback if we get it wrong,” a Twitter spokesperson explained.

“We also made some changes to this reply prompt to improve how we evaluate potentially offensive language – like insults, strong language, or hateful remarks – and are offering more context for why someone may have been prompted. As we were testing last summer, we also began looking at relationships between Tweet authors and people who reply to avoid prompting replies that were jokes or banter between friends,” they added.