OpenAI built a text generator so good, it’s considered too dangerous to release

A storm is brewing over a new language model, built by non-profit artificial intelligence research company OpenAI, which it says is so good at generating convincing, well-written text that it’s worried about potential abuse.

That’s angered some in the community, who have accused the company of reneging on a promise not to close off its research.

OpenAI said its new natural language model, GPT-2, was trained to predict the next word in a sample of 40 gigabytes of internet text. The end result was the system generating text that “adapts to the style and content of the conditioning text,” allowing the user to “generate realistic and coherent continuations about a topic of their choosing.” The model is a vast improvement on the first version by producing longer text with greater coherence.

But with every good application of the system, such as bots capable of better dialog and better speech recognition, the non-profit found several more, like generating fake news, impersonating people, or automating abusive or spam comments on social media.

To wit: when GPT-2 was tasked with writing a response to the prompt, “Recycling is good for the world, no, you could not be more wrong,” the machine spat back:

“Recycling is NOT good for the world. It is bad for the environment, it is bad for our health, and it is bad for our economy. I’m not kidding. Recycling is not good for the environment. It is destructive to the earth and it is a major contributor to global warming. Recycling is not good for our health. It contributes to obesity and diseases like heart disease and cancer. Recycling is bad for our economy. It increases the cost of a product, and in turn, the price of everything that is made with that product. Recycling is not good for our nation. We pay a tremendous price for the privilege of having the world’s most advanced and efficient recycling system. Recycling is a huge, colossal waste of time, energy, money, and resources.”

No wonder OpenAI was worried about releasing it.

For that reason, OpenAI said, it’s only releasing a smaller version of the language model, citing its charter, which noted that the organizations expects that “safety and security concerns will reduce our traditional publishing in the future.” Admittedly, the organization said that it wasn’t sure of the decision, “we believe that the AI community will eventually need to tackle the issue of publication norms in a thoughtful way in certain research areas.”

Not everyone took that well. OpenAI’s tweet announcing GPT-2 was met with anger and frustration, accusing the company of “closing off” its research, and doing the “opposite of open,” seizing on the company’s name.

Others were more forgiving, calling the move a “new bar for ethics” for thinking ahead of possible abuses.

Jack Clark, policy director at OpenAI, said the organization’s priority is “not enabling malicious or abusive uses of the technology,” calling it a “very tough balancing act for us.”

Elon Musk, one of the initial funders of OpenAI, was roped into the controversy, confirming in a tweet that he has not been involved with the company “for over a year,” and that he and the company parted “on good terms.”

OpenAI said it’s not settled on a final decision about GPT-2’s release, and that it will revisit in six months. In the meantime, the company said that governments “should consider expanding or commencing initiatives to more systematically monitor the societal impact and diffusion of AI technologies, and to measure the progression in the capabilities of such systems.”

Just this week, President Trump signed an executive order on artificial intelligence. It comes months after the U.S. intelligence community warned that artificial intelligence was one of the many “emerging threats” to U.S. national security, along with quantum computing and autonomous unmanned vehicles.