Technological breakthroughs in the U.S. never fail to inspire challengers, followers and opportunists in China. It’s ChatGPT’s turn to capture the imagination of the world’s largest internet population. On WeChat, ChatGPT’s “trending index,” an indicator of a keyword’s popularity on the social network, rose 155 folds within the last 30 days. It’s fascinating to watch how OpenAI’s powerful language model sparks great interest among the country’s tech giants, startups and ordinary people, not least because it offers a lens to understand the state of the AI race between two superpowers.
Unlike many other major Western internet platforms, the ChatGPT site isn’t blocked in China, yet. But load time is slow, and users can’t sign up with Chinese phone numbers, which adds an extra barrier to using it.
China wants its own intelligent natural-language systems for reasons ranging from language to politics, as we wrote before. The burden to create an indigenous ChatGPT inevitably falls on the country’s tech giants. After debuting a Stable Diffusion-style art generator, Baidu recently announced that it has been working on the Ernie bot, which was slated to launch in March. E-commerce powerhouse Alibaba subsequently revealed that its answer to ChatGPT is undergoing internal testing. Just as I’m writing, WeChat’s parent and video game giant Tencent said it also has “relevant research” ongoing.
Baidu and Alibaba stocks surged briefly on their chatbot announcements. But one knows froth is forming when investors start rallying around smaller AI players that are far from being able to develop powerful language models. Chinese audio-to-text service provider iFlytek saw its share price jump 17% this year despite having no ChatGPT-style product in its line of work.
Obscure companies are also piggybacking on the AI frenzy. Scores of ChatGPT knock-offs began cropping up on WeChat last week, charging users for sending prompts. The social network seems to have taken notice. At the time of writing, it has hidden enterprise accounts (think Facebook Pages) with misleading names that contain “ChatGPT” from its search results, presumably to protect users from these dubious copycats.
I tested one account, blatantly named “ChatGPT,” out of curiosity. Unfortunately, I had to spend $4.50 to ask more than 20 questions, but I got it to ponder who it was. To sum up, this bootlegged ChatGPT “has nothing to do with OpenAI; its developer is “a British company called Cofoundry Limited;” its founders are a list of seemingly AI-generated white-person names; ChatGPT was “developed by GPT-2 Technology Ltd.”
Unsurprisingly, the ChatGPT knock-off wouldn’t answer anything that is considered politically sensitive in China. It did converse with satisfying fluency, which suggests it might have used the OpenAI API and added a censorship layer.
Some enthusiasts are using OpenAI for motives less obviously tied to monetary gains. A developer made a tool that can translate chat history into digestible bullet points, which comes in handy when one wakes up finding 100 unread group messages on WeChat.
We wrote previously that censorship is perhaps one of the biggest challenges for the future of natural-language systems in China. If Baidu and Alibaba had trained their models solely on Weibo, Baidu and WeChat data, which have censorship baked in, their bots would know what’s off limits. But Baidu reportedly gleans data from both inside and outside the Great Firewall, China’s elaborate censorship mechanism.
That makes perfect sense because there is a much greater amount of English source data such as academic research. It’s then the job of Baidu’s deep learning scientists to ensure its chatbot is censorship aware. Will Ernie Bot and its local rivals be adroit enough to spot the ever-changing puns and emoticons used for skirting censorship?
State control shapes not only the rules of AI applications but also how many resources go toward certain types of AI research. This is made evident in the State of AI report 2022 from AI investors Nathan Benaich and Ian Hogarth. The report shows that while China was quickly catching up to the U.S. in the number of AI papers published, its AI research skewed more heavily toward surveillance-related tasks such as “autonomy, object detection, tracking, scene understanding, action and speaker recognition.”
Looking forward, resource-rich tech giants like Baidu, Alibaba and Tencent are undoubtedly the closest to creating a capable ChatGPT equivalent for China. Over the long haul, questions remain about China’s ability to secure advanced chipsets for training large AI models as the U.S. cut off supplies.
And let’s also not forget Xiaoice, the once hyped-up chatbot from Microsoft China that was spun out in 2020 to focus on “localized innovation.” At the time, Xiaoice said it would continue to license technologies from Microsoft. If that agreement stays unchanged, does it mean Xiaoice could potentially access GPT features, given Microsoft’s close relationship with OpenAI? This might sound far-fetched, but it’s surely worth watching how AI technologies flow across countries’ borders.