Key Points:
- China’s CAC reviews AI models to ensure they align with “core socialist values.”
- The review includes major tech companies and small startups, focusing on responses to politically sensitive topics.
- Companies must filter training data and create databases of sensitive words, leading to challenges in development.
- Regulations and U.S. sanctions have hindered Chinese firms’ ability to launch ChatGPT-like services despite China’s dominance in generative AI patents.
According to a report by the Financial Times, AI companies in China are currently undergoing a rigorous review process by the Cyberspace Administration of China (CAC) to ensure their large language models (LLMs) align with “core socialist values.” This review encompasses many companies, from major tech giants like ByteDance and Alibaba to smaller startups.
The CAC, China’s chief internet regulator, is testing these AI models for their responses to various questions, particularly those related to politically sensitive topics and Chinese President Xi Jinping. In addition to response testing, the review also examines the models’ training data and safety processes.
An anonymous source from a Hangzhou-based AI company shared that their model failed the initial round of testing for unclear reasons. They only succeeded in passing after several months of adjustments, indicating the challenging nature of meeting the CAC’s requirements.
These efforts highlight Beijing’s strategy of advancing the global AI race while maintaining strict control over technology development and ensuring it adheres to China’s stringent internet censorship policies. China was among the first countries to establish rules for generative AI, which mandate that AI services align with socialist values and avoid producing “illegal” content.
Implementing these censorship policies involves “security filtering,” which is complicated because many Chinese LLMs are trained on substantial amounts of English-language content. Engineers and industry insiders explained that this filtering process includes removing “problematic information” from training data and creating a database of sensitive words and phrases.
Due to these regulations, popular Chinese chatbots often refuse to answer questions on sensitive subjects, such as the 1989 Tiananmen Square protests. However, during CAC testing, there are limits on how many questions an LLM can decline outright. Therefore, the models must be capable of generating “politically correct answers” to sensitive questions.
An AI expert working on a chatbot in China mentioned that it is nearly impossible to prevent LLMs from generating all potentially harmful content. Instead, developers build an additional layer into the system that replaces problematic answers in real time.
The combined impact of these regulations and U.S. sanctions, which have restricted access to chips for training LLMs, has posed significant challenges for Chinese firms trying to launch ChatGPT-like services. Nonetheless, China leads globally in generative AI patents.