Before the Law
How I built a database of 400,000 Chinese state media articles and what I found about the disappearance of my own people
In the winter of 2020, I sat in a political re-education class in the news building. For thirty days.
The teacher explained to us that Chinese is a modern language. It contains precise scientific vocabulary. You can express the laws of physics and chemistry in Chinese. Mongolian, he said, is a backward language. It can only describe cattle, horses, and grasslands. Therefore, we should learn Chinese.
This was absurd. We grew up in China. We all spoke Chinese fluently. The protests that swept Southern Mongolia that September were never about refusing to learn Mandarin. They were about keeping the right to study math, history, and science in our mother tongue. In classrooms where Mongolian had been the language of instruction for decades.
I sat quietly. I understood: the language I had spoken, studied, and thought in since childhood was not going to survive this.
Five years later, I built a tool to measure what had happened.
PropagandaScope is a database of over 400,000 articles from 20 Chinese provincial party newspapers. It tracks how political keywords move from the center to the provinces. How often they appear. How fast they spread. Which regions amplify them the most.
When I searched for “蒙古族,” the Chinese term for “Mongolians” I expected to find it diminished. I did not expect what the data showed.
Out of more than 400,000 articles, exactly one mentioned the term. One. That single article only used it because it reported on a conference named “Mongolian Literature.” The term appeared in the title.
The word for my people has been removed from the vocabulary of Chinese state media.
What replaced it is a geographic label: “北疆文化,” Northern Frontier Culture. You are no longer Mongolian. You are from the northern frontier.
The idea for this project did not come from nowhere. In 2026, I co-authored an article with Professor James Leibold from La Trobe University for Made in China Journal. We discussed using map APIs to document how Mongolian-language schools were being renamed across Southern Mongolia. The technical approach did not work. But the question stayed with me. The evidence of erasure was sitting in plain sight. In Chinese government data. In public newspaper archives. In open digital records. Someone just had to build the tool to extract it.
When I started using Claude Code, I realized I could.
I am not a software engineer. I built PropagandaScope in three weeks. One person talking to a machine.
The first thing I wanted to understand was a phrase: “铸牢中华民族共同体意识.” Forging a strong sense of community for the Chinese nation. Xi Jinping introduced it in 2014. By 2017 it was in the Party Charter. By 2021 it was the guiding principle of all ethnic work. On March 12, 2026, it became law.
I searched for it in PropagandaScope. The result confirmed what I had suspected but could not prove. State media had been saturating minority regions with this phrase for years. Xinjiang Daily used it at 17.4 times the rate of People’s Daily. Tibet Daily, 12.1 times. Ningxia Daily, 10.3 times.
The places with the highest ethnic minority populations received the heaviest propaganda. The places China calls “autonomous” had the least autonomy over their own newspapers.
By the time the NPC voted on March 12, there was nothing left to campaign for. The campaign was already complete. The law did not start anything. It put a stamp on what was already done.
One detail from building this project will stay with me.
When I created the China map for the site, the boundary data came from Alibaba Cloud. A state-affiliated platform. I did not think twice about it. It was the default source.
The data encoded China’s official territorial claims. The nine-dash line. Taiwan as a province. When I removed those claims, the map refused to render. The entire country disappeared. In Alibaba’s logic, a China without the nine-dash line and Taiwan is not OK.
I spent an hour debugging before realizing the problem was not my code. It was the data source. I replaced it with Natural Earth, a public domain dataset used by Western media and academia.
A small fix. But it made something clear. China’s data infrastructure runs deep enough that even a tool built to analyze its propaganda was, by default, built on its own cartographic claims.
PropagandaScope is now live at propagandascope.org. The database updates daily. The scrapers run every night, pulling new articles from 20 provincial party newspapers.
You can search any Chinese political phrase and see how it moves through the system. You can read our investigation on the Ethnic Unity and Progress Promotion Law. You can search “蒙古族” and see the silence for yourself.
I built this because I sat in that classroom in 2020 and could not say what I knew. This is what I can say now.



What can’t prevent, we can still bear witness too, and look forward to reversing.
Thank you for your effort to bring cultural erasure to light.