By Li Yuan
Some of the most critical work in advancing China’s technology goals takes place in a former cement factory in the middle of the country’s heartland, far from the aspiring Silicon Valleys of Beijing and Shenzhen. An idled concrete mixer still stands in the middle of the courtyard. Boxes of melamine dinnerware are stacked in a warehouse next door.
Inside, Hou Xiameng runs a company that helps artificial intelligence make sense of the world. Two dozen young people go through photos and videos, labeling just about everything they see. That is a car. That is a traffic light. That is bread, that is milk, that is chocolate. That is what it looks like when a person walks. “I used to think the machines are geniuses,” Hou, 24, said. “Now I know we’re the reason for their genius.”
In China, long the world’s factory floor, a new generation of low-wage workers is assembling the foundations of the future. Startups in smaller, cheaper cities have sprung up to apply labels to China’s huge trove of images and surveillance footage. If China is the Saudi Arabia of data, as one expert says, these businesses are the refineries, turning raw data into the fuel that can power China’s AI ambitions.
Conventional wisdom says that China and the United States are competing for AI supremacy and that China has certain advantages. The Chinese government broadly supports AI companies, financially and politically. Chinese startups made up one-third of the global computer vision market in 2017, surpassing the United States. Chinese academic papers are cited more often in research papers. In a key policy announcement last year, China’s government said that it expected the country to become the world leader in artificial intelligence by 2030.
Most importantly, this thinking goes, the Chinese government and companies enjoy access to mountains of data, thanks to weak privacy laws and enforcement. Beyond what Facebook, Google and Amazon have amassed, Chinese internet companies can get more because people there so heavily use their mobile phones to shop, pay for meals and buy movie tickets. Still, many of those claims are iffy. Chinese papers and patents can be suspect. Government money may go to waste. It is not clear that the AI race is a zero sum game, in which the winner gets the spoils. Data is useless unless somebody can parse and catalog it.
But the ability to tag that data may be China’s true AI strength, the only one that the United States may not be able to match. In China, this new industry offers a glimpse of a future that the government has long promised: an economy built on technology rather than manufacturing. “We’re the construction workers in the digital world. Our job is to lay one brick after another,” said Yi Yake, co-founder of a data labeling factory in Jiaxian, a city in central Henan province. “But we play an important role in AI. Without us, they can’t build the skyscrapers.”
While AI engines are superfast learners and good at tackling complex calculations, they lack cognitive abilities that even the average 5-year-old possesses. Small children know that a furry brown cocker spaniel and a black Great Dane are both dogs. They can tell a Ford pickup from a Volkswagen Beetle, and yet they know both are cars. AI has to be taught. It must digest vast amounts of tagged photos and videos before it realizes that a black cat and a white cat are both cats. This is where the data factories and their workers come in.
Taggers helped AInnovation, a Beijing-based AI company, fix its automated cashier system for a Chinese bakery chain. Users could put their pastry under a scanner and pay for it without help from a human. But nearly one-third of the time, the system had trouble telling muffins from doughnuts or pork buns thanks to store lighting and human movement, which made images more complex. Working with photos from the store’s interior, the taggers got the accuracy up to 99 percent, said Liang Rui, an AInnovation project manager.
“All the artificial intelligence is built on human labor,” Liang said. AInnovation has fewer than 30 taggers, but a surge in labeling startups has made it easy to farm out the work. Once, Liang needed to get about 20,000 photos in a supermarket labeled in three days. Colleagues got it done with the help of data factories for only a couple thousand dollars. “We’re the assembly lines 10 years ago,” said Yi, the co-founder of the data factory in Henan.
The data factories are popping up in areas far from the biggest cities, often in relatively remote areas where both labor and office space are cheap. Many of the data factory workers are the kinds of people who once worked on assembly lines and construction sites in those big cities. But work is drying up, wage growth has slowed and many Chinese people prefer to live closer to home.
Yi, 36, was out of a job and trying to get other ventures going with elementary school classmates when someone mentioned AI tagging. After online searches, he decided it was not super technical but needed cheap labor, something Henan has in abundance.
In March, Yi and his friends set up Ruijin Technology, which rents offices the size of two professional basketball courts in an industrial park for $21,000 a year. It was previously the park’s Communist Party committee’s event space, so the ceiling lights are covered with red hammers and sickles. Ruijin, which means smart gold, now employs 300 workers but plans to expand to 1,000 after the Chinese New Year holiday, when many migrant workers come home.
Unlike workers and business around the world, Yi is not worried that AI will take his job. “The machines aren’t smart enough to teach themselves yet,” he said.