“Ghost Cities”: Hunting Chinese Ghosts With Big Data And Social Intelligence

social intelligence - big data

Some recent studies have illustrated China’s progressive development with never-seen-before data, the inhabited areas have gone from 8,842 to 41,768 Km2 in about twenty years. It is estimated that the amount of building materials used between 2010 and 2013 is the same as that used by the US construction industry in the all the twentieth century.

The race for the future of Chinese power does not happen without consequences: the territory is full of “ghost cities“, housing conglomerates born from construction speculation, but now uninhabited or under-utilized. “Ghost Towns”, that are elusive and controversial study objects. Complex to define, difficult to identify, they prove a problem for the government, that can not identify them and afterwards convert them into new production poles.

Several attempts have failed to map and include these almost deserted areas to the map. The government or ministries have never released official data about it, and several independent studies have produced discordant classifications. A recent paper by the Institute of Remote Sensing and Geographic Information Systems (Peking University), conducted in collaboration with Baidu’s Big Data Lab, promises to radically change the situation with Big Data and Social Intelligence.

The diffusion of geolocated devices (LADs or GPSs) generates on a daily basis a large volume of published content data, locations, and segmentable paths in defined time periods. Hence the idea of ​​using Baidu – the main Chinese web search engine – and its Big Data geolocated as tools to generate a previously unavailable electronic archive. A database of digital activities that correlates with the density of housing in specific geographic areas and periods of the year has been created based on the actions of online users. How? The presence of smartphones and their anonymous datasets depict users’ presence, resulting in traffic concentrations and mobile connections. The system intercepts both local dynamics – ie. seasonal residential cycle variations, or recurrent traffic trajectories, such as home-work paths – and demographic dynamics, crossing web data with flows of internal migrations. After defining a map of the country’s constructed areas and setting an average number of users per square meter (starting from the average of the standard residential area), each time the number of active online users per unit of space is much less than average for a prolonged time, the area is identified as a “ghost city”. This algorithm is also useful to distinguish these areas from holiday resorts, which naturally empty out in low season, but which record a ‘cyclical presence’ throughout the year.


variazione demografica

Comparison of active users in the cities of Kangbashi and Rushan (holiday destinations). During national holidays, the number of active users decreases in Kangbashi (blue line) and increases in Rushan (green line).


The result is illustrated in an  interactive map containing 20 cities among those with the most uninhabited areas (the actual rankings have not yet been made public to avoid new speculators’ abuse). This is a first step for the government in its census and retraining plan for new urban areas.

Baidu ghost city

The 20 cities identified by the researchers: To the left, the geolocation of digital data traffic abnormalities, to the right the processed model applied to the satellite map, with the detection of the ghost areas.

Big Data’s solution is convincing: the network’s data is the one evidencing the human environment, not actively interviewing subject. Datasets are not questions, but they hold many answers: when properly analyzed, they can trigger trends, describe behaviors, draw maps, and even capture ghosts. This is the “magic” of Social Intelligence.

Filippo Tansini

Analyst specializing in politics, finance, insurance, pharmaceutical. Expert of the Italian Language, he obtained a degree in Modern Philology, working on analyses and textual criticism. A PhD student at the Department of Theatre and New Media at La Sapienza in Rome, he worked as a data analyst for an international company, following projects of social reputation, product campaigns, landscape analyses for multinational companies. He has been a web reputation manager and data analyst at Cultur-e since 2015. He works on social media analysis services, monitoring of brands and top managers, marketing campaigns and crisis management. Languages: Italian and English.