How researchers use digital archives on Chinese social media

In April, the National Library of China announced that they will archive all the public posts on Weibo for non-commercial uses, which will be as a part of their preservation project of Internet information. It is estimated that more than 200 billion textual posts and 50 billion pictures will be stored. They hold the belief that this archive program can have a profound influence on digital-heritage retention. “It’s not only the content, but the emotions matter, known as affective computing and also the social networking reflected in these Weibo posts is important,”, commented by the Chinese scholar Zhou Kui.

Weibo is a microblogging application that is similar to Twitter, launched by Sina Corporation on August 14th, 2009, based on user relationships to share, disseminate and get information. Weibo has become one of the top 2 social media platforms in China nowadays. As of Q3 2018, this app has nearly 450 million users (compared to Twitter’s 300 million) and features that enable the study of emotional states and responses to the topics being discussed or spread across the web.

Before the Chinese National Library launched this nationwide project, the academics and some research groups have made progress on archiving weibo. Here I will introduce several interesting cases about it.

1. Archiving the comments may help prevent suicides – a rich ‘digital mine’ in Zoufan’s last post

Zoufan posted her last words on Weibo on 18, March, 2012. She was suffering from a major depressive disorder, and shortly after – committed a suicide. Since Zoufan’s last post, other Weibo users gradually found her account and continued to share their emotions or stories of depression as comments. There is  more than one million now. This caught the attention of Tingshao Zhu and his colleagues from the Chinese Institute of Psychology. Earlier this year, they started investigating this case and devising a strategy for how they could archive the weibos to connect with patients for preventing other suicides.

By analyzing those digital archives, this research group found that a significant number of patients with depressive disorders show their suicidal thoughts by posting anonymously. The researchers used Python to scrape and analyze the commentary text and further discovered that those who experience suicidal ideation, interact with others less, and are more inward looking. Specifically, the proportion of emotionally positive words is less than 5%, and the proportion of negative words is more than 80%. They rarely express thoughts about “family” and “future”, but mention “death” and “freedom” frequently.

Then Tingshao Zhu and his colleagues built an algorithm, trained with manually tagged data from the responses to the Zoufan posts, in order to recognize people with high risk of suicide among numerable updates on Weibo and classify the severity automatically. His team aims to use this algorithm, combined with their training in psychology to identify people at high risk of suicide and reach out to provide the support they need. Till now they found 4222 users with depressive disorders and provided further advice for them, which we hope will have a profound influence on treating depression in the long run.

2. Archiving the reviews and reposts of @Yutu – the parasocial interaction on new media

Yutu literally means “jade rabbit”, which refers to the pet rabbit of the Moon goddess in a Chinese myth. On Weibo, @Yutu, the official account of Chinese moon rover, has over 730,000 followers. It continues to post updates and news of its discoveries, as well as cute cartoons about its history and general knowledges about the universe, explaining complex concepts in a visual way.

In February 2014, it briefly went quiet during the lunar night, but after recovering from some mechanical difficulties (which were actually happening to the real rover on the moon), it posted the message: “Hi, anybody there?”, “I’m the rabbit that has seen the most stars!” This post attracted more than 840,00 reviews and 151,000 likes.

The scholar Feng Xian archived the reviews and reposts and then extracted the characters relating to emotional expressions as well as the emojis. He found that 60% of the users post compliments about its joyful ‘personality’ and 19% users were encouraging the rabbit/rover to keep going (as if it were a real person) when the rover itself was facing technical problems on the Moon. The reposting level also indicates a high penetration of Weibo content to the targeted audience. The researcher looked at six layers of reposting on the microblog: the direct reposting number is 2231 (40%), the secondary reposting number is 1780 (32%), and the next four are 735 (13%), 231 (4%), 111 (2%), 490 (9%), indicating that after original reposts by some users, their friends will keep forwarding it based on social relationship circle, similar to a virus spreading.

The authors also investigated the interaction model between the rover account and social media users, to find out how to balance the personified mood with scientific knowledge about the exploration of the universe. Specifically, instead of exhibiting the attitude of imparting professional knowledge as an emotionless machine, this account established an equal relationship with the audience during the virtual interaction process, which helped mobilize their enthusiasm to participate in the discussion. Besides, @Yutu also combines new stories of space rover with classic context of the Moon in China, upgrading its traditional meaning and triggering the dissemination to a further scope.

There are many more studies of archiving weibo in order to analyze user behaviors, communication trends and the spread of information through the network.

For example, a group of researchers from Hong Kong collected both Weibo and Twitter archives to understand the levels and spread of Ebola misinformation in 2013-2014. The researchers wrote a script to crawl Weibo data, as an API was not available at the time. They found that only 2% of their archive samples contained misinformed treatment options, compared to perhaps 50%+ reported in other studies looking at the misinformation spread of Ebola treatments in Guinea, Liberia and Nigeria during the same year.

Social media such as Weibo provide new opportunities as well as new challenges for archivist in the Internet era, since these digital archives may require different technologies and management approaches, which indeed deserves our attention.


References:

BBC News. (2016). China's Jade Rabbit rover dies on Moon. [online] Available at: https://www.bbc.co.uk/news/world-asia-china-36972205 [Accessed 1 Aug. 2019].

Fung, I., Fu, K., Chan, C., Chan, B., Cheung, C., Abraham, T. and Tse, Z. (2016). Social Media's Initial Reaction to Information and Misinformation on Ebola, August 2014: Facts and Rumors. Public Health Reports, 131(3), pp.461-473.

Tech.ifeng.com. (2019). Prevent suicides by archiving weibo with the help of AI. [online] Available at: http://tech.ifeng.com/a/20190426/45564761_0.shtml [Accessed 1 Aug. 2019].

Wang, M. (2019). China's national library to archive 200 billion Sina Weibo posts. [online] News.cgtn.com. Available at: https://news.cgtn.com/news/3d3d674d79677a4e34457a6333566d54/index.html [Accessed 1 Aug. 2019].

Yang, S., Xu, J. and Ye, P. (2018). Review of Online Sentiment Visualization Techniques. [online] Manu44.magtech.com.cn. Available at: http://manu44.magtech.com.cn/Jwk_infotech_wk3/article/2018/2096-3467/2096-3467-2-5-77.shtml [Accessed 1 Aug. 2019].

Leave a Reply

Your email address will not be published. Required fields are marked *

Please sign in first
You are on your way to create a site.