How Big Data, AI, Wearables and Crowdsourcing Help to Predict and Mitigate Epidemics

In the past two decades, we had experienced major outbreaks including SARS and MERS.  A new coronavirus, named COVID-19 has recently emerged causing major global disruption.  Can innovative technologies help predict and mitigate the new wave of epidemics and provide appropriate early warnings so that the general public, business community, and local governments can prepare as early as possible to minimise the damage?

Predicting outbreaks with big data and artificial intelligence

In fact, as early as 2008, Google launched the influenza trend prediction platform “Google Flu Trends (GFT)"[1]. By monitoring about 40 flu-related keywords such as “cough” and “fever”, GFT could predict flu outbreaks about a week in advance.  In the early days of the launch of the platform, the accuracy was as high as 97% and GFT became one of Google’s poster boys. Yet, since around 2011-2013, the accuracy of GFT continued to decline, by either overestimating or underestimating.  Wired Magazine found that GFT would perform well for two to three years and then failed significantly and required substantial revision.  It may be because Google’s algorithm was vulnerable to overfitting to seasonal terms unrelated to the flu, or that it did not take into account changes in search behaviour over time.[2]  Google eventually stopped publishing its flu forecasts in 2015.

While Google gave up the development of GFT, there were new attempts by some others . BlueDot, a startup company founded by epidemiologist Kamran Khan in Canada in 2014, is committed to use big data and artificial intelligence to predict the spread of infectious diseases and develop early warning systems so that governments, medical institutions and business communities worldwide can be informed of the potential risks of any new epidemic emerged.

BlueDot’s algorithm scans official reports, professional forums and online news sources in 65 languages ​​around the clock to identify trends and keywords that need attention. It tracks around 150 different types of diseases and it is a massive amount of data.  After the machine has identified the hints, the human experts in the BlueDot team will review the information and train the machine to understand if that information corresponds to an actual threat.  Like a needle in a haystack, on the last day of 2019, the BlueDot team noticed two keywords: “pneumonia" and “unknown cause" in a Chinese news report that 27 people related to a seafood market in Wuhan in Mainland China were infected with an unknown pneumonia. Even though the team didn’t know it was going to become a big global outbreak, they did see ingredients similar to that of SARS.   Therefore in the morning of 31 Dec 2019, BlueDot sent a first alert to its clients. The warning was issued more than a week before the World Health Organization (WHO) ‘s first statement on the occurrence of an unknown pneumonia in Wuhan.[3][4]

Fitness wearables help real-time tracking of seasonal influenza outbreaks

In January this year, medical journal “The Lancet" published a research report by the Scripps Research Translational Institute in the United States. From their recent study, fitness wearables such as Fitbit could help improve real-time tracking of seasonal influenza outbreaks. The Scripps researchers found that when people get an infection, their resting heart rate tends to increase and their daily activities will change, as will sleep patterns.  When the scientists added aggregated Fitbit variables, such as resting heart rate and sleep data, into a model that included the Centres for Disease Control and Prevention (CDC) surveillance data for flu-like illnesses from three weeks prior, they were able to significantly improve real-time predictions at the state level.[5][6]

Because the data of the World Health Organization (WHO) and CDC generally have a delay of one to three weeks, real-time data of wearable devices could be a useful tool for medical institutions and governments to identify possibility of outbreaks much faster.

E-commerce and online payment data valuable for outbreak prediction

Thanks to the vast amount of data acquired by online retailers, it is not difficult for them to make fairly accurate predictions of different trends and behaviour based on online shopping and electronic payment data. For example, Target, one of the biggest U.S. retailers, has come up with a very accurate “pregnancy forecast" by analysing its customers’ shopping data.  Target found that many women in their early stages of pregnancy would buy unscented moisturizers, and after a few weeks, start buying healthy food items that contain magnesium, calcium, zinc, and so on.  Based on the individual customers’ changing shopping habit, the retailer could “guess” if its customer is pregnant, and personalize its marketing accordingly.  Predictions that are too accurate would sometimes backfire though, such as Target sending pregnancy shopping recommendations to a teenage girl, even before her parents learnt about her pregnancy.[7]

Following the same logic, given the huge popularity of online shopping in Mainland China, identifying certain trends should be an easy task for e-commerce giants such as Taobao and, as well as e-payment platforms such as Alipay and WeChat Pay.   For example, any surge in sales of medications, such as cough medicines, paracetamol, or even Vitamin C supplements and so on may indicate a possible flu outbreak.  The purchase and also the product delivery locations can also help to highlight the areas of higher concern. Of course, whether such observation can be published publicly is another matter. 

Monitoring the possible virus spread by tracking traffic data

In late January this year, the University of Southampton in the United Kingdom predicted that cities such as Bangkok, Hong Kong, and Taiwan were amongst the most at risk from the spread of the new coronavirus by tracking air traffic data of Wuhan and other major Mainland Chinese  cities.  Such early warning could prompt the respective local governments to take necessary precautionary measures to mitigate the risk.

Taiwan, for example, took timely and decisive actions such as conducting health checks on passengers from Wuhan in early January, rationing surgical masks and restricting the entry of passengers with a travel history in China since early February.  The Taiwan Central Command Centre for Disease Control and other agencies issued daily mobile phone alerts to the citizens about the latest confirmed cases and information on the places the patients had visited.  The timely measures have kept confirmed cases in Taiwan to a minimum, despite its original high risk status. [8]

Baidu Map
Baidu Chinese citizen migration map

Since 2014, “Baidu Maps" has launched the “citizen migration" big data visualization project (「人群遷徙」大數據可視化項目). By analysing the positioning service data of the platform, Baidu can observe the trajectory and characteristics of the migration around the Chinese New Year. When the novel coronavirus began to spread in Wuhan at the end of 2019, the “Baidu Migration Big Data" accidentally became the platform for observing the population migration since the outbreak, and therefore help predicting the risk levels of the mainland Chinese cities.[9]

In order to facilitate the Chinese citizens to check if they have come in close contact with a confirmed coronavirus carrier, the Chinese Government launched the “close contact meter" app (密切接觸者測量儀).  The Chinese citizens only need to enter their name and ID card number to find out if they have taken the same flight, train or bus with a confirmed COVID-19 patient.  Inevitably, there are concerns about the privacy and security of the measuring tool. The government agency emphasizes that all query data is stored and compared in encrypted format, and cannot be processed by “reverse encryption".

Crowdsourcing ideas for prevention tactics  

Taiwan, due to its proximity to and close economy ties with Mainland China, was predicted as one of the highest risk location at the beginning of the coronavirus outbreak.  The early and proactive prevention measures implemented by Taiwan’s Government have successfully protected the island.  By 8 March 2020, Taiwan still has less than 50 confirmed cases, far less than the locations that were expected to have lower risks.

Screenshot 2020-02-26 at 1.42.00 AM
Real-time map of local surgical mask supplies in Taiwan

The openness of Taiwan’s Government could be one of the major reasons for the success.  Taiwan’s Digital Minister, the 38-year-old Audrey Tang (唐鳳), in particular, has gained praises and attracted a lot of international attention for her innovative crowdsourcing strategies in fighting the coronavirus threat.   A genius software programmer and a former Sunflower Movement activist, the young minister collected community ideas in combating the virus since the start of the outbreak.  The most notable measure is the launch of a real-time map of local surgical mask supplies riding on Google GPS and Place API. In collaboration with Taiwanese software engineers, Tang  also created a citywide alert for residents and tourists to stay aware of risky locations that were visited by passengers of the Diamond Princess cruise ship.[10]  Netizens in Japan and South Korea even joked that they would like to swap Tang with their respective IT ministers.

Technology is helpful.  Human decision is still the key

Today we have many more advanced technologies to help us predict, monitor and mitigate epidemics.  If used appropriately, the technologies are effective tools in fighting the battle.

After almost two months since the COVID-19 outbreak, how well different countries and areas are coping with the novel virus tells us an important lesson: the best technology cannot beat sound human judgements and decisions. Taiwan, Hong Kong and Singapore were predicted to have the highest risk of major outbreaks following China. Instead of taking chances, the proactive attitude of the respective governments and citizens and the adoption of timely and decisive measures are the keys that these cities have been able to contain the impact of the coronavirus to a manageable level so far. With more information and tested tactics, hope the situations in other countries can improve soon as well.


Note:  This article is the updated version of my Chinese blog post published on  26 Feb 2020 in Stand News Hong Kong.











[7] Big Data: A Revolution That Will Transform How We Live, Work, and Think by by Viktor Mayer-Schönberger and Kenneth Cukier