Yale researchers have developed a new way to track the spread of COVID-19

The world has never needed a fortune teller more than right now. We all want to pick up some tarot cards or gaze into a crystal ball and predict the future of this pandemic. Unfortunately, in reality, it isn’t that simple. Thankfully, researchers from Yale University have developed a new form of scientific fortune-telling that can rapidly track COVID-19’s spread and infection rate.

Families and individuals want to know what their lives will be like two, six, or ten months from now. Before that can happen, though, medical organizations, hospitals, and entire governments are doing their best to predict the immediate spread of the coronavirus across populations and areas. Accurately forecasting COVID-19’s path can potentially save countless lives by allowing local and federal authorities to get a jump on the virus and allocate limited resources to areas in need.

Of course, just like everything else with this virus, it’s proven quite difficult to predict its trajectory. This new approach sets itself apart from other prediction models by using any and all available real-time data on population flows and movement. For example, cell phone use data. These immense datasets, which normally would be too large to use, make it possible to accurately predict population movements in a given area.

“This work shows that it is possible to very accurately forecast the timing, intensity, and geographic distribution of the COVID-19 outbreak based on population movement alone,” says study co-author Nicholas A. Christakis, Sterling Professor of Social and Natural Science at Yale, in a university release. “Moreover, by tracking population flows in real-time, our model can provide policymakers and epidemiologists a powerful tool to limit an epidemic’s impact and save lives.”

To test out their approach, the team at Yale collaborated with Chinese scientists to track the movements of 11.5 million people traveling through Wuhan, China between January 1st-24th. This was achieved using cell phone geolocation data provided by a major Chinese phone provider. The beginning of 2020 was when the virus first started to spread through Wuhan and coincided with the runup to the Chinese lunar New Year. So, a lot of locals were traveling to visit friends and family all over China. According to the data, people traveled through Wuhan to 296 prefectures across 31 provinces and regions throughout China. 

That information was then compared to COVID-19 cases in China by location and date, provided by the Chinese CDC.

Using all of that data, the study’s authors were able to accurately predict the frequency of COVID-19 infections in China well into February (2/19). Moreover, researchers also correctly forecast confirmed cases and areas in China that would develop high infection rates. This is no small accomplishment; the fact that accurate predictions for February were made using only data from January is huge. If the same approach can be used to track infection rates in the United States in, say, June, that could go a long way towards ensuring those areas are better prepared.

“If there are more confirmed cases than expected ones, there is a higher risk of community spread. If there are fewer expected cases than reported, it means that the city’s preventive measures are particularly effective or it can indicate that further investigation by the central authorities is needed to eliminate possible risks from inaccurate measurement,” explains lead study author Jayson Jia, associate professor of marketing in the Faculty of Business and Economics at the University of Hong Kong.

“What is innovative about our approach is that we use misprediction to assess the level of community risk. Our model accurately tells us how many cases we should expect given travel data. We contrast this against the confirmed cases using the logic that what cannot be explained by imported cases and primary transmissions should be community spread,” Jia adds.

China’s response and reporting on COVID-19 have come under intense scrutiny as of late, and many believe that Chinese authorities have been less than transparent. For what it’s worth, these results seem to corroborate the Chinese CDC’s infection numbers. The predictions made by researchers using data provided a third-party, independent source (the cell phone carrier) correlated almost exactly with the official infection count.

It doesn’t have to be cell phone data either. Any dataset that offers a glimpse into people’s movements, such as toll booths or train tickets, are applicable within the new model’s framework. 

“People spread contagious diseases when they move,” Christakis concludes. “By accurately capturing population movements over time, we can predict how a contagion will spread geographically and use data-analytic techniques to help control it before a devastating epidemic erupts or re-erupts.”

COVID-19 has spread and disrupted all of our lives by working in silence and secret. This new tactic represents an opportunity for humanity to flip the script on the virus by being one step ahead.

The full study can be found here, published in Nature.