Predicting the next pandemic before it's too late

The COVID-19 pandemic has served as a harsh reminder of our vulnerability to viruses. Scientists are now trying to turn the tables. Instead of reacting to an outbreak, they want to predict the next pandemic before it begins. The Viral Emergence Research Initiative, or Verena for short, is leading the way. This consortium combines virology with artificial intelligence, among other things, to unravel the complex dynamics of viral transmission.

Only one to two percent mapped

The scope of the viral landscape is vast and largely unknown. Scientists estimate that we have currently mapped only one to two percent of all virus-host interactions in mammals. The rest of this network is completely invisible. This makes it nearly impossible to predict which virus will be the next to make the leap to humans.

Gigantic database

Verena uses artificial intelligence to fill this enormous gap in our knowledge. To this end, the consortium has set up gigantic databases, such as VIRION and PHAROS. VIRION now contains data on more than 9,000 vertebrate viruses and their hosts. By analyzing this data with machine learning, researchers can identify patterns that remain invisible to the human eye. The algorithms predict which animal species are likely to carry new viruses and where these viruses can cross the species barrier. This helps to focus surveillance efforts precisely on areas where the risks are greatest.

A toolkit of algorithms

To solve these complex biological puzzles, Verena does not use a single model, but a combination of different AI systems.

The consortium employs a so-called ensemble approach, in which multiple techniques are used together. Network science models (such as graphs) map the interactions between viruses and their hosts. This helps to understand how they relate to one another within an ecosystem.

In addition, the researchers use classical machine learning methods, such as random forests and decision trees. These help filter large amounts of ecological and climatological data and identify patterns.

Deep learning models are used for more detailed predictions at the molecular level. Tools from structural biology, such as AlphaFold, also play a role. These allow scientists to see how viral proteins and cells interact with each other, down to nearly the atomic scale.

The Power of Open Science

A key pillar of Verena is the belief that science should be accessible. Much of the software and databases developed are therefore available as fully open-source on platforms such as GitHub. This open approach is essential for the reliability and reproducibility of the research. Although the initiative is largely funded by the U.S. National Science Foundation with a grant of $12.5 million, it operates as a global network. This philosophy aligns closely with European ambitions for so-called FAIR data. This stands for findability, accessibility, interoperability, and reusability. European researchers actively use Verena’s infrastructure.

Misusing AI: a double-edged sword

Yet this far-reaching openness also carries significant risks. There is a constant tension between open science and security. The same powerful AI models used to prevent pandemics could, in theory, be misused. Malicious actors could use this technology to design synthetic viruses. Think of viruses with increased transmissibility or resistance. This dual-use dilemma compels the consortium to exercise extreme caution. To manage this risk, Verena follows a specific policy. Ecological and epidemiological trends are widely shared. However, access to extremely sensitive genetic sequences of dangerous pathogens is strictly regulated.

From computer model to global policy

Verena’s predictions are not meant to gather dust in academic libraries. They serve a practical purpose. The results are used to support and guide concrete policy. Although the models are not directly used for immediate political decisions, such as closing national borders, they do form the scientific basis for long-term choices. For example, research into the risks of the global trade in wild animals has provided crucial ammunition for international discussions on market regulation. In addition, national health agencies use Verena’s data to set priorities. It helps them determine which viruses are most urgent for active surveillance and the development of new vaccines.

A look to the future

The coming years are crucial for the project. Current funding from the National Science Foundation runs through 2027. Until then, the consortium aims to further expand its network of over a hundred publications and sixty trained researchers.