A groundbreaking study has revealed that advanced artificial intelligence models are now outperforming PhD-level virologists in solving complex lab-based problems, sparking a fresh wave of excitement — and concern — among experts in public health and biosecurity.
Conducted by researchers from the Center for AI Safety, MIT Media Lab, Brazilian university UFABC, and the nonprofit SecureBio, the study tested the ability of both human experts and AI systems to troubleshoot highly complex virology lab scenarios. The findings were shared exclusively with TIME.
The AI models tested — including OpenAI’s “o3” and Google’s Gemini 2.5 Pro — significantly outperformed human experts, with o3 scoring 43.8% and Gemini 37.6% on difficult practical lab questions. Human virologists averaged only 22.1% in their own fields of expertise.
“These results make me a little nervous,” said Seth Donoughe, a SecureBio scientist and co-author of the study. “For the first time, virtually anyone could access an expert-level guide on how to carry out sensitive biological procedures — with no oversight.”
The questions posed in the study mirrored real-world lab challenges that aren’t typically answered in academic literature or online. They included troubleshooting virus culturing protocols — the kind of knowledge often gained only through years of lab experience.
The potential upside of this capability is significant. AI could speed up disease detection, vaccine development, and improve diagnostics worldwide. “These models could help scientists in lower-resourced countries conduct meaningful research on local diseases,” said Dr. Tom Inglesby of the Johns Hopkins Center for Health Security.
However, the same tools could also be misused. Experts warn that powerful AI systems could assist malicious actors in developing biological weapons without the training typically required to work in high-security labs.
In response to the study, several AI companies took preliminary steps to address the risks. Elon Musk’s xAI released a risk management framework in February. OpenAI said it implemented new biological safeguards in its latest models and ran a thousand-hour red-teaming campaign. Anthropic noted the study in system documentation, while Google declined to comment.
Researchers are now calling for stronger regulation. “Self-regulation alone isn’t enough,” said Inglesby. “We need governments to step in with policies to evaluate AI systems before they’re released — especially if they could be misused to cause pandemics.”
As AI becomes more adept in scientific domains, the challenge ahead lies in balancing innovation with responsibility — ensuring tools designed to save lives don’t inadvertently put them at risk.