Five lessons from having helped run an AI-Biology RCT

Evidence-based AI policy is important but hard. We need more in-depth studies – which often don’t fit into commercial release cycles.

In this post, the author shares five lessons for evidence-based AI policy from running this RCT.

Rigorous long-term studies don’t fit hectic commercial release schedules. We need an additional pipeline of RCTs to validate benchmarks, ideally run every six months.
The main barrier to more RCTs is talent—especially excellent ops—not cost. We should pool efforts to build dedicated teams. I hope Active Site can find great hires.
Many critical threat models will require RCTs that will be substantially harder to design and execute. We should begin piloting new study designs for expert uplift now.
RCTs are informative but have their own caveats. We don’t yet know if RCTs over- or underestimate ‘real-world’ AI-biology effects – and future studies should dig into this.
AI firms should develop safeguards before RCTs find there is an urgent need to deploy them. Gaining experience is good; as is thinking about what results trigger this.

‍

Read paper

Five lessons from having helped run an AI-Biology RCT

Evidence-based AI policy is important but hard. We need more in-depth studies – which often don’t fit into commercial release cycles.

Theme

Date

author

s

Share

Research Summary

Footnotes

Further reading

AI and Biosecurity

Measuring Mid-2025 LLM-Assistance on Novice Performance in Biology

February 2026

Research Paper

Shen Zhou Hong, Alex Kleinman, Alyssa Mathiowetz, Adam Howes, Julian Cohen, Suveer Ganta, Alex Letizia, Dora Liao, Deepika Pahari, Xavier Roberts-Gaal, Luca Righetti, Joe Torres

AI and Biosecurity

Forecasting LLM-Enabled Biorisk and the Efficacy of Safeguards

July 2025

Bridget Williams, Luca Righetti, Josh Rosenberg, Rebecca Ceppas de Castro, Otto Kuusela, Rhiannon Britt, Emily Soice, Alvaro Morales, Jon Sanders, Seth Donoughe, James Black, Ezra Karger, Philip E. Tetlock

AI and Biosecurity

Five lessons from having helped run an AI-Biology RCT

February 2026

Research note

Luca Righetti

AI and Biosecurity

Measuring Mid-2025 LLM-Assistance on Novice Performance in Biology

February 2026

Research Paper

Shen Zhou Hong, Alex Kleinman, Alyssa Mathiowetz, Adam Howes, Julian Cohen, Suveer Ganta, Alex Letizia, Dora Liao, Deepika Pahari, Xavier Roberts-Gaal, Luca Righetti, Joe Torres

AI and Biosecurity

Forecasting LLM-Enabled Biorisk and the Efficacy of Safeguards

July 2025

Bridget Williams, Luca Righetti, Josh Rosenberg, Rebecca Ceppas de Castro, Otto Kuusela, Rhiannon Britt, Emily Soice, Alvaro Morales, Jon Sanders, Seth Donoughe, James Black, Ezra Karger, Philip E. Tetlock

Five lessons from having helped run an AI-Biology RCT

Evidence-based AI policy is important but hard. We need more in-depth studies – which often don’t fit into commercial release cycles.

Theme

Date

author

s

Share

Research Summary

Footnotes

Further reading

Related publications

AI and Biosecurity

Measuring Mid-2025 LLM-Assistance on Novice Performance in Biology

February 2026

Research Paper

Shen Zhou Hong, Alex Kleinman, Alyssa Mathiowetz, Adam Howes, Julian Cohen, Suveer Ganta, Alex Letizia, Dora Liao, Deepika Pahari, Xavier Roberts-Gaal, Luca Righetti, Joe Torres

AI and Biosecurity

Forecasting LLM-Enabled Biorisk and the Efficacy of Safeguards

July 2025

Bridget Williams, Luca Righetti, Josh Rosenberg, Rebecca Ceppas de Castro, Otto Kuusela, Rhiannon Britt, Emily Soice, Alvaro Morales, Jon Sanders, Seth Donoughe, James Black, Ezra Karger, Philip E. Tetlock

AI and Biosecurity

Five lessons from having helped run an AI-Biology RCT

February 2026

Research note

Luca Righetti

AI and Biosecurity

Measuring Mid-2025 LLM-Assistance on Novice Performance in Biology

February 2026

Research Paper

Shen Zhou Hong, Alex Kleinman, Alyssa Mathiowetz, Adam Howes, Julian Cohen, Suveer Ganta, Alex Letizia, Dora Liao, Deepika Pahari, Xavier Roberts-Gaal, Luca Righetti, Joe Torres

AI and Biosecurity

Forecasting LLM-Enabled Biorisk and the Efficacy of Safeguards

July 2025

Bridget Williams, Luca Righetti, Josh Rosenberg, Rebecca Ceppas de Castro, Otto Kuusela, Rhiannon Britt, Emily Soice, Alvaro Morales, Jon Sanders, Seth Donoughe, James Black, Ezra Karger, Philip E. Tetlock