Five lessons from having helped run an AI-Biology RCT
Evidence-based AI policy is important but hard. We need more in-depth studies – which often don’t fit into commercial release cycles.
In this post, the author shares five lessons for evidence-based AI policy from running this RCT.
- Rigorous long-term studies don’t fit hectic commercial release schedules. We need an additional pipeline of RCTs to validate benchmarks, ideally run every six months.
- The main barrier to more RCTs is talent—especially excellent ops—not cost. We should pool efforts to build dedicated teams. I hope Active Site can find great hires.
- Many critical threat models will require RCTs that will be substantially harder to design and execute. We should begin piloting new study designs for expert uplift now.
- RCTs are informative but have their own caveats. We don’t yet know if RCTs over- or underestimate ‘real-world’ AI-biology effects – and future studies should dig into this.
- AI firms should develop safeguards before RCTs find there is an urgent need to deploy them. Gaining experience is good; as is thinking about what results trigger this.



