Five lessons from having helped run an AI-Biology RCT

Five lessons from having helped run an AI-Biology RCT

Evidence-based AI policy is important but hard. We need more in-depth studies – which often don’t fit into commercial release cycles.

In this post, the author shares five lessons for evidence-based AI policy from running this RCT.

  1. Rigorous long-term studies don’t fit hectic commercial release schedules. We need an additional pipeline of RCTs to validate benchmarks, ideally run every six months.
  2. The main barrier to more RCTs is talent—especially excellent ops—not cost. We should pool efforts to build dedicated teams. I hope Active Site can find great hires.
  3. Many critical threat models will require RCTs that will be substantially harder to design and execute. We should begin piloting new study designs for expert uplift now.
  4. RCTs are informative but have their own caveats. We don’t yet know if RCTs over- or underestimate ‘real-world’ AI-biology effects – and future studies should dig into this.
  5. AI firms should develop safeguards before RCTs find there is an urgent need to deploy them. Gaining experience is good; as is thinking about what results trigger this.


Research Summary

Footnotes
Further reading