This web app is a tool for exploring the dynamics of risky AI competition: the safety-performance tradeoff. Will AI safety breakthroughs always lead to safer AI systems? Before long, we may be capable of creating AI systems with high levels of general intelligence. Such capable systems may present considerable risks when misaligned. Yet, suppose that thanks to incredible work by the AI safety community, we make a breakthrough which enables us to align many (but not all) AI systems and make them act according to our values at a relatively low performance cost. We also communicate this research to all relevant companies and governments. Everyone knows how to make their AI systems safer. Fast forward several years and we may find that despite these efforts, the AI systems that companies deploy are just as risky as they were before the breakthrough. Why did this happen? One way of looking at this question is to explore the safety-performance tradeoff model (SPT Model) of AI competition, displayed in this interactive web app.
December 12, 2022
Robert Trager, Paolo Bova, Allan Dafoe, Nicholas Emery and Eoghan Stafford