Information Hazards in Races for Advanced Artificial Intelligence

We study how the information environment affects races to implement a powerful new technology such as advanced artificial intelligence. In particular, we analyse a model in which a potentially unsafe technology may cause a “disaster” that affects all actors and actors that implement the technology face a tradeoff between the safety of the technology and their performance in the race. Combining analytic and computational approaches, we solve for the perfect Bayesian equilibria under three scenarios regarding information about capabilities: unknown, private, and public. First, we show that more decisive races, in which small leads in performance produce larger probabilities of victory in the race, are weakly more dangerous under most parameter values. Second, we show that revealing information about the capabilities of rivals has two opposing effects on disaster risk. The benefit is that actors may discover that they are sufficiently far apart in capability and will compete less. The cost is that actors may discover they are close in capability and thus engage in a dangerous race to the bottom, cutting corners on safety to win the race. As a result, the information hazard result of Armstrong et al. [2016]–that the public information scenario is more dangerous than the private information scenario–only holds under high decisiveness. As decisiveness decreases, the first effect dominates the second, so that public knowledge of capabilities is welfare-improving. Third, in all information scenarios, we find that the larger the impact of the eventual loser on safety, relative to the eventual winner, the more dangerous is the race due to a moral hazard effect.

Research Summary

Further reading

Related publications