Inference Scaling and AI Governance

Inference Scaling and AI Governance

The shift from scaling up the compute used to pre-train AI systems (pre-training compute) to scaling up the amount used to run them (inference compute) may have profound effects on AI governance. The nature of these effects depends crucially on whether this new inference compute will primarily be used to improve model performance during external deployment or as part of a more complex training programme within the lab. Rapid scaling of inference-at-deployment would somewhat lower the importance of open-weight models (and of securing the weights of closed models), reduce the impact of the first human-level models, change the business model for frontier AI, reduce the need for power-intensive data centres, and potentially undermine AI governance measures that rely on training-compute thresholds. Rapid scaling of inference-during-training would have more ambiguous effects that range from a revitalisation of pre-training scaling to a form of recursive self-improvement via iterated distillation and amplification.

Research Summary

Footnotes
Further reading