Joe Carlsmith Audio

Is Power-Seeking AI an Existential Risk?

January 24, 2023 Joe Carlsmith
Joe Carlsmith Audio
Is Power-Seeking AI an Existential Risk?
Show Notes Chapter Markers

Audio version of my report on existential risk from power-seeking AI. Text here: https://arxiv.org/pdf/2206.13353.pdf. Narration by Type III audio. 

Abstract
1 Introduction
1.1 Preliminaries
1.2 Backdrop
1.2.1 Intelligence
1.2.2 Agency
1.2.3 Playing with fire
1.2.4 Power
2 Timelines
2.1 Three key properties
2.1.1 Advanced capabilities
2.1.2 Agentic planning
2.1.3 Strategic awareness
2.2 Likelihood by 2070
3 Incentives
3.1 Usefulness
3.2 Available techniques
3.3 Byproducts of sophistication
4 Alignment
4.1 Definitions and clarifications
4.2 Power-seeking
4.3 The challenge of practical PS-alignment
4.3.1 Controlling objectives
4.3.1.1 Problems with proxies
4.3.1.2 Problems with search
4.3.1.3 Myopia
4.3.2 Controlling capabilities
4.3.2.1 Specialization
4.3.2.2 Preventing problematic improvements
4.3.2.3 Scaling
4.3.3 Controlling circumstances
4.4 Unusual difficulties
4.4.1 Barriers to understanding
4.4.2 Adversarial dynamics
4.4.3 Stakes of error
5 Deployment
5.1 Timing of problems
5.2 Decisions
Image: assessment of expected value of deployment
5.3 Key risk factors
5.3.1 Externalities and competition
5.3.2 Number of relevant actors
5.3.3 Bottlenecks on usefulness
5.3.4 Deception
5.4 Overall risk of problematic deployment
6 Correction
6.1 Take-off
6.2 Warning shots
6.3 Competition for power
6.4 Corrective feedback loops
6.5 Sharing power
7 Catastrophe
Marker 53
8 Probabilities
Acknowledgments