Joe Carlsmith Audio
Audio versions of essays by Joe Carlsmith. Philosophy, futurism, and other topics. Text versions at joecarlsmith.com.
Joe Carlsmith Audio
Does scheming lead to adequate future empowerment? (Section 2.3.1.2 of "Scheming AIs")
•
Joe Carlsmith
This is section 2.3.1.2 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?”
Text of the report here: https://arxiv.org/abs/2311.08379
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
2.3.1.2.2 Even if the model’s values survive this generation of training, will they survive long