
LessWrong (Curated & Popular) "(My understanding of) What Everyone in Technical Alignment is Doing and Why" by Thomas Larsen & Eli Lifland
Sep 11, 2022
Chapters
Transcript
Episode notes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41
Introduction
00:00 • 2min
A Brief Introduction to Technical Alignment
02:04 • 3min
A Lineed a I Has a Problem, Current Approach, Summary and Scale
04:48 • 2min
Scalable Oversight
07:14 • 3min
Aligning a I
09:51 • 2min
The Problem of Learning a Classifier
11:48 • 1min
Alignment Research Center - Eliciting Latent Knowledge
13:13 • 4min
Evaluating L M Power
16:59 • 3min
Anthropic Fine Tuned a Language Model
19:34 • 2min
Neural Networks and Interpretability
21:30 • 2min
A Paradime for Interpretability?
23:45 • 1min
Brain Like a G I Safety
25:15 • 4min
A G I Allignment Strategy
29:14 • 2min
C I R L Is a Wrong Way Reduction
30:54 • 2min
Conjecture, Conjectures, and Epistemology.
33:17 • 3min
Refine, Refine Is an Incubator for Decorrelated Alignment Research
36:04 • 4min
How to Solve the Alignment Problem?
39:37 • 2min
Goal Mis Generalization Is Distinct From Masor Optimization
41:12 • 2min
Deep Mind and Alignment
43:20 • 2min
A Linear Alignment Solution for Deep Learning
44:54 • 2min
How to Oversead Logical Reasoning
46:39 • 3min
Is Chain of Thought Deception a Problem ByDefault?
49:44 • 2min
Scalable Alignment Research
52:00 • 4min
Deception and Inner Alignment
56:09 • 2min
The Infinite Memory Constraints of Infrabasianism
57:56 • 2min
Infrabasian Physicalism, I B P
01:00:22 • 2min
Is There a Future for Infribasianism?
01:02:35 • 2min
Open a I - The Corps Difficulties of Alignment
01:05:00 • 2min
Ought vs Open Ai Alignment
01:07:09 • 2min
Process Base Systems Are More Alligned Than End to End Training
01:09:08 • 4min
The Oversight Team, Episode and Policy, and Change in Parameters
01:12:47 • 2min
Applied Alignment
01:14:35 • 2min
The Selection Theorems
01:16:14 • 2min
The Ontology Identifier Problem
01:18:03 • 2min
Learning Abstraction in Neural Networks
01:19:36 • 3min
Shad Theory, Truthful a I, and Owen Cotton Barrett
01:23:03 • 2min
Axestential and Strategy Alignment Research Groups
01:24:50 • 2min
How Easy Is It to Audomate?
01:26:41 • 2min
How to Align Powerful a I
01:28:37 • 2min
A I Assisting Pivotal Act
01:30:48 • 2min
The Technical Alignment Tax
01:32:39 • 2min
