LessWrong (Curated & Popular)

"(My understanding of) What Everyone in Technical Alignment is Doing and Why" by Thomas Larsen & Eli Lifland

Sep 11, 2022
Ask episode
Chapters
Transcript
Episode notes
1
Introduction
00:00 • 2min
2
A Brief Introduction to Technical Alignment
02:04 • 3min
3
A Lineed a I Has a Problem, Current Approach, Summary and Scale
04:48 • 2min
4
Scalable Oversight
07:14 • 3min
5
Aligning a I
09:51 • 2min
6
The Problem of Learning a Classifier
11:48 • 1min
7
Alignment Research Center - Eliciting Latent Knowledge
13:13 • 4min
8
Evaluating L M Power
16:59 • 3min
9
Anthropic Fine Tuned a Language Model
19:34 • 2min
10
Neural Networks and Interpretability
21:30 • 2min
11
A Paradime for Interpretability?
23:45 • 1min
12
Brain Like a G I Safety
25:15 • 4min
13
A G I Allignment Strategy
29:14 • 2min
14
C I R L Is a Wrong Way Reduction
30:54 • 2min
15
Conjecture, Conjectures, and Epistemology.
33:17 • 3min
16
Refine, Refine Is an Incubator for Decorrelated Alignment Research
36:04 • 4min
17
How to Solve the Alignment Problem?
39:37 • 2min
18
Goal Mis Generalization Is Distinct From Masor Optimization
41:12 • 2min
19
Deep Mind and Alignment
43:20 • 2min
20
A Linear Alignment Solution for Deep Learning
44:54 • 2min
21
How to Oversead Logical Reasoning
46:39 • 3min
22
Is Chain of Thought Deception a Problem ByDefault?
49:44 • 2min
23
Scalable Alignment Research
52:00 • 4min
24
Deception and Inner Alignment
56:09 • 2min
25
The Infinite Memory Constraints of Infrabasianism
57:56 • 2min
26
Infrabasian Physicalism, I B P
01:00:22 • 2min
27
Is There a Future for Infribasianism?
01:02:35 • 2min
28
Open a I - The Corps Difficulties of Alignment
01:05:00 • 2min
29
Ought vs Open Ai Alignment
01:07:09 • 2min
30
Process Base Systems Are More Alligned Than End to End Training
01:09:08 • 4min
31
The Oversight Team, Episode and Policy, and Change in Parameters
01:12:47 • 2min
32
Applied Alignment
01:14:35 • 2min
33
The Selection Theorems
01:16:14 • 2min
34
The Ontology Identifier Problem
01:18:03 • 2min
35
Learning Abstraction in Neural Networks
01:19:36 • 3min
36
Shad Theory, Truthful a I, and Owen Cotton Barrett
01:23:03 • 2min
37
Axestential and Strategy Alignment Research Groups
01:24:50 • 2min
38
How Easy Is It to Audomate?
01:26:41 • 2min
39
How to Align Powerful a I
01:28:37 • 2min
40
A I Assisting Pivotal Act
01:30:48 • 2min
41
The Technical Alignment Tax
01:32:39 • 2min