Deep Papers

ChatGPT and InstructGPT: Aligning Language Models to Human Intention

14 snips
Jan 18, 2023
Ask episode
Chapters
Transcript
Episode notes
1
Introduction
00:00 • 2min
2
ML Observability and Alignment - Part 1 of 3
01:35 • 2min
3
The Main Problems With Large Language Models
03:42 • 2min
4
Is the Term Alignment Created by OpenAI?
05:50 • 2min
5
Instruct GPT Paper - What Is It?
07:22 • 2min
6
How Do You Train a Reward Model?
09:50 • 2min
7
How to Train a Reward Model for a Good Job
11:36 • 3min
8
Do You See Other Major Applications Just Skipping the First Step?
14:10 • 2min
9
How Did You Come Up With the Idea of Instructing GPT?
16:07 • 2min
10
The Nature of Prompt Engineering Will Change Over Time
18:35 • 2min
11
Are There Other Major Benefits of LHF?
20:35 • 4min
12
OpenAI
24:55 • 2min
13
Is There a Jury on Large Language Models?
26:43 • 2min
14
The Next Generation of Language Models Are Going to Be Really Really Powerful
28:30 • 2min
15
Is There a Future for Machine Learning?
30:45 • 3min
16
How Much Is the Reward Model the Most Important?
33:45 • 2min
17
Is LHF the Best Way to Fine Tune Language Models?
35:33 • 2min
18
Is There a Way to Evaluate a Powerful Model?
37:10 • 2min
19
The Long Term Alignment Research
38:43 • 2min
20
Is There Anything You Can Point to That's Not Really Great?
40:26 • 2min
21
Is That Part of the Actual Training or Is That Like the Fine Tuning?
42:36 • 3min
22
I'd Like to Give Chat and Instruct GPT a Try
45:50 • 2min