ChatGPT and InstructGPT: Aligning Language Models to Human Intention

1

Introduction

00:00 • 2min

2

ML Observability and Alignment - Part 1 of 3

01:35 • 2min

3

The Main Problems With Large Language Models

03:42 • 2min

4

Is the Term Alignment Created by OpenAI?

05:50 • 2min

5

Instruct GPT Paper - What Is It?

07:22 • 2min

6

How Do You Train a Reward Model?

09:50 • 2min

7

How to Train a Reward Model for a Good Job

11:36 • 3min

8

Do You See Other Major Applications Just Skipping the First Step?

14:10 • 2min

9

How Did You Come Up With the Idea of Instructing GPT?

16:07 • 2min

10

The Nature of Prompt Engineering Will Change Over Time

18:35 • 2min

11

Are There Other Major Benefits of LHF?

20:35 • 4min

12

OpenAI

24:55 • 2min

13

Is There a Jury on Large Language Models?

26:43 • 2min

14

The Next Generation of Language Models Are Going to Be Really Really Powerful

28:30 • 2min

15

Is There a Future for Machine Learning?

30:45 • 3min

16

How Much Is the Reward Model the Most Important?

33:45 • 2min

17

Is LHF the Best Way to Fine Tune Language Models?

35:33 • 2min

18

Is There a Way to Evaluate a Powerful Model?

37:10 • 2min

19

The Long Term Alignment Research

38:43 • 2min

20

Is There Anything You Can Point to That's Not Really Great?

40:26 • 2min

21

Is That Part of the Actual Training or Is That Like the Fine Tuning?

42:36 • 3min

22

I'd Like to Give Chat and Instruct GPT a Try

45:50 • 2min