Programming Throwdown

172: Transformers and Large Language Models

Mar 11, 2024

Exploring transformers and large language models in neural networks, focusing on latent variables, encoders, decoders, attention layers, and the history of RNNs. Discussing the vanishing gradient problem in LSTM and self-supervised learning with direct policy optimization.

Ask episode

Chapters

Transcript

Episode notes

Exploring Remote Work and WeWork Spaces

01:31 • 22min

Exploring PID Controllers and Gemma Models in Large Language Models

Exploring Model Parameters, Memory Usage, and Fine-Tuning in Large Language Models

Exploring The Wheel of Time Series & TV Adaptation

Exploring YouTube Recommendations and Game Development

33:49 • 15min

Challenges and Solutions in Neural Networks

48:23 • 25min

Evolution and Challenges in Scaling Language Models

01:13:07 • 10min

Speculating on the Future of Jobs in an AI-dominated World

01:23:25 • 3min