Information Theory for Language Models: Jack Morris

644 snips

Jul 2, 2025

In this engaging discussion, Jack Morris, a PhD grad student at Cornell Tech, unpacks the intricate relationship between information theory and large language models. He shares fascinating insights about the efficiency of data representation in AI, particularly in models like GPT-3. The conversation dives into the revolutionary concepts of embedding inversion and the implications for model alignment and security. Jack also explores the potential of emerging programming languages like Mojo, merging performance with innovation in AI research.

Ask episode

AI Snips

Chapters

Books

Transcript

Episode notes

00:00 / 00:00

Learn Distributed Training Online

Grad students rarely receive formal training on multi-node distributed training.
Learn distributed training skills through online communities like PyTorch and DeepSpeed Discords.

00:00 / 00:00

Adopt Mojo for Fast Experimentation

Embrace new programming languages like Mojo for faster experimentation.
Efficiency helps enable quick experimentation on limited compute budgets.

00:00 / 00:00

Usable Information in Deep Learning

Traditional information theory doesn't capture usefulness of information in deep learning models.
Defining usable information under computational constraints better explains model behavior and learning.

Get the Snipd Podcast app to discover more snips from this episode

Get the app