Latent Space: The AI Engineer Podcast cover image

Information Theory for Language Models: Jack Morris

Latent Space: The AI Engineer Podcast

00:00

Exploring Gemma 3n and AI Model Dynamics

This chapter discusses the launch of the Gemma 3n language model, focusing on its ability to merge multiple modalities and the role of modular adapters in enhancing functionality. It examines the potential of smaller, parameter-efficient models to achieve high performance through innovative training and architecture, alongside the complexities of information storage and utilization in language models. Additionally, the chapter highlights recent research on model architecture, the Morse constant's impact on performance, and challenges in translating model weights into training data.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app