Latent Space: The AI Engineer Podcast cover image

ICLR 2024 — Best Papers & Talks (ImageGen, Vision, Transformers, State Space Models) ft. Durk Kingma, Christian Szegedy, Ilya Sutskever

Latent Space: The AI Engineer Podcast

00:00

Efficiency and Performance in Generative Models: Exploring FastGEN and LAMA

This chapter explores the performance and efficiency of instruction fine-tuned LAMA and the FastGEN method that utilizes adaptive KV caching. It presents experimental results on memory trade-offs and model sizes while discussing future research directions for optimizing the LAMA model.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app