Punching Cards cover image

Punching Cards

On Large Language Models - Season 2, Episode 1

Jan 9, 2025
Sepp Hochreiter, a pioneer behind the LSTM model, and Alan Akbik, an expert in NLP, dive into the fascinating world of large language models. They discuss the evolution of language models and the promising potential of XLSTMs in AI coding. The conversation highlights the advantages of LSTMs versus transformers and introduces XLCMs, emphasizing their role in generative AI. They also touch on the cultural barriers to applying AI innovations in Europe, and the balance needed between AI use and traditional academic practices.
39:03

Podcast summary created with Snipd AI

Quick takeaways

  • The extended LSTM (XLSTM) model improves memory efficiency and processing of longer sequences, aiming to regain relevance in AI applications.
  • Regulatory hurdles surrounding AI technologies in Europe pose challenges for the deployment of XLSTM, highlighting the need for balance between innovation and legislation.

Deep dives

The Return of LSTM: Sepp Hochreiter's Extended XLSTM Model

The long short-term memory (LSTM) model, invented by Sepp Hochreiter in the 1990s, revolutionized language processing applications, including Siri and Google Translate. However, the emergence of transformers, particularly the generative pre-trained transformer (GPT) architecture, led to LSTM being largely overshadowed due to its ability to handle data in a more parallelizable manner. Hochreiter has now introduced the extended LSTM (XLSTM), which aims to retain the functional advantages of LSTM while addressing scalability issues faced during training and application. This new model boasts improvements in memory efficiency, allowing for improved processing of longer sequences, thus potentially regaining relevance in modern AI applications.

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner