AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
Accelerating transformers with mixture of experts attention
Discussion of a paper on reducing compute and memory requirements of language models with mixture of experts attention, including cost implications of sequence length and importance for open source models.
Our 148th episode with a summary and discussion of last week's big AI news!
Read out our text newsletter and comment on the podcast at https://lastweekin.ai/
Email us your questions and feedback at contact@lastweekin.ai
Timestamps + links:
Listen to all your favourite podcasts with AI-powered features
Listen to the best highlights from the podcasts you love and dive into the full episode
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
Listen to all your favourite podcasts with AI-powered features
Listen to the best highlights from the podcasts you love and dive into the full episode