ThursdAI - The top AI news from the past week

πŸ’― ThursdAI - 100th episode πŸŽ‰ - Meta LLama 4, Google tons of updates, ChatGPT memory, WandB MCP manifesto & more AI news

124 snips
Apr 10, 2025
Celebrate a milestone with a deep dive into the AI landscape! Discover the buzz around Meta's Llama 4 launch and the avalanche of announcements from Google Next. Hear about advancements in open source AI, including transformative tools like Git MCP. Explore the performance race among AI models and how collaborative efforts are shaping the future. The hosts also highlight exciting projects aimed at enhancing transparency for developers. Plus, enjoy reflections on a year and a half of incredible AI progress!
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

Local Llama 4 Outperforms Hosted Version

  • Wolfram ran Llama 4 Scout locally with low precision.
  • Surprisingly, it outperformed the supposedly full-precision hosted version on Together AI.
INSIGHT

Llama 4 Architecture Creates Inference Challenges

  • LLama 4's architecture shift to Mixture of Experts (MoE) poses inference challenges.
  • This explains discrepancies between benchmarks and Meta's reported results, impacting local and hosted performance.
ADVICE

Use VLLM for Multi-GPU Inference

  • Use VLLM for efficient multi-GPU inference of large language models (LLMs).
  • Its tensor parallelism capabilities enhance speed, especially for commercial hosting, making it the dominant choice.
Get the Snipd Podcast app to discover more snips from this episode
Get the app