The InfoQ Podcast

Meryem Arik on LLM Deployment, State-of-the-art RAG Apps, and Inference Architecture Stack

Jun 10, 2024
Meryem Arik, Co-founder/CEO at TitanML, talks about the latest trends in generative AI and Large Language Model (LLM) technologies. She discusses LLM Deployment, state-of-the-art Retrieval Augmented Generation (RAG) apps, and the inference architecture stack for LLM applications. The conversation also touches on advancements in LLM technology, industry adoption, tips for LLM deployment, and the importance of AI regulation.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

Meryem Arik's Founding Story

  • Meryem Arik's background includes theoretical physics, philosophy, and enterprise experience.
  • She co-founded TitanML to bridge research advancements and enterprise infrastructure for AI deployment.
INSIGHT

Enormous LLM Progress and Potential

  • LLM technology has progressed astronomically from GPT-2 to multimodal models.
  • Even without further innovation, existing LLMs can unlock a decade of enterprise applications.
INSIGHT

Small Models and Multimodality Rise

  • Smaller LLMs like Llama 3 (8B parameters) match larger models like GPT 3.5.
  • Frontier models are advancing multimodality, enabling audio-to-audio conversations without text intermediates.
Get the Snipd Podcast app to discover more snips from this episode
Get the app