Meryem Arik on LLM Deployment, State-of-the-art RAG Apps, and Inference Architecture Stack

Jun 10, 2024

Meryem Arik, Co-founder/CEO at TitanML, talks about the latest trends in generative AI and Large Language Model (LLM) technologies. She discusses LLM Deployment, state-of-the-art Retrieval Augmented Generation (RAG) apps, and the inference architecture stack for LLM applications. The conversation also touches on advancements in LLM technology, industry adoption, tips for LLM deployment, and the importance of AI regulation.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ANECDOTE

Meryem Arik's Founding Story

Meryem Arik's background includes theoretical physics, philosophy, and enterprise experience.
She co-founded TitanML to bridge research advancements and enterprise infrastructure for AI deployment.

INSIGHT

Enormous LLM Progress and Potential

LLM technology has progressed astronomically from GPT-2 to multimodal models.
Even without further innovation, existing LLMs can unlock a decade of enterprise applications.

INSIGHT

Small Models and Multimodality Rise

Smaller LLMs like Llama 3 (8B parameters) match larger models like GPT 3.5.
Frontier models are advancing multimodality, enabling audio-to-audio conversations without text intermediates.

Get the Snipd Podcast app to discover more snips from this episode

Get the app

In this podcast, Meryem Arik, Co-founder/CEO at TitanML, discusses the innovations in Generative AI and Large Language Model (LLM) technologies including current state of large language models, LLM Deployment, state-of-the-art Retrieval Augmented Generation (RAG) apps, and inference architecture stack for LLM applications. Read a transcript of this interview: https://bit.ly/3X5ZVPu Subscribe to the Software Architects’ Newsletter for your monthly guide to the essential news and experience from industry peers on emerging patterns and technologies: www.infoq.com/software-architects-newsletter Upcoming Events: InfoQ Dev Summit Boston (June 24-25, 2024) Actionable insights on today’s critical dev priorities. devsummit.infoq.com/conference/boston2024 InfoQ Dev Summit Munich (Sept 26-27, 2024) Practical learnings from senior software practitioners navigating Generative AI, security, modern web applications, and more. devsummit.infoq.com/conference/munich2024 QCon San Francisco (November 18-22, 2024) Get practical inspiration and best practices on emerging software trends directly from senior software developers at early adopter companies. qconsf.com/ QCon London (April 7-9, 2025) Discover new ideas and insights from senior practitioners driving change and innovation in software development. qconlondon.com/ The InfoQ Podcasts: Weekly inspiration to drive innovation and build great teams from senior software leaders. Listen to all our podcasts and read interview transcripts: - The InfoQ Podcast www.infoq.com/podcasts/ - Engineering Culture Podcast by InfoQ www.infoq.com/podcasts/#engineering_culture - Generally AI Follow InfoQ: - Mastodon: techhub.social/@infoq - Twitter: twitter.com/InfoQ - LinkedIn: www.linkedin.com/company/infoq - Facebook: bit.ly/2jmlyG8 - Instagram: @infoqdotcom - Youtube: www.youtube.com/infoq Write for InfoQ: Learn and share the changes and innovations in professional software development. - Join a community of experts. - Increase your visibility. - Grow your career. www.infoq.com/write-for-infoq