Meryem Arik on LLM Deployment, State-of-the-art RAG Apps, and Inference Architecture Stack
Jun 10, 2024
auto_awesome
Meryem Arik, Co-founder/CEO at TitanML, talks about the latest trends in generative AI and Large Language Model (LLM) technologies. She discusses LLM Deployment, state-of-the-art Retrieval Augmented Generation (RAG) apps, and the inference architecture stack for LLM applications. The conversation also touches on advancements in LLM technology, industry adoption, tips for LLM deployment, and the importance of AI regulation.
Deployment of generative AI in regulated industries emphasizes on-prem or VPC solutions.
Rapid innovation in Gen AI includes smaller models matching GPT-3.5 performance and GPT-4.0's multimodal abilities.
Deep dives
Generative AI Solutions in Regulated Industries
Miriam Aarik, Co-founder and CEO of Titan ML, discusses deploying generative AI in regulated industries, focusing on LLMs deployment in regulated environments. TitanML assists regulated industries in deploying generative AI solutions within their infrastructure, emphasizing on-prem or VPC deployment to overcome infrastructure challenges for enterprises.
Rapid Advancements in Gen AI and LLMs
Recent developments in generative AI, with Google Gemini updates and OpenAI's GPT-4 Omni model, showcase rapid innovation in Gen AI and LLM technologies. Miriam highlights the significant progress made within a few years, from GPT-2 models writing poetry to advanced real-time AI capabilities such as screen monitoring and audio feedback.
Trends in Model Size and Multimodality
Anticipated trends include improved capabilities from smaller models, with an 8 billion parameter LLM matching the performance of GPT-3.5, demonstrating enhanced outcomes from compact models. Moreover, emerging models like GPT-4.0 exhibit native multimodal abilities, enabling audio-to-audio conversations without textual intermediaries, indicating exciting developments in multimodal AI.
Self-hosted LLM Deployments for Enterprise Scale
Discussing the advantages and challenges of self-hosting LLMs, Miriam delves into potential advantages for mid-market businesses and scale-ups beyond large enterprises. She emphasizes the performance benefits of self-hosted models, enabling faster responses and broader model choices, presenting a cost-effective solution for enhanced AI application performance.
In this podcast, Meryem Arik, Co-founder/CEO at TitanML, discusses the innovations in Generative AI and Large Language Model (LLM) technologies including current state of large language models, LLM Deployment, state-of-the-art Retrieval Augmented Generation (RAG) apps, and inference architecture stack for LLM applications.
Read a transcript of this interview: https://bit.ly/3X5ZVPu
Subscribe to the Software Architects’ Newsletter for your monthly guide to the essential news and experience from industry peers on emerging patterns and technologies:
www.infoq.com/software-architects-newsletter
Upcoming Events:
InfoQ Dev Summit Boston (June 24-25, 2024)
Actionable insights on today’s critical dev priorities.
devsummit.infoq.com/conference/boston2024
InfoQ Dev Summit Munich (Sept 26-27, 2024)
Practical learnings from senior software practitioners navigating Generative AI, security, modern web applications, and more.
devsummit.infoq.com/conference/munich2024
QCon San Francisco (November 18-22, 2024)
Get practical inspiration and best practices on emerging software trends directly from senior software developers at early adopter companies.
qconsf.com/
QCon London (April 7-9, 2025)
Discover new ideas and insights from senior practitioners driving change and innovation in software development.
qconlondon.com/
The InfoQ Podcasts:
Weekly inspiration to drive innovation and build great teams from senior software leaders. Listen to all our podcasts and read interview transcripts:
- The InfoQ Podcast www.infoq.com/podcasts/
- Engineering Culture Podcast by InfoQ www.infoq.com/podcasts/#engineering_culture
- Generally AI
Follow InfoQ:
- Mastodon: techhub.social/@infoq
- Twitter: twitter.com/InfoQ
- LinkedIn: www.linkedin.com/company/infoq
- Facebook: bit.ly/2jmlyG8
- Instagram: @infoqdotcom
- Youtube: www.youtube.com/infoq
Write for InfoQ:
Learn and share the changes and innovations in professional software development.
- Join a community of experts.
- Increase your visibility.
- Grow your career.
www.infoq.com/write-for-infoq
Get the Snipd podcast app
Unlock the knowledge in podcasts with the podcast player of the future.
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode
Save any moment
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Share & Export
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode