MLOps.community

RAG Has Been Oversimplified // Yujian Tang // #206

Jan 23, 2024

48:55

Creator website

AI Summary

Highlights

AI Chapters

Episode notes

Podcast summary created with Snipd AI

Quick takeaways

Careful consideration is needed to optimize and differentiate the usage of RAGs in natural language processing.

Understanding embeddings and exploiting vector databases are key elements in the successful implementation of RAG models.

Deep dives

The Rise of RAGs: Optimizing and Differentiating

RAGs (Retrieval-Augmented Generative) models are gaining popularity in the LLM (Large Language Model) space. They allow for improved performance and easier deployment of ML models. One key optimization method discussed is the chunking and pre-processing of data for better results. The type and size of data chunks can vary depending on the desired application, such as conversational chatbots or blog writing assistance. Additionally, the need for evaluation tools and guardrails is emphasized to ensure reliable and trustworthy outputs. Multi-modal RAGs, involving images and text, hold great potential but also come with challenges, including potential compounded hallucinations. Overall, RAGs offer a powerful approach to natural language processing, but careful consideration is needed to optimize and differentiate their usage.

Optimizing Storage and Retrieval of Data in LLMs

01:43

Training and Using Embeddings Models for Specific Data

01:14

Ensuring Relevant and Accurate Results in Production

01:36

Utilizing Pipelines for Embeddings Model Efficiency

01:20

Introduction

5min

Vector Database and Embedding Design for RAG Systems

9min

A Tangent on Building a Fire

2min

Building a Fire Without Unit Tests

6min

Building a RAG (Retrieval-Augmented Generation) Stack

7min

Optimizing the RAG Model

17min

Possibility of Problem Solution and Information on Pipelines

2min

Yujian is working as a Developer Advocate at Zilliz, where they develop and write tutorials for proof of concepts for large language model applications. They also give talks on vector databases, LLM Apps, semantic search, and tangential spaces.

MLOps podcast #206 with Yujian Tang, Developer Advocate at Zilliz, RAG Has Been Oversimplified, brought to us by our Premium Brand Partner, Zilliz // Abstract In the world of development, Retrieval Augmented Generation (RAG) has often been oversimplified. Despite the industry's push, the practical application of RAG reveals complexities beyond its apparent simplicity. This talk delves into the nuanced challenges and considerations developers encounter when working with RAG, providing a candid exploration of the intricacies often overlooked in the broader narrative. // Bio Yujian Tang is a Developer Advocate at Zilliz. He has a background as a software engineer working on AutoML at Amazon. Yujian studied Computer Science, Statistics, and Neuroscience with research papers published to conferences including IEEE Big Data. He enjoys drinking bubble tea, spending time with family, and being near water. // MLOps Jobs board https://mlops.pallet.xyz/jobs // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Website: zilliz.com --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Yujian on LinkedIn: linkedin.com/in/yujiantang Timestamps: [00:00] Yujian's preferred coffee [00:17] Takeaways [02:42] Please like, share, and subscribe to our MLOps channels! [02:55] The hero of the LLM space [05:42] Embeddings into Vector databases [09:15] What is large and what is small LLM consensus [10:10] QA Bot behind the scenes [13:59] Fun fact getting more context [17:05] RAGs eliminate the ability of LLMs to hallucinate [18:50] Critical part of the rag stack [19:57] Building citations [20:48] Difference between context and relevance [26:11] Missing prompt tooling [27:46] Similarity search [29:54] RAG Optimization [33:03] Interacting with LLMs and tradeoffs [35:22] RAGs not suited for [39:33] Fashion App [42:43] Multimodel Rags vs LLM RAGs [44:18] Multimodel use cases [46:50] Video citations [47:31] Wrap up

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

MLOps.community

RAG Has Been Oversimplified // Yujian Tang // #206

Podcast summary created with Snipd AI

Quick takeaways

Deep dives

The Rise of RAGs: Optimizing and Differentiating

Understanding Embeddings and the Role of Vector Databases

Considerations and Trade-offs in RAG Optimization

The Versatility and Limitations of RAGs

Optimizing Storage and Retrieval of Data in LLMs

Training and Using Embeddings Models for Specific Data

Ensuring Relevant and Accurate Results in Production

Utilizing Pipelines for Embeddings Model Efficiency

Get the Snipd
podcast app

AI-powered
podcast player

Discover
highlights

Save any
moment

Share
& Export

AI-powered
podcast player

Discover
highlights

MLOps.community

RAG Has Been Oversimplified // Yujian Tang // #206

Podcast summary created with Snipd AI

Quick takeaways

Deep dives

The Rise of RAGs: Optimizing and Differentiating

Understanding Embeddings and the Role of Vector Databases

Considerations and Trade-offs in RAG Optimization

The Versatility and Limitations of RAGs

Optimizing Storage and Retrieval of Data in LLMs

Training and Using Embeddings Models for Specific Data

Ensuring Relevant and Accurate Results in Production

Utilizing Pipelines for Embeddings Model Efficiency

Get the Snipdpodcast app

AI-poweredpodcast player

Discoverhighlights

Save anymoment

Share& Export

AI-poweredpodcast player

Discoverhighlights

Get the Snipd
podcast app

AI-powered
podcast player

Discover
highlights

Save any
moment

Share
& Export

AI-powered
podcast player

Discover
highlights