Sam Partee on Retrieval Augmented Generation (RAG)

10 snips

Jan 29, 2024

Sam Partee, principal engineer at Redis, discusses Redis' vector database offering, different approaches to embeddings, enhancing language models with search components, and the use of hybrid search in Redis. They also explore the potential applications of retrieval augmented generation (RAG) technology and the challenges of running large language models on-prem.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ANECDOTE

Sam Partee's Redis Vector Journey

Sam Partee shared his journey integrating Redis with vector databases and embedding frameworks like Langchain.
He learned many best practices through direct customer work and open source development.

ADVICE

Choose Vector DB by Use Case

Choose a vector database based on your use case, like Redis for real-time and dynamic vectors.
For static large datasets, cheaper solutions like FAISS on S3 may be more practical.

ADVICE

Best Use Cases for Redis Vectors

Redis vectors excel for chat conversation memory, semantic caching, and live recommendation systems with fast latency needs.
Avoid Redis vector DB for static, low QPS use cases as it can be cost-ineffective.

Get the Snipd Podcast app to discover more snips from this episode

Get the app

Live from the venue of the QCon San Francisco Conference, we are talking with Sam Partee, principal engineer at Redis. In this podcast, Sam shares his insights on Redis' vector database offering, different approaches to embeddings, how to enhance large language models by adding a search component for retrieval augmented generation, and the use of hybrid search in Redis. Read a transcript of this interview: https://bit.ly/3HzToDL Subscribe to the Software Architects’ Newsletter for your monthly guide to the essential news and experience from industry peers on emerging patterns and technologies: https://www.infoq.com/software-architects-newsletter Upcoming Events: QCon London (April 8-10, 2024) Discover new ideas and insights from senior practitioners driving change and innovation in software development. https://qconlondon.com/ InfoQ Dev Summit Boston (June 24-25, 2024) Actionable insights on today’s critical dev priorities. https://devsummit.infoq.com/ QCon San Francisco (November 18-22, 2024) Get practical inspiration and best practices on emerging software trends directly from senior software developers at early adopter companies. https://qconsf.com/ The InfoQ Podcasts: Weekly inspiration to drive innovation and build great teams from senior software leaders. Listen to all our podcasts and read interview transcripts: - The InfoQ Podcast https://www.infoq.com/podcasts/ - Engineering Culture Podcast by InfoQ https://www.infoq.com/podcasts/#engineering_culture - Generally AI Podcast https://www.infoq.com/generally-ai-podcast/ Follow InfoQ: - Mastodon: https://techhub.social/@infoq - Twitter: twitter.com/InfoQ - LinkedIn: www.linkedin.com/company/infoq - Facebook: bit.ly/2jmlyG8 - Instagram: @infoqdotcom - Youtube: www.youtube.com/infoq Write for InfoQ: Learn and share the changes and innovations in professional software development. - Join a community of experts. - Increase your visibility. - Grow your career. https://www.infoq.com/write-for-infoq