RAG Quality Starts with Data Quality // Adam Kamor // #262

37 snips

Sep 20, 2024

In this engaging discussion, Adam Kamor, co-founder of Tonic, shares his expertise in creating mock data while ensuring data privacy. He highlights the significance of high-quality data for Retrieval-Augmented Generation (RAG) systems, tackling challenges like data documentation and chunking. Adam emphasizes innovative strategies for managing sensitive information and maintaining accuracy in retrieval. Listeners will gain valuable insights into building effective data pipelines and the critical role of database tools in today’s AI landscape.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ANECDOTE

Naming Tonic

Adam Kamor discusses the challenge of naming Tonic.ai and its products.
The difficulty in finding available domain names led to the name "Tonic."

INSIGHT

Data Quality in RAG

Data quality is paramount for effective RAG systems, especially with private data.
Tonic Textual helps create high-quality data pipelines for RAG systems.

INSIGHT

NER Models for RAG

Tonic Textual uses NER models to identify sensitive and interesting entities in text.
It enhances RAG systems by addressing sensitive information and optimizing data chunks.

Get the Snipd Podcast app to discover more snips from this episode

Get the app