MLOps.community

Retrieval Augmented Generation

37 snips
May 17, 2024
Syed Asad, an Innovator and AI Engineer, discusses Retrieval Augmented Generation (RAG), Semantic Vector Searches, and Vector Databases reshaping data landscapes. Topics include AI model deployment complexities, AI evaluation frameworks, challenges in client approval, and struggles with data ingestion in AI environments.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

CSV Data Challenge

  • Syed Asad faced a production issue with a large CSV file (133MB) for a RAG.
  • Various embedding models and vector databases failed to process the data efficiently.
ADVICE

Alternative to Embeddings

  • Consider data complexity and topic repetition when choosing embedding models.
  • For simpler data, alternative approaches like Parquet format and Llama Index may be more efficient.
ADVICE

Inference Layer Exploration

  • Explore and test different inference solutions for production.
  • Consider factors like cost, performance, and logging capabilities when choosing a solution.
Get the Snipd Podcast app to discover more snips from this episode
Get the app