How AI Is Built  cover image

How AI Is Built

#045 RAG As Two Things - Prompt Engineering and Search

Mar 6, 2025
In this discussion, John Berryman, an expert who transitioned from aerospace engineering to search and machine learning, explores the dual nature of retrieval-augmented generation (RAG). He emphasizes separating search from prompt engineering for optimal performance. Berryman shares insights on effective prompting strategies using familiar structures, testing human evaluations, and managing token limits. He dives into the differences between chat and completion models and highlights practical techniques for tackling AI applications and workflows. It's a deep dive into enhancing interactions with AI!
01:02:44

Episode guests

Podcast summary created with Snipd AI

Quick takeaways

  • RAG encompasses both retrieval and generation, and treating them as separate elements enhances the optimization of each process.
  • Effective prompt engineering requires using familiar structures and correct formatting to align with the LLM's training data for improved model responses.

Deep dives

Separation of Retrieval and Generation

It's essential to recognize that retrieval and generation are two distinct components of information processing. By treating them separately, one can optimize the retrieval process before focusing on how to present information to the model. Prioritizing retrieval enhances the system's performance, enabling better context selection for subsequent model interaction. This separation also assists in diagnosing issues when they arise, providing clarity on which part of the process needs attention.

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner