

Fine-tuning vs RAG (Practical AI #238)
45 snips Sep 6, 2023
Demetrios, from the MLOps Community, joins the podcast to discuss fine-tuning vs. retrieval augmented generation. They also talk about OpenAI Enterprise, the MLOps Community LLM survey results, and the orchestration and evaluation of generative AI workloads.
AI Snips
Chapters
Transcript
Episode notes
Fine-tuning Misconceptions
- Fine-tuning LLMs is often misunderstood compared to fine-tuning diffusion models like Stable Diffusion. - Retrieval augmented generation (RAG) better serves use cases like customizing responses based on company emails without costly GPU fine-tuning.
Vector Databases as Stack Hero
- Vector databases are the central component in the evolving LLM stack for tasks like semantic search. - Developer SDKs and monitoring tools build on top of vector databases to form the orchestration layer of generative AI systems.
LLM Benchmarks Mislead Use Cases
- Benchmarks for LLMs often mislead since top leaderboard models aren’t guaranteed best for specific use cases. - Evaluations must consider use-case specifics like latency, toxicity, and required capabilities beyond raw benchmark scores.