
Shahul Es
Co-founder of Ragas, an open-source library for evaluating LLM applications. Experienced in natural language models and applied AI research.
Top 3 podcasts with Shahul Es
Ranked by the Snipd community

10 snips
Aug 29, 2024 • 42min
Metrics Driven Development
Shahul Es, Co-founder of Ragas, discusses innovative approaches to evaluating LLM applications. He emphasizes the significance of Metrics Driven Development to systematically measure and enhance performance. The conversation contrasts assessing LLM applications with evaluating models, highlighting the need for tailored metrics and synthetic test data. Shahul shares insights on creating clear standards for better enterprise adoption, ensuring responsible and high-quality AI solutions. Tune in for an engaging deep dive into AI's evolving landscape!

8 snips
Oct 6, 2023 • 51min
All About Evaluating LLM Applications // Shahul Es // #179
Shahul Es, creator of the Ragas Project and evaluation expert, discusses open source model evaluation, including debugging, troubleshooting, and benchmark challenges. They highlight the importance of custom data distributions and fine-tuning for better model performance. They also explore the difficulties of evaluating LLM applications and the need for reliable leaderboards. Additionally, they discuss the security aspects of language models and the significance of data preparation and filtering. Lastly, they contrast fine-tuning with retrieval augmented generation and provide resources for evaluating LLM applications.

4 snips
Nov 20, 2023 • 50min
RAGAS with Jithin James, Shahul Es, and Erika Cardenas - Weaviate Podcast #77!
Join Jithin James and Shahul ES, co-founders of RAGAS, a pioneering framework for evaluating retrieval-augmented generation, along with Erika Cardenas, a developer advocate at Weaviate. They delve into the innovative RAGAS score, which uses LLMs to evaluate generation and retrieval metrics, streamlining the evaluation process. The trio discusses optimizing RAG applications through various tuning strategies and the exciting potential of future technologies like fine-tuning smaller models and enhancing automated systems for smarter, efficient retrieval.