Neural Search Talks — Zeta Alpha cover image

Generating Training Data with Large Language Models w/ Special Guest Marzieh Fadaee

Neural Search Talks — Zeta Alpha

00:00

Exploring Query Intent and Synthetic Data Generation

This chapter focuses on the importance of query intent in generating effective training data for language models, particularly through the lens of the Argue Anna corpus for counterarguments. It discusses methodologies for generating synthetic data using different approaches, comparing their effectiveness and implications for model evaluation. Additionally, the chapter highlights the significance of filtering techniques in enhancing the quality of training outcomes, addressing the challenges of using generated queries.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app