How AI Is Built  cover image

How AI Is Built

#22 Nils Reimers on the Limits of Embeddings, Out-of-Domain Data, Long Context, Finetuning (and How We're Fixing It) | Search

Sep 19, 2024
Join Nils Reimers, a prominent researcher in dense embeddings and the driving force behind foundational search models at Cohere. He dives into the intriguing limitations of text embeddings, such as their struggles with long documents and out-of-domain data. Reimers shares insights on the necessity of fine-tuning to adapt models effectively. He also discusses innovative approaches like re-ranking to enhance search relevance, and the bright future of embeddings as new research avenues are explored. Don't miss this deep dive into the cutting-edge of AI!
46:06

Episode guests

Podcast summary created with Snipd AI

Quick takeaways

  • Text embeddings struggle with out-of-domain data and long documents, making fine-tuning essential for enhanced effectiveness in specific contexts.
  • The evolution of embeddings highlights the significance of leveraging existing models and employing two-stage retrieval processes for improved accuracy.

Deep dives

The Evolution of Embeddings

The discussion elaborates on the development of embeddings since the introduction of BERT, highlighting its application in argument mining and text clustering. Initially, pairwise classification was used to assess argument similarities, but scalability issues led to the exploration of embeddings for more efficient clustering. This evolution has enabled more effective semantic text processing and retrieval, indicating a significant shift towards unifying whether similar texts can be used for clustering and finding relevant answers based on a user's query. As the field progressed, better models emerged, including InferSent and innovations like contrastive learning, which improved the way embeddings are generated and assessed.

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode