Vector Podcast cover image

Connor Shorten - Research Scientist, Weaviate - ChatGPT, LLMs, Form vs Meaning

Vector Podcast

CHAPTER

The Future of Mass Language Modeling

Twitter has recently published a paper on indexing using colbert which is like a token level representation thing where it's like they call it late interaction. They would put that twitter tweets doc id in the end and as the new searcher comes in searching tweets they would read backwards from the beginning of so basically what they did is that they kind of like encoded the temporal nature of tweets to make them more fresh. The majority of users will only use minutia or splayed vectors but if you have 5,000 direct messages scrolling through a day  they will take half an hour to search for each one. i was just thinking um 10 years ago on berylline buzzwords there

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode