AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
The Future of Mass Language Modeling
Twitter has recently published a paper on indexing using colbert which is like a token level representation thing where it's like they call it late interaction. They would put that twitter tweets doc id in the end and as the new searcher comes in searching tweets they would read backwards from the beginning of so basically what they did is that they kind of like encoded the temporal nature of tweets to make them more fresh. The majority of users will only use minutia or splayed vectors but if you have 5,000 direct messages scrolling through a day they will take half an hour to search for each one. i was just thinking um 10 years ago on berylline buzzwords there