The embedding is lossless and it's all just going into this global model that kind of sits at the middle of the whole thing. The architecture itself doesn't care what these bytes represent right so i got thatWhat is a little less clear to me is what are the patches doing on the input side because i if i if i understood correctly and maybe i didn't i understood that the embedding was lossless but then i was kind of like well if it's lossless  what is the meaning of those patches or what is the function of those patches as opposed to just saying you know here's all thebytes that are inputs just feed them directly into the global model.

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode