AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
RT1 Robotics Transformer - Transformers and Tokenization Architecture
I think transformers have kind of come to symbolize this grand unification, perhaps, across different modalities and all that that enables. So there's been quite a bit of work that shows how tokenization and transformers can allow you to more or less transparently manipulate multiple different modalities in the same fashion. The concept behind RT1 robotics transformer is that the modalities that these robots have to deal with, which are natural language instructions, image observations and the action commands that go to the arm, can all be treated as tokens that you can flatten into a stream and process with a transformer. But we're actually working to include other modalities, including goals specified in ways other than