In this conversation, we cover 6 papers in detail. They are:
- RT-2 – which shows how internet-scale vision-language allow robots to understand and manipulate objects they've never seen in training
- RT-X – a collaboration with academic labs across the country that demonstrates how a single model can be trained to control a diverse range of robot embodiments, achieving performance that often surpasses specialist models trained on individual robots.
- RT-Trajectory – a project that shows how robots can learn new skills, in context, from a single human demonstration, as represented by a simple line drawings
- Auto-RT – a system that scales human oversight of robots, even in unseen environments, by using large language models and a "robot constitution" to power first-line ethical and safety checks on robot behavior.
- Learning to Learn Faster – an approach that enables robots to learn more efficiently from human verbal feedback,
- Pivot - another project that shows how vision-language models can be used to guide robots – no special fine-tuning required.
While progress in robotics is still trailing behind the advances in language & vision, there are still challenges to be overcome before robotics models will have the scale of data and/or the sample efficiency needed to achieve reliable general-purpose capabilities, and the study of robot safety and alignment is still in its infancy, ultimately I see this rapid-fire series of papers as strong evidence that the same core architectures and scaling techniques that have worked so well in other contexts will ultimately succeed in robotics as well.
The work being done at Google DeepMind Robotics is pushing the boundaries of what's possible, investment in a new generation of robotics startups is heating up, and the pace of progress shows no signs of slowing down.
As always, if you're finding value in the show, please take a moment to share it with friends. This one would be perfect for anyone who has ever day-dreamed of having a robot that could fold their laundry or pick up their kids toys.
And especially as we are just building the new feed, a review on Apple Podcasts, Spotify, or a comment on Youtube would be much appreciated.
Now, here's my conversation with Keerthana Gopalakrishnan and Ted Xiao of Google Deepmind Robotics.