Google DeepMind's Chatbot-Powered Robot Is Part of a Bigger Revolution
Jul 16, 2024
auto_awesome
Robotics researchers are using large language models to enhance the problem-solving skills of physical machines like Google's Gemini-powered robot for office guidance. A revolution in robotics is underway with the fusion of language models and real-world training.
Large language models like Gemini enhance robots' problem-solving abilities in physical environments.
Startups merging language models with real-world training aim to revolutionize the robotic industry by infusing problem-solving skills.
Deep dives
Google DeepMind's Gemini large language model enhances robotic capabilities
Google DeepMind's Gemini large language model empowers robots to understand commands and navigate physical spaces with up to 90% reliability. This model enables robots to process video and text inputs, aiding in functionalities like finding desired locations within an environment. By combining Gemini with specific action-generating algorithms, the robot can execute tasks based on instructions and visual cues, showcasing the potential of language models in real-world settings. Researchers see promise in expanding Gemini's applications across different robot types for improved problem-solving abilities.
Investors support startups merging AI advancements with robotics for problem-solving abilities
Investors are backing startups like Physical Intelligence and Skilled AI, which merge large language models with real-world training to enhance robots' problem-solving skills. These ventures aim to infuse robots with general problem-solving capabilities by leveraging visual language models trained on images, videos, and text inputs. With significant funding, these startups are at the forefront of utilizing AI advancements to revolutionize the robotic industry, moving beyond traditional mapping and command-based navigation to more intuitive and perceptive interactions.
1.
Exploring the Use of Large Language Models in Robotics