Categories: Gadgets360

Google DeepMind Is Integrating Gemini 1.5 Pro in Robots That Can Navigate Real-World Environments

Google DeepMind shared new advancements made in the field of robotics and vision language models (VLMs) on Thursday. The artificial intelligence (AI) research division of the tech giant has been working with advanced vision models to develop new capabilities in robots. In a new study, DeepMind highlighted that using Gemini 1.5 Pro and its long context window has now enabled the division to make breakthroughs in navigation and real-world understanding of its robots. Earlier this year, Nvidia also unveiled new AI technology that powers advanced capabilities in humanoid robots.

Google DeepMind Uses Gemini AI to Improve Robots

In a post on X (formerly known as Twitter), Google DeepMind revealed that it has been training its robots using Gemini 1.5 Pro’s 2 million token context window. Context windows can be understood as the window of knowledge visible to an AI model, using which it processes tangential information around the queried topic.

For instance, if a user asks an AI model about “most popular ice cream flavours”, the AI model will check the keyword ice cream and flavours to find information to that question. If this information window is too small, then the AI will only be able to respond with the names of different ice cream flavours. However, if it is larger, the AI will also be able to see the number of articles about each ice cream flavour to find which has been mentioned the most and deduce the “popularity factor”.

DeepMind is taking advantage of this long context window to train its robots in real-world environments. The division aims to see if the robot can remember the details of an environment and assist users when asked about the environment with contextual or vague terms. In a video shared on Instagram, the AI division showcased that a robot was able to guide a user to a whiteboard when he asked it for a place where he could draw.

“Powered with 1.5 Pro’s 1 million token context length, our robots can use human instructions, video tours, and common sense reasoning to successfully find their way around a space,” Google DeepMind stated in a post.

In a study published on arXiv (a non-peer-reviewed online journal), DeepMind explained the technology behind the breakthrough. In addition to Gemini, it is also using its own Robotic Transformer 2 (RT-2) model. It is a vision-language-action (VLA) model that learns from both web and robotics data. It utilises computer vision to process real-world environments and use that information to create datasets. This dataset can later be processed by the generative AI to break down contextual commands and produce desired outcomes.

At present, Google DeepMind is using this architecture to train its robots on a broad category known as Multimodal Instruction Navigation (MIN) which includes environment exploration and instruction-guided navigation. If the demonstration shared by the division is legitimate, this technology might further advance robotics.

Recent Posts

Beyoncé’s NFL Christmas Halftime Show Now Streaming on Netflix: Everything You Need to Know

Beyoncé's much-anticipated halftime performance, part of Netflix's NFL Christmas Gameday event, is set to release…

10 months ago

Scientists Predict Under Sea Volcano Eruption Near Oregon Coast in 2025

An undersea volcano situated roughly 470 kilometers off Oregon's coastline, Axial Seamount, is showing signs…

10 months ago

Organic Molecules in Space: A Key to Understanding Life’s Cosmic Origins

As researchers delve into the cosmos, organic molecules—the building blocks of life—emerge as a recurring…

10 months ago

The Secret of the Shiledars OTT Release Date Announced: What You Need to Know

Director Aditya Sarpotdar, following his successful venture "Munjya," has announced the release of his treasure…

10 months ago

Anne Hathaway’s Mothers’ Instinct Now Streaming on Lionsgate Play

The psychological thriller Mothers' Instinct, featuring Anne Hathaway, Jessica Chastain, and Kelly Carmichael, delves into…

10 months ago

All We Imagine As Light OTT Release Date: When and Where to Watch it Online?

Payal Kapadia's award-winning film, All We Imagine As Light, will soon be available for streaming,…

10 months ago