Embodied AI steps forward with DeepMind’s SIMA 2 research preview
Gemini integration doubles SIMA 2’s performance and reasoning.
Google DeepMind has released a research preview of SIMA 2, an upgraded generalist agent that draws on Gemini’s language and reasoning strengths. The system moves beyond simple instruction following, aiming to understand user intent and interact more effectively with its environment.
SIMA 1 relied on game data to learn basic tasks across diverse 3D worlds but struggled with complex actions. DeepMind says SIMA 2 represents a step change, completing harder objectives in unfamiliar settings and adapting its behaviour through experience without heavy human supervision.
The agent is powered by the Gemini 2.5 Flash-Lite model and built around the idea of embodied intelligence, where an AI acts through a body and responds to its surroundings. Researchers say this approach supports a deeper understanding of context, goals, and the consequences of actions.
Demos show SIMA 2 describing landscapes, identifying objects, and choosing relevant tasks in titles such as No Man’s Sky. It also reveals its reasoning, interprets clues, uses emojis as instructions, and navigates photorealistic worlds generated by Genie, DeepMind’s own environment model.
Self-improvement comes from Gemini models that create new tasks and score attempts, enabling SIMA 2 to refine its abilities through trial and error. DeepMind sees these advances as groundwork for future general-purpose robots, though the team has not shared timelines for wider deployment.
Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot!
