Meta’s V-JEPA 2 teaches AI to think, plan, and act in 3D space

Meta’s new open-source AI model builds internal world simulations to boost robotic perception and real-time interaction.

Meta, V-JEPA 2, AI model, world model, 3D

Meta has released V-JEPA 2, an open-source AI model designed to understand and predict real-world environments in 3D. Described as a world model’, it enables machines to simulate physical spaces—offering a breakthrough for robotics, self-driving cars, and intelligent assistants.

Unlike traditional AI that relies on labelled data, V-JEPA 2 learns from unlabelled video clips, building an internal simulation of how the world works. However, now, AI can reason, plan, and act more like humans.

Based on Meta’s JEPA architecture and containing 1.2 billion parameters, the model improves significantly on action prediction and environmental modelling compared to its predecessor.

Meta says this approach mirrors how humans intuitively understand cause and effect—like predicting a ball’s motion or avoiding people in a crowd. V-JEPA 2 helps AI agents develop this same intuition, making them more adaptive in dynamic, unfamiliar situations.

Meta’s Chief AI Scientist Yann LeCun describes world models as ‘abstract digital twins of reality’—vital for machines to understand and predict what comes next. This effort aligns with Meta’s broader push into AI, including a planned $14 billion investment in Scale AI for data labelling.

V-JEPA 2 joins a growing wave of interest in world models. Google DeepMind is building its own called Genie, while AI researcher Fei-Fei Li recently raised $230 million for her startup World Labs, focused on similar goals.

Meta believes V-JEPA 2 brings us closer to machines that can learn, adapt, and operate in the physical world with far greater autonomy and intelligence.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!