NVIDIA Unveils Cosmos3: The First Fully Open-Source Physical AI Foundation Model

Robotics01.Jun.2026 11:082 min read

NVIDIA has released Cosmos3, a fully open-source, multimodal physical AI model leveraging a hybrid Transformer architecture. The model dramatically reduces training and evaluation cycles for embodied AI systems and introduces the NVIDIA Cosmos Coalition to accelerate real-world AI agent development.

NVIDIA Unveils Cosmos3: The First Fully Open-Source Physical AI Foundation Model

A New Era for Physical AI

NVIDIA has officially unveiled Cosmos3, positioning it as the world’s first fully open-source, fully multimodal foundation model dedicated to physical artificial intelligence. Designed to bridge the gap between digital simulation and real-world deployment, the model aims to compress the traditionally lengthy training and evaluation cycles for embodied AI from several months down to just a few days.

Hybrid Transformer Architecture and Multimodal Capabilities

At the core of Cosmos3 is an innovative hybrid Transformer architecture that seamlessly integrates reasoning and generative capabilities. The model first analyzes object interactions, motion dynamics, and spatiotemporal relationships before executing precise video generation and action trajectory prediction. Trained on a massive dataset comprising billions of text, image, video, audio, and motion trajectory samples, Cosmos3 natively understands and generates cross-modal content with industry-leading physical simulation accuracy.

Independent evaluations across major physical AI benchmarks, including Artificial Analysis, Physics-IQ, and RoboLab, have ranked Cosmos3 at the top among open-source models. Its architecture directly addresses long-standing industry bottlenecks, particularly the difficulty of generalizing in real-world scenarios due to limited data and fragmented simulation frameworks.

Model Variants for Diverse Development Needs

To accommodate different stages of AI development, NVIDIA is rolling out a tiered model lineup:

  • Cosmos3Super: Optimized for high-precision secondary training in robotics and autonomous driving. Currently available.
  • Cosmos3Nano: Engineered for ultra-fast, high-quality video parsing and action inference within seconds. Currently available.
  • Cosmos3Edge: Designed for real-time inference on edge devices. Scheduled for upcoming release.

The NVIDIA Cosmos Coalition

Alongside the model launch, NVIDIA announced the formation of the NVIDIA Cosmos Coalition. This industry alliance brings together leading world model developers and AI researchers, including Agile Robots, Black Forest Labs, Generalist, LTX, Runway, and Skild AI. The coalition aims to standardize development practices, share resources, and accelerate the deployment of physical AI across industries.

NVIDIA CEO Jensen Huang emphasized that the convergence of multimodal reasoning and advanced world models marks a transformative milestone for physical AI. By open-sourcing these cutting-edge tools, NVIDIA intends to empower global developers to build next-generation intelligent agents capable of perceiving, reasoning, and acting reliably in complex physical environments.