NVIDIA releases SANA-WM video model on May 17 2026 for faster creation

NVIDIA's new SANA-WM model can generate 60 seconds of 720p video using only one GPU. This is 36 times faster than previous models like LingBot-World.

NVIDIA has launched SANA-WM, an open-source world model with 2.6 billion parameters. This new model is designed to generate up to one minute of 720p video from a single image and camera trajectory, notably on a single GPU. The development signals a move towards more accessible, yet capable, video generation tools.

Key Technical Advancements

SANA-WM distinguishes itself by replacing many standard attention blocks with a frame-wise Gated DeltaNet (GDN). This approach differs from token-wise GDN used in language models, allowing SANA-WM to process an entire latent frame at each recurrent step. This architectural choice contributes to its efficiency.

The training regimen for SANA-WM involved a multi-stage process:

A total of approximately 18.5 days of training was conducted using 64 H100 GPUs.
This training utilized a dataset of 212,975 public video clips.
A significant portion, around 8 days (Stage 3), focused on extending training to sequences of up to 961 frames (60 seconds) and incorporating Dual-Branch Camera Control.
The primary diffusion transformer (DiT) training followed a four-stage progressive schedule, taking roughly 15 days. Early stages adapted pre-trained SANA-Video models to the frame-wise GDN structure on shorter clips.

Performance and Efficiency Claims

NVIDIA suggests SANA-WM offers substantial efficiency gains. In comparisons, SANA-WM paired with a refiner reportedly achieves 36 times greater throughput than models like LingBot-World, while maintaining comparable visual quality scores according to VBench. This enhanced performance is achieved with reduced computational requirements compared to other systems.

Open-Source Availability and Context

The release of SANA-WM as an open-source model is a significant aspect. This makes the technology potentially available for broader experimentation and application development. The project is part of a larger family of SANA models developed by NVIDIA, including those focused on high-resolution image and video synthesis. The code and project details are accessible via repositories on platforms like GitHub and Hugging Face.

Background on SANA Models

The SANA suite, developed by NVIDIA, emphasizes efficiency in generating high-resolution content. Previous iterations, such as SANA-Video and LongSANA, explored efficient video generation using techniques like Block Linear Attention. The SANA project, in general, aims for high-quality image and video synthesis with strategies that allow for deployment even on less powerful hardware, such as a laptop GPU. The goal appears to be lowering the cost of content creation.

Frequently Asked Questions

Q: What is the NVIDIA SANA-WM video model released on May 17 2026?

SANA-WM is a new open-source world model with 2.6 billion parameters. It allows users to create up to one minute of 720p video from a single image using only one GPU.

Q: Why is the SANA-WM model better for video creators?

It is more efficient than older models, offering 36 times greater throughput than systems like LingBot-World. This means creators can make videos faster without needing expensive supercomputers.

Q: How was the SANA-WM model trained by NVIDIA?

The model was trained over 18.5 days using 64 H100 GPUs and a dataset of 212,975 public video clips. It uses a special structure called Gated DeltaNet to process frames quickly.

Q: Where can developers find the SANA-WM code?

NVIDIA has made the SANA-WM model open-source for public use. Developers can access the code and project details on platforms like GitHub and Hugging Face.

NVIDIA releases SANA-WM video model on May 17 2026 for faster creation

Key Technical Advancements

Performance and Efficiency Claims

Open-Source Availability and Context

Background on SANA Models

Frequently Asked Questions

NewsRadar

The Present

Search Records

Explore

NVIDIA releases SANA-WM video model on May 17 2026 for faster creation

Key Technical Advancements

Performance and Efficiency Claims

Open-Source Availability and Context

Background on SANA Models

Frequently Asked Questions

Know What Changed

YouTube AI Likeness Detection Tool Now Open to All Creators 18 Plus

AI Systems Use Smarter Design, Not Just Bigger Models

Senate Judiciary Committee calls tech CEOs to testify in June 2026

AI Chatbots Mimic Consciousness, Experts Say They Aren't Real

Zed Bundles ChatGPT Access into Subscriptions to Cut API Costs

AI Agents Fail Safety Tests, Risk Digital Disasters

NewsRadar

The Present

Search Records

Explore