LongLive Pipeline
LongLive is a streaming pipeline and autoregressive video diffusion model from Nvidia, MIT, HKUST, HKU and THU. The model is trained using Self-Forcing on Wan2.1 1.3b with modifications to support smoother prompt switching and improved quality over longer time periods while maintaining fast generation.At a Glance
| Base Model | Wan2.1 1.3B |
| Estimated VRAM | ~20GB |
| Training | Self-Forcing |
| LoRA Support | 1.3B LoRAs |
| VACE Support | Yes |
| T2V / V2V | Yes / Yes |
Examples
The following examples include timeline JSON files with the prompts used so you can try them as well.Panda
Factory
Resolution
The generation will be faster for smaller resolutions resulting in smoother video. The visual quality will be better at 832x480 which is the resolution that the model was trained on, but you may need a more powerful GPU in order to achieve a higher FPS (the ~20 FPS reported in the paper is on a H100).Seed
The seed parameter in the UI can be used to reproduce generations. If you like the generation for a certain seed value and sequence of prompts you can re-use that value later with those same prompts to reproduce the generation.Prompting
The original project repo contains additional tips for prompting. Subject and Background/Setting Anchors The model works better if you include a subject (who/what) and background/setting (where) in each prompt. If you want continuity in the next scene then you can continue referencing the same subject and/or background/setting. For example:Offline Generation
A test script can be used for offline generation. If the model weights are not downloaded yet:output.mp4 file in the longlive directory.
See Also
Other Pipelines
StreamDiffusion V2
Real-time streaming from the original StreamDiffusion creators
Krea Realtime
14B model for highest quality generation
RewardForcing
Reward-matched training for improved output quality
MemFlow
Memory bank for long-context consistency