Skip to main content

Krea Realtime Video Pipeline

Krea Realtime Video is a streaming pipeline and autoregressive video diffusion model from Krea. The model is trained using Self-Forcing on Wan2.1 14b.

At a Glance

Base ModelWan2.1 14B
Estimated VRAM~32GB (40GB+ recommended)
TrainingSelf-Forcing
LoRA Support14B LoRAs
VACE SupportYes
T2V / V2VYes / Limited*
*Regular V2V (using input video for latent initialization) has known quality issues. Use VACE V2V (visual conditioning with input video) for better results.

Examples

The following examples include timeline JSON files with the prompts used so you can try them as well. A GPU with >40GB VRAM (e.g. H100, RTX 6000 Pro) is recommended for these examples since they use a higher resolution.

Flower Bloom

Abstract Shape

A >= 32 GB VRAM GPU (eg RTX 5090) is recommended for these examples which have lower VRAM requirements due to the lower resolution.

Flower Bloom (Low Resolution)


Acceleration

The pipeline uses different attention kernels to accelerate inference depending on the hardware used:
  • SageAttention 2 is used on all GPUs except for Hopper GPUs (eg H100). If you run into video quality issues (which some folks have reported while using SageAttention) you can restart the server with DISABLE_SAGEATTENTION=1 (eg DISABLE_SAGEATTENTION=1 uv run daydream-scope) to fallback to Flash Attention 2.
  • Flash Attention 2 is the fallback if SageAttention 2 is disabled.
  • Flash Attention 3 is used on Hopper GPUs (eg H100).

Resolution

The generation will be faster for smaller resolutions resulting in smoother video. The visual quality will be better at higher resolutions (eg 832x480 and larger), but you may need a more powerful GPU in order to achieve a higher FPS.

Seed

The seed parameter in the UI can be used to reproduce generations. If you like the generation for a certain seed value and sequence of prompts you can re-use that value later with those same prompts to reproduce the generation.

Prompting

Subject and Background/Setting Anchors The model works better if you include a subject (who/what) and background/setting (where) in each prompt. If you want continuity in the next scene then you can continue referencing the same subject and/or background/setting. For example:
"A 3D animated scene. A **panda** walks along a path towards the camera in a park on a spring day."

"A 3D animated scene. A **panda** halts along a path in a park on a spring day."
Cinematic Long Takes The model works better for scene transitions that involve long cinematic long takes and works less well with rapid shot-by-shot transitions or fast cutscenes. Long, Detailed Prompts The model works better with long, detailed prompts. A helpful technique to extend prompts is to take a base prompt and then ask a LLM chatbot (eg. ChatGPT, Claude, Gemini, etc.) to write a more detailed version. If your base prompt is:
"A cartoon dog jumping and then running."
Then, the extended prompt could be:
"A cartoon dog with big expressive eyes and floppy ears suddenly leaps into the frame, tail wagging, and then sprints joyfully toward the camera. Its oversized paws pound playfully on the ground, tongue hanging out in excitement. The animation style is colorful, smooth, and bouncy, with exaggerated motion to emphasize energy and fun. The background blurs slightly with speed lines, giving a lively, comic-style effect as if the dog is about to jump right into the viewer."

Offline Generation

A test script can be used for offline generation. If the model weights are not downloaded yet:
# Run from scope directory
uv run download_models --pipeline krea-realtime-video
Then:
# Run from scope directory
uv run -m scope.core.pipelines.krea_realtime_video.test
This will create an output.mp4 file in the krea_realtime_video directory.

See Also

Other Pipelines

StreamDiffusion V2

Real-time streaming from the original StreamDiffusion creators

LongLive

Smooth prompt transitions and extended generation from Nvidia

RewardForcing

Reward-matched training for improved output quality

MemFlow

Memory bank for long-context consistency