Overview

This guide will take you through the process of sending video input to our StreamDiffusion pipeline. You will learn how to adjust parameters to create a variety of visual effects, utilize live streaming and audio interactivity features, generate real-time visuals, and view the resulting output video.

API Auth

The Daydream API is currently in closed beta. To request access, please fill out this form, and the Daydream team will be in touch within 48 hours.In the meantime, we invite you to join the conversation and meet fellow builders in the Daydream Discord server.
The use of the API key is currently subsidized for a limited time, and we will provide an update on pricing in the future.
The API uses Bearer auth. Include an Authorization header on every request:
Authorization: Bearer <YOUR_API_KEY>
Keep your API keys secure. Do not commit them to source control or share them publicly.

API Documentation

The full API reference is available in the sidebar and contains more in-depth descriptions of all parameters.

Creating Your First App

Building on top of our StreamDiffusion pipeline consists of three parts:
  1. Creating a Stream object
  2. Sending in video and playing the output
  3. Setting StreamDiffusion parameters

1. Create a Stream

First, we need to create a ‘Stream’ object. This will provide us with an input URL (to send in video) and an output URL (to play back the modified video).
All examples use Bash so you can copy/paste into a terminal.

Create Stream Request

# This can currently be hardcoded, but in the future will let you switch
# between different types of AI video manipulation
PIPELINE_ID="pip_qpUgXycjWF6YMeSL"

# Change this to the API key you generated
DAYDREAM_API_KEY="<YOUR_API_KEY>"

curl -X POST \
  "https://api.daydream.live/v1/streams" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${DAYDREAM_API_KEY}" \
  -d "{\"pipeline_id\":\"${PIPELINE_ID}\"}"

Create Stream Response

{
  "id": "str_gERGnGZE4331XBxW",
  "output_playback_id": "0d1crgzijlcsxpw4",
  "whip_url": "https://ai.livepeer.com/live/video-to-video/stk_abc123/whip"
}

2. Send video and play the output

Now we’ll start sending in video and view the processed output.
  1. Install OBS.
  2. Copy the whip_url from the Create Stream response.
  3. In OBS → Settings → Stream: choose WHIP as the Service and paste the whip_url as the Server. Leave the Bearer Token blank and save the settings.
  4. Under the Sourcessection, add a video source for the stream.(Ex: Video Capture Device )
  5. Under the Controls section, select Start Streaming to start the stream.
  6. Copy the output_playback_id from the Create Stream response and open: https://lvpr.tv/?v=<your output_playback_id>
  7. You should now see your output video playing.
Prefer to use your own player?
  1. Fetch playback URLs from Livepeer Studio’s Playback endpoint: curl "https://livepeer.studio/api/playback/<your playback id>"
  2. Choose either the HLS or WebRTC endpoint.
  3. Configure your chosen player with that URL.

3. Set StreamDiffusion parameters

Send a POST to https://api.daydream.live/beta/streams/<YOUR_STREAM_ID>/prompts with the full JSON body below to control the effect.
It’s currently required to send the full body on every update. A PATCH endpoint for partial updates is coming soon.
{
  "model_id": "streamdiffusion",
  "pipeline": "live-video-to-video",
  "params": {
    "model_id": "stabilityai/sd-turbo",
    "prompt": "superman",
    "prompt_interpolation_method": "slerp",
    "normalize_prompt_weights": true,
    "normalize_seed_weights": true,
    "negative_prompt": "blurry, low quality, flat, 2d",
    "num_inference_steps": 50,
    "seed": 42,
    "t_index_list": [0, 8, 17],
    "controlnets": [
      {
        "conditioning_scale": 0,
        "control_guidance_end": 1,
        "control_guidance_start": 0,
        "enabled": true,
        "model_id": "thibaud/controlnet-sd21-openpose-diffusers",
        "preprocessor": "pose_tensorrt",
        "preprocessor_params": {}
      },
      {
        "conditioning_scale": 0,
        "control_guidance_end": 1,
        "control_guidance_start": 0,
        "enabled": true,
        "model_id": "thibaud/controlnet-sd21-hed-diffusers",
        "preprocessor": "soft_edge",
        "preprocessor_params": {}
      },
      {
        "conditioning_scale": 0,
        "control_guidance_end": 1,
        "control_guidance_start": 0,
        "enabled": true,
        "model_id": "thibaud/controlnet-sd21-canny-diffusers",
        "preprocessor": "canny",
        "preprocessor_params": {"high_threshold": 200, "low_threshold": 100}
      },
      {
        "conditioning_scale": 0,
        "control_guidance_end": 1,
        "control_guidance_start": 0,
        "enabled": true,
        "model_id": "thibaud/controlnet-sd21-depth-diffusers",
        "preprocessor": "depth_tensorrt",
        "preprocessor_params": {}
      },
      {
        "conditioning_scale": 0,
        "control_guidance_end": 1,
        "control_guidance_start": 0,
        "enabled": true,
        "model_id": "thibaud/controlnet-sd21-color-diffusers",
        "preprocessor": "passthrough",
        "preprocessor_params": {}
      }
    ]
  }
}

Parameter change examples

Experiment with any of the parameters. A few useful ones:
ParameterDescription
promptGuides the model toward a desired visual style or subject.
negative_promptTells the model what not to produce (e.g., discourages low quality, flat, blurry results).
num_inference_stepsHigher values improve quality at the cost of speed/FPS.
seedEnsures reproducibility across runs. Change it to introduce variation.

ControlNets

ControlNets guide image generation by providing extra structural inputs (poses, edges, depth maps, colors) that help the model interpret the input video. They impact performance differently, so experiment with which ones to enable for your use case.
Don’t toggle the enabled field to turn ControlNets on/off, as it currently triggers a pipeline reload. Set conditioning_scale to 0 to effectively disable a ControlNet, or raise it above 0 to enable its influence.
To enable a ControlNet, increase its conditioning_scale:
{
  "conditioning_scale": 0.22,
  "control_guidance_end": 1,
  "control_guidance_start": 0,
  "enabled": true,
  "model_id": "thibaud/controlnet-sd21-openpose-diffusers",
  "preprocessor": "pose_tensorrt",
  "preprocessor_params": {}
}
Available ControlNets in the example body:
  • Pose Estimation: Body and hand pose tracking to maintain human poses in the output.
  • Edge Detection (HED): Soft edge detection preserving smooth edges and contours.
  • Canny Edge Detection: Sharp edge preservation with crisp outlines and details.
  • Depth Estimation: Preserves spatial depth and 3D structure of objects and faces.
  • Color Preservation: Color composition passthrough to maintain palette and composition.