Dream Engines

model.predict

Run one rollout against a ModelHandle. Returns a Rollout with lazy frame decode + cost / wall metadata.

Signature

PYTHON
def predict(
self,
*,
start_frame: np.ndarray | PIL.Image.Image | bytes | str | Path | None = None,
actions: np.ndarray | list | bytes | str | Path | None = None,
# Phase 0 byte-level surface (kept for power users)
frame_bytes: bytes | None = None,
actions_bytes: bytes | None = None,
frame_path: str | Path | None = None,
actions_path: str | Path | None = None,
num_steps: int | None = None, # override diffusion step count
guidance: float | None = None, # override classifier-free guidance
seed: int = 0, # deterministic seed
) -> Rollout

You either pass start_frame + actions (recommended) or the byte-level args. Mixing the two raises dream.InputValidationError.

Inputs — start_frame

TypeWhat it does
np.ndarray (H, W, 3) uint8encoded to PNG, sent as multipart
np.ndarray (3, H, W) uint8auto-transposed to (H, W, 3)
np.ndarray float in [0, 1]scaled to uint8, then encoded
PIL.Image.Imagere-encoded to PNG
bytespassed through (assumes PNG/JPEG)
str / Pathread from disk as bytes

Validation runs at the SDK boundary — wrong shape, wrong dtype, or missing path raises dream.InputValidationError before the request hits the network.

Inputs — actions

Required shape: (T, action_dim) float32, where action_dim matches the model spec (model.action_dim, e.g. 384 for GR-1) and T is a multiple of model.chunk_size (12 for GR-1; the canonical for GR-1-class specs is 48 frames = 4 chunks — see Frames, chunks, fps).

TypeWhat it does
np.ndarray (T, action_dim)cast to float32, saved as .npy
np.ndarray (T, action_dim) float64silently downcast to float32
nested list / tupleconverted via np.asarray
bytespassed through (assumes .npy blob)
str / Pathread from disk

action_dim mismatch raises dream.InputValidationError with a specific message ("model expects 384, got array with shape (49, 100)").

The result — Rollout

PYTHON
@dataclass
class Rollout:
mp4_bytes: bytes # raw mp4 — always populated
request_id: str # server-assigned UUID
engine_wall_ms: float # server-side Engine.predict() wall
cost_usd: float # frames × tier_price
customer_id: str # Stripe customer
psnr_db: float | None = None # populated when score=True (future)
ssim: float | None = None
lpips: float | None = None

Convenience accessors:

PYTHON
rollout.cost_usd # alias for cost_usd
rollout.wall_s # engine_wall_ms / 1000
rollout.frames # (T, H, W, 3) uint8 numpy ndarray (decoded lazily)
rollout.save("out.mp4") # writes mp4_bytes to disk; returns Path

rollout.frames triggers decode-via-mediapy on first access and caches the array. If [decode] extra isn't installed, frames returns None; mp4_bytes is always there.

Examples

Numpy in, mp4 out

PYTHON
import numpy as np
import dream
client = dream.Client()
model = client.models.get("dreamdojo-2b-gr1")
start_frame = np.zeros((480, 640, 3), dtype=np.uint8) # your real start
actions = np.load("teleop.npy") # (48, 384)
rollout = model.predict(start_frame=start_frame, actions=actions)
rollout.save("out.mp4")

Override diffusion steps

PYTHON
fast = model.predict(
start_frame=img, actions=actions,
num_steps=20, # default 35 for DreamDojo; halving roughly halves wall at a small quality cost
)

From file paths

PYTHON
rollout = model.predict(
start_frame="/tmp/start.png",
actions="/tmp/teleop.npy",
)

Cost vs wall

The two clocks are independent:

  • engine_wall_s ≈ 2.6 s for DreamDojo on H100 (warm).
  • Client-side wall = engine_wall_s + transit (~1 s) + cold-start (0 if warm, ~75 s if cold).

You're billed on the server's frame count, not on either wall.

Errors

  • dream.InputValidationError — wrong shape, dtype, missing path, conflicting args.
  • dream.ModelNotActiveError — handle slug isn't the server's active spec.
  • dream.AuthError, dream.RateLimitError, dream.ModelNotFoundError, dream.ServerError — network-side. See Errors & retries.