Run your first model

Roll out a video world model on Dream Engine in five minutes. Install the SDK, drop in your API key, and stream rollouts back from a hosted DreamDojo checkpoint on H100 — no Docker, no GPU setup, no Python service to wire up.

This tutorial calls DreamDojo 2B · GR-1, the action-conditioned world model post-trained on Fourier GR-1 humanoid teleop. The model is already deployed on Modal H100s; you just send a frame + actions and stream the resulting mp4 back.

Set up your environment

You need Python ≥3.10 and a Dream Engines account with an API key.

Install the SDK

BASH

pip install dream-engine

For mp4 → numpy frame decoding (lazy on rollout.frames access), add the [decode] extra:

BASH

pip install "dream-engine[decode]"

Authenticate

Generate an API key from Settings → API keys, then export it:

BASH

export DREAM_API_KEY="dre_..."

The SDK reads DREAM_API_KEY automatically. Prefer to inject explicitly?

PYTHON

client = dream.Client(api_key="dre_...")

New accounts include free credits. Each rollout in this tutorial runs on a warm H100 — typical wall time is ~3 s at roughly $0.02 per call.

Pick a model

Dream Engine ships hosted, post-trained DreamDojo checkpoints for several embodiments. Each section of this catalog controls a different part of the deployment:

Spec	Embodiment	Resolution	Frames	Use it for
`dreamdojo-2b-gr1`	Fourier GR-1 humanoid	480 × 640	48 @ 10 fps	Bimanual manipulation, household tasks
`dreamdojo-2b-g1`	Unitree G1 humanoid	480 × 640	48 @ 10 fps	Locomotion + whole-body control
`dreamdojo-2b-yam`	Bimanual YAM	480 × 640	48 @ 10 fps	Tabletop bimanual, fine grasping

Pick any spec by passing its ID to client.models.get(...).

Run a rollout

PYTHON

import dream
 
client  = dream.Client()
model   = client.models.get("dreamdojo-2b-gr1")
rollout = model.predict(
    start_frame="start.png",
    actions="actions.npy",  # shape (48, 384) float32
)
 
print(f"cost: ${rollout.cost_usd}, wall: {rollout.wall_s:.2f}s")
rollout.save("rollout.mp4")

That's it. The mp4 contains 48 frames at 480 × 640, 10 fps.

You should see something like:

cost: $0.0192, wall: 2.81s

Don't have a frame and action sequence handy? The SDK ships a deterministic synthetic example for smoke tests:

PYTHON

import dream
 
img, actions = dream.examples.dreamdojo_grasp()
# img:     (480, 640, 3) uint8 — gradient + crosshair
# actions: (48, 384) float32 — low-amplitude sinusoid
 
rollout = dream.Client().models.get("dreamdojo-2b-gr1").predict(
    start_frame=img, actions=actions,
)
rollout.save("synthetic.mp4")

The synthetic actions are out-of-distribution — the rollout will look nonsensical — but the wire shape is exactly what the engine expects, which is enough to confirm your install and credentials are working.

What just happened

With a single predict call, you stood up a production-ready video world-model endpoint. Here's what Dream Engine did:

models.get(slug) hit GET /v1/models/{slug} to confirm the spec exists and is active on the engine version your account is pinned to.
model.predict(...) posted the start frame (PNG) and action sequence (npy) as multipart to /v1/predict. The server ran DreamDojo's chunked rectified-flow rollout on a warm H100 — bf16, 35 rectified-flow steps, 49 latent frames.
The response came back as a raw mp4. The SDK parsed X-DreamEngine-Estimated-Charge-USD and X-DreamEngine-Engine-Wall-Ms into rollout.cost_usd and rollout.wall_s.

No model.py, no Docker, no inference-server config. The same pattern works for every DreamDojo spec — swap the slug in models.get(...) and you're calling a different embodiment.

Next steps

Authentication — how API keys and rate limits actually work.
predict reference — every argument, including start_frame=np.ndarray and PIL.Image paths.
Visual MPC — score K candidate rollouts in one server roundtrip.
Bulk inference — one client.predict_many(...) call across a whole dataset.