Hanzo Studio
Visual AI engine for building image, video, 3D, and audio workflows. Node-based editor with GPU scaling, multi-tenant isolation, and enterprise billing.
Hanzo Studio
Visual AI engine for production workflows. Build image generation, video synthesis, 3D rendering, and audio processing pipelines with a drag-and-drop node editor -- then scale to GPU when you need real throughput.
Available at studio.hanzo.ai or self-hosted via Docker. Built on ComfyUI with full node compatibility.
docker pull ghcr.io/hanzoai/studio:latest
docker run -p 8188:8188 ghcr.io/hanzoai/studio:latest \
python main.py --listen 0.0.0.0 --cpu
# Open http://localhost:8188Why Hanzo Studio?
- Visual Node Editor -- Wire together models, preprocessors, and outputs in a graph-based canvas. Every node is composable.
- GPU on Demand -- Start on CPU for prototyping. Switch to T4, A100, or H100 with one API call when you're ready for production.
- Distributed Workers -- Route GPU-heavy prompts to dedicated worker machines. The coordinator handles queuing, routing, and result aggregation.
- Multi-Tenant Isolation -- Each organization gets isolated storage, compute profiles, and usage tracking. IAM authentication via hanzo.id.
- Usage Billing -- Metered billing per prompt execution via Hanzo Commerce. Per-second pricing, minimum 1 credit per run.
- Full ComfyUI Compatibility -- Every custom node, model, and workflow from the ComfyUI ecosystem works out of the box. No vendor lock-in.
Compute Profiles
Each organization gets a compute profile that determines how prompts are executed. Switch profiles at any time via the API.
| Profile | Device | GPU | VRAM | Best For |
|---|---|---|---|---|
cpu | CPU | -- | -- | Prototyping, lightweight workflows |
gpu-basic | CUDA | T4 | 16 GB | Image generation, standard diffusion models |
gpu-pro | CUDA | A100 | 80 GB | Large models, video generation, 3D |
gpu-max | CUDA | H100 | 80 GB | Maximum throughput, batch processing |
custom | CUDA | Any | Any | Bring your own instance type |
Switch to GPU
curl -X PUT https://studio.hanzo.ai/api/compute/config \
-H "Authorization: Bearer $HANZO_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"active_profile": "gpu-basic",
"gpu_type": "t4",
"auto_provision": true
}'When auto_provision is enabled, Studio automatically launches a GPU VM through Visor on first GPU prompt. The worker registers itself within 60 seconds and begins accepting work.
curl https://studio.hanzo.ai/api/compute/config \
-H "Authorization: Bearer $HANZO_TOKEN"GPU Workers
Studio supports distributed execution via worker mode. A coordinator (the main Studio instance) routes prompts to GPU workers based on the active compute profile.
How It Works
Client ──prompt──▸ Coordinator (studio.hanzo.ai)
│
├── CPU prompt ──▸ local execution
├── GPU prompt ──▸ Worker 1 (T4, 16GB)
└── GPU prompt ──▸ Worker 2 (A100, 80GB)- Client submits a prompt with optional
device_preference - Coordinator checks the org's compute profile
- If GPU requested, the prompt is forwarded to an available GPU worker via HTTP
- Worker executes locally, returns the result
- Coordinator forwards the result back to the client WebSocket
Launch a GPU Worker
curl -X POST https://studio.hanzo.ai/api/compute/provision \
-H "Authorization: Bearer $HANZO_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"gpu_type": "t4",
"worker_id": "my-gpu-worker"
}'Workers auto-register with the coordinator via heartbeats every 30 seconds. Stale workers (no heartbeat for 90s) are removed from the pool.
Self-Hosted Workers
Run a Studio instance as a headless GPU worker that registers with your coordinator:
docker run --gpus all -p 8188:8188 \
ghcr.io/hanzoai/studio:latest \
python main.py \
--listen 0.0.0.0 \
--worker-mode \
--coordinator-url http://your-coordinator:8188 \
--worker-id gpu-worker-1List Workers
curl https://studio.hanzo.ai/api/compute/workers \
-H "Authorization: Bearer $HANZO_TOKEN"Returns status, device info, VRAM, and last heartbeat for each registered worker.
Prompt Routing
Control where prompts execute with the device_preference field:
curl -X POST https://studio.hanzo.ai/api/prompt \
-H "Authorization: Bearer $HANZO_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"prompt": { ... },
"device_preference": "gpu"
}'| Preference | Behavior |
|---|---|
auto (default) | GPU if enabled in profile and worker available, else CPU |
cpu | Always execute locally on CPU |
gpu | Route to GPU worker. Returns 503 if none available |
When auto_provision is enabled and no GPU worker is available, Studio returns 202 Accepted and begins provisioning. Poll the workers endpoint for status.
API Reference
| Method | Endpoint | Description |
|---|---|---|
POST | /api/prompt | Submit a workflow prompt |
GET | /api/queue | Queue status and pending jobs |
GET | /api/jobs | List jobs with filtering |
GET | /api/system_stats | Device info, memory, VRAM |
GET | /api/compute/config | Get org compute profile |
PUT | /api/compute/config | Update compute profile |
GET | /api/compute/workers | List registered workers |
POST | /api/compute/workers/register | Worker heartbeat (internal) |
POST | /api/compute/provision | Launch GPU VM via Visor |
DELETE | /api/compute/provision | Terminate GPU VM |
Self-Hosted Deployment
CPU Only
docker run -d --name studio \
-p 8188:8188 \
-v studio-data:/app/user \
ghcr.io/hanzoai/studio:latest \
python main.py --listen 0.0.0.0 --cpuGPU (NVIDIA)
docker run -d --name studio --gpus all \
-p 8188:8188 \
-v studio-data:/app/user \
ghcr.io/hanzoai/studio:latest \
python main.py --listen 0.0.0.0With IAM Auth + Billing
services:
studio:
image: ghcr.io/hanzoai/studio:latest
ports:
- "8188:8188"
environment:
STUDIO_ENABLE_IAM_AUTH: "true"
STUDIO_IAM_URL: "https://hanzo.id"
STUDIO_NO_LOCALHOST_BYPASS: "true"
STUDIO_MULTI_TENANT: "true"
STUDIO_ORG_ID: "my-org"
STUDIO_ENABLE_BILLING: "true"
STUDIO_COMMERCE_URL: "http://commerce.hanzo.svc:8001"
STUDIO_COMMERCE_TOKEN: "${COMMERCE_TOKEN}"
volumes:
- studio-data:/app/user
command: ["python", "main.py", "--listen", "0.0.0.0", "--cpu"]
volumes:
studio-data:Environment Variables
| Variable | Default | Description |
|---|---|---|
STUDIO_ENABLE_IAM_AUTH | false | Enable IAM authentication via hanzo.id |
STUDIO_IAM_URL | https://hanzo.id | IAM server URL |
STUDIO_NO_LOCALHOST_BYPASS | false | Require auth even for localhost |
STUDIO_MULTI_TENANT | false | Per-org storage isolation |
STUDIO_ORG_ID | -- | Default organization ID |
STUDIO_ENABLE_BILLING | false | Enable Commerce usage billing |
STUDIO_COMMERCE_URL | http://commerce.hanzo.svc:8001 | Commerce API endpoint |
STUDIO_RATE_LIMIT_RPM | 60 | Max prompts per minute per org |
STUDIO_ENABLE_METRICS | false | Enable Prometheus /metrics endpoint |
STUDIO_WORKER_MODE | false | Run as headless GPU worker |
COORDINATOR_URL | -- | Coordinator URL (worker mode only) |
WORKER_ID | -- | Unique worker identifier |
VISOR_URL | https://visor.hanzo.ai | Visor API for GPU provisioning |
Get Started
- Sign in at studio.hanzo.ai with your Hanzo account
- Build a workflow in the visual node editor
- Submit prompts -- runs on CPU by default
- When you need GPU, call
PUT /api/compute/configto switch profiles - Studio provisions GPU workers automatically and routes your prompts
How is this guide?
Last updated on