Visual AI engine for building image, video, 3D, and audio workflows. Node-based editor with GPU scaling, multi-tenant isolation, and enterprise billing.

Hanzo Studio

Visual AI engine for production workflows. Build image generation, video synthesis, 3D rendering, and audio processing pipelines with a drag-and-drop node editor -- then scale to GPU when you need real throughput.

Available at studio.hanzo.ai or self-hosted via Docker. Built on ComfyUI with full node compatibility.

docker pull ghcr.io/hanzoai/studio:latest
docker run -p 8188:8188 ghcr.io/hanzoai/studio:latest \
  python main.py --listen 0.0.0.0 --cpu
# Open http://localhost:8188

Why Hanzo Studio?

Visual Node Editor -- Wire together models, preprocessors, and outputs in a graph-based canvas. Every node is composable.

GPU on Demand -- Start on CPU for prototyping. Switch to T4, A100, or H100 with one API call when you're ready for production.

Distributed Workers -- Route GPU-heavy prompts to dedicated worker machines. The coordinator handles queuing, routing, and result aggregation.

Multi-Tenant Isolation -- Each organization gets isolated storage, compute profiles, and usage tracking. IAM authentication via hanzo.id.

Usage Billing -- Metered billing per prompt execution via Hanzo Commerce. Per-second pricing, minimum 1 credit per run.

Full ComfyUI Compatibility -- Every custom node, model, and workflow from the ComfyUI ecosystem works out of the box. No vendor lock-in.

Compute Profiles

Each organization gets a compute profile that determines how prompts are executed. Switch profiles at any time via the API.

Profile	Device	GPU	VRAM	Best For
`cpu`	CPU	--	--	Prototyping, lightweight workflows
`gpu-basic`	CUDA	T4	16 GB	Image generation, standard diffusion models
`gpu-pro`	CUDA	A100	80 GB	Large models, video generation, 3D
`gpu-max`	CUDA	H100	80 GB	Maximum throughput, batch processing
`custom`	CUDA	Any	Any	Bring your own instance type

Switch to GPU

curl -X PUT https://studio.hanzo.ai/api/compute/config \
  -H "Authorization: Bearer $HANZO_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "active_profile": "gpu-basic",
    "gpu_type": "t4",
    "auto_provision": true
  }'

When auto_provision is enabled, Studio automatically launches a GPU VM through Visor on first GPU prompt. The worker registers itself within 60 seconds and begins accepting work.

curl https://studio.hanzo.ai/api/compute/config \
  -H "Authorization: Bearer $HANZO_TOKEN"

GPU Workers

Studio supports distributed execution via worker mode. A coordinator (the main Studio instance) routes prompts to GPU workers based on the active compute profile.

How It Works

Client ──prompt──▸ Coordinator (studio.hanzo.ai)
                      │
                      ├── CPU prompt ──▸ local execution
                      ├── GPU prompt ──▸ Worker 1 (T4, 16GB)
                      └── GPU prompt ──▸ Worker 2 (A100, 80GB)

Client submits a prompt with optional device_preference

Coordinator checks the org's compute profile

If GPU requested, the prompt is forwarded to an available GPU worker via HTTP

Worker executes locally, returns the result

Coordinator forwards the result back to the client WebSocket

Launch a GPU Worker

curl -X POST https://studio.hanzo.ai/api/compute/provision \
  -H "Authorization: Bearer $HANZO_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "gpu_type": "t4",
    "worker_id": "my-gpu-worker"
  }'

Workers auto-register with the coordinator via heartbeats every 30 seconds. Stale workers (no heartbeat for 90s) are removed from the pool.

Self-Hosted Workers

Run a Studio instance as a headless GPU worker that registers with your coordinator:

docker run --gpus all -p 8188:8188 \
  ghcr.io/hanzoai/studio:latest \
  python main.py \
    --listen 0.0.0.0 \
    --worker-mode \
    --coordinator-url http://your-coordinator:8188 \
    --worker-id gpu-worker-1

List Workers

curl https://studio.hanzo.ai/api/compute/workers \
  -H "Authorization: Bearer $HANZO_TOKEN"

Returns status, device info, VRAM, and last heartbeat for each registered worker.

Prompt Routing

Control where prompts execute with the device_preference field:

curl -X POST https://studio.hanzo.ai/api/prompt \
  -H "Authorization: Bearer $HANZO_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": { ... },
    "device_preference": "gpu"
  }'

Preference	Behavior
`auto` (default)	GPU if enabled in profile and worker available, else CPU
`cpu`	Always execute locally on CPU
`gpu`	Route to GPU worker. Returns 503 if none available

When auto_provision is enabled and no GPU worker is available, Studio returns 202 Accepted and begins provisioning. Poll the workers endpoint for status.

API Reference

Method	Endpoint	Description
`POST`	`/api/prompt`	Submit a workflow prompt
`GET`	`/api/queue`	Queue status and pending jobs
`GET`	`/api/jobs`	List jobs with filtering
`GET`	`/api/system_stats`	Device info, memory, VRAM
`GET`	`/api/compute/config`	Get org compute profile
`PUT`	`/api/compute/config`	Update compute profile
`GET`	`/api/compute/workers`	List registered workers
`POST`	`/api/compute/workers/register`	Worker heartbeat (internal)
`POST`	`/api/compute/provision`	Launch GPU VM via Visor
`DELETE`	`/api/compute/provision`	Terminate GPU VM

Self-Hosted Deployment

CPU Only

docker run -d --name studio \
  -p 8188:8188 \
  -v studio-data:/app/user \
  ghcr.io/hanzoai/studio:latest \
  python main.py --listen 0.0.0.0 --cpu

GPU (NVIDIA)

docker run -d --name studio --gpus all \
  -p 8188:8188 \
  -v studio-data:/app/user \
  ghcr.io/hanzoai/studio:latest \
  python main.py --listen 0.0.0.0

With IAM Auth + Billing

services:
  studio:
    image: ghcr.io/hanzoai/studio:latest
    ports:
      - "8188:8188"
    environment:
      STUDIO_ENABLE_IAM_AUTH: "true"
      STUDIO_IAM_URL: "https://hanzo.id"
      STUDIO_NO_LOCALHOST_BYPASS: "true"
      STUDIO_MULTI_TENANT: "true"
      STUDIO_ORG_ID: "my-org"
      STUDIO_ENABLE_BILLING: "true"
      STUDIO_COMMERCE_URL: "http://commerce.hanzo.svc:8001"
      STUDIO_COMMERCE_TOKEN: "${COMMERCE_TOKEN}"
    volumes:
      - studio-data:/app/user
    command: ["python", "main.py", "--listen", "0.0.0.0", "--cpu"]

volumes:
  studio-data:

Environment Variables

Variable	Default	Description
`STUDIO_ENABLE_IAM_AUTH`	`false`	Enable IAM authentication via hanzo.id
`STUDIO_IAM_URL`	`https://hanzo.id`	IAM server URL
`STUDIO_NO_LOCALHOST_BYPASS`	`false`	Require auth even for localhost
`STUDIO_MULTI_TENANT`	`false`	Per-org storage isolation
`STUDIO_ORG_ID`	--	Default organization ID
`STUDIO_ENABLE_BILLING`	`false`	Enable Commerce usage billing
`STUDIO_COMMERCE_URL`	`http://commerce.hanzo.svc:8001`	Commerce API endpoint
`STUDIO_RATE_LIMIT_RPM`	`60`	Max prompts per minute per org
`STUDIO_ENABLE_METRICS`	`false`	Enable Prometheus `/metrics` endpoint
`STUDIO_WORKER_MODE`	`false`	Run as headless GPU worker
`COORDINATOR_URL`	--	Coordinator URL (worker mode only)
`WORKER_ID`	--	Unique worker identifier
`VISOR_URL`	`https://visor.hanzo.ai`	Visor API for GPU provisioning

Get Started

Build a workflow in the visual node editor

Submit prompts -- runs on CPU by default

When you need GPU, call PUT /api/compute/config to switch profiles

Studio provisions GPU workers automatically and routes your prompts

Hanzo Studio

Hanzo Studio

Why Hanzo Studio?

Compute Profiles

Switch to GPU

GPU Workers

How It Works

Launch a GPU Worker

Self-Hosted Workers

List Workers

Prompt Routing

API Reference

Self-Hosted Deployment

CPU Only

GPU (NVIDIA)

With IAM Auth + Billing

Environment Variables

Get Started

Hanzo Chat

LLM Gateway

MCP Tools

API Reference

On this page