VM and container runtime for AI workloads with GPU passthrough, live migration, snapshot/restore, and Kubernetes CRI integration.

Hanzo Visor

API reference · Hanzo Visor API → — every endpoint, generated from the OpenAPI spec.

Hanzo Visor is a lightweight VM and container runtime purpose-built for AI infrastructure. It manages cloud machines, GPU-accelerated execution environments, remote desktop sessions, and environment provisioning. Visor integrates with Kubernetes as a CRI (Container Runtime Interface) provider and exposes a REST API for programmatic machine lifecycle management.

Features

Lightweight VM Management: Minimal hypervisor layer with sub-second cold start for microVMs
Container Orchestration: OCI-compatible container runtime with native K8s CRI integration
GPU Passthrough: Direct NVIDIA/AMD GPU passthrough via VFIO for training and inference
Isolated Execution: Hardware-level isolation between tenants using KVM/microVM boundaries
Snapshot and Restore: Full VM state capture and instant restore for checkpointing
Live Migration: Zero-downtime VM migration across nodes for maintenance and rebalancing
Resource Quotas: Per-tenant CPU, memory, GPU, and disk quotas with hard enforcement
Remote Sessions: RDP, SSH, and Telnet access via integrated Guacamole gateway
Environment Templates: Pre-built images for PyTorch, JAX, TensorRT, Candle, and more

Endpoints

Environment	URL
API	`https://api.hanzo.ai/v1/visor/*`
Gateway route	`api.hanzo.ai → visor:19000`
Dashboard	`https://console.hanzo.ai/visor`
Remote Desktop	`https://visor.hanzo.ai/guacamole`

Architecture

                     API Requests
                          |
                          v
                  +---------------+
                  | Hanzo Gateway |
                  | /infra/*      |
                  +-------+-------+
                          |
                          v
              +-----------------------+
              |     Visor Control     |
              |     Plane (19000)     |
              +-----+--------+-------+
                    |        |
          +---------+        +---------+
          v                            v
  +--------------+            +--------------+
  |   VM Engine  |            |  Container   |
  |  (KVM/QEMU)  |            |   Runtime   |
  +------+-------+            +------+-------+
         |                           |
    +----+----+                 +----+----+
    v         v                 v         v
 +------+ +------+          +------+ +------+
 | VM 1 | | VM 2 |          |Ctr 1 | |Ctr 2 |
 | GPU  | | CPU  |          | GPU  | | CPU  |
 +------+ +------+          +------+ +------+
         \                   /
          v                 v
 +-------------------------------------+
 |        GPU Pool (VFIO/SR-IOV)       |
 |  NVIDIA A100 | H100 | AMD MI300X   |
 +-------------------------------------+

Quick Start

List Machines

curl -H "Authorization: Bearer $HANZO_API_KEY" \
  https://api.hanzo.ai/v1/visor/machines

Create a VM

curl -X POST https://api.hanzo.ai/v1/visor/machines \
  -H "Authorization: Bearer $HANZO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "training-node-01",
    "template": "pytorch-cuda12",
    "resources": {
      "vcpus": 8, "memory_gb": 32, "disk_gb": 200,
      "gpus": 1, "gpu_type": "nvidia-a100"
    }
  }'

Snapshot and Restore

# Snapshot
curl -X POST https://api.hanzo.ai/v1/visor/machines/vm-abc123/snapshot \
  -H "Authorization: Bearer $HANZO_API_KEY" \
  -d '{"name": "checkpoint-epoch-50"}'

# Restore to new machine
curl -X POST https://api.hanzo.ai/v1/visor/machines \
  -H "Authorization: Bearer $HANZO_API_KEY" \
  -d '{"name": "restored-node", "snapshot_id": "snap-def456"}'

Live Migrate

curl -X POST https://api.hanzo.ai/v1/visor/machines/vm-abc123/migrate \
  -H "Authorization: Bearer $HANZO_API_KEY" \
  -d '{"target_node": "node-gpu-04", "strategy": "live"}'

Remote Session

curl -X POST https://api.hanzo.ai/v1/visor/machines/vm-abc123/session \
  -H "Authorization: Bearer $HANZO_API_KEY" \
  -d '{"protocol": "ssh", "user": "hanzo"}'
# Returns: {"session_id": "sess-xyz", "url": "https://visor.hanzo.ai/guacamole/#/client/sess-xyz"}

Environment Templates

Template	Frameworks	GPU	Disk
`pytorch-cuda12`	PyTorch 2.6, torchvision	CUDA 12.6	100GB
`jax-cuda12`	JAX 0.5, Flax	CUDA 12.6	100GB
`inference-trt`	TensorRT, Triton Server	CUDA 12.6	80GB
`jupyter-gpu`	JupyterLab, PyTorch, JAX	CUDA 12.6	120GB
`candle-rust`	Rust toolchain, Candle	CUDA 12.6	60GB
`base-cpu`	Python 3.12, Node 22	--	40GB
`base-gpu`	Python 3.12, nvidia-smi	CUDA 12.6	60GB

Configuration

GPU Passthrough

Visor uses VFIO for direct GPU assignment. IOMMU must be enabled on host nodes:

dmesg | grep -i iommu          # Verify IOMMU
curl -H "Authorization: Bearer $HANZO_API_KEY" \
  https://api.hanzo.ai/v1/visor/gpus   # List available GPUs

Resource Quotas

curl -X PUT https://api.hanzo.ai/v1/visor/quotas/org-hanzo \
  -H "Authorization: Bearer $HANZO_API_KEY" \
  -d '{"max_vcpus": 256, "max_memory_gb": 1024, "max_gpus": 16, "max_machines": 50}'

Kubernetes CRI Integration

apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
  name: visor
handler: visor
scheduling:
  nodeSelector:
    runtime: visor
overhead:
  podFixed:
    memory: "64Mi"
    cpu: "50m"

Deploy a GPU pod with Visor isolation:

apiVersion: v1
kind: Pod
metadata:
  name: training-job
spec:
  runtimeClassName: visor
  containers:
    - name: trainer
      image: ghcr.io/hanzoai/pytorch-cuda12:latest
      resources:
        limits:
          nvidia.com/gpu: "1"
          memory: "32Gi"
          cpu: "8"

Machine Lifecycle

State	Description
`provisioning`	VM is being created and configured
`running`	VM is active and accepting connections
`stopped`	VM is halted, resources reserved
`migrating`	Live-migrating to another node
`snapshotting`	VM state is being captured
`restoring`	Restoring from a snapshot
`terminated`	Destroyed, resources released

Environment Variables

VISOR_API_PORT=19000             # Control plane port
VISOR_DATA_DIR=/var/lib/visor    # VM and container data
VISOR_GPU_DRIVER=nvidia          # nvidia | amd
VISOR_IOMMU_ENABLED=true
VISOR_SNAPSHOT_BACKEND=s3        # s3 | local
VISOR_S3_ENDPOINT=https://s3-api.hanzo.ai
VISOR_S3_BUCKET=visor-snapshots
GUACAMOLE_URL=https://visor.hanzo.ai/guacamole

SDK Usage

Python

from hanzoai import Hanzo

client = Hanzo(api_key="your-key")

machine = client.visor.machines.create(
    name="inference-server",
    template="inference-trt",
    resources={"vcpus": 4, "memory_gb": 16, "gpus": 1, "gpu_type": "nvidia-a100"}
)

snapshot = client.visor.machines.snapshot(machine.id, name="model-v2-loaded")

TypeScript

import Hanzo from '@hanzo/ai'

const client = new Hanzo({ apiKey: 'your-key' })

const machine = await client.visor.machines.create({
  name: 'inference-server',
  template: 'inference-trt',
  resources: { vcpus: 4, memory_gb: 16, gpus: 1, gpu_type: 'nvidia-a100' }
})

API gateway routing /infra/* to Visor

Snapshot storage on s3.hanzo.ai

Observability and monitoring for VMs

Secure networking between Visor instances

Hanzo Visor

On this page