Hanzo
Services

Hanzo Visor

VM and container runtime for AI workloads with GPU passthrough, live migration, snapshot/restore, and Kubernetes CRI integration.

Hanzo Visor

Hanzo Visor is a lightweight VM and container runtime purpose-built for AI infrastructure. It manages cloud machines, GPU-accelerated execution environments, remote desktop sessions, and environment provisioning. Visor integrates with Kubernetes as a CRI (Container Runtime Interface) provider and exposes a REST API for programmatic machine lifecycle management.

Features

  • Lightweight VM Management: Minimal hypervisor layer with sub-second cold start for microVMs
  • Container Orchestration: OCI-compatible container runtime with native K8s CRI integration
  • GPU Passthrough: Direct NVIDIA/AMD GPU passthrough via VFIO for training and inference
  • Isolated Execution: Hardware-level isolation between tenants using KVM/microVM boundaries
  • Snapshot and Restore: Full VM state capture and instant restore for checkpointing
  • Live Migration: Zero-downtime VM migration across nodes for maintenance and rebalancing
  • Resource Quotas: Per-tenant CPU, memory, GPU, and disk quotas with hard enforcement
  • Remote Sessions: RDP, SSH, and Telnet access via integrated Guacamole gateway
  • Environment Templates: Pre-built images for PyTorch, JAX, TensorRT, Candle, and more

Endpoints

EnvironmentURL
APIhttps://api.hanzo.ai/v1/visor/*
Gateway routeapi.hanzo.ai → visor:19000
Dashboardhttps://console.hanzo.ai/visor
Remote Desktophttps://visor.hanzo.ai/guacamole

Architecture

                     API Requests
                          |
                          v
                  +---------------+
                  | Hanzo Gateway |
                  | /infra/*      |
                  +-------+-------+
                          |
                          v
              +-----------------------+
              |     Visor Control     |
              |     Plane (19000)     |
              +-----+--------+-------+
                    |        |
          +---------+        +---------+
          v                            v
  +--------------+            +--------------+
  |   VM Engine  |            |  Container   |
  |  (KVM/QEMU)  |            |   Runtime   |
  +------+-------+            +------+-------+
         |                           |
    +----+----+                 +----+----+
    v         v                 v         v
 +------+ +------+          +------+ +------+
 | VM 1 | | VM 2 |          |Ctr 1 | |Ctr 2 |
 | GPU  | | CPU  |          | GPU  | | CPU  |
 +------+ +------+          +------+ +------+
         \                   /
          v                 v
 +-------------------------------------+
 |        GPU Pool (VFIO/SR-IOV)       |
 |  NVIDIA A100 | H100 | AMD MI300X   |
 +-------------------------------------+

Quick Start

List Machines

curl -H "Authorization: Bearer $HANZO_API_KEY" \
  https://api.hanzo.ai/v1/visor/machines

Create a VM

curl -X POST https://api.hanzo.ai/v1/visor/machines \
  -H "Authorization: Bearer $HANZO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "training-node-01",
    "template": "pytorch-cuda12",
    "resources": {
      "vcpus": 8, "memory_gb": 32, "disk_gb": 200,
      "gpus": 1, "gpu_type": "nvidia-a100"
    }
  }'

Snapshot and Restore

# Snapshot
curl -X POST https://api.hanzo.ai/v1/visor/machines/vm-abc123/snapshot \
  -H "Authorization: Bearer $HANZO_API_KEY" \
  -d '{"name": "checkpoint-epoch-50"}'

# Restore to new machine
curl -X POST https://api.hanzo.ai/v1/visor/machines \
  -H "Authorization: Bearer $HANZO_API_KEY" \
  -d '{"name": "restored-node", "snapshot_id": "snap-def456"}'

Live Migrate

curl -X POST https://api.hanzo.ai/v1/visor/machines/vm-abc123/migrate \
  -H "Authorization: Bearer $HANZO_API_KEY" \
  -d '{"target_node": "node-gpu-04", "strategy": "live"}'

Remote Session

curl -X POST https://api.hanzo.ai/v1/visor/machines/vm-abc123/session \
  -H "Authorization: Bearer $HANZO_API_KEY" \
  -d '{"protocol": "ssh", "user": "hanzo"}'
# Returns: {"session_id": "sess-xyz", "url": "https://visor.hanzo.ai/guacamole/#/client/sess-xyz"}

Environment Templates

TemplateFrameworksGPUDisk
pytorch-cuda12PyTorch 2.6, torchvisionCUDA 12.6100GB
jax-cuda12JAX 0.5, FlaxCUDA 12.6100GB
inference-trtTensorRT, Triton ServerCUDA 12.680GB
jupyter-gpuJupyterLab, PyTorch, JAXCUDA 12.6120GB
candle-rustRust toolchain, CandleCUDA 12.660GB
base-cpuPython 3.12, Node 22--40GB
base-gpuPython 3.12, nvidia-smiCUDA 12.660GB

Configuration

GPU Passthrough

Visor uses VFIO for direct GPU assignment. IOMMU must be enabled on host nodes:

dmesg | grep -i iommu          # Verify IOMMU
curl -H "Authorization: Bearer $HANZO_API_KEY" \
  https://api.hanzo.ai/v1/visor/gpus   # List available GPUs

Resource Quotas

curl -X PUT https://api.hanzo.ai/v1/visor/quotas/org-hanzo \
  -H "Authorization: Bearer $HANZO_API_KEY" \
  -d '{"max_vcpus": 256, "max_memory_gb": 1024, "max_gpus": 16, "max_machines": 50}'

Kubernetes CRI Integration

Register Visor as a CRI runtime for K8s-scheduled microVMs:

apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
  name: visor
handler: visor
scheduling:
  nodeSelector:
    runtime: visor
overhead:
  podFixed:
    memory: "64Mi"
    cpu: "50m"

Deploy a GPU pod with Visor isolation:

apiVersion: v1
kind: Pod
metadata:
  name: training-job
spec:
  runtimeClassName: visor
  containers:
    - name: trainer
      image: ghcr.io/hanzoai/pytorch-cuda12:latest
      resources:
        limits:
          nvidia.com/gpu: "1"
          memory: "32Gi"
          cpu: "8"

Machine Lifecycle

StateDescription
provisioningVM is being created and configured
runningVM is active and accepting connections
stoppedVM is halted, resources reserved
migratingLive-migrating to another node
snapshottingVM state is being captured
restoringRestoring from a snapshot
terminatedDestroyed, resources released

Environment Variables

VISOR_API_PORT=19000             # Control plane port
VISOR_DATA_DIR=/var/lib/visor    # VM and container data
VISOR_GPU_DRIVER=nvidia          # nvidia | amd
VISOR_IOMMU_ENABLED=true
VISOR_SNAPSHOT_BACKEND=s3        # s3 | local
VISOR_S3_ENDPOINT=https://s3.hanzo.space
VISOR_S3_BUCKET=visor-snapshots
GUACAMOLE_URL=https://visor.hanzo.ai/guacamole

SDK Usage

Python

from hanzoai import Hanzo

client = Hanzo(api_key="your-key")

machine = client.visor.machines.create(
    name="inference-server",
    template="inference-trt",
    resources={"vcpus": 4, "memory_gb": 16, "gpus": 1, "gpu_type": "nvidia-a100"}
)

snapshot = client.visor.machines.snapshot(machine.id, name="model-v2-loaded")

TypeScript

import Hanzo from '@hanzo/ai'

const client = new Hanzo({ apiKey: 'your-key' })

const machine = await client.visor.machines.create({
  name: 'inference-server',
  template: 'inference-trt',
  resources: { vcpus: 4, memory_gb: 16, gpus: 1, gpu_type: 'nvidia-a100' }
})

How is this guide?

Last updated on

On this page