Hanzo

Hanzo Models

Model registry and catalog for AI inference across 376+ models from 58+ providers

Hanzo Models

Hanzo Models is the centralized model registry and catalog for all AI models available through the Hanzo platform. It provides model discovery, capability metadata, pricing information, and version management.

Production: https://models.hanzo.ai API: https://api.hanzo.ai/v1/models Pricing: https://pricing.hanzo.ai/v1/models

Overview

The model registry indexes 376+ models from 58+ providers, including 32 first-party Zen models. Every model served through api.hanzo.ai is registered here with standardized metadata.

Model Categories

CategoryCountDescription
Zen (First-Party)32Hanzo-trained models running on Hanzo Engine
OpenAI40+GPT-4o, GPT-5 nano/mini, o1, o3 series
Anthropic15+Claude Haiku, Sonnet, Opus series
Meta20+Llama 3.x, Llama 4 series
Google15+Gemini 2.x, 3.x series
Mistral10+Mistral, Mixtral series
Other Providers200+Cohere, AI21, DeepSeek, Qwen, etc.

Model ID Format

All models follow the {provider}-{model-name} naming convention:

zen4-pro              # Zen 4 Pro (first-party)
zen4-mini             # Zen 4 Mini (first-party)
openai-gpt-5-nano     # OpenAI GPT-5 Nano
anthropic-claude-sonnet-4-5  # Anthropic Claude Sonnet 4.5
meta-llama-4-maverick-17b    # Meta Llama 4 Maverick 17B
google-gemini-2.5-flash      # Google Gemini 2.5 Flash

API

List Models

# OpenAI-compatible model listing
curl https://api.hanzo.ai/v1/models \
  -H "Authorization: Bearer $HANZO_API_KEY"

Response:

{
  "object": "list",
  "data": [
    {
      "id": "zen4-pro",
      "object": "model",
      "created": 1740000000,
      "owned_by": "hanzo",
      "capabilities": {
        "chat": true,
        "completion": true,
        "embedding": false,
        "image": false,
        "audio": false
      },
      "context_window": 128000,
      "max_output_tokens": 32768,
      "pricing": {
        "input": 0.000003,
        "output": 0.000015,
        "unit": "per_token"
      }
    }
  ]
}

Get Model Details

curl https://models.hanzo.ai/v1/models/zen4-pro

Model Capabilities

Each model exposes its capabilities:

CapabilityDescription
chatChat completions (multi-turn conversation)
completionText completions (single prompt)
embeddingText embeddings (vector representations)
imageImage generation or understanding
audioSpeech-to-text or text-to-speech
visionImage/video understanding
toolsFunction calling / tool use
json_modeStructured JSON output

Zen Models

Zen models are Hanzo's first-party models trained and served on Hanzo Engine. They are optimized for the Hanzo platform with native tool use, MCP integration, and low-latency inference.

ModelParametersContextSpecialty
zen4-pro405B128KGeneral purpose, reasoning
zen4-mini70B128KFast, cost-effective
zen4-nano8B32KEdge deployment, low latency
zen4-code70B128KCode generation and analysis
zen4-vision70B128KMultimodal (text + image)

Zen models are built on Qwen3+ architecture and fine-tuned on Hanzo's proprietary datasets.

Provider Routing

When you send a request to api.hanzo.ai, the Gateway routes to the appropriate provider based on the model ID prefix:

Client -> api.hanzo.ai -> Gateway -> Provider
                                      ├── zen-*     -> Hanzo Engine
                                      ├── openai-*  -> OpenAI / DO AI
                                      ├── anthropic-* -> Anthropic / DO AI
                                      ├── meta-*    -> Meta / DO AI
                                      ├── google-*  -> Google / DO AI
                                      └── mistral-* -> Mistral / DO AI

Pricing

Model pricing is available through the Pricing API:

curl https://pricing.hanzo.ai/v1/models

Prices are denominated in USD per token. Input and output tokens are priced separately. Prices sync from upstream providers every 6 hours.

Model Selection

By Use Case

Use CaseRecommended ModelWhy
General chatzen4-proBest quality/cost ratio
Fast responseszen4-miniLow latency, good quality
Code generationzen4-codeOptimized for code tasks
Image understandingzen4-visionMultimodal support
Embeddingstext-embedding-3-smallHigh quality, low cost
Budgetzen4-nanoLowest cost per token

By Provider Preference

If you need a specific provider (compliance, licensing, etc.), use the provider prefix:

// Use Anthropic specifically
const response = await client.chat.completions.create({
  model: 'anthropic-claude-sonnet-4-5-20250514',
  messages: [{ role: 'user', content: 'Hello' }]
})

API gateway that routes model requests to providers

GPU inference engine serving Zen models

Real-time pricing for all models

Cloud platform with model management

How is this guide?

Last updated on

On this page