Hanzo Models
Model registry and catalog for AI inference across 376+ models from 58+ providers
Hanzo Models
Hanzo Models is the centralized model registry and catalog for all AI models available through the Hanzo platform. It provides model discovery, capability metadata, pricing information, and version management.
Production: https://models.hanzo.ai
API: https://api.hanzo.ai/v1/models
Pricing: https://pricing.hanzo.ai/v1/models
Overview
The model registry indexes 376+ models from 58+ providers, including 32 first-party Zen models. Every model served through api.hanzo.ai is registered here with standardized metadata.
Model Categories
| Category | Count | Description |
|---|---|---|
| Zen (First-Party) | 32 | Hanzo-trained models running on Hanzo Engine |
| OpenAI | 40+ | GPT-4o, GPT-5 nano/mini, o1, o3 series |
| Anthropic | 15+ | Claude Haiku, Sonnet, Opus series |
| Meta | 20+ | Llama 3.x, Llama 4 series |
| 15+ | Gemini 2.x, 3.x series | |
| Mistral | 10+ | Mistral, Mixtral series |
| Other Providers | 200+ | Cohere, AI21, DeepSeek, Qwen, etc. |
Model ID Format
All models follow the {provider}-{model-name} naming convention:
zen4-pro # Zen 4 Pro (first-party)
zen4-mini # Zen 4 Mini (first-party)
openai-gpt-5-nano # OpenAI GPT-5 Nano
anthropic-claude-sonnet-4-5 # Anthropic Claude Sonnet 4.5
meta-llama-4-maverick-17b # Meta Llama 4 Maverick 17B
google-gemini-2.5-flash # Google Gemini 2.5 FlashAPI
List Models
# OpenAI-compatible model listing
curl https://api.hanzo.ai/v1/models \
-H "Authorization: Bearer $HANZO_API_KEY"Response:
{
"object": "list",
"data": [
{
"id": "zen4-pro",
"object": "model",
"created": 1740000000,
"owned_by": "hanzo",
"capabilities": {
"chat": true,
"completion": true,
"embedding": false,
"image": false,
"audio": false
},
"context_window": 128000,
"max_output_tokens": 32768,
"pricing": {
"input": 0.000003,
"output": 0.000015,
"unit": "per_token"
}
}
]
}Get Model Details
curl https://models.hanzo.ai/v1/models/zen4-proModel Capabilities
Each model exposes its capabilities:
| Capability | Description |
|---|---|
chat | Chat completions (multi-turn conversation) |
completion | Text completions (single prompt) |
embedding | Text embeddings (vector representations) |
image | Image generation or understanding |
audio | Speech-to-text or text-to-speech |
vision | Image/video understanding |
tools | Function calling / tool use |
json_mode | Structured JSON output |
Zen Models
Zen models are Hanzo's first-party models trained and served on Hanzo Engine. They are optimized for the Hanzo platform with native tool use, MCP integration, and low-latency inference.
| Model | Parameters | Context | Specialty |
|---|---|---|---|
zen4-pro | 405B | 128K | General purpose, reasoning |
zen4-mini | 70B | 128K | Fast, cost-effective |
zen4-nano | 8B | 32K | Edge deployment, low latency |
zen4-code | 70B | 128K | Code generation and analysis |
zen4-vision | 70B | 128K | Multimodal (text + image) |
Zen models are built on Qwen3+ architecture and fine-tuned on Hanzo's proprietary datasets.
Provider Routing
When you send a request to api.hanzo.ai, the Gateway routes to the appropriate provider based on the model ID prefix:
Client -> api.hanzo.ai -> Gateway -> Provider
├── zen-* -> Hanzo Engine
├── openai-* -> OpenAI / DO AI
├── anthropic-* -> Anthropic / DO AI
├── meta-* -> Meta / DO AI
├── google-* -> Google / DO AI
└── mistral-* -> Mistral / DO AIPricing
Model pricing is available through the Pricing API:
curl https://pricing.hanzo.ai/v1/modelsPrices are denominated in USD per token. Input and output tokens are priced separately. Prices sync from upstream providers every 6 hours.
Model Selection
By Use Case
| Use Case | Recommended Model | Why |
|---|---|---|
| General chat | zen4-pro | Best quality/cost ratio |
| Fast responses | zen4-mini | Low latency, good quality |
| Code generation | zen4-code | Optimized for code tasks |
| Image understanding | zen4-vision | Multimodal support |
| Embeddings | text-embedding-3-small | High quality, low cost |
| Budget | zen4-nano | Lowest cost per token |
By Provider Preference
If you need a specific provider (compliance, licensing, etc.), use the provider prefix:
// Use Anthropic specifically
const response = await client.chat.completions.create({
model: 'anthropic-claude-sonnet-4-5-20250514',
messages: [{ role: 'user', content: 'Hello' }]
})Related Services
API gateway that routes model requests to providers
GPU inference engine serving Zen models
Real-time pricing for all models
Cloud platform with model management
How is this guide?
Last updated on
Hanzo Operative
Computer-use automation service that enables Claude to interact with a full desktop environment via screenshot capture, mouse/keyboard control, bash execution, and file editing tools.
Hanzo Engine
High-performance LLM inference engine — blazing-fast Rust-based serving with Metal/CUDA acceleration, quantization, vision, audio, and MCP tools