Hanzo
Hanzo Chat

Models

All models available in Hanzo Chat — 14 Zen models and 100+ third-party from OpenAI, Anthropic, Google, Meta, Mistral, Together, and Groq.

Models

Hanzo Chat provides access to 14 first-party Zen models and 100+ third-party models through the Hanzo LLM Gateway.

Zen Models (First-Party)

All 14 Zen models are available by default. These are Hanzo's own frontier models built on MoDE (Mixture of Distilled Experts) architecture.

ModelTypeContextInput $/MTokOutput $/MTok
zen4Flagship128K$3.00$9.60
zen4-ultraMax Reasoning128K$3.00$9.60
zen4-proHigh Capability128K$2.70$2.70
zen4-maxLarge Documents1M$3.60$3.60
zen4-miniFast & Efficient128K$0.60$0.60
zen4-thinkingChain-of-Thought128K$2.70$2.70
zen4-coderCode Generation128K$3.60$3.60
zen4-coder-proPremium Code128K$4.50$4.50
zen4-coder-flashFast Code128K$1.50$1.50
zen3-omniMultimodal128K$1.80$6.60
zen3-vlVision-Language32K$0.45$1.80
zen3-nanoEdge32K$0.30$0.30
zen3-guardContent Safety8K$0.30$0.30
zen3-embeddingEmbeddings8K$0.39--

Model Recommendations

Use CaseRecommended Model
General chatzen4 or zen4-pro
Code generationzen4-coder or zen4-coder-pro
Quick responseszen4-mini
Image analysiszen3-omni or zen3-vl
Deep reasoningzen4-thinking or zen4-ultra
Long documentszen4-max (1M context)
Budget-friendlyzen3-nano ($0.30/MTok)
Content moderationzen3-guard

Third-Party Models

100+ models from all major providers are available through the Hanzo Gateway:

OpenAI

ModelType
gpt-4oFlagship multimodal
gpt-4o-miniFast, affordable
o3Advanced reasoning
o4-miniFast reasoning

Anthropic

ModelType
claude-opus-4Most capable
claude-sonnet-4.5Best balance
claude-haiku-4.5Fast, efficient

Google

ModelType
gemini-2.5-proAdvanced reasoning
gemini-2.5-flashFast multimodal

Meta

ModelType
llama-4-maverick400B MoE
llama-4-scout109B MoE
llama-3.3-70bOpen weights

Mistral

ModelType
mistral-largeFlagship
codestralCode-specialized
mistral-smallEfficient

Together AI

50+ open models including Llama, Mixtral, Qwen, DeepSeek, and more with fastest inference.

Groq

Ultra-fast inference for Llama, Mixtral, and Gemma models.

Configuring Models

Via chat.yaml

endpoints:
  hanzo:
    baseURL: "https://api.hanzo.ai/v1"
    apiKey: "${HANZO_API_KEY}"
    models:
      default:
        - zen4
        - zen4-coder
        - zen4-mini
      fetch: true  # Auto-discover additional models

  openAI:
    apiKey: "${OPENAI_API_KEY}"
    models:
      default:
        - gpt-4o
        - o3

Auto-Discovery

Set fetch: true on any endpoint to automatically discover available models from the API. This is useful when new models are added — they appear without config changes.

Custom Display Names

endpoints:
  hanzo:
    modelDisplayLabel: "Zen"
    iconURL: "https://cdn.hanzo.ai/img/logo-white.svg"

Pricing

  • $5 free credit on every new account (expires in 30 days)
  • Prepaid billing — add credits at console.hanzo.ai
  • $1 minimum balance required for API calls
  • No surprise bills — service stops when credits are depleted

See the full pricing table at zenlm.org/pricing.

How is this guide?

Last updated on

On this page