Hanzo
Hanzo Skills Reference

Hanzo Python SDK

Stainless-generated Python client library for the Hanzo AI API with full type safety, async support, streaming, auto-pagination, and auto-retry across 187 endpoints.

Overview

Hanzo Python SDK is a Stainless-generated Python client library for the Hanzo AI API. Provides full type safety, async support, streaming, auto-pagination, and auto-retry. Covers all 187 API endpoints across 51 resource namespaces — identical API surface to the JS and Go SDKs.

Why Hanzo Python SDK?

  • Full type safety: Generated from OpenAPI spec via Stainless
  • OpenAI compatible: Same interface patterns, easy migration
  • Async native: Both sync and async clients
  • Streaming: First-class SSE streaming with iterators
  • Auto-retry: Configurable retry with exponential backoff
  • Pagination: Auto-pagination helpers for list endpoints

OSS Base

Generated by Stainless. Repo: hanzoai/python-sdk.

When to use

  • Python applications calling Hanzo API
  • FastAPI, Django, or Flask backends
  • ML/AI pipelines needing LLM access
  • Replacing OpenAI Python SDK with Hanzo
  • Jupyter notebooks and data science workflows

Hard requirements

  1. Python >=3.12
  2. HANZO_API_KEY for authentication

Quick reference

ItemValue
Packagehanzoai (PyPI)
Version2.2.0
Repogithub.com/hanzoai/python-sdk
Generated byStainless
LicenseBSD-3-Clause
Python>=3.12
Base URLhttps://api.hanzo.ai

Installation

# pip
pip install hanzoai

# uv (preferred)
uv add hanzoai

# poetry
poetry add hanzoai

One-file quickstart

Chat Completion

from hanzoai import Hanzo

client = Hanzo(
    api_key="your-api-key",  # defaults to HANZO_API_KEY env var
)

response = client.chat.completions.create(
    model="zen-70b",
    messages=[{"role": "user", "content": "Hello, Hanzo!"}],
)
print(response.choices[0].message.content)

Streaming

stream = client.chat.completions.create(
    model="zen-70b",
    messages=[{"role": "user", "content": "Write a poem about code"}],
    stream=True,
)

for chunk in stream:
    content = chunk.choices[0].delta.content
    if content:
        print(content, end="")

Async Client

import asyncio
from hanzoai import AsyncHanzo

client = AsyncHanzo()

async def main():
    response = await client.chat.completions.create(
        model="zen-70b",
        messages=[{"role": "user", "content": "Hello!"}],
    )
    print(response.choices[0].message.content)

asyncio.run(main())

Async Streaming

async def stream_response():
    stream = await client.chat.completions.create(
        model="zen-70b",
        messages=[{"role": "user", "content": "Write a story"}],
        stream=True,
    )

    async for chunk in stream:
        content = chunk.choices[0].delta.content
        if content:
            print(content, end="")

Embeddings

embedding = client.embeddings.create(
    model="zen-embedding",
    input="Hello world",
)
print(len(embedding.data[0].embedding))  # dimension count

Function Calling / Tools

response = client.chat.completions.create(
    model="zen-70b",
    messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
    tools=[{
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string", "description": "City name"},
                },
                "required": ["location"],
            },
        },
    }],
)

tool_call = response.choices[0].message.tool_calls[0]
if tool_call:
    print(tool_call.function.name)       # "get_weather"
    print(tool_call.function.arguments)  # '{"location":"Tokyo"}'

Vision (Multimodal)

response = client.chat.completions.create(
    model="zen-vision",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What's in this image?"},
            {"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}},
        ],
    }],
)

API Surface

The SDK provides 187 endpoints across 51 resource namespaces, generated from the OpenAPI spec via Stainless.

Core Resources

ResourceMethodDescription
client.chat.completions.create()POSTChat completion (streaming optional)
client.completions.create()POSTText completion (legacy)
client.embeddings.create()POSTGenerate embeddings
client.models.list()GETList available models
client.models.retrieve(id)GETGet model details

File Management

MethodDescription
client.files.create()Upload a file
client.files.retrieve(id)Get file metadata
client.files.list()List uploaded files
client.files.delete(id)Delete a file
client.files.content(id)Get file content

Fine-Tuning

MethodDescription
client.fine_tuning.jobs.create()Create fine-tuning job
client.fine_tuning.jobs.retrieve(id)Get job status
client.fine_tuning.jobs.list()List all jobs
client.fine_tuning.jobs.cancel(id)Cancel a job
client.fine_tuning.jobs.list_events(id)List job events

Images

MethodDescription
client.images.generate()Generate images
client.images.edit()Edit images

Audio

MethodDescription
client.audio.transcriptions.create()Speech-to-text
client.audio.translations.create()Audio translation
client.audio.speech.create()Text-to-speech

Additional Resource Namespaces

The SDK also includes namespaces for: assistants, threads, runs, batches, vector stores, model management, key management, team management, organization, budget, guardrails, credentials, and more (51 total).

Configuration

client = Hanzo(
    api_key="your-key",                    # Required (or HANZO_API_KEY env)
    base_url="https://api.hanzo.ai",       # Default
    timeout=60.0,                          # Request timeout (seconds)
    max_retries=2,                         # Auto-retry count
    default_headers={"X-Custom": "val"},   # Extra headers
    default_query={"version": "2"},        # Extra query params
)

Environment Variables

HANZO_API_KEY=your-api-key          # Required
HANZO_BASE_URL=https://...          # Override base URL
HANZO_LOG=debug                     # Enable debug logging

OpenAI Drop-In Replacement

from openai import OpenAI

client = OpenAI(
    base_url="https://api.hanzo.ai/v1",
    api_key=os.environ["HANZO_API_KEY"],
)
# Everything works — same API shape
response = client.chat.completions.create(
    model="zen-70b",
    messages=[{"role": "user", "content": "Hello"}],
)

Framework Integration

FastAPI

from fastapi import FastAPI
from fastapi.responses import StreamingResponse
from hanzoai import Hanzo

app = FastAPI()
client = Hanzo()

@app.post("/api/chat")
async def chat(messages: list[dict]):
    stream = client.chat.completions.create(
        model="zen-70b",
        messages=messages,
        stream=True,
    )

    async def generate():
        for chunk in stream:
            content = chunk.choices[0].delta.content
            if content:
                yield f"data: {content}\n\n"

    return StreamingResponse(generate(), media_type="text/event-stream")

LangChain

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    base_url="https://api.hanzo.ai/v1",
    api_key=os.environ["HANZO_API_KEY"],
    model="zen-70b",
)

response = llm.invoke("Hello, Hanzo!")

Error Handling

from hanzoai import Hanzo, APIError, RateLimitError, APIConnectionError

try:
    response = client.chat.completions.create(...)
except RateLimitError as e:
    print(f"Rate limited, retry after: {e.response.headers.get('retry-after')}")
except APIError as e:
    print(f"API error: {e.status_code} {e.message}")
except APIConnectionError as e:
    print(f"Connection error: {e}")

Pagination

# Auto-pagination
for model in client.models.list():
    print(model.id)

# Manual pagination
page = client.models.list()
print(page.data)
if page.has_next_page():
    next_page = page.get_next_page()

Package Structure

The SDK contains 55 packages in the pkg/ directory, organized by resource namespace:

hanzoai/
├── __init__.py           # Main client export
├── _client.py            # Hanzo and AsyncHanzo clients
├── _streaming.py         # Stream and AsyncStream
├── _response.py          # Response types
├── types/                # All Pydantic response models
│   ├── chat/
│   ├── fine_tuning/
│   ├── audio/
│   └── ...
├── resources/            # Resource namespace implementations
│   ├── chat/
│   │   └── completions.py
│   ├── embeddings.py
│   ├── models.py
│   ├── files.py
│   └── ...
└── pkg/                  # 55 sub-packages

Runtime Support

RuntimeSupportNotes
CPython 3.12+FullPrimary target
PyPy 3.12+FullCompatible
JupyterFullSync client in notebooks
  • hanzo/hanzo-chat.md - API reference and model catalog
  • hanzo/js-sdk.md - JavaScript equivalent
  • hanzo/go-sdk.md - Go equivalent
  • hanzo/rust-sdk.md - Rust equivalent (infrastructure SDK)

How is this guide?

Last updated on

On this page