Hanzo Analytics

Hanzo Analytics provides dashboards and APIs for usage, latency, cost, and reliability metrics across the Hanzo platform. It aggregates data from all services to give operators and developers real-time visibility.

Features

Usage Metrics: Requests, tokens, bandwidth, and throughput per service

Cost Tracking: Per-project budgets, spend breakdowns, and forecasting

Operational Health: Error rates, latency percentiles, and saturation signals

Alerting: Threshold-based alerts via webhooks, Slack, or PagerDuty

Export: CSV/JSON exports and streaming to BI pipelines

Dashboard

The Analytics dashboard is available at console.hanzo.ai:

┌─────────────────────────────────────────────────────────────┐
│  Hanzo Analytics                         Last 24h ▾  ⟳     │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  Requests    Tokens          Cost         Latency (p99)     │
│  ━━━━━━━━    ━━━━━━━━━━      ━━━━━━━━     ━━━━━━━━━━━━━     │
│  1.2M/day    847M tokens     $2,341/day   312ms             │
│  ↑ 15%       ↑ 22%           ↑ 18%        ↓ 8%              │
│                                                              │
│  ┌─────────────────────────────────────────────────────┐    │
│  │  Request Volume                                      │    │
│  │  ▁▂▃▄▅▆▇█▇▆▅▄▃▂▁▁▂▃▄▅▆▇▇▆▅▄                      │    │
│  │  00:00    06:00    12:00    18:00    24:00           │    │
│  └─────────────────────────────────────────────────────┘    │
│                                                              │
│  Top Models           │  Cost by Project                    │
│  ─────────────────    │  ─────────────────                  │
│  claude-sonnet  42%   │  platform    $1,205                 │
│  gpt-4o         28%   │  bot         $672                   │
│  llama-3.1      15%   │  commerce    $284                   │
│  gemini-pro      8%   │  internal    $180                   │
│  other           7%   │                                     │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Metrics API

Query usage and cost metrics programmatically:

# Get usage summary for the last 7 days
curl https://cloud.hanzo.ai/v1/analytics/usage \
  -H "Authorization: Bearer $HANZO_TOKEN" \
  -d '{
    "start_date": "2025-06-01",
    "end_date": "2025-06-07",
    "group_by": ["model", "project"],
    "metrics": ["requests", "tokens", "cost"]
  }'

Response:

{
  "data": [
    {
      "model": "claude-sonnet-4-5-20250929",
      "project": "platform",
      "requests": 142500,
      "input_tokens": 285000000,
      "output_tokens": 71250000,
      "cost": 854.25
    }
  ],
  "summary": {
    "total_requests": 1200000,
    "total_tokens": 847000000,
    "total_cost": 2341.50
  }
}

Cost Tracking

Budgets

Set per-project spending limits with automatic enforcement:

# Set a monthly budget
curl -X POST https://cloud.hanzo.ai/v1/analytics/budgets \
  -H "Authorization: Bearer $HANZO_TOKEN" \
  -d '{
    "project_id": "proj_abc123",
    "monthly_limit": 5000.00,
    "alert_thresholds": [0.50, 0.75, 0.90],
    "action_on_exceed": "soft_limit"
  }'

Cost Breakdown

Dimension	Description
By model	Cost per LLM model (input/output tokens)
By project	Cost per project/team
By API key	Cost per individual API key
By endpoint	Cost per service endpoint
By time	Hourly, daily, weekly, monthly aggregation

Operational Health

Monitor service health with real-time metrics:

Metric	Description	Alert Threshold
`error_rate`	Percentage of 4xx/5xx responses	> 1%
`latency_p50`	Median response time	> 200ms
`latency_p99`	99th percentile response time	> 2000ms
`saturation`	Queue depth / capacity ratio	> 0.8
`availability`	Successful requests / total requests	< 99.9%

Alerting

Configure alerts that trigger on metric thresholds:

# Create an alert rule
curl -X POST https://cloud.hanzo.ai/v1/analytics/alerts \
  -H "Authorization: Bearer $HANZO_TOKEN" \
  -d '{
    "name": "High error rate",
    "metric": "error_rate",
    "condition": "gt",
    "threshold": 0.01,
    "window": "5m",
    "channels": [
      {"type": "slack", "webhook": "https://hooks.slack.com/..."},
      {"type": "pagerduty", "routing_key": "..."}
    ]
  }'

Data Export

Export analytics data for external BI tools:

# Export as CSV
curl "https://cloud.hanzo.ai/v1/analytics/export?format=csv&range=30d" \
  -H "Authorization: Bearer $HANZO_TOKEN" \
  -o analytics-export.csv

# Stream to webhook (real-time)
curl -X POST https://cloud.hanzo.ai/v1/analytics/streams \
  -H "Authorization: Bearer $HANZO_TOKEN" \
  -d '{
    "destination": "https://your-bi-tool.com/ingest",
    "events": ["request.completed", "cost.updated"],
    "format": "json"
  }'

SDK Integration

Python

from hanzoai import Hanzo

client = Hanzo(api_key="your-key")

# Get usage for current billing period
usage = client.analytics.usage(
    group_by=["model"],
    metrics=["requests", "tokens", "cost"]
)

for row in usage.data:
    print(f"{row.model}: {row.requests} requests, ${row.cost:.2f}")

TypeScript

import Hanzo from '@hanzo/ai'

const client = new Hanzo({ apiKey: 'your-key' })

const usage = await client.analytics.usage({
  groupBy: ['model', 'project'],
  metrics: ['requests', 'tokens', 'cost'],
  range: '7d'
})

usage.data.forEach(row => {
  console.log(`${row.model}: ${row.requests} requests, $${row.cost}`)
})

Platform management and billing

LLM Gateway — primary source of usage data

Runtime telemetry and performance metrics

Billing and payment integration

Hanzo Analytics

On this page