Quick Start

Go Live in 3 Steps

Stand up Hyperion locally, issue your first API call, and verify cache headers in under five minutes. Keep this page open while you run commands.

Boot Gateway

Clone, configure, and run containers.

Send Request

Use SDK or REST on /v1/chat/completions.

Verify Cache

Confirm X-Cache-Status and repeat latency.

Step 1

Boot Hyperion

Launch the stack with Docker Compose. This starts the gateway, Redis, Postgres, and supporting services.

Terminal

git clone https://github.com/hyperion-hq/hyperion.git
cd hyperion-gateway
cp .env.example .env
docker compose up -d --build

.env

# Required for admin APIs
ADMIN_API_KEY=change_me

# Provider key (example)
OPENAI_API_KEY=your_provider_key

# Data stores
REDIS_URL=redis://redis:6379
DATABASE_URL=postgres://postgres:postgres@postgres:5432/hyperion?sslmode=disable

Step 2

Send Your First Request

Call Hyperion with your API key. Keep your existing OpenAI-compatible request structure.

from hyperion import HyperionClient

client = HyperionClient(
    base_url="http://localhost:8080/v1",
    api_key="sk_live_your_hyperion_key"
)

response = client.chat.completions.create(
    model="openai/gpt-4.1-mini",
    messages=[{"role": "user", "content": "Write a one-line haiku about speed."}]
)

Step 3

Validate Setup

Success Checklist

Response status is 200 OK.

X-Cache-Status header appears in responses.

Second identical call should return faster due to L1 exact cache.

Architecture

Understand routing, caching, and gateway request flow.

Explore

Caching Deep Dive

Exact and semantic strategies, TTL, and hit-ratio behavior.