Predictable.
ROI-Driven.

Microsecond gateway latency. Up to 40% lower LLM costs. Zero-downtime failover — built for teams that can't afford to slow down.

Free

Perfect for side projects and learning

$0

Core Quotas

10K requests
1 user seats
Cloud
Unified API endpoint
Automatic failover
2 days log retention
Redis exact-match caching
Global budget alerts (Email)
Community support
Project and team analysis
Most Popular

Starter

For small teams and early-stage startups

$30/mo

Core Quotas

30K + overage requests
5 user seats
Cloud
Everything in Free
Semantic Caching (Qdrant L2)
RBAC (2 Roles)
Health Monitoring
Per-user budget & Auto-cutoff
30 days log retention
Email support (48h)

Business

Advanced caching and compliance

$99/mo

Core Quotas

100K + overage requests
50 user seats
Cloud
Everything in Starter
3-Layer Caching (L1 + L2 + S3)
Multi-key rotation & Load balancing
SSO (Google, GitHub)
Jaeger tracing & CSV export
24h support response
Project and team analysis

Enterprise

Complete control and dedicated support

Custom

Core Quotas

Unlimited requests
Unlimited user seats
Self-hosted / Cloud
Everything in Business
VPC / Private cloud deployment
SAML SSO & Custom RBAC
Snowflake / BigQuery export
Dedicated Slack channel
99.9% Uptime SLA
PII Sanitization (Redaction)
Project and team analysis
Open Source

Community Edition

Open source core for self-hosting

Unified API + All providers
Automatic failover
Full proper caching
Role-Based Access Control (RBAC)
Full dashboard & Analytics
30 days log retention
Smart model routing
Project and team analysis

Price

Free Forever

License

AGPL-3.0

Feature Comparison

A technical breakdown of capabilities across all tiers.

CapabilityFreeStarterBusinessEnterprise
Usage & Limits
Monthly Requests10K30K + Overage100K + OverageUnlimited
User Seats1550Unlimited
Log Retention2 Days30 Days90 DaysCustom
API Keys per User102050Unlimited
Caching Infrastructure
Exact Match Caching (Redis)
Semantic Caching (Qdrant)
Long-term Archive (S3)
Cache TTL1 Hour1 Hour6 HoursCustom
Cost Controls
Global Budget Alerts
Per-User Cutoffs
Budget IntervalsMonthlyMonthlyD/W/MCustom
Custom Model Pricing
Provider Management
Automatic Failover
Health Monitoring
Multi-Key Rotation
Load Balancing
Security & Support
SSOGoogle/GitHubSAML + All
RBAC2 Roles4 RolesCustom
SLA--99.5%99.9%
Dedicated SupportSlack + Video

Frequently
Asked
Questions

Everything you need to know about Hyperion pricing and policies.

What counts as a 'request'?

Each API call to the gateway counts as 1 request. This includes cached responses, which are billed to help you track total throughput and savings.

How does overage billing work?

On Starter and Business plans, extra requests are billed at $8 per 100,000 additional requests, up to a maximum of 3 million requests per month.

Can I pay in local currency?

Yes! We support global payments in USD. We are working on supporting more currencies soon.

What is the difference between Community and Enterprise self-hosting?

There is a massive gap in reliability and feature set. While Community Edition (AGPL-3.0) includes core L1 (Redis) and L2 (Qdrant) caching, it lacks Semantic Caching, RBAC, SSO, and Budget Cutoffs. Enterprise self-hosting includes the full proprietary feature set, SOC2/PII compliance, and 24/7 dedicated engineering support for mission-critical production loads.

All prices EXCLUDE applicable taxes (e.g. 18% GST for India).
Annual savings calculated based on 2 months free vs monthly rates.