Predictable.
ROI-Driven.

Microsecond gateway latency. Up to 40% lower LLM costs.
Zero-downtime failover — built for teams that can't afford to slow down.

Free

Perfect for side projects and learning

Core Quotas

10K requests

1 user seats

Cloud

Unified API endpoint

Automatic failover

2 days log retention

Redis exact-match caching

Global budget alerts (Email)

Community support

Project and team analysis

Starter

For small teams and early-stage startups

$30/mo

Core Quotas

30K + overage requests

5 user seats

Cloud

Everything in Free

Semantic Caching (Qdrant L2)

RBAC (2 Roles)

Health Monitoring

Per-user budget & Auto-cutoff

30 days log retention

Email support (48h)

Business

Advanced caching and compliance

$99/mo

Core Quotas

100K + overage requests

50 user seats

Cloud

Everything in Starter

3-Layer Caching (L1 + L2 + S3)

Multi-key rotation & Load balancing

SSO (Google, GitHub)

Jaeger tracing & CSV export

24h support response

Project and team analysis

Enterprise

Complete control and dedicated support

Custom

Core Quotas

Unlimited requests

Unlimited user seats

Self-hosted / Cloud

Everything in Business

VPC / Private cloud deployment

SAML SSO & Custom RBAC

Snowflake / BigQuery export

Dedicated Slack channel

99.9% Uptime SLA

PII Sanitization (Redaction)

Project and team analysis

Open Source

Community Edition

Open source core for self-hosting

Unified API + All providers

Automatic failover

Full proper caching

Role-Based Access Control (RBAC)

Full dashboard & Analytics

30 days log retention

Smart model routing

Project and team analysis

Price

Free Forever

License

AGPL-3.0

Feature Comparison

A technical breakdown of capabilities across all tiers.

Capability	Free	Starter	Business	Enterprise
Usage & Limits
Monthly Requests	10K	30K + Overage	100K + Overage	Unlimited
User Seats	1	5	50	Unlimited
Log Retention	2 Days	30 Days	90 Days	Custom
API Keys per User	10	20	50	Unlimited
Caching Infrastructure
Exact Match Caching (Redis)
Semantic Caching (Qdrant)
Long-term Archive (S3)
Cache TTL	1 Hour	1 Hour	6 Hours	Custom
Cost Controls
Global Budget Alerts
Per-User Cutoffs
Budget Intervals	Monthly	Monthly	D/W/M	Custom
Custom Model Pricing
Provider Management
Automatic Failover
Health Monitoring
Multi-Key Rotation
Load Balancing
Security & Support
SSO			Google/GitHub	SAML + All
RBAC		2 Roles	4 Roles	Custom
SLA	-	-	99.5%	99.9%
Dedicated Support				Slack + Video

Frequently
Asked
Questions

Everything you need to know about Hyperion pricing and policies.

What counts as a 'request'?

Each API call to the gateway counts as 1 request. This includes cached responses, which are billed to help you track total throughput and savings.

How does overage billing work?

On Starter and Business plans, extra requests are billed at $8 per 100,000 additional requests, up to a maximum of 3 million requests per month.

Can I pay in local currency?

Yes! We support global payments in USD. We are working on supporting more currencies soon.

What is the difference between Community and Enterprise self-hosting?

There is a massive gap in reliability and feature set. While Community Edition (AGPL-3.0) includes core L1 (Redis) and L2 (Qdrant) caching, it lacks Semantic Caching, RBAC, SSO, and Budget Cutoffs. Enterprise self-hosting includes the full proprietary feature set, SOC2/PII compliance, and 24/7 dedicated engineering support for mission-critical production loads.

All prices EXCLUDE applicable taxes (e.g. 18% GST for India).
Annual savings calculated based on 2 months free vs monthly rates.

Predictable. ROI-Driven.

Free

Starter

Business

Enterprise

Community Edition

Feature Comparison

Frequently Asked Questions

Predictable.
ROI-Driven.

Frequently
Asked
Questions