Built for AI products

Track AI usage.
Bill it accurately.
Without rebuilding your stack.

UsageBox meters tokens, GPU minutes, agent runtime, and tool calls per customer. Open-source storage engine. Connects to Stripe or your own invoicing.

Idempotent
Ingestion
Immutable
Audit trail
Open source
Storage engine
POST /v1/events
curl -X POST https://api.usagebox.com/v1/events \
-H "Authorization: Bearer $UBX_KEY" \
-d '{
"event_id": "req_8x42jk",
"account_id": "acme-co",
"meter": "llm_tokens_in",
"model": "claude-4.5-sonnet",
"quantity": 12450,
"timestamp": "2026-05-16T18:42:11Z"
}'
Β 
# Idempotent. Retries are safe.
# Rolls up hourly. Invoiced via Stripe.

Why generic billing breaks on AI workloads

Stripe Billing, Chargebee, Recurly: built for SaaS subscriptions in the 2010s. AI usage is a different shape of problem.

Volume breaks generic billing

Stripe's metered usage assumes thousands of events per customer per month. AI products send millions per day. Generic tools throttle, fail silently, or charge you per-event.

Attribution needs a graph

An agent run is N tool calls + M LLM calls + K memory ops. Each costs different amounts. Generic billing tools can't roll those into a single billable unit without engineering work you keep redoing.

Attacks show up in the bill

A user can manipulate prompts to trigger expensive generations. Without per-user spend ceilings and real-time anomaly detection, the first sign of an attack is your AWS or OpenAI invoice next month.

Built for the way AI products actually meter

Six primitives. Each one designed for the AI billing patterns generic tools fight you on.

Token Metering

Meter LLM input + output tokens per request, per agent, per tenant. Idempotent ingestion handles retries without double-billing.

GPU Minute Pricing

Bill inference time, fine-tuning runtime, or per-job GPU usage. Catalog-driven pricing rules; no code changes to update rates.

Per-Agent Cost Attribution

Track cost for each agent run across N tool calls + M LLM calls + memory ops. Roll up to tenant invoices automatically.

Real-Time Anomaly Detection

Cost-amplification attacks land in your usage data first. Per-user spend ceilings, alerting on token surges, kill-switches.

Hourly Rollups

Raw events feed hourly aggregates. Invoice generation is O(1) per account, not a scan over millions of rows.

Stripe + Manual Invoicing

Plug into Stripe for self-serve. Or generate finance-ready invoices for enterprise contracts. Same metering pipeline.

The storage engine is open source

usagedb is the Rust storage engine UsageBox runs on. Append-only, idempotent, immutable raw event audit trail, hourly rollups for invoice queries. Apache 2.0 on GitHub.

Read the code that produces every invoice line. Fork it. Self-host the ingestion layer while still using UsageBox for the platform side. The right answer to β€œis your billing math correct” is β€œread the code yourself.”

pbudzik/usagedb

Or read the architecture overview in our usagedb article, then go deep with the 10-part engine internals series: ingest, dedupe, columnar segments, rollups, the query engine, and how it is tested.

Notes on AI billing

Practical writing on metering patterns, AI cost attribution, and what we learn from production billing systems.

Adyen Just Bought Orb for $335M: The Metering Layer Is Being Absorbed Into Payments (2026)

On June 11, 2026, Adyen agreed to acquire usage-based billing platform Orb (used by Vercel, Replit, Supabase, Glean) for $335M, expected to close ~July 1 alongside Talon.One. The pitch: unify billing and payments so merchants link pricing to payment performance and fraud risk; PYMNTS framed it as Adyen tackling complex AI pricing. The signal for teams choosing how to meter and bill AI usage: metering is now strategic infrastructure, the standalone metering category is consolidating into payments giants, and that reshapes build-vs-buy. "Buy" now carries acquisition risk, owning the metering core got more defensible, and portability is the load-bearing requirement. How to map vendor concentration before the deal closes.

Read β†’

The AI Usage Meter Is Now a Management Instrument: Every Token Your Team Spends Is a Tracked, Attributable Signal (2026)

When GitHub moved every Copilot plan to usage-based token billing on June 1, 2026, the lasting change was not the price - it was that the meter became a management instrument. Once usage is metered per request, per model, and per user, it becomes observable: who spends, on which workflows, how efficiently. A YouTube breakdown put it bluntly - "Every Token You Type Is Now a Penny Your Boss Tracks." The same per-person meter can be pointed for the team (a shared instrument panel that funds what works) or on the team (a surveillance leaderboard that drives the "pay the same, get anxiety for free" backlash). The metering tech is identical; the direction you point it is the decision that matters. Why this is the same pattern that turned AWS billing into FinOps, and why a meter for the team has to be real-time and attributable or it is just a slower invoice.

Read β†’

Claude Fable 5 Lasted 72 Hours: The Government Pulled It, and the Refunds Are Messy

Claude Fable 5 launched June 9 and was pulled worldwide on June 12 by a US Commerce export-control order (national security) barring foreign-national access β€” so Anthropic disabled Fable 5 and the Mythos 5 class for everyone. Live ~72 hours. Refunds opened (desktop-only, disputed). The reported trigger: a rival (WSJ named Amazon) showed Commerce a safety bypass; Anthropic disputes it. The buyer lesson: model availability is now a regulatory risk you must price and engineer for β€” router fallbacks, eval suites, per-model metering, and refund-ready billing.

Read β†’

Start metering AI usage in 5 minutes

Free tier. No credit card. Open-source storage engine.

Get started free β†’