Built for AI products

Track AI usage.
Bill it accurately.
Without rebuilding your stack.

UsageBox meters tokens, GPU minutes, agent runtime, and tool calls per customer. Open-source storage engine. Connects to Stripe or your own invoicing.

Idempotent
Ingestion
Immutable
Audit trail
Open source
Storage engine
POST /v1/events
curl -X POST https://api.usagebox.com/v1/events \
-H "Authorization: Bearer $UBX_KEY" \
-d '{
"event_id": "req_8x42jk",
"account_id": "acme-co",
"meter": "llm_tokens_in",
"model": "claude-4.5-sonnet",
"quantity": 12450,
"timestamp": "2026-05-16T18:42:11Z"
}'
 
# Idempotent. Retries are safe.
# Rolls up hourly. Invoiced via Stripe.

Why generic billing breaks on AI workloads

Stripe Billing, Chargebee, Recurly: built for SaaS subscriptions in the 2010s. AI usage is a different shape of problem.

Volume breaks generic billing

Stripe's metered usage assumes thousands of events per customer per month. AI products send millions per day. Generic tools throttle, fail silently, or charge you per-event.

Attribution needs a graph

An agent run is N tool calls + M LLM calls + K memory ops. Each costs different amounts. Generic billing tools can't roll those into a single billable unit without engineering work you keep redoing.

Attacks show up in the bill

A user can manipulate prompts to trigger expensive generations. Without per-user spend ceilings and real-time anomaly detection, the first sign of an attack is your AWS or OpenAI invoice next month.

Built for the way AI products actually meter

Six primitives. Each one designed for the AI billing patterns generic tools fight you on.

Token Metering

Meter LLM input + output tokens per request, per agent, per tenant. Idempotent ingestion handles retries without double-billing.

GPU Minute Pricing

Bill inference time, fine-tuning runtime, or per-job GPU usage. Catalog-driven pricing rules; no code changes to update rates.

Per-Agent Cost Attribution

Track cost for each agent run across N tool calls + M LLM calls + memory ops. Roll up to tenant invoices automatically.

Real-Time Anomaly Detection

Cost-amplification attacks land in your usage data first. Per-user spend ceilings, alerting on token surges, kill-switches.

Hourly Rollups

Raw events feed hourly aggregates. Invoice generation is O(1) per account, not a scan over millions of rows.

Stripe + Manual Invoicing

Plug into Stripe for self-serve. Or generate finance-ready invoices for enterprise contracts. Same metering pipeline.

The storage engine is open source

usagedb is the Rust storage engine UsageBox runs on. Append-only, idempotent, immutable raw event audit trail, hourly rollups for invoice queries. Apache 2.0 on GitHub.

Read the code that produces every invoice line. Fork it. Self-host the ingestion layer while still using UsageBox for the platform side. The right answer to “is your billing math correct” is “read the code yourself.”

pbudzik/usagedb

Or read the architecture overview in our usagedb article, then go deep with the 10-part engine internals series: ingest, dedupe, columnar segments, rollups, the query engine, and how it is tested.

Notes on AI billing

Practical writing on metering patterns, AI cost attribution, and what we learn from production billing systems.

Stripe Billing's 0.7% Fee, Explained: What It Buys, Where the Breakeven Breaks, and the Four Exits

The fee every founder discovers at scale: Stripe Billing charges 0.7% of billing volume on top of payment processing, and it applies to subscriptions paid on AND off Stripe. The full anatomy: what the fee includes (dunning, Smart Retries, 100M meter events/month, portal, quotes), the 1,000 events/sec ceiling that drove the $1B Metronome acquisition, worked breakeven math ($70/month at $10K MRR vs $84K/year at $1M MRR), the pay-monthly tiers at 0.67%, and the four exits ranked by disruption: negotiate, unbundle the meter, replace the billing layer, build.

Read →

Tokenmaxxing: Microsoft Says AI Costs More Than Its People, Amazon Killed Its Usage Leaderboard, and the Adoption Era Just Ended

Three weeks ended the adoption-at-all-costs era: Microsoft's internal reports show AI agents costing more than human employees for many tasks (and it canceled most Claude Code licenses), Amazon scrapped its KiroRank AI leaderboard after employees began "tokenmaxxing" (running pointless agent tasks to climb rankings on the company's dime), Sam Altman conceded token costs are "an issue," and the Linux Foundation launched the Tokenomics Foundation with Microsoft, Google Cloud, IBM, and JPMorganChase behind it. Why usage was always the wrong metric, the Goodhart's-law-at-compute-prices mechanics, and the three numbers (cost per task, value per task, the ratio's trend) that replace the leaderboard.

Read →

Fable 5 Is Eating Your Claude Plan: The 2x Burn, the June 23 Cliff, and the Usage-Credit Math

Claude Fable 5 is free on Pro/Max/Team plans June 9-22, 2026, but counts roughly DOUBLE the usage of Opus toward your limits, Max 20x users report burning 2% of their allowance per minute. On June 23 it leaves plan limits entirely and bills against prepaid usage credits at API rates ($10/$50 per MTok, $2,000/day redemption cap). What counts toward limits, the five-hour reset arithmetic, the June 23 decision tree (drop to Opus, buy credits, or move to the API), and six moves that stretch a plan through the squeeze.

Read →

Start metering AI usage in 5 minutes

Free tier. No credit card. Open-source storage engine.

Get started free →