Built for AI products

Track AI usage.
Bill it accurately.
Without rebuilding your stack.

UsageBox meters tokens, GPU minutes, agent runtime, and tool calls per customer. Open-source storage engine. Connects to Stripe or your own invoicing.

Idempotent
Ingestion
Immutable
Audit trail
Open source
Storage engine
POST /v1/events
curl -X POST https://api.usagebox.com/v1/events \
-H "Authorization: Bearer $UBX_KEY" \
-d '{
"event_id": "req_8x42jk",
"account_id": "acme-co",
"meter": "llm_tokens_in",
"model": "claude-4.5-sonnet",
"quantity": 12450,
"timestamp": "2026-05-16T18:42:11Z"
}'
 
# Idempotent. Retries are safe.
# Rolls up hourly. Invoiced via Stripe.

Why generic billing breaks on AI workloads

Stripe Billing, Chargebee, Recurly: built for SaaS subscriptions in the 2010s. AI usage is a different shape of problem.

Volume breaks generic billing

Stripe's metered usage assumes thousands of events per customer per month. AI products send millions per day. Generic tools throttle, fail silently, or charge you per-event.

Attribution needs a graph

An agent run is N tool calls + M LLM calls + K memory ops. Each costs different amounts. Generic billing tools can't roll those into a single billable unit without engineering work you keep redoing.

Attacks show up in the bill

A user can manipulate prompts to trigger expensive generations. Without per-user spend ceilings and real-time anomaly detection, the first sign of an attack is your AWS or OpenAI invoice next month.

Built for the way AI products actually meter

Six primitives. Each one designed for the AI billing patterns generic tools fight you on.

Token Metering

Meter LLM input + output tokens per request, per agent, per tenant. Idempotent ingestion handles retries without double-billing.

GPU Minute Pricing

Bill inference time, fine-tuning runtime, or per-job GPU usage. Catalog-driven pricing rules; no code changes to update rates.

Per-Agent Cost Attribution

Track cost for each agent run across N tool calls + M LLM calls + memory ops. Roll up to tenant invoices automatically.

Real-Time Anomaly Detection

Cost-amplification attacks land in your usage data first. Per-user spend ceilings, alerting on token surges, kill-switches.

Hourly Rollups

Raw events feed hourly aggregates. Invoice generation is O(1) per account, not a scan over millions of rows.

Stripe + Manual Invoicing

Plug into Stripe for self-serve. Or generate finance-ready invoices for enterprise contracts. Same metering pipeline.

The storage engine is open source

usagedb is the Rust storage engine UsageBox runs on. Append-only, idempotent, immutable raw event audit trail, hourly rollups for invoice queries. Apache 2.0 on GitHub.

Read the code that produces every invoice line. Fork it. Self-host the ingestion layer while still using UsageBox for the platform side. The right answer to “is your billing math correct” is “read the code yourself.”

pbudzik/usagedb

Or read the architecture overview in our usagedb article, then go deep with the 10-part engine internals series: ingest, dedupe, columnar segments, rollups, the query engine, and how it is tested.

Notes on AI billing

Practical writing on metering patterns, AI cost attribution, and what we learn from production billing systems.

The $81,000 Meme Game: How One Slash Employee's Claude Bill Became the Face of Enterprise AI Bill Shock

Slash, a $1.4B fintech, told employees to lean into AI coding. Its head of strategic verticals took the memo seriously and burned $81,267 in Claude tokens in one week building "Brainrot Shooter," a Skibidi Toilet meme game - the story went viral on June 23, and after the coverage the game pulled ~6,900 players in 48 hours, so finance reclassified the incident as a strategic initiative. It is the perfect specimen of 2026's defining billing event: the shock bill has moved upmarket, from leaked API keys to unmetered internal seats. Same month: Uber burned its annual AI budget in four months, Microsoft canceled internal Claude Code licenses, Amazon killed its token leaderboard, and one Axios-reported client spent half a billion dollars in a single month on uncapped Claude licenses. The teardown: how an agent loop turns one seat into billions of tokens, and the four controls (per-seat gateway budgets, live meters, anomaly alerts, write-time attribution) that turn an $81K week into a $500 week plus a Slack message.

Read →

GPT-5.6 Pricing: Luna at $1/$6 Is the Real Story - and the "Quiet Tier-Up" to Price In (2026)

GPT-5.6's tier pricing is out and the naming is now official: Sol at $5/$30 per 1M tokens (same list price as GPT-5.5), Terra at $2.50/$15, and Luna at $1/$6 - a new cheap production bracket with no direct predecessor. The community verdict: Luna is the significant one, because the workhorse tier is where the volume lives. The skeptics' receipt-backed counter: GPT-5.5's output price had already doubled from $15 to $30, so "Sol holds the line" may just mean the next frontier bracket quietly steps to $60 while being marketed as "2.5x cheaper than Pro." This breaks down all three tiers against prior anchors and DeepSeek V4 Flash (still 7-20x cheaper than Luna on list), the caching economics, the gated-preview asterisk (~20 vetted partners, US-only), and the defensive posture that works whether or not the ratchet theory is true: track blended cost per task across generations, ignore vendor-framed comparisons, and keep a benchmarked fallback in a router.

Read →

Usage-Based Billing After the Buyouts: Stripe Owns Metronome, Adyen Owns Orb, Salesforce Took m3ter - Who to Pick Now (2026)

In under six months every major pure-play usage-billing specialist was acquired: Metronome by Stripe (reported ~$1B), Orb by Adyen ($335M, closing around July 1), and m3ter by Salesforce. The standalone metering category no longer exists - what exists is billing owned by your payment processor, billing owned by a CRM, open-source independents (Lago the most visible), and the build-on-your-own-meter path. That changes the buying question from "which platform is best?" to "whose roadmap do I want my revenue infrastructure on, and how expensive is my exit?" This is the decision matrix: who should pick what by situation (Stripe-native, Adyen enterprise, Salesforce RevOps, processor-neutral, subscription-first, AI per-token economics), why processor ownership matters (roadmap gravity, multi-processor leverage, renewal drift) and when it does not, plus the four questions to ask any billing vendor now - starting with continuous raw-event export, because if leaving requires re-instrumenting your product, you are not a customer, you are collateral.

Read →

Start metering AI usage in 5 minutes

Free tier. No credit card. Open-source storage engine.

Get started free →