Built for AI products

Track AI usage.
Bill it accurately.
Without rebuilding your stack.

Name: UsageBox
Rating: 4.8 (50 reviews)
Author: UsageBox

UsageBox meters tokens, GPU minutes, agent runtime, and tool calls per customer. Open-source storage engine. Connects to Stripe or your own invoicing.

Start free View docs See the source on GitHub →

Idempotent

Ingestion

Immutable

Audit trail

Open source

Storage engine

POST /v1/events
curl -X POST https://api.usagebox.com/v1/events \
  -H "Authorization: Bearer $UBX_KEY" \
  -d '{
    "event_id": "req_8x42jk",
    "account_id": "acme-co",
    "meter": "llm_tokens_in",
    "model": "claude-4.5-sonnet",
    "quantity": 12450,
    "timestamp": "2026-05-16T18:42:11Z"
  }'
 
# Idempotent. Retries are safe.
# Rolls up hourly. Invoiced via Stripe.

Why generic billing breaks on AI workloads

Stripe Billing, Chargebee, Recurly: built for SaaS subscriptions in the 2010s. AI usage is a different shape of problem.

Volume breaks generic billing

Stripe's metered usage assumes thousands of events per customer per month. AI products send millions per day. Generic tools throttle, fail silently, or charge you per-event.

Attribution needs a graph

An agent run is N tool calls + M LLM calls + K memory ops. Each costs different amounts. Generic billing tools can't roll those into a single billable unit without engineering work you keep redoing.

Attacks show up in the bill

A user can manipulate prompts to trigger expensive generations. Without per-user spend ceilings and real-time anomaly detection, the first sign of an attack is your AWS or OpenAI invoice next month.

Built for the way AI products actually meter

Six primitives. Each one designed for the AI billing patterns generic tools fight you on.

Token Metering

Meter LLM input + output tokens per request, per agent, per tenant. Idempotent ingestion handles retries without double-billing.

GPU Minute Pricing

Bill inference time, fine-tuning runtime, or per-job GPU usage. Catalog-driven pricing rules; no code changes to update rates.

Per-Agent Cost Attribution

Track cost for each agent run across N tool calls + M LLM calls + memory ops. Roll up to tenant invoices automatically.

Real-Time Anomaly Detection

Cost-amplification attacks land in your usage data first. Per-user spend ceilings, alerting on token surges, kill-switches.

Hourly Rollups

Raw events feed hourly aggregates. Invoice generation is O(1) per account, not a scan over millions of rows.

Stripe + Manual Invoicing

Plug into Stripe for self-serve. Or generate finance-ready invoices for enterprise contracts. Same metering pipeline.

The storage engine is open source

usagedb is the Rust storage engine UsageBox runs on. Append-only, idempotent, immutable raw event audit trail, hourly rollups for invoice queries. Apache 2.0 on GitHub.

Read the code that produces every invoice line. Fork it. Self-host the ingestion layer while still using UsageBox for the platform side. The right answer to “is your billing math correct” is “read the code yourself.”

pbudzik/usagedb

Or read the architecture overview in our usagedb article, then go deep with the 10-part engine internals series: ingest, dedupe, columnar segments, rollups, the query engine, and how it is tested.

Notes on AI billing

Practical writing on metering patterns, AI cost attribution, and what we learn from production billing systems.

Browse all articles →

July 30, 2026 · 7 min read

Free AI Credits Can Quietly Switch On Metered Billing

On a subscription plan the rate limit is a spend cap you never configured: hit it and you stop. Usage credits remove that wall and replace it with a meter, and it takes one toggle. Claude Pro users reported a $100 promotional credit leaving usage credits enabled with no limit, after which overage applied to every model past their plan allowance. The report is contested, which is exactly why you should read your usage settings instead of assuming.

Read →

July 24, 2026 · 7 min read

The AI-Wrapper Margin: How to Find Out What That $29/Month Tool Actually Pays Per Token (2026)

A screenshot recipe went around this month for reverse-engineering what a $29/month AI tool actually pays: find the model it runs, open the provider's per-token page, do the math. For most single-purpose wrappers the raw inference cost of a typical user is cents to low single-digit dollars - which is why $29-$99 tiers exist. The five-minute estimate, the two honest caveats (wrappers get batch/caching/volume discounts you do not, and light users subsidize the whales), what it means if you buy AI tools (heavy use is where the flat price stops being a deal), and what it means if you build them (a flat price is a leveraged bet on your usage distribution that inverts silently - meter per-user cost or run it blind).

Read →

July 24, 2026 · 7 min read

Retry Storms: How One Bad Hour Rebills More LLM Tokens Than a Month of Servers (2026)

Why did one day of AI cost more than a month of servers? A retry storm. When a model call times out or 5xx's, retry logic fires again, and each attempt that reaches the provider is a fresh, fully-billed generation - even the ones your app throws away. A slow provider hour plus aggressive retries plus concurrency multiplies spend by the retry count, with no fixed ceiling the way a server bill has. The five controls that cap it (hard retry limit, exponential backoff with full jitter, circuit breaker, retry only retry-safe statuses, idempotency keys), why 429s are a trap, and why the meter has to count attempts - not just successes - so the storm shows up the same day instead of on the invoice.

Read →

Start metering AI usage in 5 minutes

Free tier. No credit card. Open-source storage engine.

Get started free →

Track AI usage. Bill it accurately.Without rebuilding your stack.

Why generic billing breaks on AI workloads

Volume breaks generic billing

Attribution needs a graph

Attacks show up in the bill

Built for the way AI products actually meter

Token Metering

GPU Minute Pricing

Per-Agent Cost Attribution

Real-Time Anomaly Detection

Hourly Rollups

Stripe + Manual Invoicing

The storage engine is open source

Notes on AI billing

Free AI Credits Can Quietly Switch On Metered Billing

The AI-Wrapper Margin: How to Find Out What That $29/Month Tool Actually Pays Per Token (2026)

Retry Storms: How One Bad Hour Rebills More LLM Tokens Than a Month of Servers (2026)

Start metering AI usage in 5 minutes

Track AI usage.
Bill it accurately.
Without rebuilding your stack.