Name: UsageBox
Rating: 4.8 (50 reviews)
Author: UsageBox

This is a hands-on kata, not a think-piece. The goal: turn your usage meter from a number that feeds an invoice into a management instrument you can interrogate - cost sliced by customer, by model, by feature, by region - using nothing but the real metering API. By the end you will be able to answer "which feature is burning the most Opus tokens?" and "what is my margin on this account?" without a data warehouse, a nightly ETL, or a new table of your own.

The one idea that makes all of this work: you can only group later on what you recorded now. If you want per-feature cost in three weeks, you have to attach a feature dimension to every event today. So this kata starts at ingest, where the leverage is. If you have already worked through Kata #1 and have idempotent ingest running, you are ready.

Step 1: attach dimensions at ingest (the move that pays off later)

Every UsageEvent carries an optional dimensions object of up to 16 keys. These are free-form labels you stamp on the event - customer, feature, region, agent, plan tier, anything you might want to group or filter by later. They cost nothing at write time and they are the only thing that makes the rest of this kata possible. Attach them liberally:

curl -X POST https://api.usagebox.com/v1/usage/batch \
  -H "Authorization: Bearer $USAGEBOX_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "events": [{
      "event_id": "evt_2026-06-16_acct42_0042",
      "account_id": "acct_42",
      "meter_id": "claude_tokens",
      "model_id": "claude-opus-4-8",
      "unit": "tokens",
      "quantity": 18450,
      "timestamp": "2026-06-16T10:14:05Z",
      "dimensions": {
        "customer": "northwind",
        "feature": "doc_summary",
        "region": "us-east",
        "agent": "summarizer-v3"
      }
    }]
  }'

{ "accepted": 1, "duplicates": 0, "conflicts": 0, "rejected": 0 }

The rule of thumb: dimensions are cheap to add and impossible to backfill. An event with no feature label can never be grouped by feature, no matter how clever your later query is. So stamp the labels you might plausibly want before you need them.

Step 2: per-model breakdown

The first slice everyone wants: where is the model spend going? Group the account's usage by model_id and you see Opus, Sonnet, and Haiku side by side:

curl "https://api.usagebox.com/v1/accounts/acct_42/usage?from=2026-06-01T00:00:00Z&to=2026-07-01T00:00:00Z&group_by=model_id" \
  -H "Authorization: Bearer $USAGEBOX_KEY"

{
  "source": "rollup",
  "groups": [
    { "model_id": "claude-opus-4-8", "sum": 9120400, "count": 412 },
    { "model_id": "claude-sonnet-4-5", "sum": 7401200, "count": 503 },
    { "model_id": "claude-haiku-4-5", "sum": 1891300, "count": 128 }
  ]
}

Note "source": "rollup": completed hours are pre-aggregated, so this read is cheap and never contends with live ingest. The open hour falls back to raw automatically, so the breakdown is still correct, not just fast. Opus is half the tokens on a fifth of the calls - that is the kind of asymmetry the meter exists to surface.

Step 3: per-feature breakdown

Now group by a dimension key you stamped in Step 1. Any dimension is a valid group_by target, so feature works exactly like model_id did:

curl "https://api.usagebox.com/v1/accounts/acct_42/usage?from=2026-06-01T00:00:00Z&to=2026-07-01T00:00:00Z&group_by=feature" \
  -H "Authorization: Bearer $USAGEBOX_KEY"

{
  "source": "rollup",
  "groups": [
    { "feature": "doc_summary", "sum": 11240000, "count": 604 },
    { "feature": "chat", "sum": 5180900, "count": 389 },
    { "feature": "extract", "sum": 1992000, "count": 50 }
  ]
}

Same call shape, different lens. doc_summary is two-thirds of this account's tokens. If that feature is on a flat-rate plan, you have just found a margin leak - and you found it without building a feature-cost report, because the dimension was already on every event.

Step 4: multi-key cross-tabs with the JSON query API

Single-key grouping answers "which model?" and "which feature?" separately. The interesting questions are the cross-tabs: cost per feature per model. The JSON query API takes multiple group_by keys, optional filters to narrow the slice, and the metrics you want back (sum, count):

curl -X POST https://api.usagebox.com/v1/query/json \
  -H "Authorization: Bearer $USAGEBOX_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "account_id": "acct_42",
    "from": "2026-06-01T00:00:00Z",
    "to": "2026-07-01T00:00:00Z",
    "group_by": ["feature", "model_id"],
    "filters": { "region": "us-east" },
    "metrics": ["sum", "count"]
  }'

{
  "source": "rollup",
  "groups": [
    { "feature": "doc_summary", "model_id": "claude-opus-4-8",  "sum": 8800100, "count": 360 },
    { "feature": "doc_summary", "model_id": "claude-haiku-4-5", "sum":  640200, "count":  44 },
    { "feature": "chat",        "model_id": "claude-sonnet-4-5","sum": 4901000, "count": 301 },
    { "feature": "extract",     "model_id": "claude-opus-4-8",  "sum":  980500, "count":  22 }
  ]
}

The filters object narrowed the whole query to region: us-east before grouping. Now the picture sharpens: doc_summary is expensive specifically because it runs on Opus. Routing that one feature to Haiku where quality allows is a concrete, costed decision - not a hunch.

Step 5: ad-hoc SQL for the one-off slice

Sometimes you want an answer once and never again, and standing up a named query for it is overkill. The SQL endpoint runs a read-only query straight over the event store:

curl -X POST https://api.usagebox.com/v1/query/sql \
  -H "Authorization: Bearer $USAGEBOX_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "SELECT model_id, SUM(quantity) AS tokens FROM usage_events WHERE account_id='\''acct_42'\'' AND timestamp >= '\''2026-06-01'\'' GROUP BY model_id ORDER BY tokens DESC"
  }'

{
  "rows": [
    { "model_id": "claude-opus-4-8",  "tokens": 9120400 },
    { "model_id": "claude-sonnet-4-5", "tokens": 7401200 },
    { "model_id": "claude-haiku-4-5", "tokens": 1891300 }
  ]
}

Use this for the throwaway question - "did any account use the deprecated model after June 10?" - that does not deserve a permanent endpoint. The JSON query API is the better choice for anything you will run more than once, because it stays on the fast rollup path; reserve SQL for exploration.

Step 6: from slice to decision

The slices are only worth the trouble if they change what you do. Two moves close the loop. First, find the most expensive thing - re-run Step 4 grouped by customer instead of feature, sort by sum, and the account at the top is where your cost concentrates. Second, compute margin: for each customer, margin is what you charge them minus what their slice actually cost you. The meter gives you the cost side at the granularity of a single feature on a single model; you already know the price side. The gap is your real per-customer profitability, and it is frequently nothing like what the headline revenue suggests.

This is the whole point of treating the meter as more than a billing pipe. It is covered in depth in the meter as a management instrument: the same event store that produces invoices also tells you which customer is unprofitable, which feature to re-route to a cheaper model, and which region is quietly your biggest line item.

Production notes before you ship it

Stamp dimensions at the source. Add them in the same code path that emits the event, not in a later enrichment step. A dimension you forgot to attach is gone for that event forever - there is no backfill.
Sixteen keys is the budget. Spend it on things you will group or filter by: customer, feature, region, agent, plan tier. Do not burn slots on high-cardinality junk like request ids - the immutable event already has event_id for that.
Rollup vs raw. Grouped reads default to source=rollup for speed. Pass source=raw when you want a guaranteed full scan; expect it to be slower and to touch every event in the window.
JSON query for repeats, SQL for one-offs. The JSON query API stays on the rollup path and is the right tool for anything dashboarded. Ad-hoc SQL is for exploration you will not run twice.
Keep dimension values stable. If "northwind" becomes "Northwind Inc." mid-month, you get two groups. Normalize values before they hit ingest.

Kata variations to try

Margin leaderboard. Group by customer, join the result against your price list in a tiny script, and sort by margin. The bottom of that list is your churn-or-reprice shortlist.
Model-mix drift. Run the Step 2 breakdown for two consecutive months and watch the Opus share. A rising expensive-model fraction is a cost trend you want to catch early.
Region cost map. Group by region to see where compute concentrates - useful when you are deciding where to negotiate capacity or place inference.
Three-key cross-tab. Push group_by to ["customer", "feature", "model_id"] for the full picture of who is doing what on which model. The deepest slice is often the most actionable.

Kata FAQ

Can I add a dimension after I have already been sending events? Yes for new events, but old events stay un-labeled. You can group by the new key going forward; the historical events without it simply will not appear in those groups. There is no retroactive tagging - this is why you attach dimensions liberally up front.

How many dimension keys can one event carry? Up to 16. That is plenty for customer, feature, region, agent, plan tier and a few more. Spend the budget on keys you will actually group or filter by.

What is the difference between a dimension and model_id or meter_id? Practically none when you group - any of them is a valid group_by key. model_id and meter_id are first-class fields with their own meaning; dimensions are the free-form labels you define. You can group by, and cross-tab across, all of them together.

Should I dashboard with the JSON query API or with SQL? JSON query. It accepts multiple group keys, filters and metrics, and it stays on the cheap rollup path so a dashboard refresh does not turn into a full scan. Keep SQL for the one-off question you will not ask twice.

What you just avoided building

In six steps you went from a flat usage total to per-model, per-feature, per-region and per-customer cost, multi-key cross-tabs, ad-hoc SQL exploration, and a margin-per-customer figure - all from the same event store that already produces your invoices. Built in-house, that is a dimensional schema, a star or wide-table design, an ingest path that stamps and indexes labels, a query layer that stays fast under high cardinality, and a rollup-vs-raw consistency story you have to maintain. That is an analytics warehouse bolted onto a billing system, two hard projects instead of one. Here it is the same meter, queried a different way.

Keep reading: Kata #1 - meter usage to an invoice line, Kata #2 - live spend caps on real-time usage, Kata #3 - reconcile a vendor bill against your meter, plus the meter as a management instrument and the usage-based billing guide.

Key Topics

•usagebox kata
•dimensions
•cost allocation
•per-customer cost
•per-model cost
•unit economics
•metering API
•analytics
•2026

Next Steps

Slice AI cost by customer and model Browse all articles

←

→

Explore More Articles

Discover our complete collection of usage-based billing guides and implementation patterns.

View all articles

UsageBox Kata #4: Per-Customer, Per-Model Cost with Dimensions