usageDb's Query Engine: Segment Pruning, a Strict SQL Subset, and Provenance

How usageDb reads usage data back: two query paths, a deliberately strict SQL subset that refuses ambiguous billing queries, segment pruning from per-segment metadata, half-open time ranges, and explain/verify provenance endpoints.

9 min read

usageDbdatabase internalsRustquery engineSQLsegment pruningusage-based billingprovenance

The short version: usageDb's query engine is intentionally narrow. It exposes two paths, a structured JSON query and a deliberately strict SQL subset, and it refuses queries that a general SQL database would happily run. SUM only accepts quantity, COUNT only accepts *, range operators distinguish < from <=, and OR / HAVING / aliases / SELECT * are rejected with an explicit error rather than silently mapped to something plausible. The executor prunes most segments before opening a single file, uses half-open [from, to) time ranges so adjacent months never double-count, and ships an explain endpoint that names exactly which segments produced each invoice line.

This is Part 7 of the usageDb internals series, a walk through the open-source Rust storage engine (github.com/pbudzik/usagedb) that backs UsageBox. Earlier parts built the durable ingest path, the columnar segments, and the hourly rollups. This part is about reading that data back. The guiding idea: a billing query engine should do less than a general SQL database on purpose, because the worst failure mode here is not a slow query, it is a query that silently returns the wrong number and that number becomes an invoice.

Series navigation: ← Part 6: Hourly rollups and the watermark · Part 8: Compaction

Two query paths

There are two ways to ask usageDb a question, and they compile down to the same internal QueryPlan (src/query/plan.rs). The first is a structured JSON body at POST /v1/query/json with explicit fields: source, account_id, from, to, group_by, filters, and metrics. No parsing ambiguity, easy to generate from code.

POST /v1/query/json
{
  "source": "usage_rollup_hourly",
  "account_id": "acc_8f1c",
  "from": "2026-05-01T00:00:00Z",
  "to":   "2026-06-01T00:00:00Z",
  "group_by": ["meter_id", "model_id"],
  "filters": { "product_id": ["ai_gateway"] },
  "metrics": { "quantity": "sum", "count": "count" }
}

The second path is POST /v1/query/sql, which takes a SQL string and runs it through the strict subset parser. It targets the same two logical tables, usage_events (raw) and usage_rollup_hourly (rollups), and supports a bounded grammar: a projection of group columns and aggregates, an AND-only WHERE, and a GROUP BY.

POST /v1/query/sql
{
  "query": "SELECT meter_id, SUM(quantity) FROM usage_events
            WHERE account_id = 'acc_8f1c'
              AND timestamp_ms >= 1746057600000
              AND timestamp_ms <  1748736000000
            GROUP BY meter_id"
}

Why the SQL parser is strict on purpose

The parser lives in src/query/sql.rs. An earlier permissive version did what a lot of homegrown SQL layers do: it mapped SUM(any_column) to SUM(quantity), treated < and <= as interchangeable, quietly dropped OR / HAVING / aliases, and tolerated AST nodes it did not actually implement. Every one of those shortcuts is wrong-answer territory when the answer is a bill. The current parser inverts the default: anything it does not fully understand is rejected with a precise message.

Concretely, the parser refuses:

  • SUM of anything but quantity. SUM(tokens) returns the error "SUM only supports the quantity column", instead of guessing what you meant.
  • COUNT of anything but *. Column-counting is not implemented, so COUNT(meter_id) errors rather than silently behaving like COUNT(*).
  • OR in WHERE. The executor only models a conjunction of equalities and ranges. A silently-mishandled OR could match nothing, so it is rejected explicitly.
  • SELECT *, column aliases (AS), HAVING, DISTINCT, JOIN, ORDER BY, LIMIT, CTEs, set operations. Each gets its own targeted error string.

The range handling is the subtle part. The parser distinguishes the four comparison operators and normalizes every bound into a half-open [from_ms, to_ms) window, converting inclusive bounds to exclusive by adjusting by one millisecond:

match op {
    BinaryOperator::Gt   => plan.from_ms = plan.from_ms.max(v.saturating_add(1)),
    BinaryOperator::GtEq => plan.from_ms = plan.from_ms.max(v),
    BinaryOperator::Lt   => plan.to_ms   = plan.to_ms.min(v),
    BinaryOperator::LtEq => plan.to_ms   = plan.to_ms.min(v.saturating_add(1)),
    _ => unreachable!(),
}

The philosophy is that an explicit error costs an engineer a few minutes to fix their query, while a wrong number can cost a customer an over-charge and the vendor its credibility. Strictness is a billing safety feature, not pedantry.

How the executor runs a plan

The executor (src/query/executor.rs) collects events from immutable raw segments plus a snapshot of the live memtable, applies the plan's filters, groups by the requested keys, and computes the metrics. Supported group keys are the column names (account_id, product_id, meter_id, model_id, source, unit), the time buckets hour_start_ms and day, or any dimension key from the event's variable-axis map. You can also filter or group by kind to isolate corrections and retractions for forensics. Supported metrics are sum (of quantity, accumulated in i128) and count.

Segment pruning before any file is opened

The expensive part of a query is reading segment files, so the executor decides which segments could possibly contribute before opening any of them. It does this from SegmentMeta in the manifest, which carries a min/max timestamp, a bucket label, account-ID min/max bounds, and per-segment sets of product_ids / meter_ids / model_ids. The pruning predicate segment_overlaps_plan combines all of them:

// time range (half-open): segment in range iff some ts could be in [from, to)
if !(s.min_timestamp_ms < to_ms && s.max_timestamp_ms >= from_ms) { return false; }

// bucket from account_id: blake3(account_id) % bucket_count
if let Some(account) = &plan.account_id {
    let target = bucket_for_account(&AccountId(account.clone()), bucket_count);
    if s.bucket != target { return false; }
    // account name within this segment's [min, max] range
    if account < &min.0 || account > &max.0 { return false; }
}
// product / meter / model filters via the segment's ID sets
if !filter_intersects(&plan.filters, "product_id", &s.product_ids, ..) { return false; }

The bucket field is the workhorse. Segments are partitioned by bucket = blake3(account_id) % bucket_count at flush time, so a single-account monthly query touches only the one bucket that account hashes into, plus the timestamp and account-name bounds prune most segments inside even that bucket. For a database holding thousands of accounts, this turns a full scan into a handful of file opens. The same bucket check is applied to rollup segments. Pruning is segment-level only today; there is no block-level metadata for finer skipping inside a segment yet.

Half-open time ranges, everywhere

Every query path uses half-open [from, to) semantics: the from bound is included, the to bound is excluded. This is enforced both in the pruning predicate and in the per-event check inside matches_plan:

if e.timestamp_ms < plan.from_ms || e.timestamp_ms >= plan.to_ms {
    return false;
}

This is a correctness detail worth dwelling on. Consider two adjacent monthly queries, May as [May 1, Jun 1) and June as [Jun 1, Jul 1). An event landing at exactly midnight on June 1 belongs to June and June only. If the convention were inclusive on both ends, that boundary event would be counted in both months and the customer would be billed for it twice. Half-open intervals make month boundaries tile perfectly with no gaps and no overlaps, and because the rule is applied identically across the JSON path, the SQL path, and the account-usage GET, there is no path where it can drift.

Rollup source vs raw source

The source field selects the read strategy, and it ties directly back to Part 6. With source = usage_rollup_hourly (the default for the account-usage GET) the executor reads pre-aggregated hourly rollup rows for the sealed part of the range, that is up to the rollup watermark, then falls back to a raw scan only for the open-period tail above the watermark. This is the fast path for monthly totals: most of the month is already summed into compact rollup rows. With source = usage_events (or ?source=raw on the GET) the executor forces a full raw scan over the time range. Both feed the same aggregator, so SUM(quantity) is identical regardless of source.

One documented caveat: COUNT means different things per source. Over rollups each rollup row counts as one, not as the number of underlying events it aggregates. For an exact event count you must query the raw source. The README calls this out as a known difference rather than papering over it.

Provenance: explain and verify

The two endpoints that make the engine auditable are GET /v1/accounts/{id}/explain and GET /v1/accounts/{id}/verify (both in src/api/http_server.rs). They exist because when a dashboard total and an invoice total disagree, you need to find out why fast.

explain returns the breakdown for an account over a range, grouped by (product_id, meter_id, model_id, source, unit) so each row is a distinct invoice line, plus the corrections and retractions in the range as raw rows, plus the segment provenance: the rollup watermark, the IDs of every rollup and raw segment that overlaps the range (already bucket-filtered to match the executor's pruning), and for each rollup segment the input_segment_ids of the raw segments that fed it. That last piece means you can name exactly which raw segments produced a given rollup line, which is what lets an invoice snapshot reference a watermark plus a concrete source-segment set rather than a number with no paper trail.

verify computes the same SUM(quantity) two ways, once through the rollup path and once through a pure raw scan, and reports both totals plus drift = raw_total - rollup_total and a matches boolean. On a fully sealed period (to <= watermark_ms) the invariant is zero drift. A non-zero result points at a real problem: a rollup bug, a late event that landed below the watermark, or a missing rollup segment, any of which the operator-facing rebuild_rollups can repair by rewinding the watermark and refilling from raw events. There is also an inspect-segment admin command and a verify-period CLI subcommand for the same checks off-server.

The point of doing less

None of this engine would impress a SQL benchmark. It does not join, it does not sort, it does not do arbitrary aggregates. That narrowness is the feature. Because the scope is small and the schema is fixed, usageDb can prune aggressively from cheap per-segment metadata, refuse ambiguous queries that could silently produce a wrong bill, and explain exactly where each number came from. For billing, an engine you can fully reason about beats a general one you cannot.


usageDb is the open-source Rust storage engine behind UsageBox. Browse the code at github.com/pbudzik/usagedb; the query engine described here lives under src/query/.

Key Topics

  • usageDb
  • database internals
  • Rust
  • query engine
  • SQL
  • segment pruning
  • usage-based billing
  • provenance

Related Articles

Explore more articles on similar topics to deepen your understanding of usage-based billing.

Why We Built usageDb: A Purpose-Built Rust Database for AI Usage and Billing

usageDb is an open-source Rust storage engine for AI usage metering and billing. Part 1 of a 10-part internals series: t...

8 min readRead more

Inside usageDb's Ingest Path: WAL, Memtable, and the Durability Contract

How usageDb turns an acknowledged usage event into a durable, billable fact: the three-phase ingest critical section, th...

9 min readRead more

Idempotent Metering in usageDb: Dedupe, Conflicts, and At-Least-Once Collectors

How usageDb guarantees each billable event is counted exactly once: stable event_ids, blake3 128-bit payload hashing, th...

9 min readRead more

Explore More Articles

Discover our complete collection of usage-based billing guides and implementation patterns.

View all articles