The short version: usageDb is a small, embedded, append-only usage database written in Rust for AI billing workloads: token, credit, and tool-call metering. It exists because four billing invariants (idempotency, immutability, cheap account-month totals, and recoverable writes) are awkward and expensive to enforce on a general SQL database or a time-series engine. This article is the hub for a ten-part series that walks the engine module by module, from the ingest path down to the property tests.
If you have ever built metering for a usage-based product, you know the failure modes are not the ones a generic database is designed to prevent. A collector retries a batch and you bill twice. A row gets silently updated and an invoice line stops matching its audit trail. A late correction lands after a period closed and the number on the PDF drifts. usageDb is the open-source Rust storage engine behind UsageBox, and it is built around making those failure modes structurally impossible rather than operationally discouraged. For the bird's-eye picture, start with the usageDb overview; this piece goes one level deeper into why the engine is shaped the way it is.
usageDb is a documented MVP scaffold. The ingest path is durable end to end, the query path is functional, and a handful of spec items are still stubbed. Where something is planned rather than shipped, this series says so. Everything else is grounded in the source you can read on GitHub.
Four billing invariants a generic database fights you on
The invariants below are not nice-to-haves. They are the difference between a billing system you can defend in front of a customer and one you cannot. Each one is cheap when the engine is built for it and expensive when it is bolted on.
1. Idempotency
Every event carries a stable event_id. A same-payload retry is a duplicate and is silently absorbed. A same-id-with-different-payload is a conflict, surfaced in the ingest response because it almost always means a buggy collector. In a general SQL database you reach for a unique constraint and an ON CONFLICT clause, but that only catches the key collision; it does not distinguish "harmless retry" from "your upstream just sent me contradictory data for the same id". usageDb classifies both, and the dedupe check happens against a hot in-memory cache before anything touches durable state. The cache uses a 128-bit blake3 hash with a 7-day TTL, and crucially it is rebuilt on restart by replaying the WAL and scanning recent raw segments, so retries that span a process restart are still caught. Part 3 covers the dedupe machinery.
2. Immutability
Raw segments are written once and never modified. This is the audit trail that backs every invoice line: when a customer asks "where did this 4.2 million tokens come from", you can scan the exact events that produced it, byte-identical to when they were ingested. A general OLTP database treats rows as mutable by default, so preserving an immutable history means triggers, audit tables, or event sourcing layered on top. Here, immutability is the storage model itself. Corrections and retractions are not UPDATEs; they are new events of kind Correction or Retraction that net against the original. The columnar on-disk format that makes those immutable segments compact is the subject of Part 4.
3. Cheap account-month totals
The single most common query in billing is "total usage for this account this month", and it has to be fast even when the account has emitted tens of millions of events. usageDb maintains hourly rollups as the fast path and keeps raw scans as the correctness fallback. The account-usage endpoint defaults to source=rollup for snappy monthly totals and accepts source=raw to force a full scan when you need to verify the rollup. A time-series database gives you fast aggregates too, but it typically does so by discarding or downsampling raw points, which is exactly what you cannot do when each point is a billable fact. usageDb keeps both: the rollup for speed, the raw events for truth. Part 6 explains the rollup builder and the watermark that keeps the two consistent.
4. Recoverable writes
Every acknowledged event is durable in the write-ahead log or in a committed segment. There is no window where the engine has told a client "accepted" but could lose the event on a crash. The default durability mode flushes and fsyncs the WAL before acking. The contract is explicit and tested, which is the whole point: billing cannot tolerate "probably durable". Part 2 dissects the ingest path and the durability contract in detail.
The end-to-end architecture
The write path is a straight line with a few well-defined commit points. Read it top to bottom:
clients
|
v
HTTP ingest -> hot dedupe -> WAL (numbered files, fsynced)
|
v
memtable
|
(size threshold) v
raw segment writer
|
v
manifest.json (atomic rename)
|
WAL files <= sealed_id deleted
queries: scan raw segments (timestamp-pruned via SegmentMeta)
+ memtable snapshot, filter / group / aggregate.
An incoming batch hits the HTTP ingest endpoint, is validated and classified against the hot dedupe cache, then appended to a numbered WAL file that is fsynced before the batch is acknowledged. Accepted events land in an in-memory memtable. When the memtable crosses its size threshold, a background flusher seals the active WAL file, writes the buffered events out as an immutable columnar raw segment, and records that segment in the manifest via an atomic temp-plus-rename commit. Once the manifest names the new segment, every WAL file at or below the sealed id is deleted, because its contents now live durably in the segment.
Reads are deliberately simple. A query scans the relevant raw segments plus a snapshot of the memtable, then filters, groups, and aggregates. Segments are pruned before their files are even opened, using the timestamp range, the account-id range, and the per-segment sets of product, meter, and model ids carried in SegmentMeta. The manifest is the source of truth for which segments exist, and the engine commits it through a copy-on-write helper, commit_manifest, that clones the current manifest, mutates the clone, saves it to disk, and only then publishes it in memory, so a failed save never leaves in-memory state ahead of disk. The crash-recovery story around that manifest is Part 5.
The module map
The codebase is split by responsibility, and each module gets its own part later in this series. Use this as a navigation table.
| Module | Role | Covered in |
|---|---|---|
| src/api | Axum HTTP server, request/response types | Parts 2 and 7 |
| src/ingest | WAL, memtable, hot dedupe, flusher worker | Parts 2 and 3 |
| src/storage | Manifest, segment writer/reader, encoding helpers | Parts 4 and 5 |
| src/query | SQL subset parser, plan, executor | Part 7 |
| src/rollup | Hourly rollup builder and background scheduler | Part 6 |
| src/compact | Compaction planner, worker, background scheduler | Part 8 |
| src/runtime | Config, app state, startup recovery | Parts 2 and 5 |
| src/model | Event schema, IDs, dimensions | This part |
Two modules sit slightly outside that table but matter for the later parts: src/period drives the open-to-closed period lifecycle and frozen snapshots (Part 9), and the tests/ directory holds the property tests and the deterministic simulation harness (Part 10). The runtime wires everything together in src/main.rs: it acquires an exclusive process lock on the data directory, runs startup recovery, opens the WAL, and spawns the flusher, rollup, and compaction workers around a shared AppState.
The UsageEvent data model
Everything in the engine is an append of one type: UsageEvent, defined in src/model/event.rs. It is worth reading the struct, because its field grouping is the entire conceptual model.
pub enum EventKind { Usage, Correction, Retraction }
pub struct UsageEvent {
pub event_id: EventId, // identity
pub kind: EventKind,
pub correction_ref: Option<CorrectionRef>,
pub account_id: AccountId, // subject
pub subscription_id: Option<SubscriptionId>,
pub product_id: ProductId, // categorization
pub meter_id: MeterId,
pub model_id: Option<ModelId>,
pub source: SourceId,
pub timestamp_ms: i64, // measurement
pub quantity: i128,
pub unit: Unit,
pub dimensions: SmallDimensions, // variable axis
pub ingested_at_ms: i64, // provenance
}
Read it as five groups. Identity is event_id, the kind (a plain Usage, or a Correction or Retraction), and an optional correction_ref that points a correction back at the original event it adjusts. Subject is the account_id being billed plus an optional subscription_id. Categorization is how the line is rolled up: product_id, meter_id, an optional model_id for per-model accounting, and a source tag. Measurement is the actual fact: a millisecond timestamp_ms, a quantity typed as i128 so large token counts never overflow and corrections can go negative, and a free-form unit string.
The variable axis is dimensions, a SmallDimensions newtype wrapping a BTreeMap<String, String>. The BTreeMap choice is not incidental: an ordered map gives a deterministic key order, which is what lets the engine canonicalize dimensions for stable hashing during dedupe. Finally ingested_at_ms records provenance, the server-side time the event was accepted, separate from the event's own timestamp_ms. Note that the IDs are all newtype wrappers (EventId(pub String) and friends in src/model/ids.rs) rather than bare strings, so the type system stops you from passing an account_id where a meter_id belongs. The same file holds bucket_for_account, which hashes an account into one of a fixed number of buckets with blake3 so the assignment is stable across Rust versions and builds; that bucketing is how segments are physically partitioned for cheap per-account pruning.
Where this series goes next
The rest of the series follows the data. Part 2 takes an ingest batch from the HTTP boundary through the WAL and the durability contract. Part 3 is the dedupe path. Parts 4 and 5 cover how events become immutable columnar segments and how the manifest commits and recovers them. From there it is the read and maintenance side: rollups, the query engine, compaction, the period lifecycle with frozen snapshots, and finally the property and simulation testing that proves the invariants hold under random crash schedules.
usageDb internals: the full series
- Why a purpose-built usage database
- The ingest path and durability contract
- Idempotency and deduplication
- The columnar segment format
- The manifest and crash recovery
- Hourly rollups and the watermark
- The query engine
- Compaction
- Period lifecycle and frozen snapshots
- Property tests and simulation testing
usageDb is open source and developed alongside UsageBox. Read the code at github.com/pbudzik/usagedb and follow the rest of this series to see how each invariant is enforced in practice.