How much does Claude Fable 5 cost?

Claude Fable 5 costs $10 per million input tokens and $50 per million output tokens on the Claude API, double Claude Opus 4.8 ($5/$25). Prompt cache hits cost $1/MTok (90% off input), the Batch API halves prices to $5/$25, and the 1M-token context window is billed at standard per-token rates with no long-context premium. Claude Mythos 5, the limited-availability sibling, carries identical pricing.

What does a full 1M-context request actually cost on Fable 5?

About $10 in input tokens alone (1M tokens at $10/MTok) before any output. With prompt caching, the picture changes dramatically: cache the corpus once (~$12.50/MTok write) and subsequent requests read it at $1/MTok, cutting a ten-turn session over the same context from $100+ to roughly $21.50.

Is Fable 5 cheaper than Opus 4.8 Fast Mode?

They now cost the same: $10/$50 per million tokens. Anthropic cut Opus 4.8 Fast Mode 3x from the $30/$150 it cost on Opus 4.6/4.7. The choice at that price point is capability ceiling (Fable 5) versus output speed (Fast Mode); standard Opus 4.8 at $5/$25 remains the better default when neither extreme is required.

What is the 35% tokenizer issue?

Models on Anthropic's new tokenizer (Opus 4.7 and later, including Fable 5) may use up to 35% more tokens to represent the same text than older models. That inflates effective per-document costs beyond what list-price comparisons show, which is why measured cost-per-task on your own traffic is the only reliable basis for migration decisions.

Should I switch my default model to Fable 5?

No. Route to it per-task. The durable pattern is a routing ladder: Haiku 4.5 ($1/$5) for volume work, Sonnet 4.6 ($3/$15) for standard production, Opus 4.8 ($5/$25) for heavy reasoning, and Fable 5 ($10/$50) only where the frontier ceiling or the 1M window genuinely pays for itself, with prompt caching and batch processing applied wherever the workload shape allows.

Claude Fable 5 Pricing: The Real Cost of 1M Context (and the 35% Tokenizer Tax)

Name: UsageBox
Rating: 4.8 (50 reviews)
Author: UsageBox

TL;DR (June 2026): Claude Fable 5 launched at $10 input / $50 output per million tokens, double Opus 4.8's $5/$25, with a 1M-token context window billed at standard rates and 128K max output. The three numbers that actually decide your bill: a full 1M-context request costs $10 in input alone before a single output token; cache hits cost $1/MTok (90% off), which makes prompt caching the difference between viable and ruinous at long context; and the Opus 4.7+ tokenizer can use up to 35% more tokens for the same text, a quiet multiplier most comparisons ignore. Meanwhile Opus 4.8's Fast Mode just dropped 3x to $10/$50, exactly Fable 5's standard price, which makes the model choice genuinely interesting.

Anthropic shipped Claude Fable 5 this week alongside a limited-availability research sibling, Claude Mythos 5, and the launch thread on r/ClaudeAI cleared four thousand upvotes in a day. The second-most-upvoted reaction was not about capability. It was titled "Claude dropped Fable 5 and the API pricing genuinely shocked me," and a subscriber thread reported "burning 2% a minute" of their plan allowance running it. Both reactions are rational, and both are incomplete. Here is the full pricing picture, verified against Anthropic's published rate card, and the math that decides when Fable 5 is worth double Opus.

The rate card, verified

Model	Input /MTok	Output /MTok	Cache hit	Batch (in/out)	Notes
Claude Fable 5	$10	$50	$1	$5 / $25	1M context, 128K max output
Claude Mythos 5	$10	$50	$1	$5 / $25	Limited availability
Claude Opus 4.8	$5	$25	$0.50	$2.50 / $12.50	1M context
Opus 4.8 Fast Mode	$10	$50	multipliers stack	n/a	Cut from $30/$150 on 4.6/4.7
Claude Sonnet 4.6	$3	$15	$0.30	$1.50 / $7.50	1M context
Claude Haiku 4.5	$1	$5	$0.10	$0.50 / $2.50	The workhorse tier

Three structural facts sit behind the table. First, long context carries no premium: Anthropic bills a 900K-token request at the same per-token rate as a 9K one, so the 1M window is a capability, not a surcharge. Second, the Batch API halves everything for asynchronous work, putting batched Fable 5 at Opus 4.8's interactive price. Third, the cache columns matter more at this tier than any before it, which deserves its own section.

The 1M-context math nobody does before the first invoice

The headline feature is the trap if you treat it casually. Load the full window and the arithmetic is brutal:

One full-context call: 1M input tokens × $10/MTok = $10 before the model writes a word. Add a long 50K-token answer and you are at $12.50 for one request.
A conversation that keeps the window loaded: ten turns over the same million-token corpus, naively resent each time, is $100+ in input alone.
An agent loop at full context: twenty tool-call iterations against a loaded window is real money per task, the cost-per-task lens we keep banging on about, not a rounding error.

Now the counter-math, because the rate card contains its own antidote. A cache hit costs $1/MTok, ten percent of base input. Cache the million-token corpus once (a 5-minute cache write at $12.50/MTok) and every subsequent turn reads it at $1/MTok: the ten-turn conversation drops from $100+ to roughly $12.50 + 9 × $1 ≈ $21.50, a 4-5x reduction for one architectural decision. At Fable 5 prices, prompt caching stops being an optimization and becomes the load-bearing wall: if your workload re-reads big context and you are not caching, you are paying quintuple list price by choice. (Our caching cost guide covers the mechanics across providers.)

The quiet multiplier: the 35% tokenizer

Buried in Anthropic's pricing notes is the line most comparisons skip: models on the new tokenizer (Opus 4.7 and later, which includes Fable 5) "may use up to 35% more tokens for the same fixed text." The new tokenizer buys real capability gains, but it means a price-per-token comparison against older models or other providers understates the difference: the same document costs more tokens to read on the new tokenizer than it did on the old one. When you model a migration from, say, Sonnet 4.6 to Fable 5, the honest multiplier is not just $3→$10 on input; it is up to $3→$13.50 in effective terms on tokenizer-inflated text. Per-task measurement on your own traffic, not list-price arithmetic, is the only way to know your real number, which is the entire thesis of cost-per-task benchmarking.

Fable 5 vs Opus 4.8 Fast Mode: the same price, two different products

The most interesting pricing event in the launch is not Fable 5's number; it is that Anthropic simultaneously cut Opus 4.8's Fast Mode from $30/$150 to $10/$50, identical to Fable 5 standard. That creates a real decision at the $10/$50 price point:

Pick Fable 5 when the task needs the frontier: hardest reasoning, the 1M window used in anger, the capability ceiling. You pay in latency-normal output speed.
Pick Opus 4.8 Fast Mode when the bottleneck is wall-clock: interactive agents, user-facing latency, throughput-bound pipelines that Opus-class quality already satisfies.
Pick neither by default. Standard Opus 4.8 at half the price remains the right default for heavy work, and Haiku 4.5 at a tenth remains the right default for everything that does not need heavy work. Frontier models are for frontier tasks; the routing discipline does not change because the ceiling moved.

What the subscriber panic is actually about

The "burning 2% a minute" thread is worth decoding because it previews everyone's future. Claude subscription plans meter Fable 5 usage against plan allowances, and a frontier model with a huge context drains an allowance visibly faster than its predecessors, the in-plan equivalent of the API math above. The reaction pattern is the same one we documented in the Tokenpocalypse: when consumption becomes visible, people discover what their usage actually costs, and the discovery feels like a price increase even when it is just a meter. Fable 5 did not make AI expensive; it made expensive AI legible. (The full subscription-side mechanics, the 2x plan weighting, the June 23 cliff, and the usage-credit rules, are in our usage-limits breakdown; what Mythos 5 actually is and who gets it is in the Mythos access explainer; and the pricing-discipline context behind all of it is the $965B IPO filing.)

The practical playbook for week one

Do not flip the default model. Route to Fable 5 per-task, behind an explicit decision, and keep Opus/Sonnet/Haiku handling everything they already handle well.
Cache before you scale. Any Fable 5 workload that re-reads context belongs behind prompt caching from day one; at $1 cache hits the payback is immediate.
Batch what can wait. $5/$25 batched is Opus-interactive money for frontier output on overnight workloads: evals, document processing, bulk generation.
Meter per task, per model, from the first call. The tokenizer change plus the context appetite means list-price projections will be wrong in both directions; only measured cost-per-task on your own traffic tells you whether Fable 5 earns its 2x. That measurement layer is, as ever, the thing that survives every launch week.
Watch your effective context size. The window fits a million tokens; nothing obliges you to fill it. Trimmed, structured context at 100K beats a lazy full-window dump at 10x the cost and often better quality.

The honest take

$10/$50 for Fable 5 is aggressive but coherent: it prices the frontier above the workhorse line, holds the long-context surcharge at zero, and hands you a 90%-off caching lever that rewards exactly the engineering discipline serious teams already have. The shock in the launch threads is less about the number and more about the era it confirms: model launches are now pricing events, capability and cost land as one announcement, and the teams that thrive are the ones whose meters were running before the model dropped.

Key Topics

•Claude Fable 5
•Anthropic pricing
•1M context
•prompt caching
•Claude Mythos 5
•Opus 4.8 fast mode
•AI FinOps
•June 2026

Next Steps

Measure what Fable 5 really costs per task with UsageBox Browse all articles

←

→

Explore More Articles

Discover our complete collection of usage-based billing guides and implementation patterns.

View all articles