Since June 1, 2026, every paid GitHub Copilot request that is not a code completion is priced in AI Credits. The unit is simple. The thing nobody can answer off the top of their head is the only thing that matters for budgeting: what does one credit actually buy, and how many credits does a real request burn?
This is the reference page for that. One unit, the conversion math, a per-model rate table, and worked examples for the three request shapes that make up almost all real usage. If you came here from a billing dashboard wondering why one engineer burned 4,000 credits in an afternoon, the worked examples are the part you want.
The unit: 1 credit = $0.01
A GitHub AI Credit is denominated at one credit = one US cent. That is the whole unit. It means a plan's included credit allotment maps directly to dollars: Pro at $10 per month includes 1,000 credits, Pro+ at $39 includes 3,900 credits, Business at $19 per seat includes 1,900 credits. Run past the included allotment and paid plans buy more credits at the same one-cent rate.
The reason credits exist at all, rather than just billing dollars directly, is that a single request consumes a fractional, model-dependent amount. Counting in cents-as-credits lets GitHub debit sub-cent amounts per request and round at the aggregate instead of per-call. So the credit is a billing convenience. The cost driver underneath it is tokens.
How a request converts to credits
Every billable request (Copilot Chat, Edits, Agent Mode, anything that is not tab-completion) is metered the same way:
credits = (input_tokens + output_tokens + cached_tokens) × model_rate_per_1k / 1000, then converted at 1 credit = $0.01
Three things fall out of that formula immediately, and each one is a lever you control:
- Output tokens dominate. Generation is the expensive half on every frontier model. A request that returns a 600-line file costs far more than the question that prompted it.
- Context is not free. Cached and re-sent context tokens are counted. A long agent conversation re-sends its history on every turn, so the per-turn cost climbs as the session grows.
- The model rate swings the result by an order of magnitude. A mid-tier model and a frontier model answering the identical question can differ 10x in credits because the per-token rate differs that much.
The per-model rate, in shape
GitHub converts each model's published per-token API price into a credit rate. The exact multipliers move whenever a provider changes list price, so treat the table below as the shape of the rate card rather than a frozen quote. Pull the live numbers from your org's billing model list before you build a budget on them.
| Model tier | Typical use | Relative credit rate | Cost character |
|---|---|---|---|
| Base / included model | Default chat, simple edits | 1x (baseline) | Often bundled at the lowest rate; the safe default |
| Mid-tier (e.g. a Sonnet-class or GPT mid model) | Most day-to-day chat and refactors | ~3-5x baseline | The sweet spot for quality per credit |
| Frontier (e.g. an Opus-class or top GPT model) | Hard reasoning, large multi-file work | ~10-20x baseline | Worth it for the 10% of tasks that need it, expensive for the other 90% |
The single biggest budgeting mistake teams make is leaving every developer defaulted to a frontier model for tasks a mid-tier model would answer identically. That one setting is usually the difference between a flat bill and a surprising one.
Worked example 1: a chat question
A developer asks "why is this function returning undefined?" and pastes a 40-line function. The request sends roughly 600 input tokens (question plus code plus system prompt) and gets back a 250-token explanation.
- On a mid-tier model: about 850 metered tokens at a mid-tier rate lands near 1 to 2 credits. One or two cents.
- On a frontier model: the same 850 tokens at a frontier rate is closer to 4 to 8 credits. Still small in isolation.
A single chat is cheap either way. The lesson here is not the absolute number, it is the 4x multiplier for an answer that would have been identical on the cheaper model.
Worked example 2: a 200-line refactor via Edits
Now the developer asks Copilot Edits to refactor a 200-line module. Input is the file plus instructions, roughly 3,000 tokens. Output is the rewritten file, roughly 2,800 tokens. Call it 5,800 metered tokens.
- Mid-tier: roughly 8 to 15 credits. Eight to fifteen cents for a real refactor. Cheap.
- Frontier: roughly 50 to 100 credits. Half a dollar to a dollar for one edit.
This is where the included allotment starts to feel finite. A Pro user with 1,000 monthly credits who does ten frontier-model refactors a day exhausts the envelope in a few days.
Worked example 3: an Agent Mode loop
Agent Mode is the one that surprises people. It is not one request. It is a loop: read files, plan, edit, run, observe, repeat, re-sending the growing conversation each turn. A medium agent task ("add validation to this endpoint and write tests") might run 8 to 15 turns, each re-sending an expanding context.
- Mid-tier, 10 turns, growing context: easily 150 to 400 credits. One to four dollars for one agent task.
- Frontier, same task: 800 to 2,000+ credits. The single afternoon that burns through a Pro+ allotment.
If your dashboard shows a developer who spent thousands of credits in a day, the cause is almost always frontier-model Agent Mode on broad scopes. That is the request shape to put a budget around first.
What stays free
Code completions and Next Edit Suggestions do not touch credits. They are unlimited on every paid plan. This is the load-bearing detail of the whole model: if your team is 90% tab-completion and 10% chat, the credit system is nearly invisible to you. The teams that feel the change are the ones who shifted their workflow into Chat and Agent Mode, which is exactly the cohort getting the most value, and exactly the cohort that needs visibility.
How cached tokens cut the bill
The underlying model providers all offer prompt caching, and a cache hit is metered at a steep discount rather than the full input rate. In practice this means focused, repetitive sessions cost less per turn than scattered ones, because the stable prefix of the conversation keeps hitting cache. It is a real lever: keeping an agent session scoped and on-topic does not just produce better answers, it lowers the per-turn credit burn. We walk through the provider-level mechanics in the prompt caching cost guide.
How to read your own credit burn
Three numbers tell you everything about whether the new model will hurt:
- Completions-to-chat ratio. The higher the completion share, the less credits matter. If you do not know it, you cannot budget.
- Frontier-model share of billable requests. This is the dial. Every point you can move from frontier to mid-tier without losing answer quality is a near-linear cost reduction.
- Agent Mode credits per developer per week. This is where the outliers live. A team-wide average hides the two or three people generating most of the burn.
GitHub's native billing page gives you the org aggregate. It does not give you the per-developer, per-request-shape breakdown that tells you which developer and which workflow to coach. That gap is the reason teams bolt a metering layer on top, so the visibility reaches the engineer doing the spending and not just the admin reading the invoice.
The honest take
The credit unit is not the hard part. One credit is one cent, and the included allotment maps to dollars cleanly. The hard part is that the cost of a request varies by more than 100x depending on model choice and request shape, and almost nobody knows their own distribution across that range. The teams that win under credit billing are not the ones who use Copilot less. They are the ones who can see, per developer and per request type, where the credits go, and who put a budget around the one shape (frontier-model agent loops) that produces the outliers.
Related reading
- GitHub Copilot Moves to Usage-Based Billing: the full cutover: what changed June 1, tier by tier
- How to Set a Copilot Spending Cap: putting a budget around the request shape that produces the outliers
- Copilot Business vs Enterprise Billing: the org-buyer decision under credits
- Prompt Caching Cost Optimization: why focused sessions cost less per turn
- Cap AI Coding Costs Per Engineer: the FinOps operating manual once you can see the burn