TL;DR — the honest answer
Across 66 real Claude Code sessions from one heavy user (this is a reference set, not a census), the median session cost ~$4.08 over ~29 model turns; the middle half ran $2.10–$8.31 and the top tenth passed $11.42. The headline is the split between typical and total:
- In the median session, re-sent (cached) context is only ~24% of spend. Most sessions are short and cheap, and output/new context carry the bill.
- Pooled across all 66 sessions, re-sent context is 60% of total dollars — because a few long, long-context sessions dominate the total, and in those, re-sent context is the bill.
So “Claude Code is expensive” is really “a few long
sessions are expensive, and in those, re-sent context is the bill.” To compare your own,
run npx @wartzar-bee/tokenscope --share.
How much does a Claude Code session cost?
The median Claude Code session in this reference set cost about $4.08. Half of all sessions fell between roughly $2.10 (p25) and $8.31 (p75); the cheapest tenth were under $1.63 and the most expensive tenth were over $11.42. The distribution is right-skewed — a long tail of pricier sessions pulls the total up well above the typical session.
Why is Claude Code so expensive? (typical vs. total)
The single most useful thing in this data is the gap between the typical session and the total bill. Pooled across all 66 sessions, the spend split is:
Here is the apparent contradiction, stated plainly: the median session re-sends only ~24% of its spend as cached context, yet pooled it's 60%. Both numbers are real. They differ because they weight differently:
- Per-session (the median) weights every session equally. Most sessions are short — ~29 turns, peaking around 45k tokens of context. In a short session the model hasn't re-sent the context many times yet, so output and newly-written context are a bigger relative slice. Re-sent context is a minority: ~24%.
- Pooled weights every dollar equally. The total is dominated by a handful of long, long-context sessions (peaks up to 999,541 tokens, 67+ turns). In those, the same big context is re-sent turn after turn, so re-sent context balloons — up to 46% of spend in the p90 session and more in the extremes. Because those sessions hold most of the dollars, they drag the pooled re-sent share up to 60%.
/compact and fresh sessions on the
long ones.
What % of Claude Code cost is re-sent context?
It depends entirely on whether you measure a typical session or the total bill. Per session, the distribution of re-sent (cached) context as a share of spend looks like this:
So the answer to “what percent of Claude Code cost is re-sent context?” is: about 24% in a typical session, but 60% of total dollars. If someone quotes a single number, ask which one they mean.
Cache efficiency & peak context
Two more metrics tokenscope reports per session. Cache efficiency is how much of your re-sent context hit the cheap cache-read path rather than full input price — higher is better. Peak context is the largest the context window grew during the session.
The pattern echoes the cost story: the median session peaks around 44,904 tokens, but the heaviest session in the set peaked at 999,541 tokens and the average peak across all sessions was 251,371 tokens — the long sessions reach far higher, which is exactly why they cost the most. Pooled cache efficiency (98%) is higher than the median session's (83%) because the big long sessions that dominate spend also keep the cache warm.
How many turns is a typical session?
A “model turn” is one request/response round-trip. The median session ran 29 turns; the middle half was 18–39, and the top tenth ran past 67 turns. Pooled across all 66 sessions there were 4,339 turns — and the turn count tracks cost closely, because more turns means the accumulated context gets re-sent more times.
All percentile tables
The complete per-session distribution, so you can quote or check any figure. p50 is the median. All from 66 real sessions (single user), measured 2026-05-29 with tokenscope's default pricing.
Per-session percentiles
| Metric | p10 | p25 | p50 (median) | p75 | p90 |
|---|---|---|---|---|---|
| Cost per session (USD) | $1.63 | $2.10 | $4.08 | $8.31 | $11.42 |
| Re-sent (cached) context as % of spend | 14% | 19% | 24% | 31% | 46% |
| Cache efficiency % | 69% | 77% | 83% | 87% | 94% |
| Peak context (tokens) | 24,153 | 28,466 | 44,904 | 69,741 | 86,124 |
| Model turns per session | 15 | 18 | 29 | 39 | 67 |
Pooled aggregate (all 66 sessions)
| Pooled metric | Value |
|---|---|
| Total cost | $2,650.90 |
| Total model turns | 4,339 |
| Spend share — re-sent context (cache-read) | 60% |
| Spend share — new cached context (cache-write) | 25% |
| Spend share — output | 15% |
| Spend share — fresh input | ~0% |
| Pooled cache efficiency | 98% |
| Peak context (heaviest session) | 999,541 tok |
| Average peak context | 251,371 tok |
| Sessions in set | n = 66 (single user) |
Reference set, not a census. Pooled spend shares are dollar-weighted across all sessions; per-session percentiles weight every session equally. The two views differ because cost concentrates in a few long sessions — that's the whole point of this page.
How do I compare my own session?
Run one command. tokenscope reads your Claude Code logs locally (read-only, nothing uploaded) and prints the exact metrics on this page — your cost, your re-sent-context share, your cache efficiency, your peak context, your turn count — so you can see which percentile your sessions land in.
Benchmark your own sessions
npx @wartzar-bee/tokenscope --share
Paste a report in your browser →
The --share flag also produces a self-contained card (aggregate
numbers only — no file paths, no prompt or response content) that links back to this benchmark
(benchmark: tokenscope.pages.dev/benchmark).
Read-only, local, no upload, no telemetry. MIT. Not affiliated with Anthropic.
Methodology
How these numbers were produced, in full:
- Source. 66 real Claude Code sessions, measured on 2026-05-29 with the
tokenscope CLI, which reads Claude Code's local
session logs (
~/.claude/projects/**/*.jsonl) read-only and attributes cost by multiplying the token counts in those logs by documented Anthropic prices, including cache multipliers (write 1.25×/2×, read 0.1× of input). - Filter. Sessions were included only if cost > 0 and they had at least 3 model turns. This drops trivial or aborted runs that would otherwise skew the low percentiles. After filtering, n = 66.
- Pricing. tokenscope's default pricing was used (the documented
Anthropic defaults; overridable in
.tokenscope.json). Prices vary by tier and over time, so verify current pricing for your own account. - Single user. Every one of the 66 sessions comes from one heavy user — tokenscope's own development. This is a reference set, not a census or a representative survey of all Claude Code users.
- Two views. Per-session percentiles weight every session equally (the median session). Pooled aggregates weight every dollar equally (the total bill). They differ on purpose — that gap is the finding.
- It will grow. We plan to re-measure this set and grow it as people opt in to share
their aggregate, anonymous numbers via
--share. Until then, treat it as one honest data point with enough sessions to show a distribution. - Charts are hand-coded SVG. Every figure is deterministic SVG drawn in markup — no chart library, no image generation, no external assets. Bar lengths are computed directly from the percentile values above.
- No trackers. This page loads no scripts, no fonts, no analytics, and sets no cookies. You can read it offline.
Limitations (read this)
- Single user, not a population. All 66 sessions are from one heavy user on one set of projects. A lighter user, a different language/stack, or a different working style would shift every percentile. Do not read these as “the average Claude Code user.”
- Self-measurement. The user being measured is also the tool's author. We have not adjusted, trimmed, or curated the numbers — they are tokenscope's raw output — but a single self-measured source is inherently narrow.
- One day, one price sheet. Measured on 2026-05-29 with default pricing. Anthropic prices and cache multipliers change; a re-measurement at different rates would move the dollar figures even on the same sessions.
- n = 66 is enough for a shape, not a guarantee. 66 sessions show a distribution and a clear skew, but the tail (p90 and beyond) rests on a small number of long sessions, so the extreme percentiles are the least stable.
- The mechanic generalizes; the numbers don't. “Cost concentrates in a few long sessions, and in those re-sent context dominates” is structural and should hold broadly. The exact $4.08 median and 24%/60% split are this reference set's actual figures and yours will differ.
FAQ
How much does a Claude Code session cost?
In this single-user reference set of 66 real sessions, the median session cost about $4.08. The middle half ran from ~$2.10 (p25) to ~$8.31 (p75); the cheapest tenth were under ~$1.63 and the most expensive tenth were over ~$11.42. These are sessions filtered to cost > 0 and at least 3 model turns, all from one heavy user — a reference set, not a population average.
Why is Claude Code so expensive?
Cost concentrates in a few long sessions, and in those, re-sent context is the bill. The typical (median) session re-sends only ~24% of its spend as cached context, but pooled across all sessions it is 60% — because a handful of long, long-context sessions dominate the total. So “Claude Code is expensive” is really “a few long sessions are expensive, and in those, re-sent context is the bill.”
What percentage of Claude Code cost is re-sent context?
About 24% in a typical session (p25 19%, p75 31%, p90 46%), but 60% of total dollars when pooled across all sessions (new cached context 25%, output 15%, fresh input ~0%). The gap is the whole story: re-sent context is a minority of most sessions but the majority of total spend.
Is this a benchmark of all Claude Code users?
No. It is a single-user reference set: all 66 sessions come from one heavy user (tokenscope's own development), measured 2026-05-29 with default pricing, filtered to cost > 0 and ≥ 3 turns. It is not a census, not a representative survey, and not an average across users. We plan to re-measure and grow it as people opt in to share aggregate, anonymous numbers.
How do I compare my own Claude Code session to this?
Run npx @wartzar-bee/tokenscope --share. tokenscope reads your logs locally
(read-only, nothing uploaded) and prints your cost, re-sent-context share, cache efficiency, peak context,
and turn count — the exact metrics on this page — so you can see which percentile you land in.
What is a good cache efficiency?
The median session here ran at ~83% (p25 77%, p75 87%, p90 94%). Higher is generally better — it means more of your re-sent context hits the cheap cache-read path. Pooled it is 98%, because the long sessions that dominate spend keep the cache warm. Cache efficiency tells you how cheap your re-sends are, not how much you are re-sending.
How big does the context window get?
Median peak context was ~44,904 tokens (p25 ~28,466, p75 ~69,741, p90 ~86,124). Pooled, the heaviest session peaked at 999,541 tokens and the average peak was 251,371 tokens — the long sessions reach far higher, which is why they cost the most.
Should I trust this single-user benchmark?
Trust the shape, verify the numbers against your own. The structural finding (cost concentrates in a few long sessions; re-sent context dominates those) should hold broadly. The specific percentiles are one heavy user's actual sessions and would shift for a different user, project mix, or price sheet. The numbers are not fabricated or adjusted — run tokenscope on your own logs to see where you land.