Question 1

How much does a Claude Code session cost?

Accepted Answer

In this single-user reference set of 66 real sessions, the median session cost about $4.08. The middle half ran from about $2.10 (p25) to $8.31 (p75); the cheapest tenth were under ~$1.63 and the most expensive tenth were over ~$11.42. These are sessions filtered to cost > 0 and at least 3 model turns, all from one heavy user, so treat them as a reference set, not a population average. Your sessions will differ with how long you run and how big your context grows.

Question 2

Why is Claude Code so expensive?

Accepted Answer

Cost concentrates in a few long sessions, and in those, re-sent context is the bill. The model is stateless, so the whole conversation context is re-sent every turn; with prompt caching that re-send is billed at the discounted cache-read rate but paid every turn on the entire accumulated context. The typical (median) session re-sends only about 24% of its spend as cached context, but pooled across all sessions it is 60% — because a handful of long, long-context sessions dominate the total. So 'Claude Code is expensive' is really 'a few long sessions are expensive, and in those, re-sent context is the bill.'

Question 3

What percentage of Claude Code cost is re-sent context?

Accepted Answer

It depends on whether you weight by session or by dollar. In a typical (median) session it is about 24% of spend (p25 19%, p75 31%, p90 46%). But pooled across all 66 sessions — i.e. weighted by dollar — it is 60% of total spend, with new cached context (cache write) 25%, output 15%, and fresh input roughly 0%. The gap is the whole story: re-sent context is a minority of most sessions but the majority of total dollars, because the expensive long sessions are the ones where it dominates.

Question 4

Is this a benchmark of all Claude Code users?

Accepted Answer

No. This is a single-user reference set: all 66 sessions come from one heavy user (tokenscope's own development), measured on 2026-05-29 with tokenscope's default pricing, filtered to cost > 0 and at least 3 model turns. It is not a census, not a representative survey, and not an average across users. It is an honest reference set that shows the shape of real sessions and gives you numbers to compare your own against. We plan to re-measure and grow it as people opt in to share their (aggregate, anonymous) numbers.

Question 5

How do I compare my own Claude Code session to this benchmark?

Accepted Answer

Run npx @wartzar-bee/tokenscope --share in your terminal. tokenscope reads your Claude Code logs locally (read-only, nothing uploaded) and prints your cost, your re-sent-context share, your cache efficiency, your peak context, and your turn count — the exact metrics tabulated on this page — so you can see which percentile your session lands in. The --share flag also produces a self-contained card (aggregate numbers only, no file paths or prompt content) that links back to this benchmark.

Question 6

What is a good cache efficiency for Claude Code?

Accepted Answer

In this reference set the median session ran at about 83% cache efficiency, with the middle half between 77% (p25) and 87% (p75) and the top tenth above 94%. Higher is generally better — it means more of your re-sent context is hitting the cheap cache-read path rather than being re-billed at full input price. Pooled across all sessions the figure is 98%, because the long sessions that dominate spend also keep the cache warm. Cache efficiency tells you how cheap your re-sends are; it does not tell you how much you are re-sending — for that, look at the re-sent-context share of spend.

Question 7

How big does the context window get in a real session?

Accepted Answer

In this reference set, peak context in a typical (median) session was about 44,904 tokens; the middle half ran from ~28,466 (p25) to ~69,741 (p75), and the top tenth peaked above ~86,124 tokens. Pooled, the heaviest session in the set peaked at 999,541 tokens and the average peak across sessions was 251,371 tokens — the long sessions reach far higher than the median, which is exactly why they cost the most: every token in context is re-sent on every following turn.

Question 8

Should I trust this single-user benchmark?

Accepted Answer

Trust the shape, verify the numbers against your own. The structural finding — cost concentrates in a few long sessions, and re-sent context dominates those — is a property of stateless models plus per-turn context re-send, and it will hold for most users. The specific percentiles are one heavy user's actual sessions on 2026-05-29 with tokenscope's default pricing; a lighter user, a different project mix, or different prices would shift them. The numbers are not fabricated and not adjusted; they are tokenscope's raw output. Run tokenscope on your own logs to see where you land.

Pooled metric	Value
Total cost	$2,650.90
Total model turns	4,339
Spend share — re-sent context (cache-read)	60%
Spend share — new cached context (cache-write)	25%
Spend share — output	15%
Spend share — fresh input	~0%
Pooled cache efficiency	98%
Peak context (heaviest session)	999,541 tok
Average peak context	251,371 tok
Sessions in set	n = 66 (single user)

Claude Code cost benchmark: what a session really costs

TL;DR — the honest answer

How much does a Claude Code session cost?

Why is Claude Code so expensive? (typical vs. total)

What % of Claude Code cost is re-sent context?

Cache efficiency & peak context

How many turns is a typical session?

All percentile tables

Per-session percentiles

Pooled aggregate (all 66 sessions)

How do I compare my own session?

Benchmark your own sessions

Methodology

Limitations (read this)

FAQ

Related reading

Metric	p10	p25	p50 (median)	p75	p90
Cost per session (USD)	$1.63	$2.10	$4.08	$8.31	$11.42
Re-sent (cached) context as % of spend	14%	19%	24%	31%	46%
Cache efficiency %	69%	77%	83%	87%	94%
Peak context (tokens)	24,153	28,466	44,904	69,741	86,124
Model turns per session	15	18	29	39	67