Updated 31 March 2026
Claude Token Calculator
Paste any text below to get an instant token count estimate and see exactly what it would cost to process through each Claude model. No sign-up required.
Select Model
Estimated Tokens
0
Compare All Models
Shown as: input cost / output cost
Token Estimation Rules of Thumb
Claude uses a byte-pair encoding tokenizer. While exact token counts depend on the specific text, these rules give you reliable estimates for planning and budgeting. These approximations hold well for standard English prose. Code, structured data, and non-Latin scripts may tokenize differently.
~4
characters per token
English prose average
~0.75
words per token
Or ~1.33 tokens per word
~500
tokens per page
Standard A4 page of text
~1,333
tokens per 1,000 words
Common billing benchmark
How tokenization varies by content type
English prose (~4 chars/token)
Standard written English tokenizes efficiently. Common words like “the”, “and”, “is” are often single tokens. Longer or uncommon words may be split into 2-3 tokens. A 2,000-word blog post is typically around 2,700 tokens.
Source code (~3 chars/token)
Code uses more tokens per character due to brackets, operators, indentation, and short variable names. A 200-line Python file might be 2,000-3,000 tokens. JSON and XML are particularly token-heavy because of structural characters and repeated keys.
Non-Latin scripts (~2-3 chars/token)
Chinese, Japanese, Korean, Arabic, and other non-Latin scripts use more tokens per character. A single Chinese character often maps to 1-2 tokens. Plan for 2-3x the token count compared to an English translation of the same content.
Structured data (~2-3 chars/token)
JSON, XML, CSV, and similar formats have significant overhead from delimiters, keys, and whitespace. A 1 KB JSON payload might use 350-500 tokens. Minifying JSON or using compact formats can reduce token counts by 20-30%.
Common Document Sizes
Reference table showing typical token counts and Claude API costs for common document types. Use these benchmarks to estimate costs before you start building. All costs shown are per-document for a single API call.
| Document Type | Est. Tokens | Haiku 3.5 | Sonnet 4 | Opus 4 |
|---|---|---|---|---|
| 💬Tweet / short message | 50 | <$0.0001 | $0.0002 | $0.0008 |
| 200 | $0.0002 | $0.0006 | $0.0030 | |
| 🏷️Product description | 500 | $0.0004 | $0.0015 | $0.0075 |
| 📝Blog post (1,500 words) | 2,000 | $0.0016 | $0.0060 | $0.0300 |
| 📄Research paper | 8,000 | $0.0064 | $0.0240 | $0.1200 |
| ⚖️Legal contract | 15,000 | $0.0120 | $0.0450 | $0.2250 |
| 📚Technical manual | 50,000 | $0.0400 | $0.1500 | $0.7500 |
| 📖Full novel | 100,000 | $0.0800 | $0.3000 | $1.50 |
Costs shown are for input tokens only. Output token costs are 5x higher for all models. For a full request cost, add the output token estimate for the response you expect.
Practical Token Budgeting Examples
Understanding token counts matters because it directly affects your API spend. Here are three real scenarios that show how token awareness can save you money.
A support chatbot with a 1,500-token system prompt, 500-token user message, and 300-token response. Running 5,000 conversations per day on Haiku 3.5.
With prompt caching on the system prompt: ~$252/mo (40% saving)
Processing 500 legal documents per day, each averaging 15,000 input tokens with 1,000-token summaries. Using Sonnet 4 for quality.
With Batch API (non-urgent): ~$450/mo (50% saving)
An automated code review tool processing 200 pull requests per day. Each PR averages 4,000 tokens of diff with a 2,000-token review. Using Opus 4 for deep analysis.
Switching to Sonnet 4 for routine reviews: ~$252/mo