Claude Haiku 3.5 Pricing
The fastest and most affordable Claude model. Purpose-built for high-volume workloads where speed and cost matter most.
Input
$0.80
per MTok
Output
$4.00
per MTok
Cache Read
$0.08
per MTok (90% off)
Batch Input
$0.40
per MTok (50% off)
Haiku input tokens cost 3.75x less than Sonnet and 18.75x less than Opus
The Haiku Cost Advantage
Claude Haiku 3.5 is the clear choice when cost is your primary constraint. At $0.80 per million input tokens, Haiku is 3.75x cheaper than Sonnet ($3.00) and 18.75x cheaper than Opus ($15.00). Output tokens follow a similar pattern: Haiku's $4.00/MTok is 3.75x cheaper than Sonnet's $15.00 and 18.75x cheaper than Opus's $75.00.
To put that in practical terms: a request that costs $0.0135 on Sonnet (1,500 input + 600 output tokens) costs only $0.0036 on Haiku. Over 100,000 daily requests, that difference compounds to roughly $29,700/month in savings compared to Sonnet, or $133,650/month compared to Opus.
Cost per request comparison
Based on a typical request: 1,500 input tokens + 600 output tokens
Haiku 3.5
$0.0036
per request
Sonnet 4
$0.0135
3.75x more
Opus 4
$0.0675
18.75x more
Ideal Use Cases for Haiku
Classification & routing
Sort incoming requests, tag support tickets, categorise documents. Binary and multi-class classification is Haiku's sweet spot.
Entity extraction
Pull names, dates, addresses, and structured data from unstructured text. Haiku handles well-defined extraction schemas reliably.
Content moderation
Flag inappropriate content, detect spam, enforce community guidelines. Fast and cheap at scale.
Translation
Translate between languages for standard business content. Haiku handles common language pairs well at a fraction of Sonnet's cost.
Data formatting
Convert between formats (JSON, CSV, XML), clean and normalise data, transform schemas. Haiku follows formatting rules precisely.
Webhook processing
Parse incoming webhook payloads, extract relevant fields, trigger downstream actions. Sub-second latency keeps pipelines fast.
Haiku's Limitations: An Honest Assessment
Haiku is not a cheaper version of Sonnet. It is a fundamentally smaller model that trades capability for speed and cost. Being transparent about where Haiku falls short helps you avoid deploying it on tasks where it will underperform.
- !Complex reasoning: Multi-step logic problems, mathematical proofs, and tasks requiring chains of dependent reasoning see noticeable quality drops compared to Sonnet.
- !Nuanced writing: Long-form content, persuasive copy, and creative writing lack the polish and sophistication of Sonnet and Opus output.
- !Ambiguous instructions: Haiku follows explicit instructions well but struggles more than Sonnet when prompts are vague or require interpretation.
- !Code generation: While Haiku can generate simple code snippets, complex functions, debugging, and architectural decisions benefit significantly from Sonnet or Opus.
- !Context utilisation: Although Haiku has a 200K context window, it does not leverage long context as effectively as Sonnet for tasks like document analysis.
- !Edge cases: In classification and extraction, Haiku handles the 90th percentile well but may miss subtle edge cases that Sonnet catches.
The key takeaway: always test Haiku on your actual data before switching from Sonnet. Run a representative sample through both models and compare outputs. If Haiku's quality is acceptable for your use case, you will save 73% or more.
High-Volume Cost Modelling
What Haiku costs at scale, assuming an average request of 1,000 input + 200 output tokens.
| Volume | Haiku Standard | Haiku Batch |
|---|---|---|
| 10K calls/day | $480.00 | $240.00 |
| 100K calls/day | $4,800.00 | $2,400.00 |
| 1M calls/day | $48,000.00 | $24,000.00 |
Monthly costs based on 30 days. Batch pricing assumes all requests can tolerate the 24-hour processing window.
Haiku + Batch API: The Cheapest Claude Combo
Combining Haiku with the Batch API gives you the absolute lowest per-token cost in the Claude ecosystem: $0.40/MTok input and $2.00/MTok output. That is 7.5x cheaper than standard Sonnet and 37.5x cheaper than standard Opus for input tokens.
Haiku Batch pricing breakdown
Batch Input
$0.40
per MTok
Batch Output
$2.00
per MTok
Process 1 million classification requests (500 input + 50 output tokens each) for just $300.00
The Batch API is ideal for any Haiku workload where you do not need instant responses: nightly data processing jobs, weekly classification runs, bulk content moderation queues, and dataset labelling projects. You submit a JSONL file, and Anthropic processes all requests within a 24-hour window.
Switching from Sonnet? Here's What You Save
If you are currently running a workload on Sonnet 4 and suspect Haiku 3.5 could handle it, here is a straightforward migration approach and the savings you can expect.
| Metric | Sonnet 4 | Haiku 3.5 | Saving |
|---|---|---|---|
| Input price | $3.00/MTok | $0.80/MTok | 73% |
| Output price | $15.00/MTok | $4.00/MTok | 73% |
| 1K requests (1K in + 500 out) | $10.50 | $2.80 | 73% |
| 100K requests/month (2K in + 800 out) | $1,799.82 | $479.95 | 73% |
Migration steps
- 1Identify candidate tasks: Start with classification, routing, extraction, and formatting tasks. These are Haiku's strengths.
- 2Run a quality comparison: Send 100-500 representative requests to both Sonnet and Haiku. Compare outputs manually or with automated evaluation.
- 3Set a quality threshold: Define what "acceptable" means for your use case. If Haiku meets the bar on 95%+ of test cases, it is a good candidate for migration.
- 4Switch gradually: Route 10% of traffic to Haiku initially. Monitor quality metrics and error rates. Increase the percentage as confidence grows.
- 5Keep Sonnet as fallback: For requests that fail quality checks on Haiku, fall back to Sonnet. This "try cheap first" pattern captures most of the savings.
Haiku 3.5 vs GPT-4o-mini
The two leading budget-tier LLM APIs compared on price and capability.
| Dimension | Haiku 3.5 | GPT-4o-mini |
|---|---|---|
| Input price | $0.80/MTok | $0.15/MTok |
| Output price | $4.00/MTok | $0.60/MTok |
| Price advantage | - | 5.3x cheaper input |
| Context window | 200K tokens | 128K tokens |
| Classification accuracy | Strong, handles edge cases well | Good, may miss subtle cases |
| Extraction reliability | Very reliable with structured schemas | Reliable for common patterns |
| Instruction following | Excellent with explicit prompts | Good, occasionally drifts |
| Speed | Very fast | Very fast |
| Batch API | 50% off ($0.40/$2.00) | 50% off ($0.075/$0.30) |
| Best for | Higher quality at moderate cost | Maximum cost savings at acceptable quality |
Bottom line: GPT-4o-mini is significantly cheaper per token. If your task is straightforward and quality requirements are moderate, 4o-mini is the more economical choice. If you need Anthropic's safety features, larger context window, or better edge-case handling, Haiku justifies its premium. Test both on your actual workload.