This site is independently operated and is not affiliated with Anthropic. Verify pricing on Anthropic's official website.
Budget

Claude Haiku 3.5 Pricing

The fastest and most affordable Claude model. Purpose-built for high-volume workloads where speed and cost matter most.

Input

$0.80

per MTok

Output

$4.00

per MTok

Cache Read

$0.08

per MTok (90% off)

Batch Input

$0.40

per MTok (50% off)

Haiku input tokens cost 3.75x less than Sonnet and 18.75x less than Opus

The Haiku Cost Advantage

Claude Haiku 3.5 is the clear choice when cost is your primary constraint. At $0.80 per million input tokens, Haiku is 3.75x cheaper than Sonnet ($3.00) and 18.75x cheaper than Opus ($15.00). Output tokens follow a similar pattern: Haiku's $4.00/MTok is 3.75x cheaper than Sonnet's $15.00 and 18.75x cheaper than Opus's $75.00.

To put that in practical terms: a request that costs $0.0135 on Sonnet (1,500 input + 600 output tokens) costs only $0.0036 on Haiku. Over 100,000 daily requests, that difference compounds to roughly $29,700/month in savings compared to Sonnet, or $133,650/month compared to Opus.

Cost per request comparison

Based on a typical request: 1,500 input tokens + 600 output tokens

Haiku 3.5

$0.0036

per request

Sonnet 4

$0.0135

3.75x more

Opus 4

$0.0675

18.75x more

Ideal Use Cases for Haiku

Classification & routing

Sort incoming requests, tag support tickets, categorise documents. Binary and multi-class classification is Haiku's sweet spot.

Entity extraction

Pull names, dates, addresses, and structured data from unstructured text. Haiku handles well-defined extraction schemas reliably.

Content moderation

Flag inappropriate content, detect spam, enforce community guidelines. Fast and cheap at scale.

Translation

Translate between languages for standard business content. Haiku handles common language pairs well at a fraction of Sonnet's cost.

Data formatting

Convert between formats (JSON, CSV, XML), clean and normalise data, transform schemas. Haiku follows formatting rules precisely.

Webhook processing

Parse incoming webhook payloads, extract relevant fields, trigger downstream actions. Sub-second latency keeps pipelines fast.

Haiku's Limitations: An Honest Assessment

Haiku is not a cheaper version of Sonnet. It is a fundamentally smaller model that trades capability for speed and cost. Being transparent about where Haiku falls short helps you avoid deploying it on tasks where it will underperform.

  • !Complex reasoning: Multi-step logic problems, mathematical proofs, and tasks requiring chains of dependent reasoning see noticeable quality drops compared to Sonnet.
  • !Nuanced writing: Long-form content, persuasive copy, and creative writing lack the polish and sophistication of Sonnet and Opus output.
  • !Ambiguous instructions: Haiku follows explicit instructions well but struggles more than Sonnet when prompts are vague or require interpretation.
  • !Code generation: While Haiku can generate simple code snippets, complex functions, debugging, and architectural decisions benefit significantly from Sonnet or Opus.
  • !Context utilisation: Although Haiku has a 200K context window, it does not leverage long context as effectively as Sonnet for tasks like document analysis.
  • !Edge cases: In classification and extraction, Haiku handles the 90th percentile well but may miss subtle edge cases that Sonnet catches.

The key takeaway: always test Haiku on your actual data before switching from Sonnet. Run a representative sample through both models and compare outputs. If Haiku's quality is acceptable for your use case, you will save 73% or more.

High-Volume Cost Modelling

What Haiku costs at scale, assuming an average request of 1,000 input + 200 output tokens.

VolumeHaiku StandardHaiku Batch
10K calls/day$480.00$240.00
100K calls/day$4,800.00$2,400.00
1M calls/day$48,000.00$24,000.00

Monthly costs based on 30 days. Batch pricing assumes all requests can tolerate the 24-hour processing window.

Haiku + Batch API: The Cheapest Claude Combo

Combining Haiku with the Batch API gives you the absolute lowest per-token cost in the Claude ecosystem: $0.40/MTok input and $2.00/MTok output. That is 7.5x cheaper than standard Sonnet and 37.5x cheaper than standard Opus for input tokens.

Haiku Batch pricing breakdown

Batch Input

$0.40

per MTok

Batch Output

$2.00

per MTok

Process 1 million classification requests (500 input + 50 output tokens each) for just $300.00

The Batch API is ideal for any Haiku workload where you do not need instant responses: nightly data processing jobs, weekly classification runs, bulk content moderation queues, and dataset labelling projects. You submit a JSONL file, and Anthropic processes all requests within a 24-hour window.

Switching from Sonnet? Here's What You Save

If you are currently running a workload on Sonnet 4 and suspect Haiku 3.5 could handle it, here is a straightforward migration approach and the savings you can expect.

MetricSonnet 4Haiku 3.5Saving
Input price$3.00/MTok$0.80/MTok73%
Output price$15.00/MTok$4.00/MTok73%
1K requests (1K in + 500 out)$10.50$2.8073%
100K requests/month (2K in + 800 out)$1,799.82$479.9573%

Migration steps

  1. 1Identify candidate tasks: Start with classification, routing, extraction, and formatting tasks. These are Haiku's strengths.
  2. 2Run a quality comparison: Send 100-500 representative requests to both Sonnet and Haiku. Compare outputs manually or with automated evaluation.
  3. 3Set a quality threshold: Define what "acceptable" means for your use case. If Haiku meets the bar on 95%+ of test cases, it is a good candidate for migration.
  4. 4Switch gradually: Route 10% of traffic to Haiku initially. Monitor quality metrics and error rates. Increase the percentage as confidence grows.
  5. 5Keep Sonnet as fallback: For requests that fail quality checks on Haiku, fall back to Sonnet. This "try cheap first" pattern captures most of the savings.

Haiku 3.5 vs GPT-4o-mini

The two leading budget-tier LLM APIs compared on price and capability.

DimensionHaiku 3.5GPT-4o-mini
Input price$0.80/MTok$0.15/MTok
Output price$4.00/MTok$0.60/MTok
Price advantage-5.3x cheaper input
Context window200K tokens128K tokens
Classification accuracyStrong, handles edge cases wellGood, may miss subtle cases
Extraction reliabilityVery reliable with structured schemasReliable for common patterns
Instruction followingExcellent with explicit promptsGood, occasionally drifts
SpeedVery fastVery fast
Batch API50% off ($0.40/$2.00)50% off ($0.075/$0.30)
Best forHigher quality at moderate costMaximum cost savings at acceptable quality

Bottom line: GPT-4o-mini is significantly cheaper per token. If your task is straightforward and quality requirements are moderate, 4o-mini is the more economical choice. If you need Anthropic's safety features, larger context window, or better edge-case handling, Haiku justifies its premium. Test both on your actual workload.

Haiku 3.5 Pricing FAQ

How much does Claude Haiku 3.5 cost per request?
A typical request with 1,000 input tokens and 200 output tokens costs $0.0016 (about 0.16 cents). At high volume (100,000 requests/day), that adds up to about $4,800/month for standard pricing or $2,400/month with the Batch API.
Is Haiku 3.5 good enough for production?
Yes, for the right tasks. Haiku excels at classification, entity extraction, content moderation, routing, data formatting, and simple Q&A. It struggles with complex multi-step reasoning, nuanced creative writing, and tasks requiring deep analysis. Always test Haiku on your specific use case before committing.
What is the cheapest way to use the Claude API?
Haiku 3.5 + Batch API is the cheapest combination: $0.40/MTok input and $2.00/MTok output. Add prompt caching for cached input at $0.08/MTok. For a workload with 70% cacheable input using batch processing, effective input cost drops to approximately $0.14/MTok.
How does Haiku 3.5 compare to GPT-4o-mini?
GPT-4o-mini ($0.15/$0.60) is cheaper per token than Haiku 3.5 ($0.80/$4.00). However, Haiku tends to handle edge cases better in classification and extraction tasks. The right choice depends on your quality requirements. If GPT-4o-mini's quality is sufficient, it is the cheaper option. If you need higher reliability, Haiku may justify the premium.
Can I switch from Sonnet to Haiku to save money?
Yes, but test first. Switching saves approximately 73% on both input and output costs. Run your current Sonnet prompts through Haiku and evaluate the output quality. For classification, routing, and extraction tasks, Haiku usually matches Sonnet. For content generation, code review, and complex analysis, quality will drop noticeably.