LLM Cost Estimator

Stop guessing API bills. Compare monthly costs across all major models.

Usage Parameters

Input Tokens / Req100,000

Output Tokens / Req20,000

Requests per Day100

Monthly Volume

360.0Mtokens

Monthly Cost Comparison

Claude 3 HaikuAnthropic

$150.00

$0.25/M in • $1.25/M out • Budget choice.

GPT-3.5 TurboOpenAI

$240.00

$0.5/M in • $1.5/M out • Fast & cheap.

Gemini 1.5 ProGoogle

$1,680.00

$3.5/M in • $10.5/M out • Long context.

Claude 3.5 SonnetAnthropic

$1,800.00

$3/M in • $15/M out • Balanced.

GPT-4oOpenAI

$2,400.00

$5/M in • $15/M out • Flagship model.

GPT-4 TurboOpenAI

$4,800.00

$10/M in • $30/M out • Previous flagship.

Claude 3 OpusAnthropic

$9,000.00

$15/M in • $75/M out • High intelligence.

Frequently Asked Questions

Why are output tokens more expensive?

Generating text is more compute-intensive than reading it. Output tokens require sequential inference where each token depends on all previous ones, making it 3-5x more expensive than input.

GPT-4o vs Claude 3.5 Sonnet cost?

GPT-4o: $5/1M input, $15/1M output. Claude 3.5 Sonnet: $3/1M input, $15/1M output. For high-input workloads (RAG), Claude is slightly cheaper. For balanced workloads, they are similar.

How to reduce API costs?

1. Use cheaper models for simple tasks (GPT-3.5, Haiku). 2. Cache common responses. 3. Reduce prompt length. 4. Set lower max_tokens. 5. Batch requests where possible.

Are there free LLM options?

Yes! Open-source models like Llama 3, Mistral, and Gemma can be self-hosted for free (you pay for compute). Services like Groq offer generous free tiers for open models.