LLM Cost Estimator
Stop guessing API bills. Compare monthly costs across all major models.
Usage Parameters
Monthly Cost Comparison
$0.25/M in • $1.25/M out • Budget choice.
$0.5/M in • $1.5/M out • Fast & cheap.
$3.5/M in • $10.5/M out • Long context.
$3/M in • $15/M out • Balanced.
$5/M in • $15/M out • Flagship model.
$10/M in • $30/M out • Previous flagship.
$15/M in • $75/M out • High intelligence.
Frequently Asked Questions
Why are output tokens more expensive?
Generating text is more compute-intensive than reading it. Output tokens require sequential inference where each token depends on all previous ones, making it 3-5x more expensive than input.
GPT-4o vs Claude 3.5 Sonnet cost?
GPT-4o: $5/1M input, $15/1M output. Claude 3.5 Sonnet: $3/1M input, $15/1M output. For high-input workloads (RAG), Claude is slightly cheaper. For balanced workloads, they are similar.
How to reduce API costs?
1. Use cheaper models for simple tasks (GPT-3.5, Haiku). 2. Cache common responses. 3. Reduce prompt length. 4. Set lower max_tokens. 5. Batch requests where possible.
Are there free LLM options?
Yes! Open-source models like Llama 3, Mistral, and Gemma can be self-hosted for free (you pay for compute). Services like Groq offer generous free tiers for open models.