gekro
GitHub LinkedIn
AI

Multi-modal Token Counter

Drop an image → see how many vision tokens each model charges (GPT-5, Claude 4, Gemini 2.5/3 Pro) and the cost at your usage volume.

Image

Drop image or click to browse

PNG, JPG, WebP, GIF — analysed in-browser

Manual dimensions No image? Enter dimensions directly

Presets:

Detail level (OpenAI)

OpenAI-only setting. Low = single 85-token "thumbnail" — much cheaper but blind to details. High = 170 tokens per 512×512 tile + 85 base. Other providers always use full resolution.

Tokens per image

Cost at scale

🔒 Image stays in your browser — only dimensions used. Formulas reflect each provider's published vision-token spec.

As-is, no warranty. These apps are free under their listed license and run entirely in your browser. Use at your own risk — don't blame me if your PC catches fire, your dog runs away, or the math turns out wrong. Verify anything that actually matters. None of this is professional financial, medical, legal, or engineering advice.

© 2026 Rohit Burani · MIT · Built at gekro.com · View source ↗

Guide

What It Does

Vision-capable LLMs (GPT-5, Claude 4 family, Gemini 2.5/3 Pro) charge for images as tokens. Each provider has a different formula:

  • OpenAI (GPT-5, GPT-5 mini, GPT-4o): resizes image to fit 2048² then short-side to 768, then tiles into 512² blocks. Each tile = 170 tokens + 85 base. “Low detail” mode = flat 85 tokens, no tiling.
  • Anthropic (Claude Opus/Sonnet/Haiku 4): tokens ≈ (width × height) / 750, capped at 1600 per image.
  • Google (Gemini 2.5/3 Pro): ≤384px in any dimension → 258 tokens flat. Larger → tiled at 768², 258 tokens per tile.

Drop an image. The tool reads dimensions client-side and computes tokens + cost for every model side-by-side.

When To Use This

  • OCR pipelines / document AI: high-volume images amortise differently across providers
  • Vision-RAG: thumbnail vs full-res tradeoff — see the cost difference
  • Side-by-side evals: same image, see which provider charges more for the same context
  • Budget projections: enter your daily image volume, get monthly/annual estimates

Privacy

The image never leaves your browser. Only the natural width and height are read via the Image() element. Even the file name isn’t sent anywhere.

Limitations

  • Formulas reflect each provider’s published documentation as of 2026-05. Actual billed tokens can vary ±5%.
  • Multi-image messages aren’t modelled — multiply costs by image count.
  • Video / PDF tokens use different formulas not covered here.
  • “Auto” detail in OpenAI varies based on image size — this tool assumes the high-detail path for the calculation.

For informational purposes only. Not financial, medical, or legal advice. You are solely responsible for how you use these tools.