Estimate the real cost, context growth, and latency of a multi-step agent loop - including the tokens you get re-billed for every turn
Prices are approximate, last checked 2026-06-19 — edit to match your provider.
Loop shape
Tool results + user observations
Model reply incl. tool-call args
For wall-clock latency estimate
Per-step breakdown
| Step | Prompt tokens | Output | Step cost |
|---|
As-is, no warranty. These apps are free under their listed license and run entirely in your browser. Use at your own risk — don't blame me if your PC catches fire, your dog runs away, or the math turns out wrong. Verify anything that actually matters. None of this is professional financial, medical, legal, or engineering advice.
A single API call is easy to price. An agent loop is not - because at every step the model re-reads the entire conversation so far. Step 10 pays for the system prompt, the original task, and all nine prior turns again. That re-billing is why agent costs balloon, and it’s exactly what a per-call price estimate misses.
This tool simulates the loop turn by turn and adds it all up: total cost, total tokens billed, final context size, and estimated wall-clock latency.
The number that surprises people is billed tokens vs unique tokens. Unique tokens are everything the conversation ever contained, counted once. Billed tokens are what you actually pay for - and because the context is re-sent each step, billed tokens grow roughly with the square of the step count. A loop that produces 20,000 unique tokens can bill you for 200,000+.
Prompt caching is the main defense: the stable prefix of the conversation is charged at the (much cheaper) cached rate, and only the newly added tokens each step pay full price. The tool shows the before/after so you can see the saving in dollars.
For informational purposes only. Not financial, medical, or legal advice. You are solely responsible for how you use these tools.