gekro
GitHub LinkedIn
AI

Token Probability Visualizer

Paste a logprobs-enabled LLM response → see top-K alternative tokens at every position with their probabilities, colour-coded by uncertainty.

Paste logprobs JSON

Display options Top 5, all positions

Lower probability (more uncertain) → warmer colour.

Token stream

Hover or tap a token to see top-K alternatives →

Selected token

Tap a token to inspect

Top alternatives

Overall

Tokens
Mean probability
Median probability
Most uncertain
Sequence logprob

🔒 All parsing in your browser. Logprobs never sent anywhere.

As-is, no warranty. These apps are free under their listed license and run entirely in your browser. Use at your own risk — don't blame me if your PC catches fire, your dog runs away, or the math turns out wrong. Verify anything that actually matters. None of this is professional financial, medical, legal, or engineering advice.

© 2026 Rohit Burani · MIT · Built at gekro.com · View source ↗

Guide

What It Does

LLMs are probability distributions over tokens. At every position the model picks one token from many candidates with different likelihoods. logprobs is the request flag that exposes those probabilities. This tool turns that data into a visualization:

  • Token stream: each token in the response is colour-coded — green where the model was very confident (>85%), red where it was guessing (<15%)
  • Click any token: see the top-K alternatives the model considered, with their probabilities
  • Stats panel: mean / median per-token probability, most-uncertain token, total sequence log-probability

How To Use

  1. Make an LLM call with logprobs: true and top_logprobs: N (most providers support N up to 20)
  2. Copy the response JSON
  3. Paste it here
  4. Hover or tap tokens to inspect alternatives

When It’s Useful

  • Hallucination detection — low-confidence tokens in factual claims are red flags
  • Prompt comparison — same prompt, two different system messages — compare mean probabilities
  • Few-shot debugging — see which example influenced the model and where confidence drops
  • Evaluation — sequence log-probability is a standard scoring metric for prompt-quality comparison
  • Education — actually SEE what “the model considered” instead of just imagining it

Supported Formats

  • OpenAI: full chat-completion response with choices[0].logprobs.content
  • Raw OpenAI logprobs: just the content array
  • Anthropic: where top-tokens are exposed
  • Generic: [{ token, logprob, top_logprobs: [{token, logprob}] }]

The parser auto-detects which shape you pasted.

What’s NOT In Scope

For informational purposes only. Not financial, medical, or legal advice. You are solely responsible for how you use these tools.