What It Does

Paste a streaming log from any LLM API and this tool:

Detects the format automatically (OpenAI SSE, Anthropic content-block-delta, Bedrock event-stream, generic ndjson, or bracketed-timestamp format)
Reconstructs the chunk timeline
Replays the response at 1× (real-time), 0.5×, 2×, 5×, or instant
Reports time-to-first-token (TTFT), total stream duration, chunk count, approximate output tokens, throughput (tokens/sec), and inter-token p50 / p95 deltas

When To Use It

“The API got slower” claims — replay your saved logs side-by-side and prove (or disprove) it with hard numbers
TTFT regressions — first-token latency hides inside the “looks fine” total stream time
Provider comparisons — same prompt, different providers, see whose streaming is actually smoother
Demo prep — record a fast streaming response, replay at 1× during a meeting

Supported Log Formats

Format	Example
OpenAI SSE	`data: {"choices":[{"delta":{"content":"hi"}}]}`
Anthropic SSE	`data: {"type":"content_block_delta","delta":{"text":"hi"}}`
Bedrock event-stream	`{"bytes":"<base64>"}` (auto-decoded best-effort)
Bracketed timestamp	`[12.456] {"delta":{"text":"hi"}}`
Generic ndjson	`{"text":"hi","ts":420}`

The parser also handles _ts / ts / timestamp / t fields when present. If no timestamp is in the log, it falls back to 50ms-per-chunk spacing so you can still see relative ordering.

What’s NOT In Scope

Live capture from an API — paste-only; live capture would need backend
Cost calculation — see Reasoning Token Cost Calculator and LLM Cost Calculator
Tokenization — see Prompt Token Counter. Token counts here are approximate (whitespace-split).

Reasoning Token Cost Calculator — what your stream actually costs
LLM Cost Calculator — direct-API pricing
Hyperscaler Pricing Comparison — Bedrock vs Foundry vs Vertex

Streaming Response Player

Guide

What It Does

When To Use It

Supported Log Formats

What’s NOT In Scope

Related Tools