gekro
GitHub LinkedIn
AI

Chat Template Builder

Compose a system/user/assistant conversation and see the exact prompt string each model family receives - special tokens and all

Messages

Formatted prompt

 

Special tokens are highlighted. The exact string above is what the model receives before generation. Pairs with the Tokenizer app for debugging local LLM prompt formatting.

As-is, no warranty. These apps are free under their listed license and run entirely in your browser. Use at your own risk — don't blame me if your PC catches fire, your dog runs away, or the math turns out wrong. Verify anything that actually matters. None of this is professional financial, medical, legal, or engineering advice.

© 2026 Rohit Burani · MIT · Built at gekro.com · View source ↗

Guide

What It Does

Every instruction-tuned model expects its conversation wrapped in a specific format - a set of special tokens that mark where each turn starts and ends, who is speaking, and where the model should begin generating. Get that format wrong and the model still produces text, but quality quietly degrades: it ignores the system prompt, runs past where it should stop, or treats your instructions as content to echo.

This tool makes the format visible. Compose a conversation with system, user, and assistant turns, pick a model family, and see the exact string that family’s tokenizer would receive - special tokens and all.

How to Use It

  1. Edit the message rows - pick a role (system / user / assistant) and type the content. Add or remove turns as needed.
  2. Choose a template family: ChatML, Llama 3, Mistral, Gemma, or Phi-3.
  3. Read the formatted output. Special tokens are highlighted so you can see exactly where each turn begins and ends.
  4. Leave “Add generation prompt” on to append the assistant priming tokens - the string is then ready to hand to a model for completion.
  5. Copy the raw string (without highlighting) or download it.

The Families

  • ChatML - used by OpenAI models, Qwen, Yi, and many fine-tunes. <|im_start|>role / <|im_end|> markers.
  • Llama 3 - Llama 3, 3.1, 3.2 Instruct. <|begin_of_text|>, header blocks, <|eot_id|> turn terminators.
  • Mistral / Mixtral - [INST] ... [/INST] instruction blocks. No dedicated system role, so the system prompt is folded into the first user turn (exactly as the official template does it).
  • Gemma - <start_of_turn> / <end_of_turn> with the assistant role renamed to model. No system role - system content is folded into the first user turn.
  • Phi-3 - <|system|> / <|user|> / <|assistant|> markers with <|end|> terminators.

What This Is and Isn’t

This renders the chat template - the turn-structure wrapping. It does not run the BPE tokenizer, so the per-token breakdown lives in the companion Tokenizer. For exact behavior on a specific checkpoint, the model’s own tokenizer_config.json (its chat_template Jinja string) is always the final authority - vendors occasionally tweak these between releases. The templates here match the documented format for each family as of the last-verified date.

Limitations

  • Five families - the most common ones. Less common formats (Alpaca, Vicuna, Zephyr, command-r) are not included.
  • No tool/function-call formatting - this covers plain chat turns, not the tool-call block syntax some models layer on top.
  • Template, not tokenizer - token counts are approximate (characters / 4); use the Tokenizer for real counts.

For informational purposes only. Not financial, medical, or legal advice. You are solely responsible for how you use these tools.