Skip to content
uxTools
Developer

Prompt Trimmer

Trim long prompts down to a token budget — boundary-aware, with strategies for chat history, document QA, and code review.

Prompt input

Paste a prompt, mark priority paragraphs, then choose a trimming strategy.

Model:
832 tokens (est.)3,279 chars

Priority marking mode

Click paragraphs to mark high (keep) or low (trim first).

Trim settings

Pick a budget, a strategy, and what to keep intact.

Template

Strategy

Drops whole paragraphs from the middle outward.

Target token budget

Reserve for output

Subtracted from the budget so the model has room to reply.

Preserve markers

Headings (#, ##)

Code blocks (```)

Lists (-, *, 1.)

Quote blocks (>)

Inline code (`)

Add trimmed indicator

Inject markers like [... 234 tokens trimmed ...] where content was removed.

Trimmed output

Updates live. Toggle the diff to see what was removed.

832 → 832

832

Saved

0%

Retained

100%

Show what was removed
History

Save a snapshot to compare versions later.

Chars per token, by content type

Content typeChars/tokenNotes
English prose≈ 4Default for chat-style text
Code≈ 3Denser punctuation tokenizes finer
CJK (中日韓)≈ 2One token often spans 1–2 glyphs
Numbers / IDs≈ 2.5Digit-heavy strings tokenize tightly
URLs≈ 3Lots of punctuation

Common budget shapes

  • Short system prompt: 200–600 tokens.
  • RAG context window: 2k–8k per request.
  • Long-doc QA over a chapter: 8k–32k.
  • Whole-codebase context: 100k–1M tokens (Gemini, Claude long).
  • Reserve 300–800 tokens for the model's reply.

Cost note

Trimming a prompt by 1k tokens saves ~$0.003 per call on a $3 / 1M input model — multiplied by every call you make.

Open AI cost estimator

Token counts are heuristic. Real tokenizers vary by model — use these numbers for planning, not for exact billing.