Token Budget Negotiator: Greedy Ablation Prompt Compression with Quality Gating

Prompt cost optimization is usually handled manually: remove chunks, test output, repeat. This project automates that loop with guardrails.
It treats prompts as structured sections, removes low-priority candidates one by one, and keeps each removal only if quality remains above a defined threshold.
Why Teams Use This
You get measurable token savings without blind summarization. Every accepted removal is justified by a scoring pass, and the full ablation trace is auditable.
It is useful for production prompts that keep growing over time and need periodic budget enforcement.
Run the Project
git clone https://github.com/dakshjain-1616/token-budget-negotiator
cd token-budget-negotiator
pip install -e .
token-budget analyze examples/prompt.yaml
token-budget negotiate examples/prompt.yaml --scorer ollama --model gemma4:latest --threshold 0.8 --min-savings 0.2 --max-savings 0.8
export OPENROUTER_API_KEY=sk-or-...
token-budget check-openrouter
python -m token_budget_negotiator.mcp_server --scorer ollama --model gemma4:latest
The same engine is available as CLI, Python package, and MCP server, which makes it practical for both local iteration and automated pipelines.
Architecture Walkthrough
The token budget negotiator repository is organized around a clear pipeline, so you can trace the full flow from input handling to final output without guesswork. This makes onboarding easier for new contributors and helps teams debug faster when behavior changes after updates.
Practical Use Cases
If you are evaluating token budget negotiator for production, start with a small real-world dataset, run the included commands end to end, and compare output quality, latency, and operational complexity. This gives a practical signal that is stronger than a toy demo.
Implementation Notes
The project is useful as both a standalone tool and a reference implementation. You can copy patterns from this codebase into your own stack, especially around evaluation discipline, reproducibility, and operator visibility.
Try NEO in Your IDE
Install the NEO extension to bring AI-powered development directly into your workflow:
- VS Code: NEO in VS Code
- Cursor: Install NEO for Cursor