LLM tasks with hard budget limits.
enzu is a Python toolkit for running LLM tasks with guaranteed spending limits. Set a token, time, or cost cap—enzu enforces it. No runaway API bills.
pip install enzu
export OPENAI_API_KEY=sk-...from enzu import ask
print(ask("Explain quantum computing in one sentence."))LLM APIs charge per token. Without limits, a single bad prompt can cost $50+. enzu stops execution before you exceed your budget:
from enzu import Enzu
client = Enzu()
# You ask for 500 words, but cap output at 50 tokens
result = client.run(
"Write a 500-word essay on climate change.",
tokens=50, # Hard limit: stops here, guaranteed
)
# Result: partial output, no surprise bill| enzu is | enzu is not |
|---|---|
| A budget enforcement layer | A prompt library |
| Hard stops when limits hit | "Best effort" throttling |
| Works with any OpenAI-compatible API | Tied to one provider |
| ~2k lines of code | A heavyweight framework |
Cap by tokens, seconds, or dollars:
client = Enzu()
# Cap by tokens
result = client.run("Summarize this", data=text, tokens=200)
# Cap by time
result = client.run("Research this topic", seconds=30)
# Cap by cost (requires OpenRouter)
result = client.run("Analyze this data", cost=0.50) # Max $0.50Every call returns a status you can check:
from enzu import Enzu, Outcome
client = Enzu()
result = client.run("Analyze this", data=doc, tokens=100, return_report=True)
if result.outcome == Outcome.SUCCESS:
print(result.answer)
elif result.outcome == Outcome.BUDGET_EXCEEDED:
print("Hit the limit, partial result available")
elif result.outcome == Outcome.TIMEOUT:
print("Took too long")Parse PDFs, Word docs, and other files:
pip install enzu[docling]from enzu import Enzu
client = Enzu()
# Ask questions about a PDF
result = client.run(
"What are the key findings?",
documents=["quarterly-report.pdf"],
tokens=500,
)
# Multi-turn conversation with document context
session = client.session(documents=["research-paper.pdf"])
answer1 = session.run("What's the main argument?")
answer2 = session.run("What evidence supports it?")
answer3 = session.run("What are the limitations?")Documents too large for the model's context window? enzu automatically splits them into chunks and synthesizes the answer:
client = Enzu()
# Works even with 100k+ token documents
answer = client.run(
"Summarize the main conclusions",
data=open("huge-report.txt").read(),
tokens=500,
)For long-running tasks, fire and poll:
from enzu import Enzu, JobStatus
client = Enzu()
job = client.submit("Analyze this dataset", data=data, cost=5.0)
# Check later
job = client.status(job.job_id)
if job.status == JobStatus.COMPLETED:
print(job.answer)Run enzu as a server:
pip install enzu[server]
uvicorn enzu.server:app --port 8000curl http://localhost:8000/v1/run \
-H "Content-Type: application/json" \
-d '{"task": "Say hello", "model": "gpt-4o"}'| Folder | What's inside |
|---|---|
examples/basics/ |
First steps, minimal code |
examples/concepts/ |
Budget caps, error handling |
examples/production/ |
Document Q&A, async jobs, sessions |
examples/advanced/ |
Metrics, stress testing |
examples/usecases/ |
Code reviewer, summarizer, data extractor |
Start here:
basics/python_quickstart.py— First callconcepts/budget_hardstop_demo.py— See budget enforcement in actionproduction/document_qa/pipeline.py— PDF analysis
- Getting started
- Provider setup — OpenAI, OpenRouter, etc.
- HTTP API reference
- Python API reference
- Recipes & patterns
Python 3.9+
See CONTRIBUTING.md.
