Skip to content

teilomillet/enzu

enzu

enzu

LLM tasks with hard budget limits.

PyPI Python License

enzu is a Python toolkit for running LLM tasks with guaranteed spending limits. Set a token, time, or cost cap—enzu enforces it. No runaway API bills.

Quickstart

pip install enzu
export OPENAI_API_KEY=sk-...
from enzu import ask

print(ask("Explain quantum computing in one sentence."))

The problem enzu solves

LLM APIs charge per token. Without limits, a single bad prompt can cost $50+. enzu stops execution before you exceed your budget:

from enzu import Enzu

client = Enzu()

# You ask for 500 words, but cap output at 50 tokens
result = client.run(
    "Write a 500-word essay on climate change.",
    tokens=50,  # Hard limit: stops here, guaranteed
)
# Result: partial output, no surprise bill

What enzu is / isn't

enzu is enzu is not
A budget enforcement layer A prompt library
Hard stops when limits hit "Best effort" throttling
Works with any OpenAI-compatible API Tied to one provider
~2k lines of code A heavyweight framework

Core features

1. Budget limits

Cap by tokens, seconds, or dollars:

client = Enzu()

# Cap by tokens
result = client.run("Summarize this", data=text, tokens=200)

# Cap by time
result = client.run("Research this topic", seconds=30)

# Cap by cost (requires OpenRouter)
result = client.run("Analyze this data", cost=0.50)  # Max $0.50

2. Predictable error handling

Every call returns a status you can check:

from enzu import Enzu, Outcome

client = Enzu()
result = client.run("Analyze this", data=doc, tokens=100, return_report=True)

if result.outcome == Outcome.SUCCESS:
    print(result.answer)
elif result.outcome == Outcome.BUDGET_EXCEEDED:
    print("Hit the limit, partial result available")
elif result.outcome == Outcome.TIMEOUT:
    print("Took too long")

3. Document analysis

Parse PDFs, Word docs, and other files:

pip install enzu[docling]
from enzu import Enzu

client = Enzu()

# Ask questions about a PDF
result = client.run(
    "What are the key findings?",
    documents=["quarterly-report.pdf"],
    tokens=500,
)

# Multi-turn conversation with document context
session = client.session(documents=["research-paper.pdf"])
answer1 = session.run("What's the main argument?")
answer2 = session.run("What evidence supports it?")
answer3 = session.run("What are the limitations?")

4. Long document handling

Documents too large for the model's context window? enzu automatically splits them into chunks and synthesizes the answer:

client = Enzu()

# Works even with 100k+ token documents
answer = client.run(
    "Summarize the main conclusions",
    data=open("huge-report.txt").read(),
    tokens=500,
)

5. Background jobs

For long-running tasks, fire and poll:

from enzu import Enzu, JobStatus

client = Enzu()
job = client.submit("Analyze this dataset", data=data, cost=5.0)

# Check later
job = client.status(job.job_id)
if job.status == JobStatus.COMPLETED:
    print(job.answer)

HTTP API

Run enzu as a server:

pip install enzu[server]
uvicorn enzu.server:app --port 8000
curl http://localhost:8000/v1/run \
  -H "Content-Type: application/json" \
  -d '{"task": "Say hello", "model": "gpt-4o"}'

Examples

Folder What's inside
examples/basics/ First steps, minimal code
examples/concepts/ Budget caps, error handling
examples/production/ Document Q&A, async jobs, sessions
examples/advanced/ Metrics, stress testing
examples/usecases/ Code reviewer, summarizer, data extractor

Start here:

Documentation

Requirements

Python 3.9+

Contributing

See CONTRIBUTING.md.

About

Budgeted LLM runs with hard caps + typed outcomes + async job mode. Python/CLI/HTTP. OpenAI-compatible.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages