This guide walks you through exposing a local LLM as a paid API endpoint using the Obol Stack. By the end, you'll have:
- A local Ollama model serving inference
- An x402 payment gate requiring USDC per request
- A public URL via Cloudflare tunnel
- An ERC-8004 agent registration document for discoverability
Note
--per-mtok is supported for inference pricing, but phase 1 still charges an
approximate flat request price derived as perMTok / 1000 using a fixed
1000 tok/request assumption. Exact token metering is deferred to the
follow-up x402-meter design described in
docs/plans/per-token-metering.md.
Important
The monetize subsystem is alpha software on the feat/secure-enclave-inference branch.
If you encounter an issue, please open a
GitHub issue.
SELLER (obol stack cluster)
obol sell http --> ServiceOffer CR --> Agent reconciles:
1. ModelReady (pull model in Ollama)
2. UpstreamHealthy (health-check Ollama)
3. PaymentGateReady (create x402 Middleware + pricing route)
4. RoutePublished (create HTTPRoute -> Traefik gateway)
5. Registered (ERC-8004 on-chain, optional)
6. Ready (all conditions True)
CF Quick Tunnel -----------> Traefik Gateway
https://<id>.trycloudflare.com
/services/<name>/* -> x402 -> Ollama
/.well-known/*.json -> ERC-8004 doc
/ -> obol-frontend
/rpc -> eRPC
BUYER (curl / blockrun-llm SDK)
1. GET /.well-known/agent-registration.json -> discover services
2. POST /services/<name>/v1/chat/completions -> 402 Payment Required
3. Sign EIP-712 payment + retry with header -> 200 + inference
- Docker -- Docker Engine (Linux) or Docker Desktop (macOS)
- Obol Stack -- installed via
bash <(curl -s https://stack.obol.org) - Ollama -- running on the host (
ollama serve) - Base Sepolia wallet -- with ETH for gas and USDC for testing payments
- USDC (Base Sepolia):
0x036CbD53842c5426634e7929541eC2318f3dCF7e - Faucets: docs.base.org/tools/faucets
- USDC (Base Sepolia):
Start from a clean state:
# Initialize and start (automatically deploys obol-agent, configures LiteLLM
# with Ollama models, and starts a Cloudflare tunnel — no manual setup needed)
obol stack init
obol stack up
# Wait for all pods to be ready
obol kubectl get pods -AVerify the key components:
| Check | Command | Expected |
|---|---|---|
| Cluster nodes | obol kubectl get nodes |
1 node Ready |
| Agent running | obol kubectl get pods -n openclaw-obol-agent |
Running |
| CRD installed | obol kubectl get crd serviceoffers.obol.org |
Found |
| x402 verifier | obol kubectl get pods -n x402 |
2 replicas Running |
| Traefik gateway | obol kubectl get gateway -n traefik |
traefik-gateway |
| LiteLLM running | obol kubectl get pods -n llm |
Running |
| Ollama reachable | curl -s http://localhost:11434/api/tags |
JSON model list |
Make sure the model is available in your host Ollama:
# Pull a model (qwen3.5:9b is the default agent model)
ollama pull qwen3.5:9b
# Or a smaller model for quick testing
ollama pull qwen3:0.6b
# Verify it's available
curl -s http://localhost:11434/api/tags | python3 -m json.toolobol stack up automatically configures LiteLLM with all available Ollama models (no manual obol model setup needed). If you pull a new model after the cluster is running, restart LiteLLM to pick it up:
obol kubectl rollout restart deployment/litellm -n llmNote
The agent can also pull models automatically during reconciliation via the Ollama API, but pre-pulling avoids the wait when the ServiceOffer is created.
Configure the x402 verifier with your wallet and chain:
obol sell pricing \
--wallet 0x70997970C51812dc3A010C7d01b50e0d17dc79C8 \
--chain base-sepoliaThis patches the x402-pricing ConfigMap in the x402 namespace. The Stakater Reloader automatically restarts the verifier pod when the config changes.
Verify:
obol kubectl get cm x402-pricing -n x402 -o yaml
obol kubectl get pods -n x402 # verifier should have a recent restartSelf-hosted facilitator -- if you're running your own x402 facilitator (see Part 3), pass the URL:
obol sell pricing \
--wallet 0x70997970C51812dc3A010C7d01b50e0d17dc79C8 \
--chain base-sepolia \
--facilitator-url http://host.k3d.internal:4040Declare your inference service as a Kubernetes custom resource:
obol sell http my-qwen \
--wallet 0x70997970C51812dc3A010C7d01b50e0d17dc79C8 \
--chain base-sepolia \
--per-request 0.001 \
--namespace llm \
--upstream ollama \
--port 11434If you want to price by million tokens instead of explicitly setting a flat
request price, use --per-mtok. In phase 1, the verifier still enforces a
derived per-request price:
obol sell http my-qwen \
--wallet 0x70997970C51812dc3A010C7d01b50e0d17dc79C8 \
--chain base-sepolia \
--per-mtok 1.25 \
--namespace llm \
--upstream ollama \
--port 11434That stores both values in the pricing config:
- source model:
perMTok = 1.25 USDC / 1M tokens - enforced phase-1 charge:
price = 0.00125 USDC / request - approximation input:
approxTokensPerRequest = 1000
The agent automatically reconciles the offer through six stages:
ModelReady [check] Agent checks /api/tags, model already cached
UpstreamHealthy [check] Agent health-checks ollama:11434
PaymentGateReady [check] Creates Middleware x402-my-qwen + adds pricing route
RoutePublished [check] Creates HTTPRoute so-my-qwen -> ollama backend
Registered -- Skipped (--register not set)
Ready [check] All required conditions True
Watch the progress:
# Check conditions (wait ~60s for agent heartbeat)
obol sell status my-qwen --namespace llm
# Verify Kubernetes resources
obol kubectl get serviceoffer my-qwen -n llm
obol kubectl get middleware -n llm # x402-my-qwen
obol kubectl get httproute -n llm # so-my-qwenobol stack up automatically starts a Cloudflare Quick Tunnel. Get the public URL:
obol tunnel status
# -> https://<id>.trycloudflare.comIf the tunnel isn't running or you want a fresh URL:
obol tunnel restartTest each route to confirm everything is wired correctly:
export TUNNEL_URL="https://<id>.trycloudflare.com"
# Frontend (200)
curl -s -o /dev/null -w "%{http_code}" "$TUNNEL_URL/"
# eRPC (200 + network list) — local only, not via tunnel
curl -s "http://obol.stack:8080/rpc" | jq .
# eRPC JSON-RPC call (local only — specify evm/{chainId} path)
curl -s -X POST "http://obol.stack:8080/rpc/evm/84532" \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"eth_chainId","params":[],"id":1}' | jq .result
# Monetized endpoint (402 -- payment required!)
curl -s -w "\nHTTP %{http_code}" -X POST \
"$TUNNEL_URL/services/my-qwen/v1/chat/completions" \
-H "Content-Type: application/json" \
-d '{"model":"qwen3:0.6b","messages":[{"role":"user","content":"Hello"}]}'
# Machine-readable service catalog (200, always available when ServiceOffers are ready)
curl -s "$TUNNEL_URL/skill.md"
# ERC-8004 registration document (200)
curl -s "$TUNNEL_URL/.well-known/agent-registration.json" | jq .You can also verify locally (bypasses Cloudflare):
curl -s -w "\nHTTP %{http_code}" -X POST \
"http://obol.stack:8080/services/my-qwen/v1/chat/completions" \
-H "Content-Type: application/json" \
-d '{"model":"qwen3:0.6b","messages":[{"role":"user","content":"Hello"}]}'A 402 Payment Required response confirms the x402 gate is working. The response body contains the payment requirements:
{
"x402Version": 1,
"error": "Payment required for this resource",
"accepts": [{
"scheme": "exact",
"network": "base-sepolia",
"maxAmountRequired": "1000",
"asset": "0x036CbD53842c5426634e7929541eC2318f3dCF7e",
"payTo": "0x70997970C51812dc3A010C7d01b50e0d17dc79C8",
"description": "Payment required for /services/my-qwen/v1/chat/completions",
"maxTimeoutSeconds": 300,
"extra": {"name": "USDC", "version": "2"}
}]
}The maxAmountRequired is in USDC micro-units (6 decimals): 1000 = 0.001 USDC.
The seller-side verifier now exports Prometheus metrics on its existing Service:
obol kubectl get --raw /api/v1/namespaces/x402/services/x402-verifier:8080/proxy/metrics | headPrometheus scrapes it through a ServiceMonitor in x402. Key verifier metrics:
obol_x402_verifier_requests_totalobol_x402_verifier_payment_required_totalobol_x402_verifier_payment_verified_totalobol_x402_verifier_payment_failed_totalobol_x402_verifier_charged_requests_total
The seller's stack publishes an ERC-8004 agent registration document:
curl -s "$TUNNEL_URL/.well-known/agent-registration.json" | jq .This returns a JSON document describing the agent's services, supported payment methods, and endpoints.
Send a request without payment:
curl -s -X POST "$TUNNEL_URL/services/my-qwen/v1/chat/completions" \
-H "Content-Type: application/json" \
-d '{"model":"qwen3:0.6b","messages":[{"role":"user","content":"Hello"}]}' \
-D - 2>&1 | head -30The response is 402 Payment Required with a JSON body containing the payment requirements (wallet, chain, amount, facilitator URL).
Using the blockrun-llm Python SDK:
pip install blockrun-llmfrom blockrun_llm import LLMClient
import os
client = LLMClient(
private_key=os.environ["CONSUMER_PRIVATE_KEY"],
api_url=os.environ["TUNNEL_URL"]
)
# Automatically: 402 -> sign EIP-712 -> retry with payment header -> 200
response = client.chat("qwen3:0.6b", "Explain Ethereum in one sentence.")
print(f"Response: {response}")
print(f"Session cost: ${client._session_total_usd}")The SDK handles the full x402 flow:
- Sends the request
- Receives 402 with payment requirements
- Signs an EIP-712
TransferWithAuthorizationmessage (ERC-3009) - Retries with the
X-PAYMENTheader (base64-encoded x402 envelope) - Facilitator verifies the signature and settles USDC on-chain
- Returns the inference response
Manual flow with curl -- for debugging or custom integrations:
# Step 1: Get payment requirements from the 402 response
curl -s -X POST "$TUNNEL_URL/services/my-qwen/v1/chat/completions" \
-H "Content-Type: application/json" \
-d '{"model":"qwen3:0.6b","messages":[{"role":"user","content":"Hello"}]}'
# Step 2: Sign the EIP-712 payment (requires SDK or custom code)
# The 402 body contains: payTo, maxAmountRequired, asset, network, extra.name, extra.version
# Sign a TransferWithAuthorization (ERC-3009) message with:
# Domain: {name: "USDC", version: "2", chainId: 84532, verifyingContract: <USDC address>}
# Step 3: Retry with payment header
curl -s -X POST "$TUNNEL_URL/services/my-qwen/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "X-PAYMENT: <base64-encoded-x402-envelope>" \
-d '{"model":"qwen3:0.6b","messages":[{"role":"user","content":"Hello"}]}'
# -> 200 OK + inference responseAfter a successful paid request, verify the USDC transfer on-chain using Foundry's cast:
USDC=0x036CbD53842c5426634e7929541eC2318f3dCF7e
BUYER=0xa0Ee7A142d267C1f36714E4a8F75612F20a79720
PAYEE=0x70997970C51812dc3A010C7d01b50e0d17dc79C8
# Check buyer balance (should have decreased by 1000 micro-units = 0.001 USDC)
cast call "$USDC" "balanceOf(address)(uint256)" "$BUYER" --rpc-url http://localhost:8545
# Check payee balance (should have increased by 1000 micro-units)
cast call "$USDC" "balanceOf(address)(uint256)" "$PAYEE" --rpc-url http://localhost:8545The same payment flow works through the public Cloudflare tunnel URL:
export TUNNEL_URL=$(obol tunnel status | grep -oE 'https://[a-z0-9-]+\.trycloudflare\.com')
# 402 through tunnel
curl -s -w "\nHTTP %{http_code}" -X POST \
"$TUNNEL_URL/services/my-qwen/v1/chat/completions" \
-H "Content-Type: application/json" \
-d '{"model":"qwen3:0.6b","messages":[{"role":"user","content":"Hello"}]}'
# Paid request through tunnel (with X-PAYMENT header)
curl -s -X POST "$TUNNEL_URL/services/my-qwen/v1/chat/completions" \
-H "Content-Type: application/json" \
-H "X-PAYMENT: <base64-encoded-x402-envelope>" \
-d '{"model":"qwen3:0.6b","messages":[{"role":"user","content":"Hello"}]}'
# -> 200 OK + inference responseThis proves the full public path: Internet → Cloudflare → Traefik → x402 ForwardAuth → Facilitator settles USDC → 200 + inference.
The x402 facilitator verifies and settles payments on-chain. By default, the stack points at https://facilitator.x402.rs. For reliability, sovereignty, or testing, you can run your own.
- Reliability -- no dependency on a third-party service
- Sovereignty -- payments settle through your infrastructure
- Testing -- use Base Sepolia without depending on external uptime
When testing with an Anvil fork of Base Sepolia, Anvil's deterministic test accounts (0xf39F..., 0x7099..., etc.) often have contracts deployed at their addresses on the live network. Before using them for x402 payments, clear the code so they behave as EOAs:
# Clear contract code at consumer address to make it an EOA
cast rpc anvil_setCode 0xa0Ee7A142d267C1f36714E4a8F75612F20a79720 0x --rpc-url http://localhost:8545Without this, the USDC SignatureChecker will attempt EIP-1271 contract signature verification instead of ecrecover, causing "FiatTokenV2: invalid signature" errors.
To fund the consumer with USDC on a forked chain, use anvil_setStorageAt to write the balance directly. This avoids relying on testnet faucets that may be unavailable on a local fork:
# Fund consumer with USDC (Base Sepolia USDC: 0x036CbD53842c5426634e7929541eC2318f3dCF7e)
# Storage slot for balanceOf mapping is slot 9 in FiatTokenV2
CONSUMER=0xa0Ee7A142d267C1f36714E4a8F75612F20a79720
USDC=0x036CbD53842c5426634e7929541eC2318f3dCF7e
# Compute storage slot: keccak256(abi.encode(address, uint256(9)))
SLOT=$(cast index address "$CONSUMER" 9)
# Set balance to 1000 USDC (1000 * 10^6, 6 decimals)
cast rpc anvil_setStorageAt "$USDC" "$SLOT" \
"0x000000000000000000000000000000000000000000000000000000003B9ACA00" \
--rpc-url http://localhost:8545
# Verify
cast call "$USDC" "balanceOf(address)(uint256)" "$CONSUMER" --rpc-url http://localhost:8545The x402-rs project provides a Rust-based facilitator. Run it as a Docker container on the host:
# Clone and build
cd ~/Development/R&D
git clone https://github.com/x402-rs/x402-rs.git
cd x402-rs
cargo build --release
# Create config for Base Sepolia
# The facilitator wallet needs Base Sepolia ETH for gas when settling payments.
export FACILITATOR_PRIVATE_KEY="0x<your-funded-private-key>"
cat > config-sepolia.json << EOF
{
"port": 4040,
"host": "0.0.0.0",
"chains": {
"eip155:84532": {
"eip1559": true,
"flashblocks": false,
"signers": ["$FACILITATOR_PRIVATE_KEY"],
"rpc": [{"http": "https://sepolia.base.org", "rate_limit": 25}]
}
},
"schemes": [
{"id": "v1-eip155-exact", "chains": "eip155:*"},
{"id": "v2-eip155-exact", "chains": "eip155:*"}
]
}
EOF
# Start the facilitator
./target/release/x402-facilitator --config config-sepolia.jsonTip
For testing with Anvil, point the RPC at your local fork:
"rpc": [{"http": "http://127.0.0.1:8545", "rate_limit": 50}]Verify it's running:
curl -s http://localhost:4040/supported | jq .Point the x402 verifier at your self-hosted facilitator:
obol sell pricing \
--wallet 0x70997970C51812dc3A010C7d01b50e0d17dc79C8 \
--chain base-sepolia \
--facilitator-url http://host.k3d.internal:4040The k3d cluster can reach the host via host.k3d.internal. The HTTPS exemption allowlist permits HTTP for this address.
Note
You can also set the facilitator URL via the X402_FACILITATOR_URL
environment variable.
# List all offers across namespaces
obol sell list --namespace llm
# Detailed status with conditions
obol sell status my-qwen --namespace llm
# Cluster-wide pricing and registration status
obol sell statusStop serving an offer without deleting it. This removes the pricing route so requests pass through without payment:
obol sell stop my-qwen --namespace llmThe CR and any ERC-8004 registration remain intact. Re-create the offer with the same name to restart.
# Delete with confirmation prompt
obol sell delete my-qwen --namespace llm
# Delete without confirmation
obol sell delete my-qwen --namespace llm --forceDeletion:
- Removes the ServiceOffer CR
- Cascades Middleware and HTTPRoute via OwnerReferences
- Removes the pricing route from the x402 verifier
- Deactivates the ERC-8004 registration (sets
active=false)
Verify cleanup:
obol kubectl get so my-qwen -n llm # NotFound
obol kubectl get middleware x402-my-qwen -n llm # NotFound
obol kubectl get httproute so-my-qwen -n llm # NotFoundThe x402 verifier sits in the request path as a Traefik ForwardAuth middleware:
Client
|
POST /services/my-qwen/v1/chat/completions
|
v
Traefik Gateway
|
--> ForwardAuth to x402-verifier.x402.svc:8080
| |
| +-- Match request path against pricing routes
| +-- No match? Return 200 (allow, free route)
| +-- Match + no payment header? Return 402 + requirements
| +-- Match + payment header? Verify with facilitator
| | |
| | +-- POST facilitator/verify
| | +-- Valid? Return 200 (allow)
| | +-- Invalid? Return 402
| |
| <-- 200 or 402
|
+-- 200? Proxy to upstream (Ollama)
+-- 402? Return to client with payment requirements
+------------+
| ModelReady | (pull model via Ollama API)
+-----+------+
|
+--------v---------+
| UpstreamHealthy | (health-check service)
+--------+---------+
|
+----------v-----------+
| PaymentGateReady | (create Middleware + pricing route)
+----------+-----------+
|
+---------v----------+
| RoutePublished | (create HTTPRoute)
+---------+----------+
|
+---------v----------+
| Registered | (ERC-8004, optional)
+---------+----------+
|
+-----v-----+
| Ready | (all conditions True)
+-----------+
When the agent reconciles a ServiceOffer named my-qwen in namespace llm:
| Resource | Kind | Namespace | Name |
|---|---|---|---|
| ServiceOffer | obol.org/v1alpha1 |
llm |
my-qwen |
| Middleware | traefik.io/v1alpha1 |
llm |
x402-my-qwen |
| HTTPRoute | gateway.networking.k8s.io/v1 |
llm |
so-my-qwen |
| ConfigMap patch | v1 |
x402 |
x402-pricing (route added) |
The Middleware and HTTPRoute have ownerReferences pointing at the ServiceOffer, so they are garbage-collected on deletion.
The x402 verifier reads its config from the x402-pricing ConfigMap:
wallet: "0x70997970C51812dc3A010C7d01b50e0d17dc79C8"
chain: "base-sepolia"
facilitatorURL: "https://facilitator.x402.rs"
verifyOnly: false
routes:
- pattern: "/services/my-qwen/*"
price: "0.001"
description: "my-qwen inference"
payTo: "0x70997970C51812dc3A010C7d01b50e0d17dc79C8"
network: "base-sepolia"This configuration is used by the litellm-config ConfigMap in the llm namespace, which LiteLLM reads for model_list configuration.
Per-route payTo and network override the global values, enabling multiple ServiceOffers with different wallets or chains.
The agent reconciles on a heartbeat (~60 seconds). Check agent logs:
obol kubectl logs -n openclaw-* -l app=openclaw --tail=50The pricing route may not have been added, or was overwritten. Check the ConfigMap:
obol kubectl get cm x402-pricing -n x402 -o jsonpath='{.data.pricing\.yaml}'Ensure a route matching your path exists in the routes list. The verifier logs its route count at startup:
obol kubectl logs -n x402 -l app=x402-verifier --tail=10
# Look for: "routes: 1" (or however many you expect)If routes are missing, the agent may not have reconciled yet (heartbeat is ~60s). You can also re-trigger reconciliation by deleting and re-creating the ServiceOffer.
If using a self-hosted facilitator on the host, verify the k3d bridge:
obol kubectl run -n x402 curl-test --rm -it --restart=Never \
--image=curlimages/curl -- \
curl -s http://host.k3d.internal:4040/healthVerify the model is available in your host Ollama:
curl -s http://localhost:11434/api/tags | python3 -c "import sys,json; [print(m['name']) for m in json.load(sys.stdin)['models']]"LiteLLM discovers models from the configured providers. If you pulled a model after the cluster started, you may need to restart LiteLLM:
obol kubectl rollout restart deployment/litellm -n llmCloudflare Quick Tunnels assign a random URL that changes on restart. Get the current URL:
obol tunnel statusThis error has two common causes:
1. Contract code at buyer address -- On Anvil forks, deterministic test accounts (0xf39F..., 0x7099..., 0xa0Ee..., etc.) often have contract code at their addresses from the live chain state. The USDC SignatureChecker tries EIP-1271 contract verification instead of ecrecover. Clear the code:
cast rpc anvil_setCode <buyer-address> 0x --rpc-url http://localhost:85452. Wrong EIP-712 domain name -- The USDC contract on Base Sepolia uses the domain name "USDC" (not "USD Coin" like on Ethereum mainnet). Verify:
cast call 0x036CbD53842c5426634e7929541eC2318f3dCF7e "name()(string)" --rpc-url http://localhost:8545
# -> "USDC"Ensure your EIP-712 signing code uses the correct domain: {name: "USDC", version: "2", chainId: 84532, verifyingContract: 0x036CbD53842c5426634e7929541eC2318f3dCF7e}.
See Part 3.2 for full Anvil setup details.
The x402 verifier returns 400 when the payment payload is malformed. Ensure the X-Payment header contains the full x402 envelope with all required fields:
x402Version(integer, e.g.,1)scheme(e.g.,"exact")network(e.g.,"base-sepolia")payload(the signed authorization data)resource(the URL path being paid for)
Missing any of these fields causes the facilitator to reject the payment before signature verification.
If the OpenClaw agent cannot create or patch Kubernetes resources (ServiceOffers, Middlewares, HTTPRoutes), the ClusterRoleBindings may have empty subjects lists. Patch them manually:
# Patch both ClusterRoleBindings
for BINDING in openclaw-monetize-read-binding openclaw-monetize-workload-binding; do
kubectl patch clusterrolebinding "$BINDING" \
--type=json \
-p '[{"op":"add","path":"/subjects","value":[{"kind":"ServiceAccount","name":"openclaw","namespace":"openclaw-obol-agent"}]}]'
done
# Patch x402 namespace RoleBinding
kubectl patch rolebinding openclaw-x402-pricing-binding -n x402 \
--type=json \
-p '[{"op":"add","path":"/subjects","value":[{"kind":"ServiceAccount","name":"openclaw","namespace":"openclaw-obol-agent"}]}]'Replace openclaw-obol-agent with your actual OpenClaw namespace if different.
| Command | Description |
|---|---|
obol sell pricing --wallet ... --chain ... |
Configure x402 payment settings |
obol sell http <name> --wallet ... --chain ... --per-request ... --upstream ... --port ... |
Create a ServiceOffer |
obol sell list |
List all ServiceOffers |
obol sell status <name> -n <ns> |
Show conditions for an offer |
obol sell stop <name> -n <ns> |
Pause an offer (remove pricing route) |
obol sell delete <name> -n <ns> |
Delete an offer and cleanup |
obol sell status |
Show cluster pricing and registration |
obol sell register --private-key-file ... |
Register on ERC-8004 |
| Resource | Namespace | Purpose |
|---|---|---|
x402-pricing ConfigMap |
x402 |
Pricing routes and wallet config |
x402-secrets Secret |
x402 |
Wallet address |
x402-verifier Deployment |
x402 |
ForwardAuth payment verifier |
serviceoffers.obol.org CRD |
(cluster) | ServiceOffer custom resource definition |
traefik-gateway Gateway |
traefik |
Main ingress gateway |
| Variable | Default | Description |
|---|---|---|
X402_WALLET |
(none) | USDC recipient wallet address |
X402_FACILITATOR_URL |
(none) | Override facilitator URL |
CONSUMER_PRIVATE_KEY |
(none) | Buyer wallet key (for SDK) |