Work Unit: WU 3.3 Author: Pedro (operator) Date: 2026-03-09 Status: Implemented (engine + API + migrations) AN Score Version: 0.3
AN Score v0.2 evaluates tools on two axes:
- Execution (17 dimensions: I1-I7 infrastructure, F1-F7 interface, O1-O3 operational)
- Access Readiness (6 dimensions: A1-A6)
v0.3 adds a third axis: Agent Autonomy — measuring a tool's support for fully autonomous agent operations. This captures capabilities that don't fit neatly into execution quality or access friction, but matter enormously for the agent future:
- P1 — Payment Autonomy: Can an agent pay for this tool without human intervention?
- G1 — Governance Readiness: Does the tool support enterprise-grade agent oversight?
- W1 — Web Agent Accessibility: Can a web agent navigate the tool's dashboard/admin UI?
Payment Autonomy — From Levie's "trillions of agents" thesis + Tom's insight about agent microtransactions:
- x402 protocol: HTTP 402 Payment Required → instant stablecoin micropayments. 75M tx in last 30 days, $24M volume. Coinbase-led, Cloudflare integrated, Google A2A extension live.
- Stripe Issuing: Virtual cards via API ($0.10/card), per-merchant spending controls, Agentic Commerce Protocol (ACP) with OpenAI. Manus uses this.
- Coinbase AgentKit/Agentic Wallets: Agent-owned wallets (Feb 2026), built-in security guardrails, Base L2 stablecoins.
- Google AP2: Agent Payments Protocol, 60+ partners (Visa, Mastercard, Coinbase), Verifiable Credentials for authorization.
- Reality: Most SaaS tools require human credit card entry. This dimension measures how close a tool is to accepting agent-initiated payment.
Governance Readiness — From enterprise agent deployment patterns:
- Agents need per-agent identity, RBAC/ABAC scoped access, audit trails, kill switches.
- Standards: NIST AI RMF, ISO/IEC 42001, EU AI Act (high-risk tier, 2025 enforcement).
- Azure RBAC + Entra Agent Identity, Okta non-human identity, MintMCP compliance proxy.
- Reality: Enterprise won't deploy agents on tools without audit trails and access controls. This is the enterprise gatekeeping dimension.
Web Agent Accessibility — From AAG v0.1 (Agent Accessibility Guidelines):
- Agents interact via 6 channels: DOM/a11y tree, screenshots, CDP, HTML parsing, structured data, keyboard simulation.
- Three conformance levels: A (Parseable), AA (Navigable), AAA (Native).
- Key gaps: CAPTCHA/bot detection, dynamic content without aria labels, non-semantic HTML.
- Reality: Agents that can't use the web UI are limited to API-only. This dimension captures web-layer operability for agents that browse.
Question: Can an agent acquire and pay for this tool without human intervention?
Scoring Rubric (1-10):
| Score | Description |
|---|---|
| 9-10 | Native agent payment: Supports x402, AP2, or programmatic payment via API. Agent can go from discovery to paid access with zero human steps. Consumption billing available. |
| 7-8 | API-provisioned billing: Credit card or payment method can be added via API (Stripe Issuing virtual card accepted). Usage-based or per-seat billing programmatically manageable. |
| 5-6 | Semi-automated: Payment requires web form, but standard card fields (no CAPTCHA). Agent with web access could complete with Stripe Issuing card. Invoicing available for enterprise. |
| 3-4 | Human-gated: Payment requires interactive verification (3D Secure mandatory, phone verification, human-reviewed invoice). Agent needs human handoff for payment. |
| 1-2 | Fully manual: Custom pricing, sales-led only, no self-serve payment. "Contact sales" is the only path. |
Evidence Sources:
- Pricing page structure (self-serve vs. sales-led)
- API billing endpoints (if any)
- Payment method acceptance (x402, API card, web form, invoice)
- Trial/freemium availability (reduces payment friction)
- Consumption vs. commitment billing model
Question: Does this tool support enterprise-grade agent oversight, audit, and access control?
Scoring Rubric (1-10):
| Score | Description |
|---|---|
| 9-10 | Full agent governance: Per-agent identity/API keys, RBAC/ABAC with granular scoping, immutable audit logs, data residency controls, SOC 2/ISO 27001 certified, SCIM provisioning, webhook events for all mutations. |
| 7-8 | Strong controls: Team/org-level access control, API key scoping (read/write/admin), activity logs with timestamps, SSO/SAML, data processing agreements. Missing per-agent identity or granular RBAC. |
| 5-6 | Basic controls: API keys with some scoping, basic activity logs in dashboard, team management, but no RBAC granularity. No audit log export. Limited data residency options. |
| 3-4 | Minimal: Single API key per account, no activity logging visible to customer, no access scoping, no compliance certifications publicly listed. |
| 1-2 | None: No access controls beyond login. No audit trail. No compliance documentation. |
Evidence Sources:
- API key scoping (read-only, write, admin, custom)
- RBAC/team management in API
- Audit log availability + export
- Compliance certifications (SOC 2, ISO 27001, HIPAA, GDPR DPA)
- SCIM/SSO support
- Data residency options
- Webhook coverage for mutations
Question: Can a web-browsing agent effectively operate this tool's dashboard and admin interface?
Scoring Rubric (1-10):
| Score | Description |
|---|---|
| 9-10 | AAA: Agent-Native Web UI: Semantic HTML + ARIA throughout, keyboard-navigable, no CAPTCHA on authenticated sessions, API parity (anything in UI is also in API), llms.txt or agent-flows.json published, token-efficient DOM. |
| 7-8 | AA: Agent-Navigable: Good semantic structure, most interactive elements labeled, keyboard-accessible forms, no anti-bot measures on authenticated paths. Minor gaps (some custom components without ARIA). |
| 5-6 | A: Agent-Parseable: Standard HTML forms, some semantic structure. Dashboard usable by screenshot+click agent but fragile. Some unlabeled interactive elements. No aggressive anti-bot. |
| 3-4 | Poor: Heavy JavaScript rendering (SPA with client-side only), unlabeled buttons/inputs, some bot detection on authenticated paths, significant portions require visual interpretation. |
| 1-2 | Hostile: Aggressive CAPTCHA/bot detection, entirely canvas-rendered UI, no semantic structure, anti-automation measures on all paths. |
Evidence Sources:
- Lighthouse accessibility score (proxy)
- Semantic HTML usage (headings, landmarks, labels)
- ARIA attributes on interactive elements
- Keyboard navigability
- CAPTCHA/bot detection on authenticated sessions
- API parity (can API do everything dashboard can?)
- llms.txt or agent discovery metadata
The three new dimensions add a third aggregate axis. Implemented weight allocation:
Execution axis (I1-I7, F1-F7, O1-O3): 0.45
Access axis (A1-A6): 0.40
Autonomy axis (P1, G1, W1): 0.15
- P1 (Payment Autonomy): 0.06
- G1 (Governance Readiness): 0.05
- W1 (Web Agent Accessibility): 0.04
Aggregate formula (v0.3):
AN = (execution × 0.45) + (access × 0.40) + (autonomy × 0.15)
Where:
executionis the weighted I/F/O score (0-10)accessis the weighted A1-A6 score (0-10)autonomyis the weighted P1/G1/W1 score (0-10)
Rationale: These dimensions are forward-looking. Today, most agents don't pay autonomously. But the trajectory is clear: Stripe ACP + x402 + Coinbase AgentKit = payment autonomy within 12 months. Governance is already gating enterprise adoption. Web accessibility is the present (agents browse now). Weighting autonomy at 15% of aggregate reflects importance without overwhelming validated execution/access signal.
For each of the 50 services in artifacts/dataset-scores.json:
- P1 (Payment Autonomy): Research pricing page, billing API, payment acceptance
- G1 (Governance Readiness): Research RBAC, audit logs, compliance certs, API key scoping
- W1 (Web Agent Accessibility): Quick audit of dashboard accessibility (semantic HTML, ARIA, CAPTCHA)
Output: Updated dataset with 3 new dimension scores per service.
- Added
P1,G1,W1autonomy weighting (0.06/0.05/0.04) inscoring.py - Added
AUTONOMY_DIMENSION_WEIGHTSand three calculator functions:calculate_payment_autonomy()calculate_governance_readiness()calculate_web_accessibility()
- Updated
AN_SCORE_VERSIONto"0.3" - Rebalanced aggregate formula to
execution(0.45) + access(0.40) + autonomy(0.15) - Added Supabase migrations:
0009_autonomy_dimensions.sql0010_seed_autonomy_scores.sql
GET /v1/services/{slug}/score now includes an autonomy section:
{
"autonomy_score": 9.3,
"autonomy": {
"avg": 9.3,
"confidence": 0.9,
"dimensions": [
{"code": "P1", "name": "payment_autonomy", "score": 10.0, "rationale": "x402 / API-native payments", "confidence": 0.9},
{"code": "G1", "name": "governance_readiness", "score": 10.0, "rationale": "RBAC + audit logs", "confidence": 0.9},
{"code": "W1", "name": "web_accessibility", "score": 8.0, "rationale": "AAG AA/AAA structure", "confidence": 0.9}
]
}
}- Add autonomy axis to service detail pages
- Add new dimensions to leaderboard cards
- Update
llms.txtwith new dimension definitions - Blog post: "We Added 3 New Dimensions to the AN Score" (content + MEO)
This is the research nobody has published yet. First-mover content opportunity.
| Protocol | Owner | Status | Mechanism | Agent Support |
|---|---|---|---|---|
| x402 | Coinbase | Production (75M tx) | HTTP 402 → stablecoin | Native (designed for agents) |
| AP2 | Production (60+ partners) | Verifiable Credentials + Mandates | Native | |
| ACP | Stripe + OpenAI | Production (ChatGPT) | Shared Payment Tokens | Native |
| UCP | Production (Jan 2026) | Standardized checkout | Agent-compatible |
| Product | Owner | Mechanism | Use Case |
|---|---|---|---|
| Stripe Issuing | Stripe | Virtual cards via API | Agent purchases at any merchant |
| AgentKit | Coinbase | Wallet SDK | Agent-owned crypto wallets |
| Agentic Wallets | Coinbase | Managed wallets | Autonomous spending/earning |
| Base L2 | Coinbase | Low-fee blockchain | $0.001/tx stablecoin payments |
- 0 of 50 scored services accept x402 or AP2 natively
- ~5 of 50 have API-manageable billing (Stripe, Vercel, Supabase, Cloudflare, Resend)
- ~30 of 50 accept standard card payment via web form (automatable with Stripe Issuing)
- ~15 of 50 require sales/custom pricing or have complex payment flows
- The gap is enormous. This is why Rhumb's Access layer exists.
- Stripe: API billing, consumption model, obviously 10/10
- Vercel: API billing, team management, usage-based
- Supabase: Self-serve, API-manageable, usage-based
- Cloudflare: Workers has API billing, pay-per-use
- Resend: Self-serve, usage-based, API billing
- Most SaaS with self-serve credit card forms
- GitHub, Linear, Notion, Slack, etc.
- Enterprise-only tools (Salesforce, custom pricing)
- Tools requiring phone/human verification
- Sales-led platforms
This spec and its outputs create content for three meaning-space positions:
- "agent payment landscape" — Nobody has mapped this comprehensively. Own it.
- "AN Score dimensions" — Rhumb is the only entity defining scoring dimensions for agent-tool compatibility. Compound position.
- "agent accessibility guidelines" (AAG) — Empty meaning-space. The WCAG for AI agents. Own it.
Each dimension produces a blog post, a leaderboard filter, and a data point in every service profile. Compound content generation from a single research effort.