Skip to content

codertray/local-a11y-ai

Repository files navigation

ReaderHelper Core

ReaderHelper Core is an experimental TypeScript core for using on-device AI to improve web accessibility. The project is intentionally small right now: it provides shared contracts, DOM candidate discovery, prompt builders, adapter interfaces, structured patch validation, mutation application, rollback, and tests.

There is not yet a surrounding browser extension, Safari app extension, macOS app, iOS app, settings UI, packaging flow, or production host shell in this repository. The goal of open sourcing the core is to invite collaboration on the hard shared layer first, then grow toward host integrations once the behavior is reliable across real pages and real on-device models.

Why This Exists

The long-term hope is that on-device AI can make the web meaningfully more accessible without sending page content to remote services. ReaderHelper is exploring that idea through focused, reversible page improvements such as:

  • repairing missing or misleading ARIA attributes on controls
  • creating screen-reader-friendly spoken renderings of code blocks
  • creating concise spoken renderings of tables
  • preparing for image alt text once regular on-device multimodal support is more widely available

Image alt text is part of the plan, but it is intentionally hidden by default today. Some platforms are beginning to expose image input, but the project is waiting until more on-device models can handle images regularly and consistently before treating image descriptions as a normal feature.

Current Status

This repository currently contains the core engine only.

Available pieces:

  • src/core: candidate discovery, DOM snapshots, scheduling, caching, mutation application, rollback
  • src/contracts: public TypeScript contracts and JSON Schema validation for model patches
  • src/prompts: built-in prompts for ARIA repair, spoken code blocks, and spoken tables
  • src/adapters: adapter surfaces for Chrome Prompt API / Gemini Nano, Safari native Apple bridge, and remote Atlas/OpenAI-style fallback
  • tests: Vitest coverage using jsdom
  • scripts: small experiments, including a Foundation Models bionic-reader prompt harness

Not included yet:

  • Chrome extension manifest/content-script shell
  • Safari Web Extension wrapper
  • native macOS/iOS bridge app
  • production settings UI
  • release packaging
  • user-facing toggle UI

How It Works

ReaderHelper discovers candidate elements such as controls, images, tables, code blocks, and preformatted text. It builds a compact snapshot of the target and surrounding context, resolves a prompt for the candidate slot, sends the request through an adapter, validates the model response against a structured patch schema, and applies the result if confidence and policy checks pass.

For semantic replacements such as tables and code blocks, the original element is not removed from the DOM. The core marks the original as hidden from assistive technology and inserts an adjacent screen-reader-only proxy containing the generated accessible text. Mutations are tracked so they can be rolled back individually or all at once when a host extension toggle turns ReaderHelper off.

Built-In Prompt Slots

  • label-control: repairs ARIA attributes for controls when a safe fix is supported by context.
  • semantic-code-block: converts code blocks into dense spoken plain text.
  • semantic-table: converts relational, comparison, and non-relational tables into concise spoken text.
  • image-alt: reserved for image descriptions, default-off until on-device multimodal support is dependable.
  • semantic-image: reserved for future semantic image handling.

Long content is reduced before prompting. Tables are snapshotted with headers, early rows, tail rows, and omitted-row markers. Long code and table prompts preserve both the beginning and end of the content where possible.

Installation

This package is not published yet. For local development:

npm install
npm test
npm run typecheck
npm run build

The package is currently marked private in package.json while the API is still settling.

Basic Integration Sketch

A host extension or app creates an engine with one or more adapters:

import {
  ChromeNanoAdapter,
  SafariAppleBridgeAdapter,
  createAccessibilityEngine
} from "./src/index.js";

const engine = createAccessibilityEngine({
  document,
  window,
  adapters: [
    new ChromeNanoAdapter(),
    new SafariAppleBridgeAdapter()
  ],
  policy: {
    processInitialDocument: true,
    enableImageHandling: false
  }
});

await engine.start();

To support a user-facing on/off toggle:

engine.stop();
engine.rollbackAllMutations();

To process a specific subtree after a host app decides it is safe:

await engine.processNode(document.querySelector("main")!);

Adapter Notes

The core does not own any model runtime. Hosts provide adapters.

  • ChromeNanoAdapter targets Chrome's built-in Prompt API / Gemini Nano surface when available.
  • SafariAppleBridgeAdapter expects a native bridge exposed by a Safari/Web Extension host. The JavaScript core should remain bridge-oriented rather than depending directly on Swift APIs.
  • AtlasOpenAIAdapter is a remote fallback shape and should require explicit opt-in before page content leaves the device.

Adapters report capabilities so hosts can decide what to enable. The default policy keeps image handling off.

Agent Use And Repository Notes

This repository has been developed with an AI coding agent, and that workflow is intentionally documented.

  • agentattention.md explains the lightweight repo rules for future agents.
  • notes.md is append-only and records repo-specific implementation findings, vendor-doc discoveries, and decisions that matter later.
  • log.csv is append-only and records implementation edits for auditability.

If you use an agent in this repo, please keep those files useful rather than noisy. Add entries only for concrete implementation work, incidents, current vendor-doc findings, or decisions that future contributors need to understand.

Development Practices

  • Prefer structured patches over free-form model output.
  • Keep prompts small enough for on-device context windows.
  • Preserve user-visible page layout whenever possible.
  • Make every mutation reversible.
  • Treat remote model fallback as opt-in only.
  • Add tests for candidate discovery, prompt injection, validation, mutation, rollback, and adapter behavior.
  • Do not enable image handling by default until on-device image understanding is broadly reliable.

Useful commands:

npm test
npm run typecheck
npm run build

Contributing

Contributions are welcome, especially around:

  • real-world prompt failures on complex web pages
  • host extension scaffolding for Chrome or Safari
  • native Apple bridge experiments
  • on-device model capability checks
  • table and code-block accessibility quality
  • reversible mutation strategies
  • privacy-preserving image alt text once model support matures

Please keep changes focused, tested, and reversible. This project is trying to improve accessibility without making pages more fragile.

License

ReaderHelper Core is licensed under the Apache License, Version 2.0. See LICENSE.txt for details.

About

On-device AI for a more accessible web: reversible ARIA fixes, spoken code blocks, and screen-reader-friendly table summaries.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors