Skip to content

kevinmichaelchen/agent-chime

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

agent-chime

TTS hooks for agentic workflows on macOS. The goal is simple: when your agent yields or asks for a decision, you hear a short, clear audio cue (TTS or earcon) while still seeing the full text output.

This project is designed for local-first TTS on Apple Silicon and integrates with these CLI tools:

  • claude
  • codex
  • opencode

We will build a small event-driven layer that converts agent lifecycle events into audio prompts.

Why this exists

  • Avoid missing a prompt while your focus is elsewhere.
  • Keep the interaction lightweight with short speech (or earcons) instead of long narration.
  • Remain local-first for privacy and low latency.

Core ideas

  • Event hooks: detect yield and decision moments from agent tools.
  • CLI adapters: pluggable adapters per CLI tool, easy to extend later.
  • TTS broker: normalize messages into a compact, spoken prompt.
  • TTS provider: unified mlx-audio interface for local synthesis.
  • Audio renderer: plays back audio immediately with minimal delay.

TTS Provider

We use mlx-audio as a unified TTS interface. It runs locally on Apple Silicon and supports multiple models:

Model Speed Use Case
mlx-community/Spark-TTS-0.5B-bf16 ~0.3x RT Default β€” fast, good quality
mlx-community/Spark-TTS-0.5B-8bit ~0.3x RT Quantized, lower memory
mlx-community/pocket-tts ~1.3x RT Fastest, smallest (~1GB memory)

All models share the same API through mlx-audio's generate_audio() function.

Quick Example

from mlx_audio.tts.generate import generate_audio

generate_audio(
    text="Ready.",
    model="mlx-community/Spark-TTS-0.5B-bf16",
    file_prefix="output",
    play=True
)

Installation

uv add mlx-audio

Quick Start

Configure your CLI tool to invoke agent-chime notify on relevant events.

Claude Code

Add to ~/.claude/settings.json:

{
  "hooks": {
    "Stop": [
      { "type": "command", "command": "agent-chime notify --source claude" }
    ],
    "Notification": [
      { "type": "command", "command": "agent-chime notify --source claude" }
    ]
  }
}

See Claude hooks docs.

Codex

Add to ~/.codex/config.toml:

notify = ["agent-chime", "notify", "--source", "codex"]

See Codex config docs.

OpenCode

Create .opencode/plugin/agent-chime.js:

export const AgentChimePlugin = async ({ $ }) => ({
  event: async ({ event }) => {
    if (event.type === "session.idle")
      await $`agent-chime notify --source opencode --event AGENT_YIELD`;
    if (event.type === "session.error")
      await $`agent-chime notify --source opencode --event ERROR_RETRY`;
    if (event.type === "permission.asked")
      await $`agent-chime notify --source opencode --event DECISION_REQUIRED`;
  },
});

See OpenCode plugin events for the event list. ERROR_RETRY is only available for OpenCode via session.error.

Documents

  • requirements.md β€” functional and non-functional requirements
  • design.md β€” architecture, event model, and data flow

Scope (initial)

  • macOS only
  • English-only prompts
  • Short spoken messages (1–2 sentences max)
  • Minimal setup and config
  • Adapter system for claude, codex, opencode, with a clear path to add more

Non-goals (initial)

  • Cross-platform support
  • Long-form narration
  • UI beyond simple CLI config

Installation

# Install dependencies and create virtual environment
uv sync

# Or with dev dependencies
uv sync --all-extras

# Run the CLI
uv run agent-chime --help

Usage

CLI Commands

# Show system info and recommended model
agent-chime system-info

# Show system info as JSON
agent-chime system-info --json

# Test TTS synthesis
agent-chime test-tts
agent-chime test-tts --text "Hello world"
agent-chime test-tts --model "mlx-community/pocket-tts"

# Manage configuration
agent-chime config              # Show config path
agent-chime config --show       # Show current config
agent-chime config --init       # Create default config file
agent-chime config --validate   # Validate configuration

# Process notifications (usually called by hooks)
agent-chime notify --source claude    # Reads JSON from stdin
agent-chime notify --source codex     # Reads JSON from argv
agent-chime notify --source opencode --event AGENT_YIELD

Configuration

Create ~/.config/agent-chime/config.json:

{
  "tts": {
    "model": null,
    "selection_mode": "auto",
    "voice": null
  },
  "volume": 0.8,
  "events": {
    "AGENT_YIELD": {
      "enabled": true,
      "mode": "tts",
      "read_summary": true,
      "template": "Ready."
    },
    "DECISION_REQUIRED": {
      "enabled": true,
      "mode": "tts",
      "read_summary": false,
      "template": "I need your input."
    },
    "ERROR_RETRY": {
      "enabled": true,
      "mode": "earcon"
    }
  }
}

Dynamic Model Selection

agent-chime automatically selects the best TTS model based on your system:

RAM Recommended Model Notes
β‰₯4GB available Spark-TTS-0.5B-bf16 Best quality
β‰₯3GB available Spark-TTS-0.5B-8bit Quantized, smaller
<3GB available pocket-tts Fastest (~1GB memory)

Override with --model or set in config:

{
  "tts": {
    "model": "mlx-community/pocket-tts",
    "selection_mode": "manual"
  }
}

Status

Implemented and ready for testing.

Implementation

  • Language: Python 3.11+
  • TTS: mlx-audio (requires Apple Silicon)
  • Audio: afplay (macOS built-in)

About

πŸ”” TTS notifications for agentic CLI workflows on macOS

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages