TTS hooks for agentic workflows on macOS. The goal is simple: when your agent yields or asks for a decision, you hear a short, clear audio cue (TTS or earcon) while still seeing the full text output.
This project is designed for local-first TTS on Apple Silicon and integrates with these CLI tools:
claudecodexopencode
We will build a small event-driven layer that converts agent lifecycle events into audio prompts.
- Avoid missing a prompt while your focus is elsewhere.
- Keep the interaction lightweight with short speech (or earcons) instead of long narration.
- Remain local-first for privacy and low latency.
- Event hooks: detect
yieldanddecisionmoments from agent tools. - CLI adapters: pluggable adapters per CLI tool, easy to extend later.
- TTS broker: normalize messages into a compact, spoken prompt.
- TTS provider: unified mlx-audio interface for local synthesis.
- Audio renderer: plays back audio immediately with minimal delay.
We use mlx-audio as a unified TTS interface. It runs locally on Apple Silicon and supports multiple models:
| Model | Speed | Use Case |
|---|---|---|
mlx-community/Spark-TTS-0.5B-bf16 |
~0.3x RT | Default β fast, good quality |
mlx-community/Spark-TTS-0.5B-8bit |
~0.3x RT | Quantized, lower memory |
mlx-community/pocket-tts |
~1.3x RT | Fastest, smallest (~1GB memory) |
All models share the same API through mlx-audio's generate_audio() function.
from mlx_audio.tts.generate import generate_audio
generate_audio(
text="Ready.",
model="mlx-community/Spark-TTS-0.5B-bf16",
file_prefix="output",
play=True
)uv add mlx-audioConfigure your CLI tool to invoke agent-chime notify on relevant events.
Add to ~/.claude/settings.json:
{
"hooks": {
"Stop": [
{ "type": "command", "command": "agent-chime notify --source claude" }
],
"Notification": [
{ "type": "command", "command": "agent-chime notify --source claude" }
]
}
}See Claude hooks docs.
Add to ~/.codex/config.toml:
notify = ["agent-chime", "notify", "--source", "codex"]See Codex config docs.
Create .opencode/plugin/agent-chime.js:
export const AgentChimePlugin = async ({ $ }) => ({
event: async ({ event }) => {
if (event.type === "session.idle")
await $`agent-chime notify --source opencode --event AGENT_YIELD`;
if (event.type === "session.error")
await $`agent-chime notify --source opencode --event ERROR_RETRY`;
if (event.type === "permission.asked")
await $`agent-chime notify --source opencode --event DECISION_REQUIRED`;
},
});See OpenCode plugin events for the event list. ERROR_RETRY
is only available for OpenCode via session.error.
requirements.mdβ functional and non-functional requirementsdesign.mdβ architecture, event model, and data flow
- macOS only
- English-only prompts
- Short spoken messages (1β2 sentences max)
- Minimal setup and config
- Adapter system for
claude,codex,opencode, with a clear path to add more
- Cross-platform support
- Long-form narration
- UI beyond simple CLI config
# Install dependencies and create virtual environment
uv sync
# Or with dev dependencies
uv sync --all-extras
# Run the CLI
uv run agent-chime --help# Show system info and recommended model
agent-chime system-info
# Show system info as JSON
agent-chime system-info --json
# Test TTS synthesis
agent-chime test-tts
agent-chime test-tts --text "Hello world"
agent-chime test-tts --model "mlx-community/pocket-tts"
# Manage configuration
agent-chime config # Show config path
agent-chime config --show # Show current config
agent-chime config --init # Create default config file
agent-chime config --validate # Validate configuration
# Process notifications (usually called by hooks)
agent-chime notify --source claude # Reads JSON from stdin
agent-chime notify --source codex # Reads JSON from argv
agent-chime notify --source opencode --event AGENT_YIELDCreate ~/.config/agent-chime/config.json:
{
"tts": {
"model": null,
"selection_mode": "auto",
"voice": null
},
"volume": 0.8,
"events": {
"AGENT_YIELD": {
"enabled": true,
"mode": "tts",
"read_summary": true,
"template": "Ready."
},
"DECISION_REQUIRED": {
"enabled": true,
"mode": "tts",
"read_summary": false,
"template": "I need your input."
},
"ERROR_RETRY": {
"enabled": true,
"mode": "earcon"
}
}
}agent-chime automatically selects the best TTS model based on your system:
| RAM | Recommended Model | Notes |
|---|---|---|
| β₯4GB available | Spark-TTS-0.5B-bf16 | Best quality |
| β₯3GB available | Spark-TTS-0.5B-8bit | Quantized, smaller |
| <3GB available | pocket-tts | Fastest (~1GB memory) |
Override with --model or set in config:
{
"tts": {
"model": "mlx-community/pocket-tts",
"selection_mode": "manual"
}
}Implemented and ready for testing.
- Language: Python 3.11+
- TTS: mlx-audio (requires Apple Silicon)
- Audio: afplay (macOS built-in)