X Space Agent

AI agents that join and talk in X/Twitter Spaces

Quick Start • Features • Structure • Architecture • Examples • Docs • Contributing

Multi-agent AI voice conversations in X/Twitter Spaces — real-time transcription, LLM responses, and voice synthesis

What is this?

X Space Agent is a TypeScript SDK that lets you build AI agents that autonomously join, listen, and speak in X/Twitter Spaces. Connect any LLM, any voice provider, and ship in minutes. No Twitter API approval needed.

import { XSpaceAgent } from 'xspace-agent'

const agent = new XSpaceAgent({
  auth: { token: process.env.X_AUTH_TOKEN!, ct0: process.env.X_CT0! },
  ai: { provider: 'openai', apiKey: process.env.OPENAI_API_KEY! },
})

agent.on('transcription', ({ text }) => console.log('Heard:', text))
agent.on('response', ({ text }) => console.log('Said:', text))

await agent.join('https://x.com/i/spaces/YOUR_SPACE_ID')

Or skip the code entirely with the CLI:

npx xspace-agent join https://x.com/i/spaces/YOUR_SPACE_ID --provider openai

Features

🎤 Multi-Provider LLM OpenAI, Claude, Groq, or any custom API	👥 Multi-Agent Teams Run multiple personalities with turn management	🔧 Middleware Pipeline Hook into STT → LLM → TTS at any stage
💻 Zero-Code CLI `npx xspace-agent join <url>` no SDK needed	📊 Admin Dashboard Web UI to monitor and control live agents	🔷 TypeScript-First Full type safety, autocomplete included

Requirements

Node.js >= 18 (tested on 18, 20, 22)
pnpm >= 9 (for monorepo development) or npm/yarn for consuming the SDK
Chromium — bundled with Puppeteer, or provide your own via BROWSER_MODE=connect
X (Twitter) account — cookie-based auth (X_AUTH_TOKEN + X_CT0) or username/password
At least one AI provider key — OpenAI, Anthropic, or Groq

Quick Start

1. Install

npm install xspace-agent

2. Set environment variables

# .env
X_AUTH_TOKEN=your_x_auth_token
X_CT0=your_x_ct0_cookie
OPENAI_API_KEY=sk-...

Get X_AUTH_TOKEN and X_CT0 from your browser cookies after logging into X. Guide →

3. Run

import { XSpaceAgent } from 'xspace-agent'

const agent = new XSpaceAgent({
  auth: { token: process.env.X_AUTH_TOKEN!, ct0: process.env.X_CT0! },
  ai: {
    provider: 'openai',
    apiKey: process.env.OPENAI_API_KEY!,
    model: 'gpt-4o',
    systemPrompt: 'You are a helpful AI analyst. Be concise and data-driven.',
  },
  voice: {
    sttProvider: 'deepgram',
    ttsProvider: 'elevenlabs',
    voiceId: 'rachel',
  },
})

agent.on('transcription', ({ text, speaker }) => console.log(`${speaker}: ${text}`))
agent.on('response', ({ text }) => console.log(`Agent: ${text}`))

await agent.join('https://x.com/i/spaces/YOUR_SPACE_ID')

Or skip the code entirely with the CLI:

npx xspace-agent join https://x.com/i/spaces/YOUR_SPACE_ID --provider openai

Deploy

Or with Docker:

docker run -e OPENAI_API_KEY=sk-... ghcr.io/nirholas/xspace-agent

Documentation

Full docs live in docs/. Key guides:

Guide	Description
Architecture Overview	How the system fits together
Providers	LLM, STT, and TTS provider setup
Admin Panel	Web dashboard guide
Environment Variables	All config options
Multi-Space Support	Run agents across multiple Spaces
Agent Memory & RAG	Persistent memory and retrieval
TypeScript Migration	TypeScript usage guide

Project Structure

This is a pnpm monorepo with five publishable packages, a standalone voice agent, and supporting infrastructure.

Packages (npm-published)

packages/
  core/                → xspace-agent         The main SDK. Everything needed to build an AI agent
                         ├── agent.ts            Entry point — orchestrates browser, audio, LLM, turns
                         ├── team.ts             Multi-agent coordination (multiple AIs, one Space)
                         ├── audio/              PCM capture, VAD, silence detection, WAV encoding, TTS injection
                         ├── browser/            Puppeteer lifecycle, self-healing selector engine, DOM interaction
                         ├── fsm/                Finite state machine for agent & team lifecycles
                         ├── intelligence/       Speaker ID, topic tracking, sentiment, context management
                         ├── pipeline/           Provider factories — createLLM(), createSTT(), createTTS()
                         ├── turns/              Turn coordination, decision engine, interruption handling
                         ├── plugins/            Plugin system with 6 middleware hooks (before/after stt/llm/tts)
                         ├── providers/          Multi-provider router and cost tracking
                         ├── db/                 Drizzle ORM, migrations, repositories
                         ├── auth/               X/Twitter login, token validation, OAuth, SAML
                         ├── memory/             Conversation persistence, RAG, archiving
                         ├── observability/      Structured logging (Pino), tracing, metrics
                         └── __tests__/          Unit & E2E test suites with fixtures

  server/              → @xspace/server        Express + Socket.IO admin panel
                         ├── routes/             REST API endpoints
                         ├── events/             Socket.IO real-time event handlers
                         ├── middleware/          Auth, validation, CORS, rate limiting
                         ├── schemas/            Zod request/response validation
                         ├── personalities/      Preset agent configurations
                         └── public/             Admin dashboard HTML/CSS/JS

  cli/                 → @xspace/cli           Command-line tool
                         └── commands/           init, auth, join, start, dashboard

  widget/              → @xspace/widget        Embeddable voice chat widget (UMD + ESM builds)
                         ├── connection.ts       WebSocket connection handler
                         ├── theme.ts            Theme customization
                         └── ui/                 UI components

  create-xspace-agent/ → create-xspace-agent   Project scaffolding (like create-react-app)
                         └── templates/base/     Starter project template

Application Code

agent-voice-chat/      Standalone voice chat agent — separate from the monorepo
                       ├── server.js             Express + Socket.IO server (38KB)
                       ├── openapi.json           Full REST API spec
                       ├── agents.config.json     Agent configurations
                       ├── room-manager.js        Multi-room coordination
                       ├── knowledge/             Vector embeddings & RAG data
                       ├── memory/                Persistent conversation storage
                       ├── providers/             LLM, STT, TTS implementations
                       └── tests/                 Own test suite (vitest)

src/                   Legacy monolithic server — functional via `npm run dev`, being migrated
                       ├── server/                Express server, socket handlers, routes, metrics
                       ├── browser/               Puppeteer auth, launcher, orchestrator, selectors
                       ├── audio/                 Audio stream bridge
                       └── client/                Frontend initialization & provider configs

x-spaces/             Low-level Puppeteer automation scripts (JavaScript)
                       ├── index.js               Orchestration entry point
                       ├── audio-bridge.js         Audio capture & injection via CDP
                       ├── auth.js                 Browser cookie authentication
                       └── space-ui.js             X Spaces DOM interaction & selectors

Supporting Directories

examples/              12 runnable projects — basic-join, multi-agent-debate, discord-bridge,
                       custom-provider, middleware-pipeline, express-integration, scheduled-spaces,
                       chrome-connect, with-plugins, and more. Each has its own package.json.

docs/                  43 markdown files — architecture overview, API reference (REST + WebSocket),
                       provider guides, deployment (Docker, Railway, Render, VPS), troubleshooting,
                       plugin system, configuration, and internal design specs.

personalities/         Pre-built agent personalities with system prompts & voice preferences
                       └── presets/               agent-zero, comedian, crypto-degen, educator,
                                                  interviewer, tech-analyst, and more

providers/             AI provider wrappers (JS) — Claude, Groq, OpenAI Chat, OpenAI Realtime, STT, TTS

public/                Frontend assets — admin dashboard, agent builder, voice chat UI,
                       widget demos (React, Vue), landing pages

docker/                Monitoring stack — Prometheus scrape configs + Grafana dashboards

tasks/                 14 implementation specs & roadmap items (landing page, design system,
                       docs site, onboarding flow, admin dashboard v2, auth, rate limiting, etc.)

tests/                 Top-level integration & load tests

Examples

Example	Description
basic-join	Join a Space with an AI agent in ~15 lines
transcription-logger	Listen-only — save timestamped transcripts to file
multi-agent-debate	Two AIs (Bull vs Bear) debate live with round-robin turns
multi-agent	Multiple AI agents sharing a single Space
custom-provider	Use a local LLM (Ollama) or any custom API backend
middleware-pipeline	Content filtering, language detection, safety redaction, analytics hooks
express-integration	Embed the agent in an existing Express app with admin panel
scheduled-spaces	Join Spaces on a cron schedule with auto-leave timers
discord-bridge	Control the agent from Discord — join, leave, speak, stream transcriptions
chrome-connect	Connect to an existing Chrome instance instead of launching one
with-plugins	Extend agent behavior with custom plugins
plugins	Reusable plugin modules — analytics, moderation, webhooks

cd examples/basic-join
npm install
cp .env.example .env   # fill in your API keys
npm start

Architecture

                         X Space (live audio)
                                │
                    Puppeteer + Chrome DevTools Protocol
                                │
                    ┌───────────▼────────────┐
                    │   BrowserLifecycle      │  Auth → Join → Request Speaker → Speak
                    │   Self-healing CSS/     │  Retries selectors via CSS → text → aria
                    │   text/aria selectors   │
                    └───────────┬────────────┘
                                │  RTCPeerConnection audio hooks
                    ┌───────────▼────────────┐
                    │   AudioPipeline         │  PCM capture → VAD → silence detection
                    │                         │  → WAV encoding → TTS injection
                    └───────────┬────────────┘
                                │
          ┌─────────────────────┼─────────────────────┐
          │                     │                     │
   ┌──────▼──────┐      ┌──────▼──────┐      ┌──────▼──────┐
   │  STT        │      │  LLM        │      │  TTS        │
   │  Deepgram   │      │  OpenAI     │      │  ElevenLabs │
   │  Whisper    │      │  Claude     │      │  OpenAI TTS │
   │  (Groq/OAI) │      │  Groq       │      │  Browser    │
   └──────┬──────┘      │  Custom     │      └──────┬──────┘
          │              └──────┬──────┘             │
          │    before:stt       │    before:llm      │    before:tts
          │    after:stt        │    after:llm       │    after:tts
          │  ← middleware →     │  ← middleware →     │  ← middleware →
          │                     │                     │
   ┌──────▼─────────────────────▼─────────────────────▼──────┐
   │  Intelligence Layer                                      │
   │  Speaker ID · Topic tracking · Sentiment · Context mgmt  │
   └─────────────────────────┬───────────────────────────────┘
                             │
   ┌─────────────────────────▼───────────────────────────────┐
   │  Turn Management + FSM                                   │
   │  Decision engine · Interruption handling · Response pace  │
   │                                                          │
   │  idle → launching → authenticating → joining → listening │
   │                                          ↕               │
   │                                       speaking → leaving │
   └──────────────────────────────────────────────────────────┘

The agent connects to X Spaces via a headless Chromium browser, hooks into the WebRTC audio stream, and routes it through a fully configurable STT → LLM → TTS pipeline. Every stage supports middleware for logging, filtering, translation, content moderation, and more. The intelligence layer attributes speech to speakers, tracks topics, and manages conversation context. A finite state machine governs the full agent lifecycle.

Providers

Category	Providers
LLM	OpenAI (GPT-4o), Anthropic (Claude), Groq (Llama/Mixtral), any OpenAI-compatible API
Speech-to-Text	Deepgram (streaming), OpenAI Whisper, custom
Text-to-Speech	ElevenLabs, OpenAI TTS, custom

CLI Reference

xspace-agent init                  # Interactive setup wizard
xspace-agent auth                  # Authenticate with X
xspace-agent join <url>            # Join a Space
xspace-agent start                 # Start agent with admin panel
xspace-agent dashboard             # Launch web dashboard only

Used By

Be the first! Open a PR to add your project.

Community

🐛 GitHub Issues — bug reports and feature requests
🗣️ GitHub Discussions — ideas and broader conversations

Contributing

We welcome contributions! See CONTRIBUTING.md for setup instructions and guidelines.

Good first contributions:

Add a new AI provider (Mistral, Cohere, Together)
Add a new TTS provider (Cartesia, PlayHT)
Build an example project
Improve documentation

Name		Name	Last commit message	Last commit date
Latest commit History 1,007 Commits
.github		.github
.vscode		.vscode
.xspace-conversations		.xspace-conversations
agent-voice-chat		agent-voice-chat
debug-screenshots		debug-screenshots
docker		docker
docs		docs
examples		examples
packages		packages
personalities		personalities
providers		providers
public		public
scripts		scripts
src		src
tasks		tasks
tests/load		tests/load
x-spaces		x-spaces
xspace-agent		xspace-agent
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
.npmrc		.npmrc
.prettierrc		.prettierrc
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
MIGRATION.md		MIGRATION.md
Procfile		Procfile
README.md		README.md
docker-compose.yml		docker-compose.yml
eslint.config.mjs		eslint.config.mjs
join-debug.mjs		join-debug.mjs
join-debug2.mjs		join-debug2.mjs
join-space.mjs		join-space.mjs
package-lock.json		package-lock.json
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
railway.toml		railway.toml
run-agent.ts		run-agent.ts
server.js		server.js
tsconfig.base.json		tsconfig.base.json
tsconfig.client.json		tsconfig.client.json
tsconfig.json		tsconfig.json
tsconfig.server.json		tsconfig.server.json
turbo.json		turbo.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

X Space Agent

What is this?

Features

Requirements

Quick Start

Deploy

Documentation

Project Structure

Packages (npm-published)

Application Code

Supporting Directories

Examples

Architecture

Providers

CLI Reference

Used By

Community

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

X Space Agent

What is this?

Features

Requirements

Quick Start

Deploy

Documentation

Project Structure

Packages (npm-published)

Application Code

Supporting Directories

Examples

Architecture

Providers

CLI Reference

Used By

Community

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages