Skip to content

hironow/adk-stream-protocol

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

223 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

ADK Stream Protocol

AI SDK v6 and Google ADK integration demonstrating SSE and WebSocket streaming implementation.


⚠️ Development Status

This project is under active development and contains experimental features with known issues.

Current Status

βœ… Stable Features

  • Gemini Direct mode (AI SDK v6 only)
  • ADK SSE streaming with tool calling
  • ADK BIDI Blocking Mode for tool approval (ADR-0009, ADR-0011)
  • ADK Mode History Sharing (chat history preserved across ADK SSE ↔ BIDI transitions)
  • Complete E2E test infrastructure (Frontend, Backend, Playwright)

🚧 Experimental Features

  • ADK BIDI (WebSocket) native tool confirmation - See known issues below

Known Issues

Critical: ADK BIDI Mode Limitations

BIDI mode (run_live()) has two significant issues:

  1. Tool Confirmation Not Working (Native ADK) 🟑

    • ADK's native require_confirmation=True does not trigger approval UI in live mode
    • Root cause: ADK FunctionTool._call_live() TODO - "tool confirmation not yet supported for live mode"
    • Status: Known ADK limitation, awaiting upstream fix
    • Workaround: Use BIDI Blocking Mode (ADR-0009, ADR-0011) or SSE mode for tools requiring confirmation
  2. Missing Text Responses After Tool Execution 🟑

    • Tools execute successfully but AI generates no explanatory text
    • Only raw JSON output shown to user
    • Status: Under investigation
    • Workaround: Use SSE mode for full tool support

Recent Fixes

  • βœ… ADK Mode History Sharing - chat history preserved across ADK SSE ↔ BIDI transitions (2026-01-18)
  • βœ… BIDI Confirmation ID routing bug fixed (2025-12-19)
  • βœ… Fixed infinite loop in tool confirmation auto-send logic (2025-12-17)

🎯 Project Overview

This project demonstrates the integration between:

  • Frontend: Next.js 16 with AI SDK v6 beta
  • Backend: Google ADK with FastAPI

Three Streaming Modes

  1. Gemini Direct - Direct Gemini API via AI SDK (stable)
  2. ADK SSE - ADK backend with Server-Sent Events (stable)
  3. ADK BIDI ⚑ - ADK backend with WebSocket bidirectional streaming (stable for Blocking Mode, experimental for native confirmation)

Key Insight: All three modes use the same AI SDK v6 Data Stream Protocol format, ensuring consistent frontend behavior regardless of backend implementation.


✨ Key Features

Streaming Modes

  • Gemini Direct: Built-in AI SDK v6 streaming support
  • ADK SSE: Token-by-token streaming via Server-Sent Events
  • ADK BIDI: Bidirectional WebSocket streaming for voice agents

Multimodal Capabilities

  • Text I/O: Token-by-token streaming with AI SDK v6
  • Image Input/Output: PNG, JPEG, WebP via data-image custom events
  • Audio Input: Microphone recording (16kHz PCM) with CMD key push-to-talk
  • Audio Output: PCM streaming (24kHz) with WAV playback
  • Audio Transcription: Input and output speech-to-text with native-audio models
  • Tool Calling: ADK integration with user approval flow (SSE mode)

Architecture Highlights

  • StreamProtocolConverter: Converts ADK events to AI SDK v6 Data Stream Protocol
  • SSE format over WebSocket: Backend sends SSE format via WebSocket for BIDI mode
  • Frontend Transparency: Same useChat hook works across all three modes
  • Custom Transport: WebSocketChatTransport for AI SDK v6 WebSocket support
  • Tool Approval Flow: Frontend-delegated execution with AI SDK v6 approval APIs

πŸ› οΈ Tech Stack

Frontend:

  • Next.js 16 (App Router)
  • React 19
  • AI SDK v6 beta (ai, @ai-sdk/react, @ai-sdk/google)
  • TypeScript 5.7

Backend:

  • Python 3.13
  • Google ADK >=1.20.0
  • FastAPI >=0.115.0
  • Pydantic v2

Development Tools:

  • bun (Node.js packages)
  • uv (Python packages)
  • just (task automation)

πŸš€ Quick Start

Prerequisites

  • Python 3.13+
  • Node.js 18+
  • bun, uv, just

Installation

# Install all dependencies
just install

# Or manually:
uv sync
bun install

Environment Setup

Copy the example file:

cp .env.example .env.local

Edit .env.local:

For Gemini Direct:

GOOGLE_GENERATIVE_AI_API_KEY=your_api_key_here
BACKEND_MODE=gemini
NEXT_PUBLIC_BACKEND_MODE=gemini

For ADK SSE/BIDI:

GOOGLE_API_KEY=your_api_key_here
BACKEND_MODE=adk-sse
NEXT_PUBLIC_BACKEND_MODE=adk-sse
ADK_BACKEND_URL=http://localhost:8000
NEXT_PUBLIC_ADK_BACKEND_URL=http://localhost:8000

Running

Gemini Direct (frontend only):

bun dev

ADK SSE/BIDI (backend + frontend):

# Run both concurrently:
just dev

# Or separately:
just server  # Backend on :8000
bun dev     # Frontend on :3000

For all available commands:

just --list

πŸ§ͺ Testing

Python Backend Tests:

just test-python
# Expected: ~200 passed (unit + integration + e2e)

TypeScript Frontend Tests:

bun test:lib
# Expected: ~565 passed (unit + integration + e2e)

Playwright E2E Tests:

just test-e2e-clean  # Recommended: clean server restart
just test-e2e-ui     # Interactive UI mode

Code Quality:

just format  # Format code
just lint    # Run linters
just check   # Run type checks

πŸ“š Documentation

Complete documentation is available in the docs/ directory:

Quick Start

Architecture & Specs

  • Architecture Overview - Complete system architecture

    • AudioWorklet PCM Streaming
    • Tool Approval Flow (Frontend Delegation Pattern)
    • Per-Connection State Management
    • Multimodal Support Architecture
  • Protocol Implementation - ADK ↔ AI SDK v6 protocol

    • Event/Part field mapping
    • Implementation status
    • Custom extensions (data-pcm, data-image, etc.)

Backend (Python)

Frontend (TypeScript)

Testing

  • Testing Strategy - Overall test architecture (pytest, Vitest, Playwright)
  • E2E Testing Guide - Complete E2E testing documentation
    • Backend E2E (pytest golden files)
    • Frontend E2E (Vitest browser tests)
    • Fixtures management
    • Chunk Logger debugging
  • Coverage Audit - Test coverage verification

Architecture Decision Records

  • ADR-0001 - Per-Connection State Management
  • ADR-0002 - Tool Approval Architecture
  • ADR-0003 - SSE vs BIDI Confirmation Protocol
  • ADR-0004 - Multi-Tool Response Timing
  • ADR-0005 - Frontend Execute Pattern and [DONE] Timing
  • ADR-0006 - sendAutomaticallyWhen Decision Logic Order
  • ADR-0007 - Approval Value Independence
  • ADR-0008 - SSE Mode Pattern A Only
  • ADR-0009 - BIDI Blocking Mode
  • ADR-0010 - BIDI Confirmation Chunk Generation
  • ADR-0011 - BIDI Approval Deadlock Fix (finish-step Injection)

Additional Resources

  • Experiments - Research notes, protocol investigations, multimodal experiments

πŸ”¬ Experiments & Research

All experiment notes and architectural investigations are documented in experiments/:

  • Bidirectional protocol investigations
  • Multimodal support (images, audio, video)
  • Tool approval flow implementations
  • Test coverage investigations
  • ADK field mapping completeness

See experiments/README.md for the complete experiment index and results.


πŸ“„ License

MIT License. See LICENSE file for details.


πŸ”— References


Last Updated: 2026-01-18

About

ADK with Vercel AI SDK UI stream protocol - Data Stream Protocol [uses Server-Sent Events (SSE) format]

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors