NIKI Photo Booth

NIKI is an AI-powered autonomous photo booth system built with Python, featuring conversational AI guidance, real-time camera capture, photo processing, and text-to-speech interaction. The system uses OpenAI's GPT-4o-mini via Azure OpenAI with function calling to orchestrate photo sessions through a state-driven workflow.

Features

Conversational AI: Powered by Azure OpenAI GPT-4o-mini for natural user interactions
Real-time Camera Capture: WebRTC-based camera integration for live photo taking
Photo Processing: Automatic cropping, resizing, and border addition using PIL
Text-to-Speech: Google Text-to-Speech integration for voice guidance
Multi-UI Modes: Kiosk, user, and admin interfaces for different use cases
State-Driven Workflow: Event-based state management with Server-Sent Events (SSE)
Real-time Synchronization: Live UI updates across all connected clients

Architecture

Core System Flow

The application follows a strict conversational flow driven by AI tool calls:

detect_presence → get_info_for_engagement → text_to_speech_with_emotions → wait_for_user_engagement
capture_photos → wait_for_user_choose_photo → print_photo → show_goodbye_screen_and_wait
Loop back to presence detection

Three UI Modes

/niki: Kiosk mode - emoji/text displays for public use
/user: User interface - shows conversation history and engagement buttons
/admin: Admin interface - full conversation table, interrupt controls, manual photo capture

Key Components

main.py: Main application, UI modes, API endpoints, state management
niki_ai.py: OpenAI integration, tool definitions, conversation flow
camera.py + camera.js: WebRTC camera capture component
photos.py: Image processing pipeline
tts.py: Text-to-speech generation and playback
shared_state.py: Reactive state management

Installation

Prerequisites

Python 3.8+
Azure OpenAI API access
WebRTC-compatible browser

Setup

Clone the repository:

git clone https://github.com/evnchn/NIKI.git
cd NIKI

Install dependencies:
```
pip install -r requirements.txt
```

Set environment variables (create a .env file):

NIKI_API_KEY=your_azure_openai_api_key
NIKI_USER_PASSWORD=your_user_password
STORAGE_SECRET=your_storage_secret

Usage

Running the Application

python main.py

Access the application at http://localhost:11011

UI Modes

Kiosk Mode (/niki): Public-facing interface with emoji/text displays
User Mode (/user): Interactive interface with conversation history
Admin Mode (/admin): Full control panel with manual overrides

API Endpoints

/api/state/sse: Server-Sent Events for real-time state updates
/api/save_photo: Save captured photos
/api/user_input: Handle user interactions

Development Workflow

Running the Application

# Install dependencies
pip install -r requirements.txt

# Set environment variables (.env file)
NIKI_API_KEY=your_api_key
NIKI_USER_PASSWORD=your_password
STORAGE_SECRET=your_secret

# Run the application
python main.py
# Access at http://localhost:11011

Testing & Debugging

SSE testing: python test_sse.py
AI conversation debugging: Check .debug.json after each interaction
Photo processing: Use debug_image.py for image manipulation testing
Admin mode provides full conversation inspection and manual controls

Code Quality

Linting/Formatting: ruff with pre-commit hooks
Configuration: .ruff.toml with custom rules (allows long lines, global statements)
Import sorting: Enabled with isort integration

Common Implementation Patterns

Adding New Tools

Define tool schema in niki_ai.py tools list
Add handling logic in AIloop() for blocking vs non-blocking execution
Implement response handling in handle_user_input()
Update UI mappings in main.py if needed

Custom UI Elements

# camera.py - Python wrapper
class camera(Element, component="camera.js"):
    def capture(self):
        self.run_method("capture")

# camera.js - Vue component with WebRTC
export default {
  template: `<video ref="video" autoplay playsinline muted></video>`,
  mounted() {
    navigator.mediaDevices.getUserMedia({ video: true })
      .then(stream => { this.$refs.video.srcObject = stream; });
  }
}

State Synchronization

# SSE endpoint yields state changes
async def api_state_yielder(request: Request):
    past_state = None
    while True:
        state = get_state()
        if state != past_state:
            yield {"event": "state_update", "data": json.dumps(state)}
        await asyncio.sleep(0.1)

Key Files & Their Purposes

main.py: Main application, UI modes, API endpoints, state management
niki_ai.py: OpenAI integration, tool definitions, conversation flow
camera.py + camera.js: WebRTC camera capture component
photos.py: Image processing pipeline
tts.py: Text-to-speech generation and playback
niki_utils.py: UI helpers, button mappings, text processing
shared_state.py: Reactive state management
FLOW.md: High-level workflow documentation
NIKI_SCREEN_ELEMENTS.md: UI/UX specifications

Integration Points

External APIs: Azure OpenAI (conversation), Google TTS
Hardware: Camera (WebRTC), Printer (simulated via admin buttons)
File System: user_photos/, chosen_photos/, tts/, assets/
Web Standards: SSE for real-time updates, WebRTC for camera access

Troubleshooting

Common Issues

Camera not working: Ensure WebRTC permissions are granted in the browser
TTS not playing: Check internet connection for Google TTS API
AI responses slow: Verify Azure OpenAI API key and network connectivity
State not syncing: Check SSE connection and browser console for errors

Debug Mode

Use admin mode (/admin) for full conversation inspection and manual controls. Check .debug.json for AI interaction logs.

Contributing

Follow the established code quality standards (ruff, isort)
Test all changes in admin mode first
Update documentation for any new features
Ensure backward compatibility with existing workflows

License

[Add your license information here]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NIKI Photo Booth

Features

Architecture

Core System Flow

Three UI Modes

Key Components

Installation

Prerequisites

Setup

Usage

Running the Application

UI Modes

API Endpoints

Development Workflow

Running the Application

Testing & Debugging

Code Quality

Common Implementation Patterns

Adding New Tools

Custom UI Elements

State Synchronization

Key Files & Their Purposes

Integration Points

Troubleshooting

Common Issues

Debug Mode

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
.github		.github
assets		assets
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.ruff.toml		.ruff.toml
Dockerfile		Dockerfile
FLOW.md		FLOW.md
NIKI_SCREEN_ELEMENTS.md		NIKI_SCREEN_ELEMENTS.md
README.md		README.md
UPLOAD_PHOTO_TO_LOCAL_API.py		UPLOAD_PHOTO_TO_LOCAL_API.py
camera.js		camera.js
camera.py		camera.py
debug_image.py		debug_image.py
main.py		main.py
niki_ai.py		niki_ai.py
niki_utils.py		niki_utils.py
pack_docker_files.sh		pack_docker_files.sh
photos.py		photos.py
requirements.txt		requirements.txt
shared_state.py		shared_state.py
test_sse.py		test_sse.py
tts.py		tts.py

Folders and files

Latest commit

History

Repository files navigation

NIKI Photo Booth

Features

Architecture

Core System Flow

Three UI Modes

Key Components

Installation

Prerequisites

Setup

Usage

Running the Application

UI Modes

API Endpoints

Development Workflow

Running the Application

Testing & Debugging

Code Quality

Common Implementation Patterns

Adding New Tools

Custom UI Elements

State Synchronization

Key Files & Their Purposes

Integration Points

Troubleshooting

Common Issues

Debug Mode

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages