Skip to content

nipil/openai-compatible-chat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

284 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

CLI Chatbot (OpenAI-compatible)

Table of Contents

Description

Simple command-line and web chatbot written in Rust.

It is compatible with

  • any OpenAI-compatible provider (public or private)
  • any model supporting /v1/chat/completions end points

About the code

  • the majority of the code was first written by Claude.ai, and I used ChatGPT for various things

    • i spent about 10% of the time beforehand, drafting a prompt with the design of the product
    • about 75% of the "generate working code" job was done in 10% of the time by Claude
    • i spent 80% of the time ... doing 25% of the "quality work" i enjoy (learn by reworking)
  • current state

    • reworked the whole thing to learn about each component and technology
    • cleaning, refactoring and improving the parts until i was satisfied with its shape
    • error management, which was entirely missing (because i did not request it at first)

Features

  • interactive model selection
  • streaming responses
  • disposable history (no storage at all)
  • token display (exact or estimated, cache efficiency on info log level)
  • model filtering (regex) via configuration
  • error handling (forbidden model, context overflow, plus all possible unhappy path)

Sample rendering

Web interface

sample

CLI in dark mode

sample

Installation

Binaries are automatically generated at each release.

πŸ“¦ Download a binary

  1. Go to the releases page

  2. Download the archive for your system:

    • πŸͺŸ Windows: openai-compatible-chat-x86_64-pc-windows-msvc.zip (MSVC)
    • 🐧 Linux: openai-compatible-chat-x86_64-unknown-linux-musl.zip (static MUSL)
    • 🍎 macOS: openai-compatible-chat-aarch64-apple-darwin.zip (Apple Silicon)
    • 🍎 macOS: openai-compatible-chat-x86_64-apple-darwin.zip (Intel)
    • 🍎 macOS: openai-compatible-chat-macos.zip (universal)
  3. Extract the ZIP archive

  4. On Linux/macOS: make the binary executable (using chmod +x)

  5. Run the included executable

Update

Simply download the latest version from the releases page and replace the old binary.

Configuration

Use config set-key command to set your api key (and automatically generate a default configuration)

Use config set-url if you use another OpenAI-compatible provider, to use another base URL for API calls

Use config show command to see your configuration.

{
  "api_key": "YOUR_API_KEY",
  "base_url": "https://api.openai.com/v1",
  "exclude_model_name_regex": [
    "chat-latest"
  ],
  "default_system_prompt": ""
}

Use model-info update command to automatically fetch the most-up to date model info.

Use model-info show command to see the current metadata (pipe to jq for a prettier output !)

{
  ...
  "gpt-3.5-turbo": {
    "context_window": 16384,
    "description": "Fast, cost-effective chat model",
    "family": "gpt-3.5",
    "release": "2023-03-01",
    "type": "chat"
  },
  ...
}

Proxy should be detected automatically by your http_proxy and https_proxy environment variables.

Usage

Common parameters

Usage: openai-compatible-chat.exe [OPTIONS] <COMMAND>

Commands:
  config
  model-info
  cli
  web
  help        Print this message or the help of the given subcommand(s)

Options:
  -t, --api-timeout-sec <API_TIMEOUT_SEC>
  -c, --config-file <CONFIG_FILE>
  -i, --info-file <INFO_FILE>
  -m, --model-lock <MODEL_LOCK>
      --log-file <LOG_FILE>
  -h, --help                               Print help
  -V, --version                            Print version

You can bypass the model selection menu with:

--model-lock gpt-4o cli

Behavior:

  • verifies that the model exists in the list retrieved via the API
  • applies filters (regex + compatible types)
  • if valid β†’ starts the conversation directly
  • otherwise β†’ error message

CLI mode

Usage: openai-compatible-chat.exe cli [OPTIONS]

Options:
      --theme <THEME>            [default: dark]
      --refresh-ms <REFRESH_MS>  [default: 100]
  -h, --help                     Print help

A light theme is available but might have trouble in your terminal, due to the fact that terminals are made to be dark, and so many "fixes" are applied naturally, by the window manager, and stuff to "make it work".

Web mode

Usage: openai-compatible-chat.exe web --port <PORT>

Options:
  -p, --port <PORT>  Port to listen on
  -b, --bind <BIND_ADDR>  Address to bind to [default: localhost]
  -h, --help         Print help```

Then open http://localhost:PORT in your browser.

Service

Once you are happy with the configuration and all, you can set it to autostart in web mode

You can install it as a service :

... service [--user] install --port N [--bind addr]
... service [--user] start
... service [--user] restart
... service [--user] stop
... service [--user] uninstall

IMPORTANT

  • it will run from the place it was executed
  • you can install it
    • in "admin" mode (by default, linux)
    • in user mode (only linux like systemd and friends, uses the running user)

IMPORTANT

Dev Workflow

Proxy: if needed, set the VSCode setting rust-analyzer.cargo.extraEnv:

"rust-analyzer.cargo.extraEnv": {
  "ALL_PROXY": "http://YOUR_PROXY_IP:3128"
}

Install prerequisites:

# required for the rust-analyzer VSCode extension
rustup component add rust-src

# wasm toolchain
rustup target add wasm32-unknown-unknown

# adds the code formatter (only) from nightly
# IMPORTANT: builds are done using stable !
rustup toolchain install nightly
rustup component add rustfmt --toolchain nightly
# remove everything but rustc, so that `rustup update` does not show an error
rustup component remove cargo clippy rust-docs rust-std --toolchain nightly

# tool for hot-building/reloading wasm and static files
cargo install trunk

# tool for hot-building/reloading native code
cargo install watchexec-cli

Dependency management

# show dupplicated (often, pulled) versions
cargo tree -d --depth 1

# show unused dependencies
cargo +nightly udeps

# unlike "cargo update", shows versions beyond semver
cargo outdated

Model info

The JSON files in ai_model_info must be kept up to date, as they are used as metadata to filter which models to use for each function.

However, this data is not officially and centrally available, and must be periodically updated to add new models (returned by the API) using public data.

The recommended workflow to update model info:

  • use Claude.ai since it has internet access and does the job
  • for a model info file that needs updating:
    • extract all incomplete models (those with null fields) from the JSON file
    • get the list of model ids retrieved from the API for which you have no info (see logs)

Paste the batch of incomplete JSON, then submit the prompt below with your list of missing models:

Can you please update my attached incomplete json model metadata compilation
WITH ACCURATE DATA (no hallucinating !!) from up-to-date sources,
for all AI model id listed below, which i just got from the AI provider API

    ```
    gpt-5.4-nano-2026-03-17
    gpt-5.4-mini-2026-03-17
    ```

Wait for the result, then paste it back into the original JSON file.

Then run the command below to pretty-print the JSON, which makes commits produce a clean diff that can be reviewed to track changes:

cargo run -p ai_model_info ai_model_info

Review the changes made, and verify they are "consistent".

Copy Claude's summary of actions (use the "copy" button to get it in markdown format!)

Commit, making sure to archive Claude's explanation in the commit message.

Debug

Start the backend (use the port from key backend in section [[proxy]] of wasm/Trunk.toml):

watchexec --clear --quiet --restart --debounce 1s --stop-signal SIGTERM --ignore "wasm/**" --exts rs cargo run -p native -- web --port 3000

Hot-build and reload Rust/WASM code and serve static files:

cd wasm
watchexec --clear --quiet --restart --debounce 1s --stop-signal SIGTERM --watch "../portable" --exts rs trunk serve

Hot-build documentation if needed

watchexec --clear --quiet --restart --debounce 10s --stop-signal SIGTERM --watch Cargo.lock cargo doc --locked

Visit http://localhost:8080 in your browser (as stated below, the favicon.svg will not show)

Release

cd wasm && trunk build --release

IMPORTANT: when the embed feature is enabled, the static files are re-embedded each time the module file where they are is touched.

NOTE: the favicon.svg is not managed by trunk it should be manually copied to dist folder before building native to be available.

cd ..
cargo build --release

Architecture

Components and links

That's the entire stack:

  • Axum + async-openapi on the back
  • Leptos + Trunk on the front.

No database, no auth middleware, no extra complexity.

For local use only, though you could "open" it, if you do not mind sharing your API key.

Backend: Axum

The simplest, most modern Rust web framework. Lightweight, built on Tokio, and with excellent support for streaming responses via SSE (Server-Sent Events). It serves two purposes: proxying requests to OpenAI (keeping your API key server-side), and serving the compiled WASM frontend as static files.

Frontend: Leptos

The best choice for a simple reactive SPA in Rust/WASM right now. It has a clean component model, handles async and reactive state elegantly, and its compiled output is very small. No router needed β€” only core reactivity and the component system are used.

Build tooling: Trunk

The standard tool for building and bundling Rust WASM frontends. It handles WASM compilation, asset pipeline, and dev server with hot-reload out of the box. Zero config for a simple project like this.

Application flow

The streaming flow: Leptos frontend sends a fetch request β†’ Axum backend forwards it to OpenAI with streaming enabled β†’ Axum streams tokens back as SSE β†’ Leptos reads the SSE stream and appends tokens to the UI reactively.

What's next?

There is surely something fun to do!

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Contributors