Simple command-line and web chatbot written in Rust.
It is compatible with
- any OpenAI-compatible provider (public or private)
- any model supporting
/v1/chat/completionsend points
About the code
-
the majority of the code was first written by Claude.ai, and I used ChatGPT for various things
- i spent about 10% of the time beforehand, drafting a prompt with the design of the product
- about 75% of the "generate working code" job was done in 10% of the time by Claude
- i spent 80% of the time ... doing 25% of the "quality work" i enjoy (learn by reworking)
-
current state
- reworked the whole thing to learn about each component and technology
- cleaning, refactoring and improving the parts until i was satisfied with its shape
- error management, which was entirely missing (because i did not request it at first)
- interactive model selection
- streaming responses
- disposable history (no storage at all)
- token display (exact or estimated, cache efficiency on info log level)
- model filtering (regex) via configuration
- error handling (forbidden model, context overflow, plus all possible unhappy path)
Web interface
CLI in dark mode
Binaries are automatically generated at each release.
-
Go to the releases page
-
Download the archive for your system:
- πͺ Windows:
openai-compatible-chat-x86_64-pc-windows-msvc.zip(MSVC) - π§ Linux:
openai-compatible-chat-x86_64-unknown-linux-musl.zip(static MUSL) - π macOS:
openai-compatible-chat-aarch64-apple-darwin.zip(Apple Silicon) - π macOS:
openai-compatible-chat-x86_64-apple-darwin.zip(Intel) - π macOS:
openai-compatible-chat-macos.zip(universal)
- πͺ Windows:
-
Extract the ZIP archive
-
On Linux/macOS: make the binary executable (using
chmod +x) -
Run the included executable
- after setting up a configuration file
- and taking a look at the usage
Simply download the latest version from the releases page and replace the old binary.
Use config set-key command to set your api key (and automatically generate a default configuration)
Use config set-url if you use another OpenAI-compatible provider, to use another base URL for API calls
Use config show command to see your configuration.
{
"api_key": "YOUR_API_KEY",
"base_url": "https://api.openai.com/v1",
"exclude_model_name_regex": [
"chat-latest"
],
"default_system_prompt": ""
}Use model-info update command to automatically fetch the most-up to date model info.
Use model-info show command to see the current metadata (pipe to jq for a prettier output !)
{
...
"gpt-3.5-turbo": {
"context_window": 16384,
"description": "Fast, cost-effective chat model",
"family": "gpt-3.5",
"release": "2023-03-01",
"type": "chat"
},
...
}Proxy should be detected automatically by your http_proxy and https_proxy environment variables.
Common parameters
Usage: openai-compatible-chat.exe [OPTIONS] <COMMAND>
Commands:
config
model-info
cli
web
help Print this message or the help of the given subcommand(s)
Options:
-t, --api-timeout-sec <API_TIMEOUT_SEC>
-c, --config-file <CONFIG_FILE>
-i, --info-file <INFO_FILE>
-m, --model-lock <MODEL_LOCK>
--log-file <LOG_FILE>
-h, --help Print help
-V, --version Print version
You can bypass the model selection menu with:
--model-lock gpt-4o cliBehavior:
- verifies that the model exists in the list retrieved via the API
- applies filters (regex + compatible types)
- if valid β starts the conversation directly
- otherwise β error message
Usage: openai-compatible-chat.exe cli [OPTIONS]
Options:
--theme <THEME> [default: dark]
--refresh-ms <REFRESH_MS> [default: 100]
-h, --help Print help
A light theme is available but might have trouble in your terminal,
due to the fact that terminals are made to be dark, and so many "fixes"
are applied naturally, by the window manager, and stuff to "make it work".
Usage: openai-compatible-chat.exe web --port <PORT>
Options:
-p, --port <PORT> Port to listen on
-b, --bind <BIND_ADDR> Address to bind to [default: localhost]
-h, --help Print help```
Then open http://localhost:PORT in your browser.
Once you are happy with the configuration and all, you can set it to autostart in web mode
You can install it as a service :
... service [--user] install --port N [--bind addr]
... service [--user] start
... service [--user] restart
... service [--user] stop
... service [--user] uninstallIMPORTANT
- it will run from the place it was executed
- you can install it
- in "admin" mode (by default, linux)
- in user mode (only linux like systemd and friends, uses the running user)
IMPORTANT
- does not seem to work yet on windows (Go to What's next?)
Proxy: if needed, set the VSCode setting rust-analyzer.cargo.extraEnv:
"rust-analyzer.cargo.extraEnv": {
"ALL_PROXY": "http://YOUR_PROXY_IP:3128"
}Install prerequisites:
# required for the rust-analyzer VSCode extension
rustup component add rust-src
# wasm toolchain
rustup target add wasm32-unknown-unknown
# adds the code formatter (only) from nightly
# IMPORTANT: builds are done using stable !
rustup toolchain install nightly
rustup component add rustfmt --toolchain nightly
# remove everything but rustc, so that `rustup update` does not show an error
rustup component remove cargo clippy rust-docs rust-std --toolchain nightly
# tool for hot-building/reloading wasm and static files
cargo install trunk
# tool for hot-building/reloading native code
cargo install watchexec-cliDependency management
# show dupplicated (often, pulled) versions
cargo tree -d --depth 1
# show unused dependencies
cargo +nightly udeps
# unlike "cargo update", shows versions beyond semver
cargo outdatedThe JSON files in ai_model_info must be kept up to date, as they are used as metadata to filter which models to use for each function.
However, this data is not officially and centrally available, and must be periodically updated to add new models (returned by the API) using public data.
The recommended workflow to update model info:
- use Claude.ai since it has internet access and does the job
- for a model info file that needs updating:
- extract all incomplete models (those with
nullfields) from the JSON file - get the list of model ids retrieved from the API for which you have no info (see logs)
- extract all incomplete models (those with
Paste the batch of incomplete JSON, then submit the prompt below with your list of missing models:
Can you please update my attached incomplete json model metadata compilation
WITH ACCURATE DATA (no hallucinating !!) from up-to-date sources,
for all AI model id listed below, which i just got from the AI provider API
```
gpt-5.4-nano-2026-03-17
gpt-5.4-mini-2026-03-17
```
Wait for the result, then paste it back into the original JSON file.
Then run the command below to pretty-print the JSON, which makes commits produce a clean diff that can be reviewed to track changes:
cargo run -p ai_model_info ai_model_infoReview the changes made, and verify they are "consistent".
Copy Claude's summary of actions (use the "copy" button to get it in markdown format!)
Commit, making sure to archive Claude's explanation in the commit message.
Start the backend (use the port from key backend in section [[proxy]] of wasm/Trunk.toml):
watchexec --clear --quiet --restart --debounce 1s --stop-signal SIGTERM --ignore "wasm/**" --exts rs cargo run -p native -- web --port 3000Hot-build and reload Rust/WASM code and serve static files:
cd wasm
watchexec --clear --quiet --restart --debounce 1s --stop-signal SIGTERM --watch "../portable" --exts rs trunk serveHot-build documentation if needed
watchexec --clear --quiet --restart --debounce 10s --stop-signal SIGTERM --watch Cargo.lock cargo doc --lockedVisit http://localhost:8080 in your browser (as stated below, the favicon.svg will not show)
cd wasm && trunk build --releaseIMPORTANT: when the embed feature is enabled, the static files are re-embedded each time the module file where they are is touched.
NOTE: the favicon.svg is not managed by trunk it should be manually copied to dist folder before building native to be available.
cd ..
cargo build --releaseThat's the entire stack:
- Axum + async-openapi on the back
- Leptos + Trunk on the front.
No database, no auth middleware, no extra complexity.
For local use only, though you could "open" it, if you do not mind sharing your API key.
The simplest, most modern Rust web framework. Lightweight, built on Tokio, and with excellent support for streaming responses via SSE (Server-Sent Events). It serves two purposes: proxying requests to OpenAI (keeping your API key server-side), and serving the compiled WASM frontend as static files.
The best choice for a simple reactive SPA in Rust/WASM right now. It has a clean component model, handles async and reactive state elegantly, and its compiled output is very small. No router needed β only core reactivity and the component system are used.
The standard tool for building and bundling Rust WASM frontends. It handles WASM compilation, asset pipeline, and dev server with hot-reload out of the box. Zero config for a simple project like this.
The streaming flow: Leptos frontend sends a fetch request β Axum backend forwards it to OpenAI with streaming enabled β Axum streams tokens back as SSE β Leptos reads the SSE stream and appends tokens to the UI reactively.
There is surely something fun to do!

