Releases: ggozad/oterm
Releases · ggozad/oterm
0.15.0
Highlights
oterm 0.15.0 is the multi-provider release. Underneath the same TUI, chats now run through pydantic-ai, so any provider it supports is available alongside Ollama. The chat UI also got a refresh, MCP integration was rewritten, and the test suite was rebuilt from scratch.
Breaking changes
- Multi-provider via pydantic-ai.
otermis no longer Ollama-only. Set the matching API key for OpenAI, Anthropic, Google (AI / Vertex), Groq, Mistral, Cohere, AWS Bedrock, DeepSeek, Cerebras, Grok, or Hugging Face and the provider appears in the new-chat dropdown. OpenAI-compatible endpoints (vLLM, LM Studio, llama.cpp, OpenRouter, LiteLLM, …) are first-class too, configured underopenaiCompatibleinconfig.json. - MCP rewrite. The
mcpServersconfig block now uses pydantic-ai's standard schema, which is compatible with Claude Desktop and Cursor configs. MCP prompts have been removed. See docs/mcp for the full migration notes. - No more pull-model command. Use
ollama pulldirectly instead. - Removed legacy config fields and CSS classes. Older custom themes that relied on internal class names may need touching up.
New
- Refreshed chat UI. Borderless accent-driven layout, auto-growing prompt, inline
[Image #N]attachment tokens, a thinking section that collapses once the response starts streaming, and a live token-usage footer in place of the spinner. - Faster streaming. Markdown is updated as deltas arrive rather than being re-rendered on every token, so long responses no longer slow the terminal as they grow.
- Capability-aware new-chat form. Tools, thinking, and vision toggles are enabled or disabled based on what the chosen provider and model actually support.
- Generic, pydantic-ai-driven chat parameters. Temperature, top_p, max_tokens, and seed are forwarded through pydantic-ai's
ModelSettings, with the supported set discovered per provider. Unknown keys saved by older versions are ignored cleanly. - Stick-to-bottom autoscroll during streaming, with a small threshold so a single line of scroll-back keeps you anchored where you were.
- Anthropic extended thinking is handled correctly for non-Opus-4.7 models too:
temperatureandtop_pare stripped andmax_tokensis bumped above the thinking budget. - OpenAI-compatible endpoints with API keys can now use
${VAR}interpolation inconfig.json, with a guard against accidentally leakingOPENAI_API_KEYto third-party endpoints.
Improvements
- Friendlier failure messages when Ollama isn't reachable, when an
openai-compatendpoint is missing from config, whenshow_modelfails, and when a chat references a tool that's no longer available. - Regenerate truncates the full prior turn instead of just the last two messages.
- Escape during streaming now cancels the in-flight inference task.
- Ctrl+Tab cycles through chat tabs in DOM order with priority bindings.
- Edit screen opens cleanly even when a chat's saved provider is no longer configured.
Docs
- Documentation rewritten around the pydantic-ai architecture, including provider setup,
openaiCompatibleconfiguration, and the new MCP schema.
Tooling
- CI now runs lint, type-check, and tests across Python 3.10 through 3.14 on every push and PR, with coverage reported to Codecov and held at 100%.
- Switched to
tyfor type checking andrufffor lint and format. - Test suite rebuilt around
TestModel/FunctionModel, with no live Ollama or VCR cassettes required.
0.14.7
0.14.6
0.14.5
0.14.4
0.14.3
0.14.2
0.14.1
0.14.0
What's Changed
- Support Streamable HTTP MCP servers by @ggozad in #252
- Support for bearer authentication in Streamable HTTP MCP servers by @ggozad in #251
- Delay importing the app to avoid sixel detection unless necessary by @ggozad in #248
- Remove completion method in favour of streaming in OllamaLLM. by @ggozad in #250
Full Changelog: 0.13.1...0.14.0