Conversation
|
How do I do the setup? |
Copy an image to clipboard (screenshot, browser, etc.) and paste into the Hermes CLI. The image is saved to ~/.hermes/images/, shown as a badge above the input ([📎 Image #1]), and sent to the model as a base64-encoded OpenAI vision multimodal content block. Implementation: - hermes_cli/clipboard.py: clean module with platform-specific extraction - macOS: pngpaste (if installed) → osascript fallback (always available) - Linux: xclip (apt install xclip) - cli.py: BracketedPaste key handler checks clipboard on every paste, image bar widget shows attached images, chat() converts to multimodal content format, Ctrl+C clears attachments Inspired by @m0at's fork (https://github.com/m0at/hermes-agent) which implemented image paste support for local vision models. Reimplemented cleanly as a separate module with tests.
Fixes NousResearch#633 Problem: - Sequential numbering gaps (e.g., NousResearch#1, NousResearch#2, NousResearch#5, NousResearch#8) confuse users - 200 char truncation too aggressive - Tool messages completely hidden with no indication Fix: 1. Use separate counter for displayed messages only 2. Skip tool messages but show count at end 3. Skip system messages 4. Increase truncation to 300 chars 5. Display 'N tool messages hidden' summary Impact: - Consistent numbering: NousResearch#1, NousResearch#2, NousResearch#3, NousResearch#4 - Users know when tool calls occurred - More context visible per message
Three bugs fixed in gateway fallback chain:
1. _load_fallback_chain() called build_auto_chain() which scanned env vars
without primary provider context, causing the active provider to appear
in its own fallback list and triggering API mode switches mid-conversation
(anthropic_messages -> chat_completions) on transient errors.
Fix: return None when no explicit config exists.
2. Fallback notify callback captured asyncio event loop inside run_sync()
which runs in a thread pool via run_in_executor. Python 3.11+ raises
RuntimeError('no running event loop') in worker threads.
Fix: build the callback in the async _run_agent context after
loop = asyncio.get_event_loop(), before run_in_executor call.
3. When no fallback chain was configured, AIAgent.__init__ auto-built one
from env vars (same bug as NousResearch#1 but inside the agent).
Fix: always pass explicit FallbackChain(entries=[], enabled=False)
when gateway has no chain configured.
teknium1
left a comment
There was a problem hiding this comment.
REQUEST_CHANGES
This change is directionally reasonable (429s should distinguish transient rate limits from daily-quota exhaustion), but the implementation reintroduces the exact bug it sets out to fix.
The message classifier keys on if "requests per" in err_message.lower(). That substring is wrong in both directions:
- Daily-quota 429s commonly say "... requests per day" -> matched as transient -> user told to "retry shortly" when they're out of daily quota. FALSE POSITIVE.
- Per-minute token limits ("tokens per minute"/TPM) or "rate limit exceeded, retry after Ns" lack "requests per" -> fall through to "Daily quota exhausted. See /gquota". FALSE NEGATIVE.
It also contradicts the code path in the same function: code is derived from the structured error_reason (RATE_LIMIT_EXCEEDED / MODEL_CAPACITY_EXHAUSTED), but the human-facing message ignores error_reason and matches English prose, so code and message can disagree.
Recommendation: derive the message from error_reason (the same reliable signal used for code), or at minimum match explicit windows ("per day"/daily vs "per minute"/RPM/TPM). Adding a unit test covering "requests per day" (daily) and "tokens per minute" (transient) would lock this in. Requesting changes until the classifier no longer misclassifies these common cases.
|
REQUEST_CHANGES — verdict at a glance The |
Uh oh!
There was an error while loading. Please reload this page.