Skip to content

feat: OpenBao token migration and agent service improvements#84

Merged
jaredglaser merged 16 commits intomainfrom
stack/01-openbao-token-migration
Mar 23, 2026
Merged

feat: OpenBao token migration and agent service improvements#84
jaredglaser merged 16 commits intomainfrom
stack/01-openbao-token-migration

Conversation

@jaredglaser
Copy link
Copy Markdown
Owner

@jaredglaser jaredglaser commented Mar 22, 2026

Summary

  • Migrate agent tokens from database to OpenBao KV store
  • Add host secret methods to OpenBao client
  • Refactor agent stats collector to accept token via constructor
  • Simplify pullImage, clean up proxmox client
  • Fix code review and SonarQube findings

Stack

PR 1 of 8 — base: main

Test plan

  • bun run typecheck passes
  • bun test passes (1354 tests)

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Managed host lifecycle: add/remove hosts, provisioning, updates, and health checks with external secret storage.
    • Host-scoped secret operations for storing, retrieving, and deleting secrets.
  • Bug Fixes

    • More robust SSE stream parsing and improved stream error handling.
    • Fixed stack expansion/tracking and icon/variable rendering issues.
  • Refactor

    • Centralized input schemas and reorganized server/data modules.
    • Removed persistent token storage from host records; token handling moved to secret manager.

…vements

- Migrate agent tokens from database to OpenBao KV store
- Add host secret methods to OpenBao client
- Refactor agent stats collector to accept token via constructor
- Simplify pullImage (remove timeout), clean up proxmox client
- Fix various code review and SonarQube findings

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 22, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 005087eb-19d8-4a3f-9e90-a315368ca539

📥 Commits

Reviewing files that changed from the base of the PR and between 756274f and 573a944.

📒 Files selected for processing (24)
  • agent/src/routes/stats.ts
  • docker-compose.agent.yml
  • src/components/stacks/ComposeEditor.tsx
  • src/components/stacks/__tests__/ComposeEditor.test.tsx
  • src/data/docker/__tests__/schemas.test.ts
  • src/data/docker/schemas.ts
  • src/data/hosts/__tests__/handlers.test.ts
  • src/data/hosts/handlers.ts
  • src/data/settings/__tests__/schemas.test.ts
  • src/data/stacks/__tests__/schemas.test.ts
  • src/data/stacks/schemas.ts
  • src/hooks/useSettings.tsx
  • src/lib/clients/__tests__/openbao-client.test.ts
  • src/lib/clients/openbao-client.ts
  • src/lib/clients/proxmox-client.ts
  • src/lib/hosts/host-utils.ts
  • src/lib/services/__tests__/agent-update-service.test.ts
  • src/lib/services/agent-provisioning-service.ts
  • src/lib/services/agent-update-service.ts
  • src/lib/stacks/stack-service.ts
  • src/routes/api/git.$.ts
  • src/worker/__tests__/collector-factory.test.ts
  • src/worker/collector.ts
  • src/worker/collectors/__tests__/agent-stats-collector.test.ts

📝 Walkthrough

Walkthrough

Refactors host/token management to use OpenBao secrets, removes DB-stored agent tokens, restructures host handlers and server functions, inlines and simplifies agent log/stats streaming and Proxmox helpers, extracts Zod schemas to dedicated files, updates worker collector token wiring, and adjusts many import paths and tests.

Changes

Cohort / File(s) Summary
Agent routes
agent/src/routes/logs.ts, agent/src/routes/stats.ts
Inlined helper logic into handlers; moved Docker log demuxing and stats refresh into handlers; switched SSE error emission to event: error payloads; replaced context-controlled stream state with local variables.
Host management surface
src/data/hosts/functions.tsx, src/data/hosts/handlers.ts, src/data/hosts/schemas.ts
New server functions, dependency-injected handlers, and Zod schemas; add/remove/update/check host flows rewritten to provision agents via Docker and persist tokens to OpenBao (store/delete), with feature-flag gating.
OpenBao client & token APIs
src/lib/clients/openbao-client.ts, src/lib/services/token-service.ts
Added host-scoped KV v2 secret methods (getHostSecret, setHostSecret, deleteHostSecret); removed bcrypt-based token hash/verify exports, leaving only token generation.
Database & migrations
src/lib/database/repositories/*, migrations/013_drop_agent_token_columns.sql, migrations/*
Removed agent token/hash fields from repository types and SQL; deleted/added migrations to drop token columns; adjusted repository queries and removed updateTokenHash.
Worker & collectors
src/worker/collector-factory.ts, src/worker/collector.ts, src/worker/collectors/agent-stats-collector.ts, src/worker/collectors/docker-collector.ts
Collectors now accept injected per-host token (getToken); OpenBao init occurs in worker main; inlined docker-collector metadata/stat row building and container refresh logic; AgentStatsCollector now uses host.name and provided token.
Agent provisioning / update services
src/lib/services/agent-provisioning-service.ts, src/lib/services/agent-update-service.ts
Removed hostIp option and host-bound PortBindings; changed container naming to prefix + hostId; removed rollback flow in updates (stop→remove→create→start); health-check host derived from container env.
Stats repository
src/lib/database/repositories/stats-repository.ts
Removed generic bulkInsert helper; replaced with table-specific per-column array construction and typed unnest INSERTs for Docker/ZFS/Proxmox stats; added try/catch logging.
Schemas extraction
src/data/*/schemas.ts, src/data/*/functions.tsx
Moved inline Zod schemas into dedicated modules for docker, proxmox, zfs, stacks, hosts, settings; functions now import these schemas; added tests for schemas.
Stacks UI & props
src/components/stacks/ComposeEditor.tsx, ComposeEditorLoader.tsx, StackDetail.tsx, StackRow.tsx, StacksTable.tsx, VariablesPanel.tsx
Added required host prop to compose editor/loader/detail; replaced variable extraction implementation; changed stack sorting to name then host; adjusted variable display rendering.
Proxmox types & logic
src/lib/clients/proxmox-client.ts, src/types/proxmox.ts, src/lib/utils/proxmox-overview-builder.ts
Removed exported helpers flattenPerNodeResults/calculateClusterTotals and WithNode<T> type; inlined flattening/totals computation and replaced WithNode with intersection types; silenced dispatcher init error handling change.
Import path & alias migration
src/lib/constants/preload-queries.ts, vite.config.ts, many route/component imports
Systematic transition from @/data/*.functions to @/data/*/functions aliasing and adjusted Vite demo aliases.
Mocks & tests
many src/**/__tests__/* files (hosts, worker, services, clients, schemas)
Widespread test updates: remove token/hash expectations, update fixtures to remove token fields, adapt to new constructors/import paths, add schema tests, simplify Docker/mock behaviors, and adjust SSE test parsing.
Config / compose
docker-compose.agent.yml, bunfig.toml
Make AGENT_TOKEN required in compose env; add coveragePathIgnorePatterns in bunfig to ignore certain server-function files.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Client
participant ServerFns as Server Functions
participant Handlers as Handlers
participant DB as Database (HostRepo)
participant Docker
participant OpenBao
participant Worker

Client->>ServerFns: POST /addHost { name, socketProxyUrl, agentPort? }
ServerFns->>Handlers: handleAddHost(deps, data)
Handlers->>DB: repo.create({ name, socketProxyUrl, agent_url: '' })
DB-->>Handlers: createdHost (id)
Handlers->>Docker: provision(socketProxyUrl, { hostId, agentPort, agentToken, agentImage })
Docker-->>Handlers: { agentUrl }
Handlers->>OpenBao: storeToken(host.name, agentToken)
OpenBao-->>Handlers: OK
Handlers->>DB: repo.updateAgentUrl(hostId, agentUrl)
DB-->>Handlers: OK
Handlers->>Handlers: retryHealthCheck(agentUrl)
alt healthy
    Handlers->>DB: repo.updateAgentVersion(hostId, version)
    DB-->>Handlers: OK
    Handlers-->>Client: success { host: HostListItem }
    Handlers->>Worker: (worker will later call getHostSecret/getToken)
else unhealthy or failure
    Handlers->>OpenBao: deleteToken(host.name) (best-effort)
    Handlers->>Docker: removeAgent(socketProxyUrl, hostId) (best-effort)
    Handlers->>DB: repo.delete(hostId)
    Handlers-->>Client: error

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Poem

🐰 I hopped through code with tiny paws,
Tokens tucked away from DB claws.
Secrets now snug in OpenBao's den,
Streams simplified, helpers penned —
A carrot for tests and one for fans! 🥕

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch stack/01-openbao-token-migration

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

jaredglaser and others added 2 commits March 22, 2026 14:12
…orts)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Clean up hosts.functions.tsx: remove banner comments, use ManagedHost type
instead of inlining row shape in HostRepo interface.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 27

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/lib/utils/proxmox-overview-builder.ts (1)

67-83: ⚠️ Potential issue | 🟠 Major

Type contract is not enforced—node coercion masks malformed database rows.

These three arrays promise a non-null node: string to consumers, but ProxmoxStatsRow.node is defined as string | null in the row type, and the converter does not validate. For entity types qemu, lxc, and storage (per-node resources), a null node value should never occur, but the mappings silently coerce it to '' instead. This hides malformed rows downstream.

Either enforce non-null node at the database/converter layer for these entity types, or validate and throw/skip malformed rows here rather than silently defaulting to empty string.

Applies to lines: 67, 86, 108

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/utils/proxmox-overview-builder.ts` around lines 67 - 83, The mapping
that builds vms (and the similar mappings for containers and storage where you
map vmRows/nodeRows) currently coerces ProxmoxStatsRow.node to '' which masks
malformed rows; change the converter to validate that r.node is non-null for
entity types that must have a node (e.g., in the vms mapping for vmRows and the
corresponding container/storage mappings), and either throw a descriptive error
or skip the row when r.node === null instead of defaulting to ''. Update the
mapping logic in the functions/variables named vms (and the analogous arrays for
qemu/lxc/storage) to assert/guard r.node and include the row identity
(vmid/entity_name) in the error message when rejecting rows.
src/data/stacks.functions.tsx (1)

38-43: ⚠️ Potential issue | 🔴 Critical

Add explicit { method: 'POST' } to all mutating server functions.

createServerFn() defaults to GET in TanStack Start. The functions triggerDeploy, saveComposeFile, and updateStackIcon are mutating operations that must explicitly use { method: 'POST' } to prevent unintended side effects from GET caching and automatic prefetching.

Example fix
export const triggerDeploy = createServerFn()
+  .method('POST')
  .inputValidator(triggerDeploySchema)
  .handler(async ({ data }): Promise<{ deployId: number }> => {
    const { triggerStackDeploy } = await import('@/lib/stacks/stack-service');
    return triggerStackDeploy(data);
  });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/data/stacks.functions.tsx` around lines 38 - 43, The mutating server
functions use createServerFn() which defaults to GET; update the createServerFn
call for triggerDeploy (and similarly for saveComposeFile and updateStackIcon)
to explicitly specify POST by changing createServerFn() to createServerFn({
method: 'POST' }) so these handlers are not treated as GET (avoiding
caching/prefetching); ensure you apply the same change to the functions named
triggerDeploy, saveComposeFile, and updateStackIcon and keep their existing
.inputValidator(...).handler(...) chains intact.
src/data/__tests__/hosts.functions.handlers.test.ts (1)

167-175: 🛠️ Refactor suggestion | 🟠 Major

Make the success-path mock reflect what create() stored.

Line 170 still asserts the hard-coded fixture row, so this test never verifies the name/agent_url that handleAddHost() actually persisted and returned. As written, it would stay green even if the handler saved agent_url: '' and returned a blank URL. Have the create mock echo its input and assert the created host fields.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/data/__tests__/hosts.functions.handlers.test.ts` around lines 167 - 175,
Update the test "provisions and creates host on success" so the repo.create mock
echoes the input it receives and the assertions check the persisted values
instead of the hard-coded fixture; specifically modify deps.repo.create to be a
jest mock that returns an object built from the create input (including id and
agent_url/agentUrl derived from socketProxyUrl/agentPort), then change the
expect(result.host.name) to assert the input name ('new') and add an assertion
for the returned agent_url/agentUrl matching the value the handler should have
persisted; keep the existing assertions for deps.provision, deps.storeToken, and
deps.repo.updateStatus.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@agent/src/routes/logs.ts`:
- Around line 55-69: The data handler for logStream currently treats each
chunk/frame as complete and uses a single muxedRemainder buffer, causing
split/truncated lines; modify the logStream.on('data', ...) logic so that
parseTtyChunk and parseMuxedChunk receive and return partial remainders which
are preserved across invocations, maintain separate pending remainder buffers
for multiplexed stdout and stderr (not just muxedRemainder) when using
parseMuxedChunk, and update the surrounding logic to concatenate incoming chunk
+ the appropriate remainder before parsing; also ensure any final unterminated
remainder(s) are flushed/emitted in the logStream 'end' handler so no partial
lines are dropped.

In `@agent/src/routes/stats.ts`:
- Around line 201-207: The initial containers snapshot returned by
docker.listContainers() should be emitted immediately before entering the
refresh loop so clients receive the authoritative (possibly empty) ID list
without waiting for the first refresh; after obtaining the list in the variable
containers (from docker.listContainers) call the same event emission path you
use inside the refresh loop (the code that sends the 'containers' event to the
client) and then proceed to openContainerStream(docker, c) for each container
and enter the lastRefresh / consecutiveFailures refresh loop as before so the
initial state is delivered right away.

In `@docker-compose.agent.yml`:
- Line 56: The docker-compose service port mappings were changed from
loopback-only to all-interfaces (e.g., "9090:9090"), exposing the agent/OpenBao
alongside predictable dev tokens (dev-agent-token, dev-root-token); revert the
port mappings to bind to localhost (use the loopback-style mapping such as
"127.0.0.1:9090:9090" for the affected 9090 and the other port at the other
noted line) or, if you must expose them, add a prominent comment/README note
about the exposure risk and ensure tokens are not defaulted in compose. Update
the compose entries that currently read "9090:9090" (and the analogous mapping
at the other flagged line) to use explicit 127.0.0.1 bindings or document the
risk and rotate/remove dev tokens.

In `@migrations/012_drop_agent_token_columns.sql`:
- Around line 1-4: The migration file named 012_drop_agent_token_columns.sql
conflicts with an existing migration 012_managed_hosts_agent_token.sql; rename
this migration to the next unused sequential prefix (e.g.,
013_drop_agent_token_columns.sql or appropriate next number) and update any
migration manifest or registry that lists migrations so the sequence is unique
and ordered, ensuring the SQL content (DROP COLUMN IF EXISTS statements) is
unchanged.

In `@migrations/012_managed_hosts_agent_token.sql`:
- Around line 1-3: Migrations with the shared "012" prefix conflict and cause
DROP to run before ADD (see 012_drop_agent_token_columns.sql vs
012_managed_hosts_agent_token.sql and 012_managed_hosts_updated_at.sql); resolve
by consolidating the intended changes into a single migration that reflects the
final schema (either keep the column or remove it) or renumber the files to
enforce the correct order (e.g., rename so add runs before any drop), remove
duplicate/obsolete files like the extra 013_managed_hosts_agent_token.sql or the
stray 011_* migration if they are superseded, and verify the final schema state
against the code that expects agent_token so the migrate runner (which sorts
filenames) applies them deterministically.

In `@src/components/stacks/ComposeEditor.tsx`:
- Around line 27-29: ComposeEditor initializes editorContent and detectedVars
from props only once; add a prop-sync path so changes to stackName or incoming
content/initialVariables reset the editor state. Inside ComposeEditor, add a
useEffect that watches stackName, content, and initialVariables and calls
setEditorContent(content) and setDetectedVars(initialVariables), and also
clear/reset isDirty via setIsDirty(false) (or the equivalent state setter) so
the editor reflects canonical content after saves or when switching stacks.
Ensure this useEffect targets the existing state vars editorContent,
detectedVars, and isDirty so the editor reliably follows prop changes.

In `@src/components/stacks/StackRow.tsx`:
- Line 1: The component's iconError state can become stale when the stack's
iconUrl changes; add a useEffect in StackRow that resets iconError by calling
setIconError(false) whenever iconUrl changes (i.e., useEffect(() =>
setIconError(false), [iconUrl])) so a newly provided valid slug will reattempt
loading instead of permanently showing the fallback Layers icon.

In `@src/components/stacks/StacksTable.tsx`:
- Line 24: The sort comparator only compares a.name and can return 0 for stacks
with identical names on different hosts, causing non-deterministic ordering;
update the comparator used where stacks are prepared (the line creating
[...stacks].sort(...)) to compare name first and when equal fall back to
comparing host (e.g., if nameCompare === 0 then return
a.host.localeCompare(b.host)) so ordering is deterministic across
renders/virtualization.

In `@src/data/__tests__/hosts.functions.test.ts`:
- Around line 29-55: Extract the duplicated Zod schemas (socketProxyUrlSchema,
addHostSchema, removeHostSchema, updateAgentSchema, checkHostHealthSchema) into
a single shared module and export them; then in both the server implementation
(hosts.functions.tsx) and this test file replace the local recreated schema
definitions with imports from that shared module so the tests exercise the
actual production schemas, update any paths/exports accordingly, and run tests
to ensure imports compile and validation behavior remains identical.

In `@src/data/hosts.functions.tsx`:
- Around line 205-211: The error message currently asserts that the host
record/container and token were cleaned up unconditionally; change the error
handling in the block around deps.storeToken (and similarly the block around
deps.deleteToken) to track the actual outcomes of cleanup calls (e.g., capture
success/failure booleans from deps.repo.delete, deps.removeAgent, and
deps.deleteToken) and build the thrown Error message to only claim cleanup for
steps that returned success; update the catch blocks in the function containing
deps.storeToken and the one containing deps.deleteToken to record each cleanup
result and include those specific results in the error text rather than blanket
statements.
- Around line 276-280: The handlers are directly instantiating OpenBaoClient
(via OpenBaoClient and loadOpenBaoConfig) and calling ensureSecretsEngine inside
createServerFn, violating the DI rule; move the OpenBaoClient construction and
any calls that touch token/secret paths (e.g., ensureSecretsEngine) into the
server dependency injection input so the server fn receives a
pre-built/open-configured client (and similarly replace direct new
AgentProvisioningService() usage with an injected provService), update
createServerFn signatures to accept these injected deps, and ensure tests can
stub the token path by providing a mocked OpenBaoClient via the injected deps
(also apply the same change for the other occurrence where
OpenBaoClient/loadOpenBaoConfig are constructed).
- Around line 59-64: The new host creation currently writes an empty string for
agent_url so the resolved provisionResult.agentUrl is never persisted; update
the create flow (the function that inserts the new host row and the code path
that later calls toHostListItem) to store provisionResult.agentUrl into the
agent_url column before returning the created host — either include agent_url:
provisionResult.agentUrl in the initial INSERT or immediately perform an UPDATE
using the created row's id, and remove the hard-coded '' value; also apply the
same persistence fix in the later path that builds the response around
toHostListItem (the block around where toHostListItem is called) so subsequent
handleCheckHostHealth()/handleUpdateAgent() reads the persisted URL.

In `@src/hooks/__tests__/useStackExpansion.test.ts`:
- Around line 33-58: Tests in useStackExpansion.test.ts use plain display names
('plex', 'traefik', 'grafana') which can collide across hosts; update the tests
to use host-prefixed entity IDs (e.g., 'hostA/plex', 'hostB/traefik',
'hostA/grafana') when calling result.current.toggleStackExpanded and
result.current.isStackExpanded so the expansion-state keys are unique.
Specifically, change the values passed to toggleStackExpanded and
isStackExpanded in the test cases that call useStackExpansion (the renderHook
wrapper created by createTestWrapper) to host-prefixed IDs to satisfy the
uniqueness contract.

In `@src/lib/clients/proxmox-client.ts`:
- Around line 41-45: The async import of undici's Agent currently races with
get()'s fetch call so the dispatcher may be undefined; ensure the dispatcher is
fully initialized before any fetch by turning the dynamic import into an
initialization promise (e.g., store a this.dispatcherReady Promise when doing
import('undici') and resolving after setting this.fetchOptions.dispatcher) and
await that promise at the start of the get() method (or any method that calls
fetch) so the Agent is set when allowSelfSignedCerts is true; look for the
import('undici') block and the get() method and add awaiting logic tied to a
shared dispatcher-ready symbol.

In `@src/lib/deploy/types.ts`:
- Around line 50-52: Summary: The file both re-exports HostStatus as
ManagedHostStatus and also defines a local alias ManagedHostStatus, which is
redundant. Fix: remove the re-export line "export type { HostStatus as
ManagedHostStatus }" and instead export the local alias by replacing "type
ManagedHostStatus = HostStatus;" with "export type ManagedHostStatus =
HostStatus;". This keeps a single alias definition (ManagedHostStatus) used by
downstream code and preserves the original HostStatus import.

In `@src/lib/hosts/host-utils.ts`:
- Line 58: The agentVersion assignment in host-utils (the ternary using
overrides && 'agentVersion' in overrides ? overrides.agentVersion! :
row.agent_version) can leak undefined when overrides.agentVersion is explicitly
passed as undefined; change the branch that reads overrides.agentVersion to
coerce undefined to null (e.g., use the nullish coalescing behavior) so the
value returned for agentVersion is always string | null; update the expression
that references overrides.agentVersion in that ternary to normalize undefined to
null and keep the rest of the logic unchanged.

In `@src/lib/services/__tests__/agent-update-service.test.ts`:
- Around line 11-50: The test uses the same mockContainer for both the existing
and newly-created container which masks incorrect behavior in updateAgent;
change the mocks so getContainer() returns an "old" container object (e.g.,
oldMockContainer with its own inspect/stop/remove/start spies/state) and
createContainer() returns a distinct "new" container object (e.g.,
newMockContainer with separate spies), then adjust assertions to reference those
distinct mocks to verify updateAgent stops/removes the old container and starts
the new one.

In `@src/lib/services/__tests__/token-service.test.ts`:
- Line 2: The test imports generateToken using a relative path; update the
import in src/lib/services/__tests__/token-service.test.ts to use the project
alias instead of a relative path so it follows the guideline for imports outside
__tests__; specifically replace the ../token-service import with the alias
import that references the generateToken export from token-service (e.g., use
the "@/lib/services/token-service" style path) so the test imports the
generateToken function via the alias.

In `@src/lib/services/agent-provisioning-service.ts`:
- Around line 67-71: The PortBindings entry currently omits HostIp which causes
Docker to bind the agent port on 0.0.0.0 (all interfaces); update the container
creation logic around PortBindings to accept and validate an optional host IP
(e.g., a new options.hostIp or options.managementIp) and, when provided, set
HostIp to that value instead of leaving it out; if no host IP is supplied,
document/explicitly fall back to the current behavior and/or validate that the
deployment is on an isolated management network. Target the PortBindings
construction and any caller that builds options (references: PortBindings,
HostIp, options.agentPort) to implement and test this change.

In `@src/lib/services/agent-update-service.ts`:
- Around line 47-56: The current logic stops and removes existingContainer
immediately, which leaves no rollback if createContainer()/start() or the health
check fails; change the flow to keep existingContainer running until the
replacement is verified healthy: create the replacement container (use a
temporary/new name), start it via its start() call, run the existing health
check routine against the new container (same checks currently used), and only
after the new container passes health remove/stop the old existingContainer; if
the new container fails, remove the failed new container and leave
existingContainer untouched so existingContainer can continue serving.
- Around line 58-68: When creating the replacement container in
docker.createContainer (the block constructing newContainer), don't cherry-pick
HostConfig fields; copy the full oldHostConfig to preserve runtime settings.
Replace the HostConfig object that currently lists only Binds, PortBindings, and
RestartPolicy with a full clone/spread of oldHostConfig (e.g., use the
equivalent of { ...oldHostConfig }) so the new container created with
containerName, Image newImage, Env oldEnv, and ExposedPorts
(inspectData.Config.ExposedPorts) retains all original HostConfig settings.

In `@src/lib/stacks/stack-mappers.ts`:
- Around line 40-42: The extractVariableNames function's regex currently only
matches ${VAR} and ${VAR:-default} but excludes valid Docker Compose forms like
${VAR-default}, ${VAR:?error}/${VAR?error}, and
${VAR:+alternate}/${VAR+alternate}; update the regex in extractVariableNames to
also accept the operators -, :-, ?, :?, +, and :+ and their trailing token
portions (so the capture group still returns the variable name only) OR, if the
narrower scope was intentional, add a clarifying comment above
extractVariableNames describing the deliberate limitation and examples of
unsupported forms.

In `@src/lib/stacks/stack-service.ts`:
- Around line 112-117: The current loop fetches secrets sequentially which
causes N round-trips; change to parallel fetching by mapping variables to
baoClient.getSecret calls, await Promise.all on that array, then iterate results
to build the const secrets: Record<string,string> object (keeping the existing
filter for non-null values). Use the existing symbols variables and
baoClient.getSecret and populate the same secrets record so behavior is
unchanged except for parallelism.

In `@src/routes/api/git`.$.ts:
- Around line 34-36: Wrap the atob(authHeader.slice('Basic '.length)) call in a
try/catch to avoid throwing InvalidCharacterError for malformed Base64: catch
decode errors when computing decoded and set providedToken to undefined (or
leave it empty) so the auth flow returns a 401 challenge instead of propagating
a 500; update the block that currently assigns providedToken (using decoded and
colonIndex) to run only when decoding succeeds and ensure the request handler
returns a 401 on decode failure.

In `@src/worker/collector-factory.ts`:
- Around line 126-136: In createCollectorsForManagedHosts, the await
getToken(host.name) call can throw and abort the whole hosts loop; wrap the
getToken call in a try/catch per host (inside the for loop), log an error
mentioning host.name and the caught error, and continue to the next host if an
exception occurs (preserving the existing falsy-token check/continue for
no-token cases); then proceed to instantiate AgentStatsCollector and call
stack.use only when a token was successfully retrieved.

In `@src/worker/collector.ts`:
- Around line 88-96: Wrap the OpenBao initialization block (when
isDockerManagementEnabled() is true) in a try-catch around loadOpenBaoConfig(),
new OpenBaoClient(...) and await baoClient.ensureSecretsEngine(), catch errors
(specifically ZodError from loadOpenBaoConfig) and log a clear, contextual
message including the error details (e.g., via console.error or existing
processLogger) before rethrowing or exiting so operators see which env/config is
missing; keep getToken assignment to baoClient.getHostSecret(hostname,
'agent_token') inside the try so it only gets set on successful init.

In `@src/worker/collectors/agent-stats-collector.ts`:
- Around line 100-109: The parsed SSE payload assigned to event after JSON.parse
may be a primitive or partial object; update the handling in the block around
JSON.parse so you validate the parsed value before using it as an
AgentStatsEvent: after parsing, check that the result is a non-null object
(typeof result === 'object' && result !== null) and optionally verify expected
keys (e.g., hasOwnProperty('error') or 'containerId') before running "'error' in
event" or other property reads; if the parsed value fails validation, log the
failure (include this.name and a snippet of the payload) and continue the
collector loop to avoid crashing code paths that assume an AgentStatsEvent.
Ensure references to AgentStatsEvent, the local variable event, and the
JSON.parse call are the ones being guarded.

---

Outside diff comments:
In `@src/data/__tests__/hosts.functions.handlers.test.ts`:
- Around line 167-175: Update the test "provisions and creates host on success"
so the repo.create mock echoes the input it receives and the assertions check
the persisted values instead of the hard-coded fixture; specifically modify
deps.repo.create to be a jest mock that returns an object built from the create
input (including id and agent_url/agentUrl derived from
socketProxyUrl/agentPort), then change the expect(result.host.name) to assert
the input name ('new') and add an assertion for the returned agent_url/agentUrl
matching the value the handler should have persisted; keep the existing
assertions for deps.provision, deps.storeToken, and deps.repo.updateStatus.

In `@src/data/stacks.functions.tsx`:
- Around line 38-43: The mutating server functions use createServerFn() which
defaults to GET; update the createServerFn call for triggerDeploy (and similarly
for saveComposeFile and updateStackIcon) to explicitly specify POST by changing
createServerFn() to createServerFn({ method: 'POST' }) so these handlers are not
treated as GET (avoiding caching/prefetching); ensure you apply the same change
to the functions named triggerDeploy, saveComposeFile, and updateStackIcon and
keep their existing .inputValidator(...).handler(...) chains intact.

In `@src/lib/utils/proxmox-overview-builder.ts`:
- Around line 67-83: The mapping that builds vms (and the similar mappings for
containers and storage where you map vmRows/nodeRows) currently coerces
ProxmoxStatsRow.node to '' which masks malformed rows; change the converter to
validate that r.node is non-null for entity types that must have a node (e.g.,
in the vms mapping for vmRows and the corresponding container/storage mappings),
and either throw a descriptive error or skip the row when r.node === null
instead of defaulting to ''. Update the mapping logic in the functions/variables
named vms (and the analogous arrays for qemu/lxc/storage) to assert/guard r.node
and include the row identity (vmid/entity_name) in the error message when
rejecting rows.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 1a636f79-f47c-4bb6-a52b-67b38c45b55b

📥 Commits

Reviewing files that changed from the base of the PR and between a994258 and 4af109e.

📒 Files selected for processing (57)
  • LICENSE
  • agent/src/routes/logs.ts
  • agent/src/routes/stats.ts
  • docker-compose.agent.yml
  • docker-compose.local.yml
  • migrations/010_managed_hosts_updated_at.sql
  • migrations/012_drop_agent_token_columns.sql
  • migrations/012_managed_hosts_agent_token.sql
  • self-hosting/README.md
  • src/components/stacks/ComposeEditor.tsx
  • src/components/stacks/StackDetail.tsx
  • src/components/stacks/StackRow.tsx
  • src/components/stacks/StacksTable.tsx
  • src/components/stacks/VariablesPanel.tsx
  • src/components/stacks/__tests__/ComposeEditor.test.tsx
  • src/data/__tests__/hosts.functions.handlers.test.ts
  • src/data/__tests__/hosts.functions.test.ts
  • src/data/hosts.functions.tsx
  • src/data/stacks.functions.tsx
  • src/hooks/__tests__/useStackExpansion.test.ts
  • src/hooks/useSettings.tsx
  • src/lib/clients/__tests__/openbao-client.test.ts
  • src/lib/clients/__tests__/proxmox-client.test.ts
  • src/lib/clients/openbao-client.ts
  • src/lib/clients/proxmox-client.ts
  • src/lib/database/repositories/__tests__/host-repository.test.ts
  • src/lib/database/repositories/__tests__/managed-hosts-repository.test.ts
  • src/lib/database/repositories/host-repository.ts
  • src/lib/database/repositories/managed-hosts-repository.ts
  • src/lib/database/repositories/stats-repository.ts
  • src/lib/deploy/__tests__/pipeline.test.ts
  • src/lib/deploy/types.ts
  • src/lib/hosts/__tests__/host-utils.test.ts
  • src/lib/hosts/host-utils.ts
  • src/lib/mock/functions/stacks.functions.tsx
  • src/lib/services/__tests__/agent-health-service.test.ts
  • src/lib/services/__tests__/agent-provisioning-service.test.ts
  • src/lib/services/__tests__/agent-update-service.test.ts
  • src/lib/services/__tests__/docker-image-utils.test.ts
  • src/lib/services/__tests__/token-service.test.ts
  • src/lib/services/agent-provisioning-service.ts
  • src/lib/services/agent-update-service.ts
  • src/lib/services/docker-image-utils.ts
  • src/lib/services/token-service.ts
  • src/lib/stacks/__tests__/stack-service.test.ts
  • src/lib/stacks/stack-mappers.ts
  • src/lib/stacks/stack-service.ts
  • src/lib/utils/proxmox-overview-builder.ts
  • src/routes/api/git.$.ts
  • src/types/proxmox.ts
  • src/worker/__tests__/collector-factory.test.ts
  • src/worker/collector-factory.ts
  • src/worker/collector.ts
  • src/worker/collectors/__tests__/agent-stats-collector.test.ts
  • src/worker/collectors/__tests__/docker-collector.test.ts
  • src/worker/collectors/agent-stats-collector.ts
  • src/worker/collectors/docker-collector.ts
💤 Files with no reviewable changes (6)
  • src/lib/services/tests/agent-health-service.test.ts
  • self-hosting/README.md
  • src/lib/deploy/tests/pipeline.test.ts
  • docker-compose.local.yml
  • src/lib/database/repositories/tests/host-repository.test.ts
  • src/lib/services/token-service.ts

Comment thread agent/src/routes/logs.ts
Comment on lines +55 to +69
let muxedRemainder: Buffer = Buffer.alloc(0);

logStream.on('data', (chunk: Buffer) => {
if (closed) return;

let lines: LogLine[];
if (isTty) {
lines = parseTtyChunk(chunk);
} else {
const input = muxedRemainder.length > 0
? Buffer.concat([muxedRemainder, chunk])
: chunk;
const result = parseMuxedChunk(input);
lines = result.lines;
muxedRemainder = result.remainder;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Preserve partial log lines across chunk and frame boundaries.

Both branches currently treat each incoming chunk/frame as a complete line. Docker log streams are byte-oriented, so one logical line can span multiple TTY chunks or multiple multiplexed frames; in multiplexed mode you also need a separate pending fragment per stream (stdout/stderr) to avoid merging interleaved output. As written, this will emit truncated/split log messages and drop any final unterminated fragment unless you carry the text remainder forward and flush it on end.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@agent/src/routes/logs.ts` around lines 55 - 69, The data handler for
logStream currently treats each chunk/frame as complete and uses a single
muxedRemainder buffer, causing split/truncated lines; modify the
logStream.on('data', ...) logic so that parseTtyChunk and parseMuxedChunk
receive and return partial remainders which are preserved across invocations,
maintain separate pending remainder buffers for multiplexed stdout and stderr
(not just muxedRemainder) when using parseMuxedChunk, and update the surrounding
logic to concatenate incoming chunk + the appropriate remainder before parsing;
also ensure any final unterminated remainder(s) are flushed/emitted in the
logStream 'end' handler so no partial lines are dropped.

Comment thread agent/src/routes/stats.ts Outdated
Comment thread docker-compose.agent.yml Outdated
Comment on lines +1 to +4
-- Drop agent token columns from managed_hosts.
-- Tokens are now stored in OpenBao at secret/hosts/<hostname>/agent_token.
ALTER TABLE managed_hosts DROP COLUMN IF EXISTS agent_token;
ALTER TABLE managed_hosts DROP COLUMN IF EXISTS agent_token_hash;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Migration logic is correct, but see numbering concern.

The migration correctly uses DROP COLUMN IF EXISTS for safe removal. The comment appropriately documents that tokens are now stored in OpenBao. However, per the earlier comment, the 012 prefix conflict with 012_managed_hosts_agent_token.sql needs resolution.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@migrations/012_drop_agent_token_columns.sql` around lines 1 - 4, The
migration file named 012_drop_agent_token_columns.sql conflicts with an existing
migration 012_managed_hosts_agent_token.sql; rename this migration to the next
unused sequential prefix (e.g., 013_drop_agent_token_columns.sql or appropriate
next number) and update any migration manifest or registry that lists migrations
so the sequence is unique and ordered, ensuring the SQL content (DROP COLUMN IF
EXISTS statements) is unchanged.

Comment thread migrations/012_managed_hosts_agent_token.sql Outdated
Comment thread src/lib/stacks/stack-service.ts
Comment thread src/routes/api/git.$.ts Outdated
Comment thread src/worker/collector-factory.ts
Comment thread src/worker/collector.ts
Comment on lines +100 to +109
let event: AgentStatsEvent;
try {
parsed = JSON.parse(jsonStr);
event = JSON.parse(jsonStr);
} catch {
console.error(`[${this.name}] Failed to parse SSE event: ${jsonStr.substring(0, 100)}`);
continue;
}

if (typeof parsed !== 'object' || parsed === null) continue;

// Skip error events from the agent
if ('error' in parsed && !('containerId' in parsed)) continue;

const event = parsed as AgentStatsEvent;
if ('error' in event && !('containerId' in event)) continue;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Keep a runtime guard around parsed SSE payloads.

JSON.parse() can return a primitive or a partial object. With the direct assignment here, a malformed-but-valid payload will trip 'error' in event or later property reads and tear down the collector loop for that host. Please validate the parsed value before treating it as AgentStatsEvent. Based on learnings, Agent (agent/) is a separate Bun package running as a sidecar container alongside Docker hosts.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/worker/collectors/agent-stats-collector.ts` around lines 100 - 109, The
parsed SSE payload assigned to event after JSON.parse may be a primitive or
partial object; update the handling in the block around JSON.parse so you
validate the parsed value before using it as an AgentStatsEvent: after parsing,
check that the result is a non-null object (typeof result === 'object' && result
!== null) and optionally verify expected keys (e.g., hasOwnProperty('error') or
'containerId') before running "'error' in event" or other property reads; if the
parsed value fails validation, log the failure (include this.name and a snippet
of the payload) and continue the collector loop to avoid crashing code paths
that assume an AgentStatsEvent. Ensure references to AgentStatsEvent, the local
variable event, and the JSON.parse call are the ones being guarded.

jaredglaser and others added 8 commits March 22, 2026 14:23
Extract handler functions and types to hosts.handlers.ts (fully testable).
Keep only createServerFn wiring in hosts.functions.tsx (untestable dynamic
imports). Exclude server function wiring files from coverage via bunfig.toml.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Extract Zod schemas to hosts.schemas.ts (imported by both functions and tests)
- Replace 595-line test that duplicated schemas with 80-line test importing them
- hosts.functions.tsx now only contains createServerFn wiring (excluded from coverage)
- hosts.handlers.ts has all handler logic (100% function coverage)
- hosts.schemas.ts has all validation (100% coverage)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…s, and functions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ComposeEditor: add useEffect to sync editorContent/detectedVars when
  stackName, content, or initialVariables props change
- StackRow: reset iconError state when iconUrl changes
- StacksTable: add host tiebreaker to sort comparator for deterministic
  ordering when stack names are identical across hosts

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The initial container list from listContainers() was only used to open
per-container stats streams but never emitted as an SSE `containers`
event, leaving the client unaware of the initial set of container IDs
until the first refresh cycle (60s later by default).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Agent:
- Emit initial containers snapshot before stats refresh loop
- Logs parsing already handles remainders correctly (verified, no change)

Infrastructure:
- Revert agent/OpenBao ports to loopback-only (127.0.0.1)
- Resolve migration numbering conflicts (consolidate 010-013)

UI:
- ComposeEditor: sync state when stackName/content props change
- StackRow: reset iconError when iconUrl changes
- StacksTable: add host tiebreaker to sort comparator

Host management:
- Persist agent_url after provisioning (was left as empty string)
- Fix agentVersion undefined leak (use ?? null instead of \!)
- Track container cleanup outcome in error messages

Worker/clients:
- Fix proxmox-client dispatcher race condition (await import)
- Wrap git auth atob() in try/catch for malformed Base64
- Wrap getToken() per-host in collector-factory to prevent loop abort
- Wrap OpenBao init in try-catch for better error messages
- Parallelize secret fetching with Promise.all
- Broaden extractVariableNames regex for Docker Compose forms

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/data/hosts/__tests__/handlers.test.ts (1)

167-175: 🧹 Nitpick | 🔵 Trivial

Assert the stored host fields here, not just that create() ran.

Because mockRepo.create() always resolves to mockRow(), this test still passes if handleAddHost() persists agent_url: '' or the wrong name. Make the mock echo the create input or assert on the call payload so this path actually guards the add-host contract.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/data/hosts/__tests__/handlers.test.ts` around lines 167 - 175, The test
for handleAddHost currently only checks that deps.repo.create was called, but
because mockRepo.create returns a fixed mockRow the test won't catch incorrect
persisted fields; update the test to either make the mock echo the create input
(change mockRepo.create to return the payload it was called with) or assert on
the actual call payload by inspecting deps.repo.create.mock.calls (or
equivalent) to verify fields like name, agent_url/socketProxyUrl, agent_port and
token are set correctly when handleAddHost is invoked; reference handleAddHost,
deps.repo.create (or mockRepo.create), mockRow and deps.storeToken when making
these assertions so the test will fail if incorrect host fields are persisted.
♻️ Duplicate comments (2)
src/components/stacks/ComposeEditor.tsx (1)

1-1: ⚠️ Potential issue | 🟠 Major

Reintroduce prop-to-state sync for editor content/variables.

Line 27 and Line 29 initialize local state from props once, but there’s still no sync path when stackName or content changes. That can keep stale text/variables and make Line 33 isDirty inaccurate after stack switches or post-save canonical updates.

Suggested fix
-import { useCallback, useRef, useState } from 'react';
+import { useCallback, useEffect, useRef, useState } from 'react';
...
 export default function ComposeEditor({ stackName, content, variables: initialVariables }: ComposeEditorProps) {
   const [editorContent, setEditorContent] = useState(content);
   const [detectedVars, setDetectedVars] = useState<string[]>(initialVariables);
+  useEffect(() => {
+    setEditorContent(content);
+    setDetectedVars(initialVariables);
+  }, [stackName, content, initialVariables]);

Also applies to: 27-29, 33-33, 61-65

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/components/stacks/ComposeEditor.tsx` at line 1, ComposeEditor initializes
local state from props (content, variables) once which leads to stale editor
text when props like stackName or content change; add a useEffect inside the
ComposeEditor component that listens for changes to props.stackName and
props.content (and props.variables if present) and calls the state setters
(e.g., setContent, setVariables) to reset the local editor state and any
canonical/original values used to compute isDirty (the isDirty check that
compares current state to the original prop-derived value), so switching stacks
or after save updates immediately clear dirty state and keep the editor in sync.
src/data/hosts/functions.tsx (1)

37-43: 🛠️ Refactor suggestion | 🟠 Major

Move service/client construction out of the server-function body.

OpenBaoClient, AgentProvisioningService, and AgentUpdateService are still being instantiated inside createServerFn() handlers, which keeps this wiring hard to stub and bypasses the repo's DI rule for server functions. As per coding guidelines, "All server logic via createServerFn() + middleware injection. Never create clients directly in server functions."

Also applies to: 60-64, 82-83

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/data/hosts/functions.tsx` around lines 37 - 43, The server functions are
directly instantiating OpenBaoClient (with
loadOpenBaoConfig/ensureSecretsEngine), AgentProvisioningService, and
AgentUpdateService inside createServerFn handlers; move these constructions out
of the handler body and provide them via dependency injection or middleware
instead — either create and export a shared factory or provider that constructs
OpenBaoClient (calling loadOpenBaoConfig and ensureSecretsEngine) and the
AgentProvisioningService/AgentUpdateService once at module scope, or accept them
as injected dependencies into createServerFn so the handlers receive
already-constructed instances; update all handlers that call new OpenBaoClient,
new AgentProvisioningService, or new AgentUpdateService to use the
injected/provider instances and remove any direct "new" usage from inside the
server function bodies.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bunfig.toml`:
- Around line 5-8: The coverage exclusion is hiding testable handler logic
inside src/data/*/functions.tsx; extract pure handler functions into a new
module (e.g., create src/data/<area>/handlers.ts) and move logic into exported
functions such as handleListStacks and handleGetStackDetail, then update the
original functions.tsx to only wire createServerFn handlers (e.g., listStacks,
getStackDetail) that import and call those handlers; adjust tests to target the
new handlers module and narrow the coverage ignore to only the minimal wiring
code (the functions.tsx or specific wiring lines) rather than excluding the
entire file.

In `@src/components/stacks/StackDetail.tsx`:
- Around line 7-18: StackDetail's queries are keyed only by stackName causing
cache collisions across hosts; update StackDetailProps to accept the host (from
StackSummary) and include it in both query keys and any entity IDs used: change
StackDetailProps to include host, update the component signature StackDetail({
stackName, host }), and modify the useQuery queryKey arrays to ['stack-detail',
host, stackName] and ['deploy-history', host, stackName] (and update
getStackDetail call/payload if you need to pass host there as well).

In `@src/data/hosts/functions.tsx`:
- Around line 60-68: The OpenBao bootstrap (loadOpenBaoConfig and new
OpenBaoClient) is happening eagerly and can throw before handleRemoveHost runs;
move that initialization into the best-effort deleteToken path so failures only
downgrade to warnings. Modify the wrapper passed to handleRemoveHost: replace
the preconstructed baoClient and loadOpenBaoConfig usage with a deleteToken
implementation that calls loadOpenBaoConfig() and constructs new OpenBaoClient()
inside deleteToken(hostname) and calls deleteHostSecret(hostname, 'agent_token')
inside a try/catch, logging or ignoring errors so handleRemoveHost's deletion
remains best-effort; keep removeAgent using loadDockerClient and
AgentProvisioningService as-is.

In `@src/data/hosts/handlers.ts`:
- Around line 145-149: The host is persisted with agent_url: '' so later loads
miss the resolved provisionResult.agentUrl; update the creation and the
subsequent branch (the block referencing lines 151-159) to persist the resolved
agent URL from provisionResult by passing agent_url: provisionResult.agentUrl
(or calling the repo update method with { id: host.id, agent_url:
provisionResult.agentUrl }) so handleCheckHostHealth and handleUpdateAgent will
read the actual stored URL; locate the repo call via deps.repo.create and the
follow-up repo update calls to set the agent_url field to
provisionResult.agentUrl before returning success.
- Around line 87-90: The current try/catch around deps.deleteToken in
handlers.ts (see deleteToken usage near the add-host rollback and the removeHost
flow) only logs the error and swallows it, leaving stale OpenBao tokens; change
both catch blocks to propagate the error after logging (either rethrow the
caught error or return a rejected Promise) so the caller is aware of cleanup
failure and can fail the operation; ensure you update the error handling in the
add-host rollback path and the removeHost path (lines referencing deleteToken)
to not silently swallow errors and to use the same logging context when
rethrowing.

In `@src/data/proxmox/schemas.ts`:
- Around line 3-4: The schema getHistoricalProxmoxStatsSchema currently allows
zero, negative, and fractional values for the seconds field; update the seconds
validator to require a positive integer greater than zero (e.g., use an integer
+ positive constraint and keep the default 120) so invalid inputs are rejected
at the schema boundary before reaching the DB layer; target the seconds key in
getHistoricalProxmoxStatsSchema and replace its current
z.number().optional().default(120) with a z number validator that enforces
integer and positive constraints.

In `@src/data/settings/schemas.ts`:
- Around line 3-5: The updateSettingSchema currently allows any non-empty string
for key; replace that with a constrained schema using the canonical settings
keys (import the exported enum/array from src/lib/constants/settings-keys.ts and
use z.enum(...) or z.union(z.literal(...)) to validate against that set) so only
known setting keys are accepted; update the import in this file and change the
key line in updateSettingSchema to reference the enum/union and adjust any
dependent types/tests accordingly.

In `@src/data/stacks/schemas.ts`:
- Line 4: The stack identifier schemas currently use plain z.string().min(1)
(symbol: stackName) and must reject path traversal characters; update stackName
(and the other similar schema fields in this file) to add a restrictive regex
such as .regex(/^[a-zA-Z0-9._-]+$/) so that values disallow slashes and patterns
like ".." (i.e., change z.string().min(1) to z.string().min(1).regex(...)), and
ensure the error message conveys invalid characters for clearer validation
failures.
- Around line 13-15: The getDeployHistorySchema's limit currently uses
z.number(), allowing non-integer values; update the limit definition to enforce
integers by adding .int() (i.e., change the limit chain on
getDeployHistorySchema to use
z.number().int().min(1).max(100).optional().default(20)) so pagination/SQL LIMIT
receives only whole numbers.

In `@src/data/zfs/schemas.ts`:
- Around line 3-5: The schema getHistoricalZFSStatsSchema currently allows
seconds to be 0, negative, or fractional; tighten it at the schema boundary by
changing the seconds field to only accept positive integers and keep the default
of 60 (e.g., use z.number().int().positive().optional().default(60) or
equivalent with .min(1)); update getHistoricalZFSStatsSchema's seconds
definition so invalid windows are rejected before they reach the query logic.

In `@src/hooks/useSettings.tsx`:
- Around line 30-31: The expansion state is keyed by display-only stackName
which can collide across hosts; update the functions and any state map to use
host-prefixed entity IDs instead (e.g., server/stackId) so keys are unique:
change the signatures and callers of toggleStackExpanded and isStackExpanded to
accept the host-prefixed stack ID (or compute it inside these functions from
host + stackName) and replace all usages that index the expansion state (the
internal Map/Object used around the current toggleStackExpanded/isStackExpanded
implementation and the block around lines 137-143) to use that host-prefixed ID
for set/get operations and uniqueness checks.

In `@src/lib/stacks/stack-service.ts`:
- Line 42: The mapping layer dropped metadata hydration so
manifestEntryToSummary and manifestEntryToDetail end up returning icon: null,
making icons write-only; restore the icon in the read paths by looking up the
persisted icon slug (the same place updateStackIconSlug writes to) when building
summaries/details from manifest.stacks, or re-run the metadata hydration used
previously before calling manifestEntryToSummary/manifestEntryToDetail so both
functions receive and return the icon field; update the code that returns
Object.entries(manifest.stacks).map(...) to merge in the stored icon for each
stack name or rehydrate metadata so icons are preserved on reload.

---

Outside diff comments:
In `@src/data/hosts/__tests__/handlers.test.ts`:
- Around line 167-175: The test for handleAddHost currently only checks that
deps.repo.create was called, but because mockRepo.create returns a fixed mockRow
the test won't catch incorrect persisted fields; update the test to either make
the mock echo the create input (change mockRepo.create to return the payload it
was called with) or assert on the actual call payload by inspecting
deps.repo.create.mock.calls (or equivalent) to verify fields like name,
agent_url/socketProxyUrl, agent_port and token are set correctly when
handleAddHost is invoked; reference handleAddHost, deps.repo.create (or
mockRepo.create), mockRow and deps.storeToken when making these assertions so
the test will fail if incorrect host fields are persisted.

---

Duplicate comments:
In `@src/components/stacks/ComposeEditor.tsx`:
- Line 1: ComposeEditor initializes local state from props (content, variables)
once which leads to stale editor text when props like stackName or content
change; add a useEffect inside the ComposeEditor component that listens for
changes to props.stackName and props.content (and props.variables if present)
and calls the state setters (e.g., setContent, setVariables) to reset the local
editor state and any canonical/original values used to compute isDirty (the
isDirty check that compares current state to the original prop-derived value),
so switching stacks or after save updates immediately clear dirty state and keep
the editor in sync.

In `@src/data/hosts/functions.tsx`:
- Around line 37-43: The server functions are directly instantiating
OpenBaoClient (with loadOpenBaoConfig/ensureSecretsEngine),
AgentProvisioningService, and AgentUpdateService inside createServerFn handlers;
move these constructions out of the handler body and provide them via dependency
injection or middleware instead — either create and export a shared factory or
provider that constructs OpenBaoClient (calling loadOpenBaoConfig and
ensureSecretsEngine) and the AgentProvisioningService/AgentUpdateService once at
module scope, or accept them as injected dependencies into createServerFn so the
handlers receive already-constructed instances; update all handlers that call
new OpenBaoClient, new AgentProvisioningService, or new AgentUpdateService to
use the injected/provider instances and remove any direct "new" usage from
inside the server function bodies.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5c8541e6-431f-45a5-b952-21887b6704e0

📥 Commits

Reviewing files that changed from the base of the PR and between 4af109e and 54a2d92.

📒 Files selected for processing (38)
  • bunfig.toml
  • src/components/docker/ContainerHistoryPage.tsx
  • src/components/docker/ContainerRow.tsx
  • src/components/docker/ContainerTable.tsx
  • src/components/stacks/ComposeEditor.tsx
  • src/components/stacks/StackDetail.tsx
  • src/data/__tests__/hosts.functions.test.ts
  • src/data/docker/__tests__/schemas.test.ts
  • src/data/docker/functions.tsx
  • src/data/docker/schemas.ts
  • src/data/hosts/__tests__/handlers.test.ts
  • src/data/hosts/__tests__/schemas.test.ts
  • src/data/hosts/functions.tsx
  • src/data/hosts/handlers.ts
  • src/data/hosts/schemas.ts
  • src/data/proxmox/__tests__/schemas.test.ts
  • src/data/proxmox/functions.tsx
  • src/data/proxmox/schemas.ts
  • src/data/settings/__tests__/schemas.test.ts
  • src/data/settings/functions.tsx
  • src/data/settings/schemas.ts
  • src/data/stacks/__tests__/functions.test.ts
  • src/data/stacks/__tests__/schemas.test.ts
  • src/data/stacks/functions.tsx
  • src/data/stacks/schemas.ts
  • src/data/zfs/__tests__/schemas.test.ts
  • src/data/zfs/functions.tsx
  • src/data/zfs/schemas.ts
  • src/hooks/__tests__/useSettings.test.ts
  • src/hooks/__tests__/useStackExpansion.test.ts
  • src/hooks/useSettings.tsx
  • src/lib/constants/preload-queries.ts
  • src/lib/mock/functions/hosts.functions.ts
  • src/lib/stacks/stack-service.ts
  • src/routes/docker.stacks.tsx
  • src/routes/docker.tsx
  • src/routes/proxmox.tsx
  • vite.config.ts
💤 Files with no reviewable changes (1)
  • src/data/tests/hosts.functions.test.ts

Comment thread bunfig.toml
Comment on lines +5 to +8
coveragePathIgnorePatterns = [
"src/data/*/functions.tsx",
"src/lib/server-functions/",
]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Excluding entire files may hide testable code.

The pattern src/data/*/functions.tsx excludes entire files, but the context snippets show these files contain handler logic that could be tested with dependency injection, not just untestable framework wiring.

For comparison, openbao-server-functions.ts (context snippet 1) separates handler logic (lines 24-60, tested) from server function wiring (lines 68-107, excluded). However, stacks/functions.tsx and hosts/functions.tsx mix both in the same file, so excluding them hides testable code.

Consider extracting handler logic from functions.tsx files into separate modules (like handlers.ts) that can be tested, and only exclude the minimal wiring code from coverage.

♻️ Suggested refactoring approach

For each src/data/*/functions.tsx file:

  1. Extract handler logic to a separate handlers.ts module:
// src/data/stacks/handlers.ts
export async function handleListStacks(): Promise<StackSummary[]> {
  const { getStackSummaries } = await import('@/lib/stacks/stack-service');
  return getStackSummaries();
}

export async function handleGetStackDetail(data: GetStackDetailInput): Promise<StackDetail | null> {
  const { getStackDetailByName } = await import('@/lib/stacks/stack-service');
  return getStackDetailByName(data.stackName);
}
  1. Keep only the wiring in functions.tsx:
// src/data/stacks/functions.tsx
import { createServerFn } from '@tanstack/react-start';
import { handleListStacks, handleGetStackDetail } from './handlers';

export const listStacks = createServerFn()
  .handler(handleListStacks);

export const getStackDetail = createServerFn()
  .inputValidator(getStackDetailSchema)
  .handler(({ data }) => handleGetStackDetail(data));
  1. Test the handlers in __tests__/handlers.test.ts

This approach ensures business logic is tested while excluding only the framework integration.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bunfig.toml` around lines 5 - 8, The coverage exclusion is hiding testable
handler logic inside src/data/*/functions.tsx; extract pure handler functions
into a new module (e.g., create src/data/<area>/handlers.ts) and move logic into
exported functions such as handleListStacks and handleGetStackDetail, then
update the original functions.tsx to only wire createServerFn handlers (e.g.,
listStacks, getStackDetail) that import and call those handlers; adjust tests to
target the new handlers module and narrow the coverage ignore to only the
minimal wiring code (the functions.tsx or specific wiring lines) rather than
excluding the entire file.

Comment thread src/components/stacks/StackDetail.tsx Outdated
Comment thread src/data/hosts/functions.tsx Outdated
Comment on lines +87 to +90
try {
await deps.deleteToken(host.name);
} catch (err) {
console.error(`[removeHost] Failed to delete OpenBao token for ${host.name}:`, err instanceof Error ? err.message : err);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't silently leave stale OpenBao tokens behind.

Both cleanup paths swallow deleteToken() errors. On add-host rollback the thrown message can still say the token was cleaned up, and on remove the caller gets no warning at all, so operators can be told cleanup succeeded while the credential is still present.

Also applies to: 177-194

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/data/hosts/handlers.ts` around lines 87 - 90, The current try/catch
around deps.deleteToken in handlers.ts (see deleteToken usage near the add-host
rollback and the removeHost flow) only logs the error and swallows it, leaving
stale OpenBao tokens; change both catch blocks to propagate the error after
logging (either rethrow the caught error or return a rejected Promise) so the
caller is aware of cleanup failure and can fail the operation; ensure you update
the error handling in the add-host rollback path and the removeHost path (lines
referencing deleteToken) to not silently swallow errors and to use the same
logging context when rethrowing.

Comment thread src/data/hosts/handlers.ts
Comment thread src/data/stacks/schemas.ts Outdated
Comment thread src/data/stacks/schemas.ts Outdated
Comment thread src/data/zfs/schemas.ts Outdated
Comment thread src/hooks/useSettings.tsx Outdated
}

return summaries;
return Object.entries(manifest.stacks).map(([name, entry]) => manifestEntryToSummary(name, entry));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

This makes stack icons write-only.

manifestEntryToSummary() and manifestEntryToDetail() both return icon: null (src/lib/stacks/stack-mappers.ts:12-38). After dropping the metadata hydration here, both read paths now lose the icon that updateStackIconSlug() still persists below, so stack icons disappear after a reload unless you restore the lookup or move icon storage into the manifest.

Also applies to: 68-68

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/stacks/stack-service.ts` at line 42, The mapping layer dropped
metadata hydration so manifestEntryToSummary and manifestEntryToDetail end up
returning icon: null, making icons write-only; restore the icon in the read
paths by looking up the persisted icon slug (the same place updateStackIconSlug
writes to) when building summaries/details from manifest.stacks, or re-run the
metadata hydration used previously before calling
manifestEntryToSummary/manifestEntryToDetail so both functions receive and
return the icon field; update the code that returns
Object.entries(manifest.stacks).map(...) to merge in the stored icon for each
stack name or rehydrate metadata so icons are preserved on reload.

jaredglaser and others added 3 commits March 22, 2026 15:09
… improvements

- Add .int().positive() to proxmox and zfs seconds schemas
- Add path traversal protection to stack name fields via regex
- Add .int() to deploy history limit schema
- Assert create payload fields in handleAddHost test
- Add test coverage for new validation constraints

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/worker/collector.ts (1)

23-23: 🧹 Nitpick | 🔵 Trivial

Use console.info for operational messages instead of console.log.

Per coding guidelines, console.log should be used sparingly for temporary debugging only and not committed. Operational messages (startup, shutdown, configuration) should use console.info. Multiple lines in this file use console.log for operational output.

♻️ Proposed fix to use console.info
 async function main() {
-  console.log('[Worker] Starting homelab-manager background collector');
+  console.info('[Worker] Starting homelab-manager background collector');
   ...
     if (!workerConfig.enabled) {
-      console.log('[Worker] Worker disabled via WORKER_ENABLED=false, exiting');
+      console.info('[Worker] Worker disabled via WORKER_ENABLED=false, exiting');
       process.exit(0);
     }

-    console.log('[Worker] Configuration loaded:', {
+    console.info('[Worker] Configuration loaded:', {
       ...
     });

-    console.log('[Worker] Connecting to PostgreSQL...');
+    console.info('[Worker] Connecting to PostgreSQL...');
     ...
-    console.log('[Worker] Running database migrations...');
+    console.info('[Worker] Running database migrations...');
     ...
-      console.log(`[Worker] Using update interval from database: ${resolvedInterval}ms`);
+      console.info(`[Worker] Using update interval from database: ${resolvedInterval}ms`);
     } else {
-      console.log(`[Worker] Using update interval from config: ${workerConfig.collection.interval}ms`);
+      console.info(`[Worker] Using update interval from config: ${workerConfig.collection.interval}ms`);
     }
     ...
     const shutdown = () => {
-      console.log('[Worker] Shutdown signal received, aborting collectors...');
+      console.info('[Worker] Shutdown signal received, aborting collectors...');
       ...
     };
     ...
       if (runners.length === 0) {
-        console.log('[Worker] No collectors enabled, exiting');
+        console.info('[Worker] No collectors enabled, exiting');
         process.exit(0);
       }
       ...
-      console.log(`[Worker] ${runners.length} collector(s) started, running...`);
+      console.info(`[Worker] ${runners.length} collector(s) started, running...`);
       ...
-    console.log('[Worker] Closing connections...');
+    console.info('[Worker] Closing connections...');
     ...
-    console.log('[Worker] Shutdown complete');
+    console.info('[Worker] Shutdown complete');

As per coding guidelines: "Use console.error for actual errors, console.info for operational messages (startup, shutdown), and console.log sparingly for temporary debugging only (do not commit)."

Also applies to: 30-30, 34-34, 42-42, 45-45, 53-55, 72-72, 112-112, 149-149, 154-154, 162-162

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/worker/collector.ts` at line 23, Replace operational console.log calls
with console.info: change the console.log that prints "[Worker] Starting
homelab-manager background collector" and all other operational messages
identified (lines noted in the review: 30, 34, 42, 45, 53-55, 72, 112, 149, 154,
162) to use console.info instead; leave console.error for error paths and keep
any true debug-only console.log usages only if they are temporary. Locate and
update each occurrence of console.log in src/worker/collector.ts (search for the
exact message strings like "[Worker] Starting homelab-manager background
collector" to find them) and replace with console.info to conform to the logging
guidelines.
♻️ Duplicate comments (7)
src/routes/api/git.$.ts (1)

38-38: ⚠️ Potential issue | 🟠 Major

Use 401 challenge for malformed Basic auth, not 400.

Line 38 should return an auth challenge (401 + WWW-Authenticate) so malformed Basic credentials are treated as authentication failure consistently with the rest of this handler and Git client expectations.

Proposed fix
-      return new Response('Malformed Basic auth encoding', { status: 400 });
+      return new Response('Unauthorized', {
+        status: 401,
+        headers: { 'WWW-Authenticate': 'Basic realm="git"' },
+      });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/routes/api/git`.$.ts at line 38, Replace the 400 response on the
"Malformed Basic auth encoding" branch with a 401 authentication challenge:
return a Response with status 401 and include a WWW-Authenticate header for
Basic (e.g. 'WWW-Authenticate': 'Basic realm="Git", charset="UTF-8"') so this
malformed credential case is treated as an auth failure consistent with the rest
of the auth handling in the git handler (locate the return that currently
produces 'Malformed Basic auth encoding').
docker-compose.agent.yml (1)

52-52: ⚠️ Potential issue | 🟠 Major

Avoid predictable fallback credentials for AGENT_TOKEN.

Line 52 sets a known default token (dev-agent-token), which weakens secure defaults and can be unintentionally reused outside strictly local scenarios. Prefer fail-fast token injection instead of a hardcoded fallback.

🔐 Suggested change
-      AGENT_TOKEN: "${AGENT_TOKEN:-dev-agent-token}"
+      AGENT_TOKEN: "${AGENT_TOKEN:?AGENT_TOKEN must be set}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker-compose.agent.yml` at line 52, The docker-compose service currently
provides a predictable fallback AGENT_TOKEN value ("dev-agent-token") which
weakens security; remove the hardcoded default in the AGENT_TOKEN environment
entry in docker-compose.agent.yml so the service requires an injected
AGENT_TOKEN (i.e., use the variable without the ":-dev-agent-token" fallback)
and add a fail-fast check in the container startup logic (entrypoint or
application bootstrap) to exit with a clear error if AGENT_TOKEN is missing;
reference the AGENT_TOKEN environment variable and the container
entrypoint/bootstrap code to implement the runtime validation.
src/data/stacks/__tests__/schemas.test.ts (1)

23-26: ⚠️ Potential issue | 🟠 Major

Add a regression test for stackName: '..' traversal segment.

Current traversal tests only cover values with /. .. alone should also be rejected to prevent parent-directory resolution bugs.

✅ Suggested test addition
   it('rejects stackName with slashes (path traversal)', () => {
     expect(() => getStackDetailSchema.parse({ stackName: '../etc' })).toThrow();
     expect(() => getStackDetailSchema.parse({ stackName: 'foo/bar' })).toThrow();
+    expect(() => getStackDetailSchema.parse({ stackName: '..' })).toThrow();
   });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/data/stacks/__tests__/schemas.test.ts` around lines 23 - 26, The
traversal tests for getStackDetailSchema currently only assert failures for
values containing slashes; add a regression case to also reject a bare '..'
segment so parent-directory resolution is blocked. Update the test in
src/data/stacks/__tests__/schemas.test.ts to call getStackDetailSchema.parse({
stackName: '..' }) and expect it to throw (similar to the existing foo/bar and
../etc assertions) to ensure stackName '..' is rejected.
src/data/stacks/schemas.ts (1)

4-4: ⚠️ Potential issue | 🔴 Critical

stackNameField still permits .., enabling traversal-segment inputs.

Blocking / is not sufficient. .. can still traverse when used in path joins. Reject dot-segments explicitly.

🔒 Suggested hardening
-const stackNameField = z.string().min(1).regex(/^[a-zA-Z0-9._-]+$/, 'Stack name must be alphanumeric (hyphens, underscores, dots allowed)');
+const stackNameField = z
+  .string()
+  .min(1)
+  .regex(/^(?!\.{1,2}$)(?!.*\.\.)[a-zA-Z0-9._-]+$/, 'Stack name contains invalid path segments');
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/data/stacks/schemas.ts` at line 4, The stackNameField regex currently
allows consecutive dots, which permits traversal segments like ".."; update the
schema for stackNameField to explicitly reject dot-segments by either tightening
the regex or adding a refinement: keep the existing /^[a-zA-Z0-9._-]+$/ check
but add a .refine(...) that fails if value.includes("..") or value === "." ||
value === ".." (and keep rejecting '/' if needed), and return a clear error
message such as "Stack name must not contain dot-segments (e.g. '..')".
src/hooks/useSettings.tsx (1)

31-32: ⚠️ Potential issue | 🟠 Major

Use host-prefixed stack entity IDs for expansion state keys.

Keying expansion by stackName will collide for same-named stacks on different hosts, causing cross-host toggle bleed. Use a host-prefixed stackEntityId (e.g., host/stack) in the interface and in expandedStacks set operations.

Suggested fix
-  toggleStackExpanded: (stackName: string) => void;
-  isStackExpanded: (stackName: string) => boolean;
+  toggleStackExpanded: (stackEntityId: string) => void;
+  isStackExpanded: (stackEntityId: string) => boolean;
@@
-  const toggleStackExpanded = useCallback((stackName: string) => {
-    optimisticSet(SETTINGS_KEYS.stacks.expandedStacks, prev => toggleInSet(prev, stackName));
+  const toggleStackExpanded = useCallback((stackEntityId: string) => {
+    optimisticSet(SETTINGS_KEYS.stacks.expandedStacks, prev => toggleInSet(prev, stackEntityId));
   }, [optimisticSet]);
@@
-    (stackName: string): boolean => {
-      return settings.stacks.expandedStacks.has(stackName);
+    (stackEntityId: string): boolean => {
+      return settings.stacks.expandedStacks.has(stackEntityId);
     },
As per coding guidelines, "Always use entity IDs with host prefix (e.g., `server1/tank`, `192.168.1.10/abc123`) for state keys and uniqueness checks. Never use display names - they collide across hosts."

Also applies to: 138-145

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/hooks/useSettings.tsx` around lines 31 - 32, The expansion state
currently keys by display name and will collide across hosts; change the
interface methods toggleStackExpanded(stackName: string) and
isStackExpanded(stackName: string) to use a host-prefixed stackEntityId (e.g.,
"host/stack") instead, update the expandedStacks set operations to
add/remove/check using that stackEntityId, and update any callers to pass the
host-prefixed entity ID; specifically modify the functions toggleStackExpanded
and isStackExpanded and the places where expandedStacks is mutated/queried so
they operate on stackEntityId (host/stack) rather than stackName to guarantee
global uniqueness.
src/data/hosts/functions.tsx (1)

39-43: 🛠️ Refactor suggestion | 🟠 Major

Keep client/service construction out of the server fn bodies.

addHost, removeHost, and updateAgent still new up OpenBaoClient, AgentProvisioningService, and AgentUpdateService inside the createServerFn() handlers. That bypasses the repo's DI rule and keeps these paths harder to stub and evolve.

As per coding guidelines, "All server logic via createServerFn() + middleware injection. Never create clients directly in server functions."

Also applies to: 60-61, 66-69, 84-85

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/data/hosts/functions.tsx` around lines 39 - 43, The handlers addHost,
removeHost, and updateAgent are constructing OpenBaoClient,
AgentProvisioningService, and AgentUpdateService inside createServerFn handlers;
move those constructions out of the server function bodies and inject them
instead (e.g., create module-level factories/singletons or a provider object and
pass them into createServerFn or middleware), replace direct new
OpenBaoClient(...), new AgentProvisioningService(), and new AgentUpdateService()
in the handlers with references to the injected instances, and update the
handler signatures to accept the injected client/service so the code follows the
repo DI rule and becomes testable/stubbable.
src/data/hosts/handlers.ts (1)

88-92: ⚠️ Potential issue | 🟠 Major

Surface token-cleanup failures to the caller.

Both deleteToken() catches swallow the error. On remove, callers get full success even if the OpenBao secret is still live; on add rollback, the thrown message can still claim the token was cleaned up when it wasn't.

Also applies to: 185-201

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/data/hosts/handlers.ts` around lines 88 - 92, The catch around
deps.deleteToken in removeHost (and the corresponding catch in the add/rollback
path) currently swallows errors; change these catches to propagate the failure
to the caller instead of just console.error by rethrowing (or throwing a new
Error with context) after logging so callers see token-cleanup failures; update
the catch blocks around deps.deleteToken to log the error and then throw
(preserving the original error message) rather than returning/suppressing it.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@agent/src/routes/stats.ts`:
- Around line 201-239: The code can open or reconcile container streams after
the client has disconnected because late-resolving docker.listContainers() and
container.stats() promises are not checked; update the stats flow and
openContainerStream() to bail early when ctx.closed: after any await of
docker.listContainers() (e.g., the initial list and the periodic refresh into
variable containers/current) check ctx.closed and skip calling
openContainerStream() or reconcileContainers() if true, and inside
openContainerStream()’s container.stats({stream:true}).then(...) handler
immediately destroy the returned readable (call readable.destroy() or
readable.destroy?.()) and return if ctx.closed before registering it in
ctx.containerStreams; this ensures no new streams are added after
destroyAllStreams() runs.

In `@src/components/stacks/ComposeEditor.tsx`:
- Around line 17-26: parseVariables currently only recognizes ${VAR} and
${VAR:-default} but extractVariableNames supports all Docker Compose operator
forms (e.g. ${VAR-default}, ${VAR:?err}, ${VAR?err}, ${VAR:+alt}, ${VAR+alt}),
causing mismatch when editing content; update the regex in parseVariables to
mirror extractVariableNames' pattern so it captures all operator variants, and
update the JSDoc comment above parseVariables to state it supports full Docker
Compose variable/operator syntax; reference the parseVariables function and
align its regex with the pattern used in extractVariableNames.

In `@src/data/hosts/handlers.ts`:
- Around line 166-167: The code calls deps.repo.updateAgentUrl(host.id,
provisionResult.agentUrl) but continues using the stale host object (with empty
agent_url) when calling toHostListItem(host, …), and if updateAgentUrl throws it
exits without rolling back; fix by assigning the result of updateAgentUrl to a
new/overwritten variable (e.g., host = await deps.repo.updateAgentUrl(...)) and
pass that updated host into toHostListItem, and wrap updateAgentUrl in a
try/catch so failures trigger the existing rollback/cleanup path (stop/remove
the agent container and delete created host) before rethrowing; apply the same
change to the other identical call site that uses updateAgentUrl followed by
toHostListItem.

In `@src/data/settings/__tests__/schemas.test.ts`:
- Around line 11-25: Replace the hardcoded knownKeys list with the canonical
SETTINGS_KEYS constant so the test stays in sync; import SETTINGS_KEYS (the
module that exports the settings map/enum) into the test and iterate over its
keys (e.g., Object.keys(SETTINGS_KEYS) or appropriate accessor for the exported
shape) and assert updateSettingSchema.parse({ key, value: 'x' }) does not throw
for each key; keep the existing assertion and test name unchanged.

In `@src/hooks/useSettings.tsx`:
- Around line 14-15: The file mixes "@/..." and relative imports; update this
file to use only absolute "@/..." imports per repo rules: replace any "./..." or
"../..." imports in src/hooks/useSettings.tsx with their corresponding "@/..."
paths (e.g., ensure imports for updateSetting and SettingsKey remain or become
"@/data/settings/functions" and "@/data/settings/schemas" and convert any other
local imports in this file to the matching "@/..." module paths), keeping
existing exported symbol names (like updateSetting and SettingsKey) unchanged.

In `@src/lib/clients/proxmox-client.ts`:
- Around line 179-186: The code currently scans allVMs and allContainers twice
to compute running and stopped counts; replace those duplicate filters by a
single pass per collection (e.g., use Array.prototype.reduce over allVMs to
produce runningVMs and stoppedVMs in one traversal and similarly reduce
allContainers to produce runningContainers and stoppedContainers) to avoid extra
allocations and improve performance; update the logic that assigns runningVMs,
stoppedVMs, runningContainers, stoppedContainers to be derived from the
respective reduce results while preserving the same status check (vm.status ===
'running' / ct.status === 'running').

In `@src/lib/hosts/host-utils.ts`:
- Line 82: In retryHealthCheck (in src/lib/hosts/host-utils.ts) change the log
level for expected retry attempts from console.error to console.info: replace
the console.error call that prints `[retryHealthCheck] Attempt ${i +
1}/${delays.length} failed for ${agentUrl}: ${result.error}` with console.info
so operational backoff messages are info-level; keep the same message content
(including agentUrl, attempt index i and delays.length and result.error) and
formatting so diagnostics remain intact.

---

Outside diff comments:
In `@src/worker/collector.ts`:
- Line 23: Replace operational console.log calls with console.info: change the
console.log that prints "[Worker] Starting homelab-manager background collector"
and all other operational messages identified (lines noted in the review: 30,
34, 42, 45, 53-55, 72, 112, 149, 154, 162) to use console.info instead; leave
console.error for error paths and keep any true debug-only console.log usages
only if they are temporary. Locate and update each occurrence of console.log in
src/worker/collector.ts (search for the exact message strings like "[Worker]
Starting homelab-manager background collector" to find them) and replace with
console.info to conform to the logging guidelines.

---

Duplicate comments:
In `@docker-compose.agent.yml`:
- Line 52: The docker-compose service currently provides a predictable fallback
AGENT_TOKEN value ("dev-agent-token") which weakens security; remove the
hardcoded default in the AGENT_TOKEN environment entry in
docker-compose.agent.yml so the service requires an injected AGENT_TOKEN (i.e.,
use the variable without the ":-dev-agent-token" fallback) and add a fail-fast
check in the container startup logic (entrypoint or application bootstrap) to
exit with a clear error if AGENT_TOKEN is missing; reference the AGENT_TOKEN
environment variable and the container entrypoint/bootstrap code to implement
the runtime validation.

In `@src/data/hosts/functions.tsx`:
- Around line 39-43: The handlers addHost, removeHost, and updateAgent are
constructing OpenBaoClient, AgentProvisioningService, and AgentUpdateService
inside createServerFn handlers; move those constructions out of the server
function bodies and inject them instead (e.g., create module-level
factories/singletons or a provider object and pass them into createServerFn or
middleware), replace direct new OpenBaoClient(...), new
AgentProvisioningService(), and new AgentUpdateService() in the handlers with
references to the injected instances, and update the handler signatures to
accept the injected client/service so the code follows the repo DI rule and
becomes testable/stubbable.

In `@src/data/hosts/handlers.ts`:
- Around line 88-92: The catch around deps.deleteToken in removeHost (and the
corresponding catch in the add/rollback path) currently swallows errors; change
these catches to propagate the failure to the caller instead of just
console.error by rethrowing (or throwing a new Error with context) after logging
so callers see token-cleanup failures; update the catch blocks around
deps.deleteToken to log the error and then throw (preserving the original error
message) rather than returning/suppressing it.

In `@src/data/stacks/__tests__/schemas.test.ts`:
- Around line 23-26: The traversal tests for getStackDetailSchema currently only
assert failures for values containing slashes; add a regression case to also
reject a bare '..' segment so parent-directory resolution is blocked. Update the
test in src/data/stacks/__tests__/schemas.test.ts to call
getStackDetailSchema.parse({ stackName: '..' }) and expect it to throw (similar
to the existing foo/bar and ../etc assertions) to ensure stackName '..' is
rejected.

In `@src/data/stacks/schemas.ts`:
- Line 4: The stackNameField regex currently allows consecutive dots, which
permits traversal segments like ".."; update the schema for stackNameField to
explicitly reject dot-segments by either tightening the regex or adding a
refinement: keep the existing /^[a-zA-Z0-9._-]+$/ check but add a .refine(...)
that fails if value.includes("..") or value === "." || value === ".." (and keep
rejecting '/' if needed), and return a clear error message such as "Stack name
must not contain dot-segments (e.g. '..')".

In `@src/hooks/useSettings.tsx`:
- Around line 31-32: The expansion state currently keys by display name and will
collide across hosts; change the interface methods
toggleStackExpanded(stackName: string) and isStackExpanded(stackName: string) to
use a host-prefixed stackEntityId (e.g., "host/stack") instead, update the
expandedStacks set operations to add/remove/check using that stackEntityId, and
update any callers to pass the host-prefixed entity ID; specifically modify the
functions toggleStackExpanded and isStackExpanded and the places where
expandedStacks is mutated/queried so they operate on stackEntityId (host/stack)
rather than stackName to guarantee global uniqueness.

In `@src/routes/api/git`.$.ts:
- Line 38: Replace the 400 response on the "Malformed Basic auth encoding"
branch with a 401 authentication challenge: return a Response with status 401
and include a WWW-Authenticate header for Basic (e.g. 'WWW-Authenticate': 'Basic
realm="Git", charset="UTF-8"') so this malformed credential case is treated as
an auth failure consistent with the rest of the auth handling in the git handler
(locate the return that currently produces 'Malformed Basic auth encoding').

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2a378de6-5604-457f-8810-fad5d1a61894

📥 Commits

Reviewing files that changed from the base of the PR and between 54a2d92 and 756274f.

📒 Files selected for processing (34)
  • agent/src/__tests__/stats.test.ts
  • agent/src/routes/stats.ts
  • docker-compose.agent.yml
  • migrations/011_managed_hosts_agent_token.sql
  • migrations/013_drop_agent_token_columns.sql
  • migrations/013_managed_hosts_agent_token.sql
  • src/components/stacks/ComposeEditor.tsx
  • src/components/stacks/ComposeEditorLoader.tsx
  • src/components/stacks/StackDetail.tsx
  • src/components/stacks/StackRow.tsx
  • src/components/stacks/StacksTable.tsx
  • src/components/stacks/__tests__/ComposeEditor.test.tsx
  • src/data/hosts/__tests__/handlers.test.ts
  • src/data/hosts/functions.tsx
  • src/data/hosts/handlers.ts
  • src/data/proxmox/__tests__/schemas.test.ts
  • src/data/proxmox/schemas.ts
  • src/data/settings/__tests__/schemas.test.ts
  • src/data/settings/schemas.ts
  • src/data/stacks/__tests__/schemas.test.ts
  • src/data/stacks/schemas.ts
  • src/data/zfs/__tests__/schemas.test.ts
  • src/data/zfs/schemas.ts
  • src/hooks/useSettings.tsx
  • src/lib/clients/__tests__/proxmox-client.test.ts
  • src/lib/clients/proxmox-client.ts
  • src/lib/database/repositories/__tests__/host-repository.test.ts
  • src/lib/database/repositories/host-repository.ts
  • src/lib/hosts/host-utils.ts
  • src/lib/stacks/stack-mappers.ts
  • src/lib/stacks/stack-service.ts
  • src/routes/api/git.$.ts
  • src/worker/collector-factory.ts
  • src/worker/collector.ts
💤 Files with no reviewable changes (2)
  • migrations/013_managed_hosts_agent_token.sql
  • migrations/011_managed_hosts_agent_token.sql

Comment thread agent/src/routes/stats.ts Outdated
Comment thread src/components/stacks/ComposeEditor.tsx Outdated
Comment thread src/data/hosts/handlers.ts Outdated
Comment thread src/data/settings/__tests__/schemas.test.ts
Comment thread src/hooks/useSettings.tsx
Comment thread src/lib/clients/proxmox-client.ts Outdated
Comment thread src/lib/hosts/host-utils.ts Outdated
jaredglaser and others added 2 commits March 22, 2026 20:56
… alignment

Address all 12 issues from comprehensive PR review:

- Align stack name pattern with SAFE_PATH_SEGMENT_PATTERN (no dots)
- Restore error handling regressions in handleAddHost/handleUpdateAgent
- Convert HostOperationResult to discriminated union
- Add fromMs/toMs constraints and compose content validation
- Fix worker silent degradation when OpenBao init fails
- Add logging to empty rollback catch blocks
- Use console.info for intermediate retry failures
- Filter non-string keys in listSecrets
- Add 9 new tests covering error paths and edge cases

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- agent/stats: add ctx.closed checks after listContainers() awaits and
  before registering streams in openContainerStream to prevent post-disconnect
  resource leaks
- ComposeEditor: align parseVariables regex with extractVariableNames to
  support all Docker Compose variable operator forms
- handlers: fix stale host.agent_url passed to toHostListItem after
  updateAgentUrl, extract error helpers for DRY rollback logic
- settings test: derive known keys from SETTINGS_KEYS instead of hardcoded list
- useSettings: fix relative imports to use @/ paths, rename stackName params
  to stackEntityId to match actual usage (callers pass host/name entity IDs)
- proxmox-client: replace 4 filter().length calls with 2 reduce() + subtraction
- collector: change operational console.log to console.info per logging rules
- docker-compose.agent: remove predictable dev-agent-token fallback, require
  AGENT_TOKEN via :? syntax (agent already fails fast if missing)
- stacks schemas test: add '..' path traversal regression case
- git handler: return 401 + WWW-Authenticate for malformed Basic auth encoding

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jaredglaser jaredglaser merged commit ce4418b into main Mar 23, 2026
1 of 2 checks passed
@jaredglaser jaredglaser deleted the stack/01-openbao-token-migration branch March 23, 2026 01:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stack: 1/8 Review first

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant