Skip to content

nodeagent: Phase 2 — Zboot/RebootStore/path seams + tests#5915

Draft
eriknordmark wants to merge 4 commits intolf-edge:masterfrom
eriknordmark:nodeagent-unit-tests-phase2
Draft

nodeagent: Phase 2 — Zboot/RebootStore/path seams + tests#5915
eriknordmark wants to merge 4 commits intolf-edge:masterfrom
eriknordmark:nodeagent-unit-tests-phase2

Conversation

@eriknordmark
Copy link
Copy Markdown
Contributor

Description

Phase 2 of the nodeagent unit-test work. Stacks on #5914 (Phase 1) and is best reviewed after it. Brings package coverage from 48.7% (end of Phase 1) to 70.3% of statements.

The PR is split into two commits so the refactor can be reviewed independently from the tests:

  1. nodeagent: add Zboot/RebootStore/path seams — refactor only, no behaviour change. Three small interfaces let the goroutine-spawned reboot path, the boot-reason reconstruction, and the file-touching helpers be driven from unit tests without invoking /sbin/zboot, persisting under /persist, or calling os.Exit from the test binary.

    • Zboot — wraps the subset of pkg/pillar/zboot (current/other partition, valid labels, Reset, Poweroff). realZboot adapts the package; nodeagentContext.zboot defaults to it.
    • RebootStore — wraps pkg/pillar/agentlog reboot/boot-reason persistence.
    • pathConfig — centralises the on-disk paths nodeagent reads or writes (firstboot file, installer log + send-require, restart counter, fault-injection file, SMART current/previous). Reuses existing package constants where they exist.
    • Plus two adjacent fixes:
      • handleNodeOperation now uses ctxPtr.minRebootDelay rather than the package constant — finally honours the existing comment "Some constants.. Declared here as variables to enable unit tests".
      • newNodeagentContext now uses its logArg parameter (was _).
    • mocks_test.go is extended with mockZboot and mockRebootStore so existing Phase-1 tests keep building.
  2. nodeagent: add Phase-2 unit tests — uses the seams. No production change.

    • nodeoperation_test.gohandleNodeOperation reboot/poweroff/shutdown plus waitForAllDomainsHalted's empty and bounded-wait branches.
    • handlelastreboot_test.gohandleLastRebootReason wrapper (stored-reason, first-boot, stack-tagging, restart-counter).
    • installlog_test.gohandleInstallationLog with/without the send-require marker.
    • wrappers_test.go — smoke tests for the trivial Create/Modify/Delete dispatch wrappers across all six pubsub topics + the GlobalConfig non-global-key fast path.
    • Folded into existing files: incrementRestartCounter and parseSMARTData wrapper tests, publishZbootConfigAll, publishZbootConfig invalid-label rejection, getZbootConfigAll empty/populated, handleDeviceTimers no-op baseline.

Coverage delta

Source Before this PR After
go test -count=1 -cover ./cmd/nodeagent/... 48.7% 70.3%

Remaining 0% targets are now exclusively things best covered end-to-end by Eden (Phase 3 of the plan):

  • Run() lifecycle / pubsub wiring, newNodeagentContext, initNodeDrainPubSub — agent boot.
  • The real adapter methods (realZboot.X, realRebootStore.X) — already covered by pkg/pillar/zboot and pkg/pillar/agentlog tests, plus Eden e2e.

Why draft

Stacks on #5914. Will mark ready for review once that PR has had a pass.

How to test and validate this PR

cd pkg/pillar
go test -count=1 -race -cover ./cmd/nodeagent/...
go test -count=1 -coverprofile=/tmp/n.cov ./cmd/nodeagent/...
go tool cover -func=/tmp/n.cov | grep total

Expected: all tests pass, coverage ≈ 70.3%.

Changelog notes

No user-facing changes.

PR Backports

  • 16.0-stable: No, test-only addition.
  • 14.5-stable: No, test-only addition.
  • 13.4-stable: No, test-only addition.

Checklist

  • I've provided a proper description
  • I've added the proper documentation
  • I've tested my PR on amd64 device — N/A, host-side unit tests
  • I've tested my PR on arm64 device — N/A, host-side unit tests
  • I've written the test verification instructions
  • I've set the proper labels to this PR

eriknordmark and others added 4 commits May 7, 2026 19:34
Lifts four pieces of logic out of nodeagent.go and handletimers.go
without changing any behaviour, so the upcoming unit tests can drive
them directly:

- incrementRestartCounterIn(path) — file-path-parameterised core of
  incrementRestartCounter.
- parseSMARTDataFiles(currPath, prevPath, curr, prev) — likewise for
  parseSMARTData.
- synthesizeRebootReason(...) and truncateRebootStack(...) — pure
  functions extracted from handleLastRebootReason.
- nodeagentContext.startNodeOperation — function-pointer seam
  defaulting to "go handleNodeOperation"; scheduleNodeOperation now
  calls it instead of inlining the goroutine spawn, so tests can
  observe the scheduled op without actually launching the reboot
  path that would call zboot.Reset/Poweroff and os.Exit.

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds unit-test coverage for nodeagent's pure logic and per-handler
state transitions. No production behaviour change. The previous
commit added the small refactors that make these tests possible.

Coverage of the package goes from 0% to 48.7% of statements
(go test -count=1 -cover). The remaining uncovered code is the
Run() wiring, the goroutine-driven handleNodeOperation, the
installer-log replay, and the thin Create/Modify/Delete pubsub
wrappers — all expected at this phase.

Files added (one per logical concern):

- helpers_test.go            test logger init
- mocks_test.go              mockPubSub (Publication+Subscription)
                             and newTestCtx fixture
- restartcounter_test.go     §3.1 absent / existing / garbage
- smart_test.go              §3.2 both / one missing / malformed
- maintmode_test.go          §3.3 add+remove, idempotent, multi
- rebootreason_test.go       §3.4 first boot, power-fail, kernel,
                             unknown, stored-wins, stack truncation
- handletimers_test.go       §3.5 fallback / reset / vault-locked /
                             upgrade-test-validation timers,
                             scheduleNodeOperation idempotency
- zedagentstatus_test.go     §3.6 ConfigGetStatus transitions,
                             handleDeviceCmd reboot/shutdown/poweroff,
                             handleZedAgentStatusImpl, certs-refused
- zbootstatus_test.go        §3.7 other-updating reboot, current→active
                             commit, doZbootBaseOsTestValidationComplete,
                             initiateBaseOsControllerTestComplete
- handlers_test.go           §3.8 volume / TPM / vault / kube
                             node-drain / allDomainsHalted

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three small interfaces let the goroutine-spawned reboot path, the
boot-reason reconstruction, and the file-touching helpers be driven
from unit tests without actually invoking /sbin/zboot, persisting
files under /persist, or os.Exit-ing the test binary.

- Zboot wraps the subset of pkg/pillar/zboot used by nodeagent
  (EveCurrentPartition, IsCurrentPartitionStateInProgress,
  IsValidPartitionLabel, GetValidPartitionLabels, GetOtherPartition,
  IsOtherPartitionStateUpdating, Reset, Poweroff). realZboot adapts
  the package; nodeagentContext.zboot is defaulted to it.
- RebootStore wraps the subset of pkg/pillar/agentlog used by
  nodeagent for reboot-reason persistence. realRebootStore adapts
  the package.
- pathConfig holds the on-disk paths nodeagent reads or writes
  (firstboot file, installer log + send-require, restart counter,
  fault injection, SMART current/previous). defaultPathConfig
  reuses the existing package constants where they exist.

The production code is updated to dispatch through ctxPtr.zboot,
ctxPtr.rebootStore and ctxPtr.paths instead of calling the zboot /
agentlog packages directly. Two adjacent fixes pulled in:

- handleNodeOperation now uses ctxPtr.minRebootDelay rather than
  the package constant, finally honouring the comment "Some
  constants.. Declared here as variables to enable unit tests".
- newNodeagentContext receives logArg (was an unused param) and
  threads it into realRebootStore.

Test fixture (mocks_test.go) is extended with mockZboot and
mockRebootStore so existing Phase-1 tests keep building.

No production behaviour change.

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Builds on the seams added in the previous commit to bring nodeagent's
package coverage from 48.7% to 70.3%. No production change.

New test files:

- nodeoperation_test.go: handleNodeOperation reboot / poweroff /
  shutdown paths, plus waitForAllDomainsHalted's empty/bounded-wait
  branches.
- handlelastreboot_test.go: handleLastRebootReason wrapper for the
  stored-reason, first-boot, stack-tagging, and restart-counter
  branches (uses the RebootStore + path seams).
- installlog_test.go: handleInstallationLog with and without the
  send-require marker (uses agentbase.Init for ctx.Logger()).
- wrappers_test.go: smoke tests for the trivial Create / Modify /
  Delete dispatch wrappers across ZedAgentStatus, ZbootStatus,
  VaultStatus, VolumeMgrStatus, TpmSanityStatus, NodeDrainStatus,
  and the GlobalConfig non-global key fast path.

New tests folded into existing files:

- restartcounter_test.go: incrementRestartCounter wrapper using
  ctx.paths.restartCounterFile.
- smart_test.go: parseSMARTData wrapper using ctx.paths.smartCurrent
  / smartPrevious.
- zbootstatus_test.go: publishZbootConfigAll, publishZbootConfig
  rejecting an invalid label, getZbootConfigAll empty + populated,
  handleDeviceTimers as a no-op baseline.

Signed-off-by: eriknordmark <erik@zededa.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 7, 2026

Codecov Report

❌ Patch coverage is 70.68966% with 34 lines in your changes missing coverage. Please review.
✅ Project coverage is 22.74%. Comparing base (63fa63d) to head (49e35ec).
⚠️ Report is 6 commits behind head on master.

Files with missing lines Patch % Lines
pkg/pillar/cmd/nodeagent/interfaces.go 32.14% 19 Missing ⚠️
pkg/pillar/cmd/nodeagent/nodeagent.go 83.54% 12 Missing and 1 partial ⚠️
pkg/pillar/cmd/nodeagent/handlezboot.go 50.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5915      +/-   ##
==========================================
+ Coverage   21.62%   22.74%   +1.11%     
==========================================
  Files         464      475      +11     
  Lines       83994    85741    +1747     
==========================================
+ Hits        18166    19502    +1336     
- Misses      64300    64507     +207     
- Partials     1528     1732     +204     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant