Skip to content

build.yml: fix eve job cache handling#5665

Open
europaul wants to merge 1 commit intolf-edge:masterfrom
europaul:fix-ci-build-cache
Open

build.yml: fix eve job cache handling#5665
europaul wants to merge 1 commit intolf-edge:masterfrom
europaul:fix-ci-build-cache

Conversation

@europaul
Copy link
Contributor

@europaul europaul commented Mar 10, 2026

Description

The eve job was rebuilding arm64 packages from scratch instead of using the ones already built by the packages job. Investigating the root cause revealed several interrelated issues.

  1. Redundant 'pkgs' target in the eve build command

    The eve job ran 'make pkgs eve', but the packages job already builds and caches all packages. Since the eve job restores the cache first, the 'pkgs' target should be a no-op. Removed it.

  2. arm64 packages were never restored from cache

    The cache restore logic had a conditional: if the runner arch matched the matrix arch, it skipped both clearing the linuxkit cache and restoring the target arch cache. The assumption was that the first cache restore (for tool images) already had the right packages. But that first restore always fetched the amd64 generic cache — even on arm64 runners. So arm64 jobs were left with amd64 packages in the cache, and 'make pkgs' (see issue number 1) was silently rebuilding everything for arm64.

  3. Tool images were hardcoded to amd64

    The cache key for loading tool images (mkconf, mkimage-raw-efi, mkrootfs-squash, etc.) into docker was hardcoded to amd64. On arm64 runners this is wrong — they need arm64 tool images. Since for native builds the target cache already contains these tools, we now load them directly from the target cache. The two-cache dance (load tools from one arch, then restore packages from another) is only needed for riscv64 cross-builds on amd64.

  4. The 'rt' platform maps to generic packages

    No build-rt.yml files exist anywhere in pkg/, so PLATFORM=rt produces identical packages to PLATFORM=generic. Rather than adding a redundant amd64/rt entry to the packages matrix, we map 'rt' to 'generic' in the cache key.

The fix simplifies the eve job's cache handling:

  • Native builds (amd64, arm64): restore target cache, load tools, build
  • Cross-builds (riscv64): restore amd64 cache, load tools, clear, restore riscv64 cache, build

The "Arch Runner is Matrix" step is removed as it is no longer used.

PR dependencies

None

How to test and validate this PR

Run CI - the eve job from build.yml should not rebuild the packages again and run quicker on every architecture.

Changelog notes

N/A

PR Backports

- 16.0-stable: To be backported.
- 14.5-stable: To be backported.
- 13.4-stable: To be backported.

Checklist

  • I've provided a proper description
  • I've added the proper documentation
  • I've tested my PR on amd64 device
  • I've tested my PR on arm64 device
  • I've written the test verification instructions
  • I've set the proper labels to this PR

The eve job was rebuilding arm64 packages from scratch instead of
using the ones already built by the packages job. Investigating
the root cause revealed several interrelated issues.

1. Redundant 'pkgs' target in the eve build command

   The eve job ran 'make pkgs eve', but the packages job already
   builds and caches all packages. Since the eve job restores the
   cache first, the 'pkgs' target should be a no-op. Removed it.

2. arm64 packages were never restored from cache

   The cache restore logic had a conditional: if the runner arch
   matched the matrix arch, it skipped both clearing the linuxkit
   cache and restoring the target arch cache. The assumption was
   that the first cache restore (for tool images) already had the
   right packages. But that first restore always fetched the amd64
   generic cache — even on arm64 runners. So arm64 jobs were left
   with amd64 packages in the cache, and 'make pkgs' (issue #1)
   was silently rebuilding everything for arm64.

3. Tool images were hardcoded to amd64

   The cache key for loading tool images (mkconf, mkimage-raw-efi,
   mkrootfs-squash, etc.) into docker was hardcoded to amd64. On
   arm64 runners this is wrong — they need arm64 tool images. Since
   for native builds the target cache already contains these tools,
   we now load them directly from the target cache. The two-cache
   dance (load tools from one arch, then restore packages from
   another) is only needed for riscv64 cross-builds on amd64.

4. The 'rt' platform maps to generic packages

   No build-rt.yml files exist anywhere in pkg/, so PLATFORM=rt
   produces identical packages to PLATFORM=generic. Rather than
   adding a redundant amd64/rt entry to the packages matrix, we
   map 'rt' to 'generic' in the cache key.

The fix simplifies the eve job's cache handling:
- Native builds (amd64, arm64): restore target cache, load tools, build
- Cross-builds (riscv64): restore amd64 cache, load tools, clear,
  restore riscv64 cache, build

The "Arch Runner is Matrix" step is removed as it is no longer used.

Signed-off-by: Paul Gaiduk <paulg@zededa.com>
@europaul europaul added the stable Should be backported to stable release(s) label Mar 10, 2026
@github-actions github-actions bot requested a review from uncleDecart March 10, 2026 13:20

- name: Build EVE ${{ matrix.hv }}-${{ matrix.arch }}-${{ matrix.platform }}
run: |
make V=1 ROOTFS_VERSION="$VERSION" PLATFORM=${{ matrix.platform }} HV=${{ matrix.hv }} ZARCH=${{ matrix.arch }} pkgs eve # note that this already loads it into docker
Copy link
Contributor

@rene rene Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@europaul , I'm not sure if you can get rid of make pkgs because all packages pointed by PKGS_DOCKER_LOAD in the Makefile must be loaded to docker in the host, so is not only about being available on linuxkit cache, they must be loaded into docker...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I saw you added the command to load these tools packages....

@eriknordmark
Copy link
Contributor

  • The eve job ran 'make pkgs eve', but the packages job already builds and caches all packages. Since the eve job restores the cache first, the 'pkgs' target should be a no-op. Removed it.

When I type 'make eve' in my workspace it does not rebuild all of pkg/* from source. I think the intent is that the workflow does that (unless I'm missing something) so I don't know how this can be considered a no-op.

@europaul
Copy link
Contributor Author

  • The eve job ran 'make pkgs eve', but the packages job already builds and caches all packages. Since the eve job restores the cache first, the 'pkgs' target should be a no-op. Removed it.

When I type 'make eve' in my workspace it does not rebuild all of pkg/* from source. I think the intent is that the workflow does that (unless I'm missing something) so I don't know how this can be considered a no-op.

we first run make pkgs in the packages job - so they should already by the time we run make eve in eve job

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stable Should be backported to stable release(s)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants