Skip to content

feat(find-images): 4509 include archives in find image#4551

Open
chaospuppy wants to merge 44 commits intozarf-dev:mainfrom
chaospuppy:4509-include-archives-in-find-image
Open

feat(find-images): 4509 include archives in find image#4551
chaospuppy wants to merge 44 commits intozarf-dev:mainfrom
chaospuppy:4509-include-archives-in-find-image

Conversation

@chaospuppy
Copy link
Copy Markdown
Contributor

@chaospuppy chaospuppy commented Jan 23, 2026

Description

This MR resolves #4577 by doing the following:

  • Allow schema validation to succeed by removing findings related to an empty image list under imageArchives keys. Currently this finding is only removed when calling PackageDefinition from FindImages
  • Add FindImagesInArchive to identify and return all images in imageArchives.path
  • Update unit tests

Related Issue

Fixes #4577

Checklist before merging

@netlify
Copy link
Copy Markdown

netlify bot commented Jan 23, 2026

Deploy Preview for zarf-docs ready!

Name Link
🔨 Latest commit c67b9be
🔍 Latest deploy log https://app.netlify.com/projects/zarf-docs/deploys/69de69f570f9900008bf1d7f
😎 Deploy Preview https://deploy-preview-4551--zarf-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@chaospuppy
Copy link
Copy Markdown
Contributor Author

@AustinAbro321 when you have a moment, please take a look. I am still going to update a few unit tests.

@chaospuppy chaospuppy force-pushed the 4509-include-archives-in-find-image branch from aad61df to 6d872f4 Compare January 24, 2026 00:49
@chaospuppy chaospuppy marked this pull request as ready for review January 24, 2026 00:50
@chaospuppy chaospuppy requested review from a team as code owners January 24, 2026 00:50
@chaospuppy chaospuppy force-pushed the 4509-include-archives-in-find-image branch 2 times, most recently from b03b517 to f35514c Compare January 27, 2026 20:43
@chaospuppy chaospuppy changed the title 4509 include archives in find image feat(find-images): 4509 include archives in find image Jan 27, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Jan 27, 2026

Codecov Report

❌ Patch coverage is 56.65236% with 101 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/cmd/dev.go 0.00% 38 Missing ⚠️
src/pkg/packager/find_images.go 65.62% 15 Missing and 7 partials ⚠️
src/pkg/images/unpack.go 64.15% 12 Missing and 7 partials ⚠️
src/pkg/packager/update.go 76.47% 11 Missing and 5 partials ⚠️
src/pkg/images/common.go 40.00% 4 Missing and 2 partials ⚠️
Files with missing lines Coverage Δ
src/pkg/packager/load/load.go 43.45% <ø> (ø)
src/pkg/images/common.go 35.87% <40.00%> (+0.34%) ⬆️
src/pkg/packager/update.go 56.91% <76.47%> (+8.95%) ⬆️
src/pkg/images/unpack.go 59.74% <64.15%> (+5.64%) ⬆️
src/pkg/packager/find_images.go 57.25% <65.62%> (+1.39%) ⬆️
src/cmd/dev.go 43.32% <0.00%> (-2.33%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Member

@AustinAbro321 AustinAbro321 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start! Thanks for taking this on

Comment thread src/pkg/images/unpack.go Outdated
// It returns a list of images found in the extracted OCI layout.
func FindImagesInArchive(ctx context.Context, imageArchive, destDir string) ([]string, error) {
// Create a temporary directory for extraction
tmpdir, err := utils.MakeTempDir("")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extraction logic in this block is mostly duplicated. Lets make a common function to de-duplicate it

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a few changes with this in mind

Comment thread src/pkg/packager/load/load.go Outdated
// SkipVersionCheck skips version requirement validation
SkipVersionCheck bool
// SkipImageArchivesImages ignores schema validation errors when imageArchives does not include an images list
SkipEmptyImageArchivesImages bool
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer a strategy where we create a function load.PackageDefinitionNoValidate() then also make a public function load.Validate(). Keep the load.PackageDefinition function this way validation happens by default.

I believe a public validate function could be generally useful for SDK users who might be creating packages without zarf.yaml files, additionally it'll leave more room in the future for more generic solutions if we need to skip validation for a similar reason

Copy link
Copy Markdown
Contributor Author

@chaospuppy chaospuppy Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to make that split still, but our failure was while validating the file itself here, rather than during the validation of v1alpha1.ZarfPackage. Therefore, it wasn't useful for us to delay validation after adding a placeholder image as we previously discussed to the v1alpha1.ZarfPackage.

I could split them and then add some work to change the zarf.ymal on disk to add an empty array there, but that didn't seem like desirable behavior.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, I could also split out a Validate and ValidateWithoutSchemaValidation (not literally that name, but something like that) and use the later

Comment thread src/pkg/packager/find_images.go Outdated
return nil, fmt.Errorf("failed to unpack image archive %s: %w", archive.Path, err)
}
for _, image := range archiveImages {
matchedImages[image] = true
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think matched images isn't the right construct. Maybe a new field should be added such as archiveImages. One of the goals of find-images is to be able to copy and paste the output into your zarf.yaml. Currently you'd see this output, which would result in trying to pull the image from the internet if you copied and pasted it

2026-01-27 16:38:42 INF looking up cosign artifacts for discovered images count=3

components:
  - name: kiwix-serve
    images:
      - docker.io/library/kiwix-data:local
      - ghcr.io/kiwix/kiwix-serve:3.5.0-2
      - kiwix-data:local

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, good point, I missed the desire to pull images from the image archive on disk after zarf.yaml has been updated with the output of zarf dev find-images. I'll fix that up!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AustinAbro321 to clarify, are you hoping to see images elements expressed in a such as way as to direct Zarf to pull them from the archive, rather than from the internet?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I missed your comment before asking, I think I see the answer in #4577

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just went back and clarified #4577 a bit, but just to write here as well, we should no longer include images because they're in an image archive. If they're in both a manifest we should check if the image archive contains that image, then in the output to the user structure it so it's in the image archive, and the user can simply copy and paste.

Comment thread src/pkg/packager/find_images.go
@github-project-automation github-project-automation bot moved this to In progress in Zarf Jan 27, 2026
@chaospuppy chaospuppy force-pushed the 4509-include-archives-in-find-image branch from 08c339e to 9534ea6 Compare January 31, 2026 01:53
@AustinAbro321
Copy link
Copy Markdown
Member

Hello @chaospuppy, I reazlied that the behavior I proposed in #4509, didn't really fit the zarf dev find-images command. However, there is still work to be done to get image archives to behave properly for zarf dev find-images. I've created #4577 to track this work. If you're interested, I believe this PR could be adjusted for that issue. LMK if you have any questions.

If you do adjust this PR for #4577, don't worry about validation, assume the given zarf.yaml file is valid

@chaospuppy
Copy link
Copy Markdown
Contributor Author

chaospuppy commented Feb 6, 2026

@AustinAbro321 Are you thinking about handling validation differently in the near future? Trying to get an understanding for which changes I should back out. I am also happy to keep the validation changes in, including making Validation public and creating PackageDefinitionWithoutValidation.

@AustinAbro321
Copy link
Copy Markdown
Member

@chaospuppy Very possibly handling validation in the near future yes. I suggest backing out of the validation changes in this PR

chaospuppy and others added 13 commits February 9, 2026 14:22
…eholder images list, add skipimagearchivesimages definition option

Signed-off-by: Tim Seagren <timseagren@defenseunicorns.com>
…efinitionOption for skipping imagearchives images schema findings

Signed-off-by: Tim Seagren <timseagren@defenseunicorns.com>
…es output

Signed-off-by: Tim Seagren <timseagren@defenseunicorns.com>
Signed-off-by: Tim Seagren <timseagren@defenseunicorns.com>
Signed-off-by: Tim Seagren <timseagren@defenseunicorns.com>
Signed-off-by: Tim Seagren <timseagren@defenseunicorns.com>
Signed-off-by: Tim Seagren <timseagren@defenseunicorns.com>
…dev#4554)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Tim Seagren <timseagren@defenseunicorns.com>
…-organization group (zarf-dev#4553)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Tim Seagren <timseagren@defenseunicorns.com>
…ev#4552)

Signed-off-by: Jason Washburn <jason.washburn@gmail.com>
Signed-off-by: Tim Seagren <timseagren@defenseunicorns.com>
Signed-off-by: Austin Abro <austinabro321@gmail.com>
Signed-off-by: Tim Seagren <timseagren@defenseunicorns.com>
Signed-off-by: Austin Abro <austinabro321@gmail.com>
Signed-off-by: Tim Seagren <timseagren@defenseunicorns.com>
Signed-off-by: Austin Abro <austinabro321@gmail.com>
Signed-off-by: Tim Seagren <timseagren@defenseunicorns.com>
Signed-off-by: Tim Seagren <timseagren@defenseunicorns.com>
@chaospuppy chaospuppy marked this pull request as ready for review March 4, 2026 05:19
Comment thread src/pkg/packager/find_images.go Outdated

type ImageArchivesScan struct {
ComponentName string
ImageArchives []ImageArchive
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered using v1alpha1.ZarfComponent.ImageArchive here, but thought perhaps there could be some divergence in the future. Let me know if you'd prefer the re-use of v1alpha1.ZarfComponent.ImageArchive

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they are coupled in that if image archives were to change, we'd also want to change this block since ultimately we'd want to use the information here to update / print a package definition

Comment thread src/pkg/packager/update.go Outdated
chaospuppy and others added 2 commits March 3, 2026 21:28
Signed-off-by: Tim Seagren <timseagren@defenseunicorns.com>
Copy link
Copy Markdown
Member

@AustinAbro321 AustinAbro321 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good progress. When running this locally on examples/kiwix I get

components:
  - name: kiwix-serve
    images:
      - ghcr.io/kiwix/kiwix-serve:3.5.0-2
      - kiwix-data:local
  # Archive images - kiwix-serve
  - name: kiwix-serve
    imageArchives:
      - path: kiwix-data.tar
        images: []

Whereas I'd expect

  - name: kiwix-serve
    images:
      - ghcr.io/kiwix/kiwix-serve:3.5.0-2
    imageArchives:
      - path: kiwix-data.tar
        images: 
         - kiwix-data:local

The kiwix-data:local is in the wrong spot, and they are printed as different components.

Comment thread src/cmd/dev.go
Comment thread src/pkg/packager/update.go Outdated
Comment thread src/pkg/packager/find_images.go Outdated

type ImageArchivesScan struct {
ComponentName string
ImageArchives []ImageArchive
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think they are coupled in that if image archives were to change, we'd also want to change this block since ultimately we'd want to use the information here to update / print a package definition

Signed-off-by: Tim Seagren <timseagren@defenseunicorns.com>
@chaospuppy
Copy link
Copy Markdown
Contributor Author

chaospuppy commented Mar 4, 2026

@AustinAbro321 I've resolved the duplicate components issue, thanks for spotting that. To your point about kiwix-data:local missing from imageArchives, I'm wondering how we want to handle that exactly. Since creating a tarball of an image tagged without a registry with docker save always results in an index.json io.containerd.image.name that is pre-pended with docker.io/library/, there is some ambiguity over if kiwix-data:local really does correspond to docker.io/library/kiwix-data:local found in the archive. The user could also be looking for ghcr.io/kiwix/kiwix-data:local, for example. Are do we also want to assume that images without registries in their references refer to docker.io/library images?

@AustinAbro321
Copy link
Copy Markdown
Member

So a fun thing about containers is that if the registry is not specified then it's always assumed to be docker.io. Docker was the registry that popularized containers, so they made themselves the default. Generally the community wants to move away from this. IMO it would be a positive change if the find images command started returning the full path, so instead of kiwix-data:local, it'd return docker.io/library/kiwix-data:local. This should simplify things for you as well.

You should be able to use the transform.ParseImageRef library to get the qualified name.

Signed-off-by: Tim Seagren <timseagren@defenseunicorns.com>
@chaospuppy
Copy link
Copy Markdown
Contributor Author

@AustinAbro321 updated to return qualified names for scan Matches. I did not do the same for PotentialMatches since if I understand the use case for PotentialMatches, we wouldn't want to push an image to the zarf registry from the archive that isn't actually used.

Signed-off-by: Tim Seagren <timseagren@defenseunicorns.com>
Signed-off-by: Tim Seagren <timseagren@defenseunicorns.com>
Signed-off-by: Tim Seagren <timseagren@defenseunicorns.com>
Copy link
Copy Markdown
Member

@AustinAbro321 AustinAbro321 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good progress, some more requests as I go through

Comment thread src/pkg/images/unpack.go Outdated
Comment thread src/pkg/images/unpack.go Outdated
Comment thread src/pkg/images/unpack.go Outdated
Comment thread src/pkg/packager/find_images.go Outdated
Comment thread src/pkg/packager/find_images.go Outdated
Comment thread src/cmd/dev.go Outdated
Comment thread src/pkg/packager/find_images.go Outdated
…ind-image

Signed-off-by: Tim Seagren <timseagren@defenseunicorns.com>
@chaospuppy chaospuppy force-pushed the 4509-include-archives-in-find-image branch from 987f2e9 to 3327eae Compare April 7, 2026 17:06
Signed-off-by: Tim Seagren <timseagren@defenseunicorns.com>
Copy link
Copy Markdown
Member

@AustinAbro321 AustinAbro321 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting an architecture change. Appreciate the iterations

// FilterImagesFoundInArchives recieves imageScans and discovers if any of its elements are duplicated by
// images provided in imageArchives parsed out of the package located at packagePath. If duplciates are found, they
// are removed from imageScans and added to archiveImageScans.
func FilterImagesFoundInArchives(ctx context.Context, packagePath string, imageScans []ComponentImageScan) (_ []ImageArchivesScan, _ []ComponentImageScan, err error) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking more about the architecture we're going with here. I'd like to see this function private and another abstract function introduced FindDefinitionImages. This function would return a struct like this, and would do the logic of filtering the images.

  type DefinitionImageResult struct {                                                                                                                           
      ComponentImageScan                                                                                                                                                
      ImageArchives []v1alpha1.ImageArchive
  }  

This should solve a few problems:

  1. dev.go and UpdateImages will be simplified since all data will already be in one struct
  2. when --why is used, we can just call the original Packager.FindImages, it's no longer order dependent in dev.go
  3. We can introduce a private function packager.findImages which accepts a package definition. This way we only have to load the package once.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused around FindDefinitionImages returning a struct like DefinitionImageResult. I'm thinking FindDefinitionImages would need to return a []DefinitionImageResult to that the rest of the logic in dev.go and UpdateImages can remain relatively unchanged. Is that right?

Aside from that, I'm also not totally certain I understand the situations in which we call packager.FindImages, packer.findImages, and packager.FindDefintionImages. My current idea is this:

  • FindImages is only called if --why is set, in dev.go
  • FindDefinitionImages is called if --why is not set, in dev.go
  • findImages is called by both FindImages and FindDefinitionImages
  • Both FindImages and FindDefintitionImages handle loading the package before passing it as an argument to findImages

Is that right?

Copy link
Copy Markdown
Member

@AustinAbro321 AustinAbro321 Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I misspoke, it would return the struct. Your understanding and flow is correct

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In progress

Development

Successfully merging this pull request may close these issues.

zarf dev find-images should place images within Image Archives when archives hold relevant images

3 participants