feat: Add multi-arch build API fields and pre-flight validations#2127
feat: Add multi-arch build API fields and pre-flight validations#2127anchi205 wants to merge 1 commit intoshipwright-io:mainfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
af45e2c to
00999d8
Compare
There was a problem hiding this comment.
Pull request overview
This PR introduces first-class multi-architecture image build support to Shipwright Builds/BuildRuns, adding API fields for platform selection and implementing a PipelineRun-based fan-out/fan-in orchestration that produces an OCI image index and per-platform status reporting.
Changes:
- Extend Build/BuildRun APIs and CRDs with
spec.output.multiArch.platformsand BuildRun status fields for per-platform results and manifest list digest. - Generate multi-arch PipelineRuns that push a source bundle, execute per-platform build tasks on matching nodes, and assemble an OCI image index.
- Add image-processing CLI support and supporting
pkg/imageutilities for source bundling and index assembly, plus tests and docs.
Reviewed changes
Copilot reviewed 23 out of 23 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/validate/output_test.go | Adds validation tests for multi-arch platforms, executor mode, node conflicts, and node availability checks. |
| pkg/validate/output.go | Implements multi-arch validation helpers (platforms, nodeSelector conflicts, executor, node presence). |
| pkg/reconciler/buildrun/resources/results_test.go | Adds tests for aggregating multi-arch PipelineRun child TaskRun results into BuildRun status. |
| pkg/reconciler/buildrun/resources/results.go | Implements multi-arch result extraction and status population (PlatformResults + ManifestDigest). |
| pkg/reconciler/buildrun/resources/resource_builders.go | Switches nodeSelector merge helper to exported MergeMaps; adds multi-arch workspace bindings. |
| pkg/reconciler/buildrun/resources/pipelinerun_generator.go | Adds multi-arch detection and generates fan-out PipelineRun with per-platform tasks + assemble step. |
| pkg/reconciler/buildrun/resources/multiarch_test.go | New tests for multi-arch PipelineRun generation behavior and wiring. |
| pkg/reconciler/buildrun/resources/multiarch.go | New multi-arch PipelineRun building blocks (task naming, source bundle, per-platform tasks, assemble-index, TaskRunSpecs). |
| pkg/reconciler/buildrun/resources/image_processing.go | Renames mergeMaps to exported MergeMaps and updates usage. |
| pkg/reconciler/buildrun/buildrun.go | Adds multi-arch validation at reconcile time and multi-arch result extraction for PipelineRun execution. |
| pkg/image/index_test.go | Adds tests for assembling an image index from per-platform images. |
| pkg/image/index.go | Adds AssembleImageIndex helper for creating OCI image indexes (manifest lists). |
| pkg/image/bundle_test.go | Adds tests for bundling a directory into a single-layer OCI image (including symlink behavior). |
| pkg/image/bundle.go | Adds source directory bundling as an OCI image layer for cross-node source distribution. |
| pkg/apis/build/v1beta1/zz_generated.deepcopy.go | Updates generated deep-copies for new API fields (MultiArch, PlatformResults, etc.). |
| pkg/apis/build/v1beta1/buildrun_types.go | Adds BuildRun status types for multi-arch per-platform results and manifest digest. |
| pkg/apis/build/v1beta1/build_types.go | Adds API types ImagePlatform and MultiArch plus new BuildReasons for multi-arch validation failures. |
| docs/build.md | Documents how to configure multi-arch builds, requirements, and examples. |
| deploy/crds/shipwright.io_builds.yaml | Updates Build CRD schema with spec.output.multiArch.platforms. |
| deploy/crds/shipwright.io_buildruns.yaml | Updates BuildRun CRD schema with spec.output.multiArch.platforms and status fields. |
| deploy/200-role.yaml | Grants controller RBAC permissions to list/watch nodes for platform availability validation. |
| cmd/image-processing/main_test.go | Adds tests for source bundle push and index assembly CLI modes. |
| cmd/image-processing/main.go | Adds CLI modes for pushing source bundles and assembling/pushing image indexes, plus platform-image parsing. |
Comments suppressed due to low confidence (1)
pkg/reconciler/buildrun/resources/image_processing.go:289
- MergeMaps mutates and returns the first input map in-place. Several new call sites pass maps from informer-cached objects (e.g., build.Spec.NodeSelector / build.Spec.Output.Annotations), so this can unintentionally mutate cached resources and cause hard-to-debug reconciliation behavior. Make MergeMaps allocate a new map, copy entries from first, then overlay second, without modifying either input map (and handle nil maps).
// MergeMaps merges two string maps. Values in second take precedence
// over first when keys overlap.
func MergeMaps(first map[string]string, second map[string]string) map[string]string {
if len(first) == 0 {
first = map[string]string{}
}
for k, v := range second {
first[k] = v
}
return first
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if in.PlatformResults != nil { | ||
| in, out := &in.PlatformResults, &out.PlatformResults | ||
| *out = make([]PlatformBuildResult, len(*in)) | ||
| copy(*out, *in) |
There was a problem hiding this comment.
BuildRunStatus.DeepCopyInto shallow-copies PlatformResults with copy(*out, *in). PlatformBuildResult contains a Vulnerabilities slice, so this will share underlying slice data between copies and violates Kubernetes deepcopy semantics. Regenerate deepcopy with controller-gen (or deep-copy each element via DeepCopyInto) so nested slices are fully copied.
| copy(*out, *in) | |
| for i := range *in { | |
| (*in)[i].DeepCopyInto(&(*out)[i]) | |
| } |
There was a problem hiding this comment.
This is auto-generated code from controller-gen, so I don't think we should be changing this. I have verified this doesn't cause issues because the reconciler rebuilds PlatformResults from scratch on each reconciliation rather than mutating deep-copied structs.
| } | ||
| } | ||
| } | ||
| } | ||
| }) |
There was a problem hiding this comment.
This test file has mismatched/missing closing braces in the nested for/if blocks (around the vulnerability scanning args assertion). As written it will not compile, which will break the test suite. Please fix the block structure and run gofmt to ensure the file builds.
| } | |
| } | |
| } | |
| } | |
| }) | |
| } | |
| } | |
| } | |
| } | |
| }) |
There was a problem hiding this comment.
Its compiling properly, so this is a false positive.
pkg/image/bundle.go
Outdated
| defer f.Close() | ||
|
|
||
| if _, err := io.Copy(tarWriter, f); err != nil { | ||
| return fmt.Errorf("writing %s to tar: %w", relPath, err) | ||
| } | ||
|
|
There was a problem hiding this comment.
createTarGz defers f.Close() inside the filepath.Walk callback. With many files this will keep file descriptors open until the entire walk finishes and can exhaust FD limits. Close the file immediately after copying (avoid defer in the loop/callback).
| defer f.Close() | |
| if _, err := io.Copy(tarWriter, f); err != nil { | |
| return fmt.Errorf("writing %s to tar: %w", relPath, err) | |
| } | |
| if _, err := io.Copy(tarWriter, f); err != nil { | |
| _ = f.Close() | |
| return fmt.Errorf("writing %s to tar: %w", relPath, err) | |
| } | |
| if err := f.Close(); err != nil { | |
| return fmt.Errorf("closing %s: %w", filePath, err) | |
| } |
There was a problem hiding this comment.
This is a false positive
The defer f.Close() is inside the anonymous function passed to filepath.Walk
filepath.Walk calls this anonymous function once per file as a separate function invocation. f.Close() runs when each callback returns, not when the outer createTarGz returns. Each file descriptor is opened, copied, and closed before the next file is processed.
There was a problem hiding this comment.
I think it makes sense to turn this anonymous function into a real function. It should fix the false positive and also make this behavior easier to unit test.
IrvingMg
left a comment
There was a problem hiding this comment.
Thanks for the work here! Would it be possible to split this into smaller PRs to make it easier to review?
Also, is there an issue for this? If yes, could you link it for more context?
SHIP for this : https://github.com/shipwright-io/community/blob/main/ships/0043-multi-arch-image-builds.md I will be splitting this entire work into 3 PRs. Please review it once I am done. Thank you |
|
We have a parent "Epic" issue in the Community repo where we are tracking all the work for this feature: shipwright-io/community#285. This is also linked on the project Roadmap. This is most directly related to #2075, although I need to clean up that feature's description. I used an AI assistant to generate these features, and on this one it's quite clear that it hallucinated the requirements. |
Add ImagePlatform and MultiArch types to Build and BuildRun CRDs under spec.output.multiArch, allowing users to declare target platforms for multi-architecture image builds as described in SHIP-0043. Pre-flight validation in the BuildRun controller rejects builds early when platform fields are invalid, nodeSelector conflicts with os/arch labels, the controller is not in PipelineRun executor mode, or the cluster lacks a Ready schedulable node for a requested platform. RBAC is extended with get/list/watch on nodes for the controller ServiceAccount, required by the node availability check. Closes shipwright-io#2075 Assisted-by: Cursor Signed-off-by: Anchita Borah <anborah@redhat.com>
Changes
Add ImagePlatform and MultiArch types to Build and BuildRun CRDs under
spec.output.multiArch, allowing users to declare target platforms for
multi-architecture image builds as described in SHIP-0043.
Pre-flight validation in the BuildRun controller rejects builds early
when platform fields are invalid, nodeSelector conflicts with os/arch
labels, the controller is not in PipelineRun executor mode, or the
cluster lacks a Ready schedulable node for a requested platform.
RBAC is extended with get/list/watch on nodes for the controller
ServiceAccount, required by the node availability check.
Closes #2075
Assisted-by: Cursor
Submitter Checklist
Release Notes