(metro-core): add manifest SHA-256 hashes and optional bundle cache layer integration#4576
Conversation
✅ Deploy Preview for module-federation-docs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
…d add register function
…undle URL resolution
jbroma
left a comment
There was a problem hiding this comment.
hey @zhongwuzw , nice job with the implementation so far, I have few questions & one ideas about how to make this even better, let me know what you think!
There was a problem hiding this comment.
since there are no package.json changes in this PR, this is probably a leftover from previous changes, could you please verify this?
| const bundleContent = await fs.readFile( | ||
| saveBundleOpts.bundleOutput, | ||
| 'utf-8', | ||
| ); |
There was a problem hiding this comment.
perhaps we can use bundle directly here? I don't see any benefit of reading it again from the filesystem
There was a problem hiding this comment.
@jbroma Hi, I see we have --bundle-encoding in bundle-mf-remote — do we actually support non-UTF-8? https://github.com/zhongwuzw/core/blob/b2472772250dad91947eefa31b724582441f95e9/packages/metro-core/src/commands/bundle-remote/index.ts#L335
The option accepts utf8 | utf16le | ascii, but encoding info is never passed to the upload layer or stored in the manifest. The server/CDN has no idea what encoding the bundle was written in, and the client always decodes as UTF-8? If a user passes --bundle-encoding utf16le, the bundle will silently break at runtime I think. Did I miss anything?
There was a problem hiding this comment.
Probably not, it's fine to assume utf-8 for now - ideally we would store this information somewhere so we dont have to guess
| // Inject container bundle hash into metaData.buildInfo.hash | ||
| const containerFilename = federationConfig.filename; | ||
| if (bundleHashMap.has(containerFilename)) { | ||
| rawManifest.metaData.buildInfo.hash = |
There was a problem hiding this comment.
is this typed in the MF core or is this a non-standard field?
There was a problem hiding this comment.
shared[].hash is already typed in MF core (StatsShared.hash: string) — Metro generates it as an empty string and I populate it post-build. metaData.buildInfo.hash and expose[].hash are non-standard — StatsBuildInfo and StatsExpose don't have a hash field. Happy to add hash?: string to the core types if you'd like to formalize it.
There was a problem hiding this comment.
yes, let's standardize those then 👍
| afterResolve: (args) => { | ||
| // Register bundle hashes with cache layer for integrity verification | ||
| try { | ||
| const cacheLayer = (globalThis as any).__MFE_CACHE_LAYER__ as | ||
| | ICacheLayer | ||
| | undefined; | ||
| if (!cacheLayer) return args; | ||
|
|
||
| const __loadBundleAsync = | ||
| globalThis[`${__METRO_GLOBAL_PREFIX__ ?? ''}__loadBundleAsync`]; | ||
| const { origin, remoteInfo, remote } = args; | ||
| const manifestUrl = | ||
| 'entry' in remote ? (remote as any).entry : undefined; | ||
| if (manifestUrl && origin.snapshotHandler?.manifestCache) { | ||
| const manifest = | ||
| origin.snapshotHandler.manifestCache.get(manifestUrl); | ||
| if (manifest) { | ||
| // Container bundle hash | ||
| const containerHash = (manifest.metaData?.buildInfo as any)?.hash; | ||
| if (containerHash && remoteInfo.entry) { | ||
| cacheLayer.registerBundleHash(remoteInfo.entry, containerHash); | ||
| } | ||
|
|
||
| const loadBundleAsync = | ||
| __loadBundleAsync as typeof globalThis.__loadBundleAsync; | ||
| // Exposed + shared bundle hashes | ||
| const hashes = extractBundleHashes(manifest, manifestUrl); | ||
| for (const [url, hash] of hashes) { | ||
| // Strip query params — loadBundle looks up hashes by bare URL | ||
| cacheLayer.registerBundleHash(url.split('?')[0], hash); | ||
| } | ||
|
|
||
| if (!loadBundleAsync) { | ||
| throw new Error('loadBundleAsync is not defined'); | ||
| } | ||
| // Register manifest source for polling | ||
| cacheLayer.registerManifestSource(manifestUrl, extractBundleHashes); | ||
| } | ||
| } | ||
| } catch { | ||
| // non-critical — hash validation is best-effort | ||
| } | ||
| return args; | ||
| }, |
There was a problem hiding this comment.
I think we could extract this to the cache package and make it a separate runtime plugin instead - do you think it's possible?
There was a problem hiding this comment.
@jbroma Hey.
I think keeping it in metroCorePlugin is the better fit here. Two reasons:
Separation of concerns — the afterResolve hook parses MF manifest structure and builds Metro-specific URLs, then passes clean (url, hash) pairs to the cache layer via registerBundleHash(). Moving it to the cache package would couple cache with MF manifest types and Metro URL format.
Zero-config — currently it's a no-op when cache isn't registered (if (!cacheLayer) return args). If extracted to a separate runtime plugin, every app that resolves remotes (host + nested remotes) would need to add it to runtimePlugins.
What do u think?
There was a problem hiding this comment.
oh yeah, very good point, we need to keep this here then 👍
There was a problem hiding this comment.
perhaps we could make this PR more complete by providing a default in-memory implementation of this cache interface and always pipe everything through cache - this way we would just swap cache backends later, there would be no need for branching
There was a problem hiding this comment.
Just to make sure I understand — are you suggesting providing a no-op default ICacheLayer so afterResolve always pipes through the cache interface without the if (!cacheLayer) return args guard?If so, my concern is that it would still execute the manifest parsing, hash extraction, and URL resolution on every afterResolve call — only to feed the results into empty no-op functions. The early return may avoids that unnecessary work when no cache backend is registered.
There was a problem hiding this comment.
yes, on second thought even tho it streamlines the process and keeps things branchless, there is a lot of work to do that's totally skippable, just an idea I had, thank you for taking time to think this through.
There was a problem hiding this comment.
btw I've found out about MF global plugins:
All instances on a page share a singleton array at window.FEDERATION.GLOBAL_PLUGIN
core/packages/runtime-core/src/global.ts
Lines 272 to 287 in 1c02710
When any instance initializes — host or remote - it includes the global plugins
core/packages/runtime-core/src/utils/plugin.ts
Lines 5 to 35 in 1c02710
- registerGlobalPlugins([plugin]) → applies to every instance (host + all remotes)
so the actual approach with just host using this is actually valid 👍
| // For remote split bundles with cache enabled, convert relative paths to | ||
| // full URLs so they enter the same cache path as container bundles. | ||
| // In dev mode, getBundlePath returns relative paths unchanged, but we need | ||
| // full URLs for the cache layer (download + eval). | ||
| if (isSplitBundle && cacheLayer && publicPath && !isUrl(bundlePath)) { | ||
| bundlePath = joinComponents(publicPath, bundlePath); | ||
| } | ||
|
|
||
| // --- Cache layer: intercept bundles with full URLs (containers + remote split bundles) --- | ||
| if (cacheLayer && isUrl(bundlePath)) { | ||
| const { status } = await cacheLayer.loadBundle(bundlePath); | ||
| if (status === 'skipped') { | ||
| // Cache layer skipped — fall back to network load | ||
| const encodedBundlePath = bundlePath.replaceAll('../', '..%2F'); | ||
| await loadBundleAsync(encodedBundlePath); | ||
| } | ||
| // else: 'cache-hit' or 'downloaded' — bundle already eval'd by cache layer | ||
| } else { | ||
| // No cache: host split bundles (no publicPath), cache disabled, or native-cache not installed | ||
| const encodedBundlePath = bundlePath.replaceAll('../', '..%2F'); | ||
| await loadBundleAsync(encodedBundlePath); | ||
| } |
There was a problem hiding this comment.
This looks good but to me it feels out of place - with an optional layer we are introducing a lot of complexity into this module - do you think it would be possible to create wrapper around loadBundleAsync that introduces the cache layer and then load the MF wrapper?
I think we could rework the asyncRequire implementation, so that it would be possible to modify the actual loadBundleAsync and then add MF wrapper on top of it:
- InitializeCore runs ->
__loadBundleAsyncgets initialized with default implementation - Cache wrapper runs -> enhances
__loadBundleAsyncwith caching capabilities - MF metro-core wrapper runs -> adapts
__loadBundleAsyncto work with MF
This approach would most likely require splitting asyncRequire into two modules, something like this:
mf:init-async-require-> injects expo impl of async require if missingmf:adapt-async-require-> adapts existing impl of async require to work with federation
in the cache plugin you could then modify the resolver for either of those to ensure module with cache wrapper runs after init and before adapt
mf:async-require, left over for compatibility could be then just:
import `mf:init-async-require`;
import `mf:adapt-async-require`;
let me know what you think & if that makes sense to you
There was a problem hiding this comment.
| const isSplitBundle = !isUrl(originalBundlePath); | ||
|
|
||
| // Cache handler registered externally (e.g. by zephyr-native-cache register()). | ||
| const cacheHandler = (globalThis as any).__MFE_CACHE__ as |
There was a problem hiding this comment.
lets group this under globalThis.__FEDERATION__.__NATIVE__ like globalThis.__FEDERATION__.__NATIVE__.__CACHE__ - important to note to not use any scope qualifier using __METRO_GLOBAL_PREFIX__ in order to make this a singleton entity.
| // entry is always in the root directory of assets associated with remote | ||
| // based on that, we extract the public path from the origin URL | ||
| // e.g. http://example.com/a/b/c/mf-manfiest.json -> http://example.com/a/b/c |
| bundlePath = joinComponents(publicPath, bundlePath); | ||
| } | ||
|
|
||
| // ../../node_modules/ -> ..%2F..%2Fnode_modules/ so that it's not automatically sanitized |
There was a problem hiding this comment.
This interface is now redundant since we we have changed requirements in asyncRequire which no longer depend on this type and the afterResolve is scheduled to be moved to the cache impl. itself
There was a problem hiding this comment.
| afterResolve: (args) => { | ||
| // Register bundle hashes with cache layer for integrity verification | ||
| try { | ||
| const cacheLayer = (globalThis as any).__MFE_CACHE_LAYER__ as | ||
| | ICacheLayer | ||
| | undefined; | ||
| if (!cacheLayer) return args; | ||
|
|
||
| const __loadBundleAsync = | ||
| globalThis[`${__METRO_GLOBAL_PREFIX__ ?? ''}__loadBundleAsync`]; | ||
| const { origin, remoteInfo, remote } = args; | ||
| const manifestUrl = | ||
| 'entry' in remote ? (remote as any).entry : undefined; | ||
| if (manifestUrl && origin.snapshotHandler?.manifestCache) { | ||
| const manifest = | ||
| origin.snapshotHandler.manifestCache.get(manifestUrl); | ||
| if (manifest) { | ||
| // Container bundle hash | ||
| const containerHash = (manifest.metaData?.buildInfo as any)?.hash; | ||
| if (containerHash && remoteInfo.entry) { | ||
| cacheLayer.registerBundleHash(remoteInfo.entry, containerHash); | ||
| } | ||
|
|
||
| const loadBundleAsync = | ||
| __loadBundleAsync as typeof globalThis.__loadBundleAsync; | ||
| // Exposed + shared bundle hashes | ||
| const hashes = extractBundleHashes(manifest, manifestUrl); | ||
| for (const [url, hash] of hashes) { | ||
| // Strip query params — loadBundle looks up hashes by bare URL | ||
| cacheLayer.registerBundleHash(url.split('?')[0], hash); | ||
| } | ||
|
|
||
| if (!loadBundleAsync) { | ||
| throw new Error('loadBundleAsync is not defined'); | ||
| } | ||
| // Register manifest source for polling | ||
| cacheLayer.registerManifestSource(manifestUrl, extractBundleHashes); | ||
| } | ||
| } | ||
| } catch { | ||
| // non-critical — hash validation is best-effort | ||
| } | ||
| return args; | ||
| }, |
|
@jbroma Hi. Please review again. |
jbroma
left a comment
There was a problem hiding this comment.
Hey @zhongwuzw, thanks for addressing the latest round of feedback — the namespace migration and hash type standardization look great.
I've been testing this branch and experimenting with a few changes that could reduce the mf-core surface area. I'm keen on getting this PR wrapped up and merged — I'm building on top of it so getting the final shape nailed down will really help move things forward.
Build-time hashing in the serializer
While testing I found it useful to compute SHA-256 hashes directly in the serializer pipeline rather than post-build. Since the serializer already has the bundle code in hand, we can hash it right there and thread the hashes through manifest generation. This means hashes get populated in the manifest as bundles are produced — no disk round-trip needed.
The core addition to manifest.ts:
type BundleHashMap = Map<string, string>;
export function recordBundleHash(
hashes: BundleHashMap,
code: string,
entryPoint: string,
projectRoot: string,
config: ModuleFederationConfigNormalized,
): void {
const hash = crypto.createHash('sha256').update(code).digest('hex');
const key = resolveBundleKey(entryPoint, projectRoot, config);
if (key) hashes.set(key, hash);
}resolveBundleKey maps each entryPoint to its manifest slot — container:X when the basename matches config.filename, expose:X when the relative path matches an expose config entry, or shared:X when the path contains a node_modules/ package name present in config.shared.
Then in serializer.ts, we accumulate hashes across serializer invocations and rewrite the manifest after each bundle:
export function getModuleFederationSerializer(
mfConfig: ModuleFederationConfigNormalized,
isUsingMFBundleCommand: boolean,
manifestPath?: string, // ← new param, threaded from augmentConfig
): CustomSerializer {
const bundleHashes = new Map<string, string>();
return async (entryPoint, preModules, graph, options) => {
// ... existing serialization logic produces `code` ...
if (manifestPath) {
recordBundleHash(bundleHashes, code, entryPoint, options.projectRoot, mfConfig);
updateManifest(manifestPath, mfConfig, bundleHashes);
}
return code;
};
}The manifest generation functions (generateMetaData, generateExposes, generateShared) each accept an optional hashes?: BundleHashMap and use it to populate the hash fields — e.g. hash: hashes?.get('container:${config.name}') ?? ''.
This also has the nice side effect of making hashes available during dev server builds, not just production bundle-remote runs.
Moving the runtime cache registration out of metroCorePlugin
Remember our earlier discussion about extracting the afterResolve hook to the cache package (comment thread starting at my initial suggestion)? At the time, you raised two very valid points — separation of concerns and zero-config — and I agreed to keep it in metroCorePlugin.
Since then though, I found the global plugins mechanism (__FEDERATION__.__GLOBAL_PLUGIN__, comment here), which addresses the zero-config concern. A consumer can provide their own runtime plugin that uses beforeInit to register into __GLOBAL_PLUGIN__ — add it once to the host config via runtimePlugins, and it automatically applies to all instances including nested remotes. No per-remote configuration needed.
With that in place, the afterResolve hook, extractBundleHashes, buildUrlForSplitBundle, and cache-interface.ts could all move out of mf-core, and metroCorePlugin would go back to its minimal form — just loadEntry + generatePreloadAssets. The asyncRequire changes would stay exactly as they are.
This would keep mf-core cache-agnostic while still providing all the hooks needed for external cache integration. I've been working with this approach and can push the mf-core side as commits on your branch if you'd like to see it in action — should help us get this across the finish line faster.
What do you think?
@jbroma Hey, thanks for your review, Here are my thoughts — the serializer approach is a nice idea, but I'd lean towards keeping hashing in bundle-remote for now. It's the last point before saveBundleAndMap, so we're guaranteed to hash exactly what gets written — if any post-processing were ever added between serializer and disk write, serializer-based hashes could silently diverge. As for dev-mode hashing, I intentionally skipped it since local dev server loads are fast enough and caching/polling doesn't add much value there. That said, happy to revisit if you feel strongly about it! |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 46c118120e
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const cacheHandler = (globalThis as any).__FEDERATION__?.__NATIVE__ | ||
| ?.__CACHE__ as |
There was a problem hiding this comment.
Read cache layer from the registered global key
afterResolve stores and uses the cache object at __FEDERATION__.__NATIVE__.__CACHE_LAYER__ (metroCorePlugin.ts), but buildLoadBundleAsyncWrapper looks for __FEDERATION__.__NATIVE__.__CACHE__ here. In environments that implement the new ICacheLayer contract, the loader never sees the cache layer, so split/entry bundle loads bypass caching and the hash registrations done in afterResolve are effectively unused. This makes the new cache integration non-functional unless consumers also set a second, undocumented global.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
Adds bundle integrity metadata (SHA-256) to Metro federation manifests and introduces a (global) optional runtime cache-layer integration intended to use these hashes for verified bundle loading.
Changes:
- Compute SHA-256 hashes for container/exposed/shared bundles during
bundle-remoteand inject them intomf-manifest.json. - Add a cache-layer contract (
ICacheLayer) and register expected bundle hashes / manifest sources inmetroCorePlugin.afterResolve. - Route async bundle loading through an externally-registered cache handler in
asyncRequire(with URL normalization for split bundles).
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/sdk/src/types/stats.ts | Adds optional hash fields to stats types used by manifests/build info. |
| packages/metro-core/src/modules/metroCorePlugin.ts | Extracts/registers bundle hashes from manifest into an optional cache layer during afterResolve. |
| packages/metro-core/src/modules/cache-interface.ts | Introduces ICacheLayer contract and documents the intended global attachment point. |
| packages/metro-core/src/modules/asyncRequire.ts | Adds cache-handler interception for __loadBundleAsync, with URL normalization for split bundles. |
| packages/metro-core/src/commands/bundle-remote/index.ts | Computes SHA-256 hashes for emitted bundles and injects them into the manifest. |
| .gitignore | Ignores additional native build artifacts. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| // entry is always in the root directory of assets associated with remote | ||
| // based on that, we extract the public path from the origin URL | ||
| // e.g. http://example.com/a/b/c/mf-manfiest.json -> http://example.com/a/b/c |
| /** | ||
| * Interface for the global cache layer (`globalThis.__FEDERATION__.__NATIVE__.__CACHE_LAYER__`). | ||
| * | ||
| * Metro-core never imports native-cache directly — it only | ||
| * reads this global, keeping the two packages decoupled. | ||
| */ |
| function addHashes(items: any[] | undefined, isContainer: boolean) { | ||
| if (!Array.isArray(items)) return; | ||
| for (const item of items) { | ||
| const hash = (item as any)?.hash; | ||
| const syncJs = item?.assets?.js?.sync; | ||
| if (hash && syncJs) { | ||
| for (const assetPath of syncJs) { | ||
| // In dev, asset paths use source extensions (.tsx/.ts) — normalize to .bundle | ||
| const bundlePath = assetPath.replace(/\.\w+$/, '.bundle'); | ||
| const bareUrl = resolvedPublicPath | ||
| ? `${resolvedPublicPath.replace(/\/+$/, '')}/${bundlePath.replace(/^\.?\//, '')}` | ||
| : bundlePath; | ||
| const fullUrl = isContainer | ||
| ? buildUrlForEntryBundle(bareUrl) | ||
| : buildUrlForSplitBundle(bareUrl); | ||
| hashes.set(fullUrl, hash); | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
| addHashes(manifest?.exposes, false); | ||
| addHashes(manifest?.shared, false); |
| const cacheLayer = (globalThis as any).__FEDERATION__?.__NATIVE__ | ||
| ?.__CACHE_LAYER__ as | ||
| | ICacheLayer | ||
| | undefined; | ||
| if (!cacheLayer) return args; |
| if (cacheHandler) { | ||
| await cacheHandler(loadBundleAsync, encodedBundlePath); | ||
| } else { | ||
| result = await loadBundleAsync(encodedBundlePath); |
Description
Related Issue
Types of changes
Checklist