Skip to content

Conversation

@Ayoub-Mabrouk
Copy link

Optimizes neverIndex lookup by replacing Array.includes() with SafeSet.has() for O(1) lookup performance.

  • SafeSet instead of Array: O(1) hash-based lookup vs O(n) linear search
  • for loop instead of .map(): Direct Set construction avoids intermediate array allocation

Replace array-based neverIndex lookup with SafeSet for O(1) lookup
performance. This improves buildNgHeaderString performance from
O(n) to O(1) per header, providing significant speedup in hot path
scenarios with many headers.
@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/http2
  • @nodejs/net

@nodejs-github-bot nodejs-github-bot added http2 Issues or PRs related to the http2 subsystem. needs-ci PRs that need a full CI run. labels Dec 30, 2025
@codecov
Copy link

codecov bot commented Dec 30, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 88.54%. Comparing base (5c8ce91) to head (0046eff).
⚠️ Report is 10 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #61223      +/-   ##
==========================================
+ Coverage   88.52%   88.54%   +0.01%     
==========================================
  Files         703      703              
  Lines      208613   208616       +3     
  Branches    40252    40238      -14     
==========================================
+ Hits       184684   184722      +38     
+ Misses      15951    15909      -42     
- Partials     7978     7985       +7     
Files with missing lines Coverage Δ
lib/internal/http2/util.js 92.71% <100.00%> (+0.02%) ⬆️

... and 31 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@VoltrexKeyva VoltrexKeyva added the request-ci Add this label to start a Jenkins CI on a PR. label Dec 30, 2025
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Dec 30, 2025
@nodejs-github-bot
Copy link
Collaborator

const neverIndex = sensitiveHeaders.map((v) => v.toLowerCase());
const neverIndex = new SafeSet();
for (let i = 0; i < sensitiveHeaders.length; i++) {
neverIndex.add(sensitiveHeaders[i].toLowerCase());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
neverIndex.add(sensitiveHeaders[i].toLowerCase());
neverIndex.add(StringPrototypeToLowerCase(sensitiveHeaders[i]));

?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The usage of primordials is not advised in some cases like http2
I guess it's been stated here

https://github.com/nodejs/node/blob/main/doc/contributing/primordials.md

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then wouldn't SafeSet, also a primordial, be discouraged?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SafeSet is already imported and used throughout this file and the http2 module it’s the standard safe data structure in Node.js core. The primordials.md guidance about avoiding primordials in http2 refers to the verbose primordial methods (like ArrayPrototypeMap, StringPrototypeToLowerCase), not safe data structures like SafeSet.

@ljharb ljharb added the needs-benchmark-ci PR that need a benchmark CI run. label Dec 31, 2025
@ljharb
Copy link
Member

ljharb commented Dec 31, 2025

it's also worth noting that v8 has long optimized .map in this case so that there's no intermediate array allocation to speak of.

@Ayoub-Mabrouk
Copy link
Author

it's also worth noting that v8 has long optimized .map in this case so that there's no intermediate array allocation to speak of.

The main optimization here is SafeSet.has() (O(1)) vs Array.includes() (O(n)) for lookups. The construction method is secondary, happy to switch back to .map() if preferred

Copy link
Member

@ovflowd ovflowd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, won't be a massive performance improvement, but going from O(n) to O(1) is a good win here and the diff is small enough and clean enough.

SGTM.

@ovflowd
Copy link
Member

ovflowd commented Dec 31, 2025

it's also worth noting that v8 has long optimized .map in this case so that there's no intermediate array allocation to speak of.

Jordan has a point here, but the line of optimization is soo thin that in this scenario maybe relying on explicit optimizations is worth it. That said, these are "technicalities" at best, both the old diff and the proposed changes should work perfectly fine without performance issues or observable performance difference. (Jordan knows way better than me here tho)

@Ayoub-Mabrouk
Copy link
Author

That's a great point about V8's .map() optimization, thank you @ovflowd for clarifying! I went with the explicit for loop since I've seen feedback from collaborators about preferring explicit loops over iteration-based methods as mentioned in primordials.md. I'm just trying to follow the contribution guidelines as closely as possible, but I appreciate you both pointing out the nuances here, you and @ljharb clearly have way more experience with these V8 optimization details than I do. Happy to adjust to whatever approach you think is best!

@RafaelGSS RafaelGSS added the performance Issues and PRs related to the performance of Node.js. label Dec 31, 2025
@ljharb
Copy link
Member

ljharb commented Dec 31, 2025

the iterator protocol should be staunchly avoided, but array iteration methods are wonderful :-) a for loop is fine, and more obvious that it's performant, it's just not more performant than .map and friends.

@Ayoub-Mabrouk
Copy link
Author

the iterator protocol should be staunchly avoided, but array iteration methods are wonderful :-) a for loop is fine, and more obvious that it's performant, it's just not more performant than .map and friends.

Thanks for the thorough explanation, really appreciate the clarification on iterator protocol vs array methods! I'm completely flexible here, if you'd prefer .map() for this case, I'm happy to update it immediately. Otherwise, I'm fine keeping the explicit for loop. Either way, I'd love to get your approval if you're satisfied with the overall approach. Just let me know your preference!

Copy link
Member

@pimterry pimterry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, but I don't think this will be an improvement in practice.

I agree with all the discussion in terms of the general case for this kind of logic, it's a good optimization approach most of the time, but it misses the reality here: in 99.99% of cases this array is empty. Nearly nobody uses explicit sensitive header configuration - the defaults within nghttp2 itself already cover authorization headers & many cookies, and I don't think many users are aware of it anyway. In the few remaining cases it'll usually contain one single element. O(N) vs O(1) isn't relevant.

In practice, I expect the overhead of set creation vs array creation will dominate this change, and the impact on the vast majority case outweighs the performance boost in the extremely niche scenario where this array is large. It's not a big difference in either case, but I do think overall this'll make http2 fractionally slower for almost all real-world usage.

Doing a very rough local benchmark, I see empty set creation being about 20x slower than empty array creation, empty set.has being about 10x slower than empty array;includes, and set.has vs array.includes taking about the same amount of time for single-element arrays. There's little benefit until the set gets larger (and this one nearly never will).

Open to benchmarks or arguments against if I've missed something of course. If we do want to improve this, imo the best gains would be in optimizing this for the empty-array case, to skip the map & includes entirely there (if we can show a benchmark that actually improves things in that very common scenario).

@ovflowd
Copy link
Member

ovflowd commented Dec 31, 2025

@pimterry quite insightful, had no idea this specific Set creation was that much slower. If that's the case, it completely diminishes the point of this PR as the creation itself would worse the overall performance.

I'm unfamiliar with the defaults here and the callstack chain of this method so... appreciate for clarifying.

@Ayoub-Mabrouk
Copy link
Author

Sorry, but I don't think this will be an improvement in practice.

I agree with all the discussion in terms of the general case for this kind of logic, it's a good optimization approach most of the time, but it misses the reality here: in 99.99% of cases this array is empty. Nearly nobody uses explicit sensitive header configuration - the defaults within nghttp2 itself already cover authorization headers & many cookies, and I don't think many users are aware of it anyway. In the few remaining cases it'll usually contain one single element. O(N) vs O(1) isn't relevant.

In practice, I expect the overhead of set creation vs array creation will dominate this change, and the impact on the vast majority case outweighs the performance boost in the extremely niche scenario where this array is large. It's not a big difference in either case, but I do think overall this'll make http2 fractionally slower for almost all real-world usage.

Doing a very rough local benchmark, I see empty set creation being about 20x slower than empty array creation, empty set.has being about 10x slower than empty array;includes, and set.has vs array.includes taking about the same amount of time for single-element arrays. There's little benefit until the set gets larger (and this one nearly never will).

Open to benchmarks or arguments against if I've missed something of course. If we do want to improve this, imo the best gains would be in optimizing this for the empty-array case, to skip the map & includes entirely there (if we can show a benchmark that actually improves things in that very common scenario).

Thanks for looking into this! A few thoughts:
The set creation happens once at module load, not per request - even if it’s slower to create, that’s a one-time startup cost. The .has() lookup runs in the hot path every time buildNgHeaderString is called, which is where the O(1) vs O(n) matters.
Re: the benchmarks - I’d genuinely be interested to see them if you have numbers to share. “Rough local benchmarks” are hard to evaluate without the actual data and methodology. If there’s a real performance regression in the common case, I’m definitely open to seeing it.
Also worth noting that users who do configure sensitive headers (security-focused apps, enterprise setups, etc.) would benefit from this. Optimizing only for the empty array case means ignoring their use case entirely.
If you think this makes things slower, would you be able to share benchmark results showing that? Otherwise I’m not sure how to address the concern without concrete data.

@Ayoub-Mabrouk
Copy link
Author

@pimterry quite insightful, had no idea this specific Set creation was that much slower. If that's the case, it completely diminishes the point of this PR as the creation itself would worse the overall performance.

I'm unfamiliar with the defaults here and the callstack chain of this method so... appreciate for clarifying.

Thanks for taking another look! Just want to clarify the concern about set creation:
The set is created once when the module loads, not per-request. Even if set creation has overhead, it’s a one-time startup cost. The lookup (.has() vs .includes()) runs repeatedly in buildNgHeaderString, which is where the O(1) improvement matters.
The benchmarks mentioned were described as “rough” without specific numbers shared. I think it’d be helpful to see actual data before drawing conclusions about performance impact.
Appreciate you both taking time to review this carefully!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

http2 Issues or PRs related to the http2 subsystem. needs-benchmark-ci PR that need a benchmark CI run. needs-ci PRs that need a full CI run. performance Issues and PRs related to the performance of Node.js.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants