[ORCA] Fix flaky "Invalid key is inaccessible" fallback (#15147) #437
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In CI pipeline there were occassional test failures due to ORCA fallback with following stacktrace.
Core dump of failure showed CMemoryPool::m_hash_key had invalid key value 0xffffffff. Hence, the query raised an assertion error and fell back to PLANNER.
Issue is that CMemoryPool::m_hash_key was never directly initialized. This suggests that it was using uninitialized memory to produce randomness in the key. When that memory contains 0xffffffff in just the right place, then the value of the CMemoryPool::m_hash_key is an invalid key and ORCA falls back.
Following is patch that demonstrates the issue:
```
diff src/backend/utils/mmgr/aset.c
@@ -989,6 +989,8 @@ AllocSetAlloc(MemoryContext context, Size size)
A few lines above that patch, you can see that when compiled with RANDOMIZE_ALLOCATED_MEMORY the memory is randomly initialied. So we can make no assumptions about the uninitialied memory; meaning that 0xffffff is valid.
Note: Seemed this failure manifested more commonly with JIT ICW runs. (cherry picked from commit 2c7152f46aced9328d86dc1025d0395fcf467455)
fix #ISSUE_Number
Change logs
Describe your change clearly, including what problem is being solved or what feature is being added.
If it has some breaking backward or forward compatibility, please clary.
Why are the changes needed?
Describe why the changes are necessary.
Does this PR introduce any user-facing change?
If yes, please clarify the previous behavior and the change this PR proposes.
How was this patch tested?
Please detail how the changes were tested, including manual tests and any relevant unit or integration tests.
Contributor's Checklist
Here are some reminders and checklists before/when submitting your pull request, please check them:
make installcheckmake -C src/test installcheck-cbdb-parallelcloudberrydb/devteam for review and approval when your PR is ready🥳