[WIP] Fix CSE side effect and definition extraction#19125
[WIP] Fix CSE side effect and definition extraction#19125briansull merged 1 commit intodotnet:masterfrom
Conversation
src/jit/gentree.cpp
Outdated
| assert(!IS_CSE_INDEX(node->gtCSEnum)); | ||
|
|
||
| return Compiler::WALK_CONTINUE; | ||
| if (node->OperIs(GT_LCL_VAR)) |
There was a problem hiding this comment.
@AndyAyersMS Broken ref count update in existing CSE code? I see no reason to decrement the ref count for GT_LCL_VAR and ignore GT_LCL_FLD.
There was a problem hiding this comment.
Oops, the existing CSE code is actually here:
Lines 281 to 307 in d38774d
There was a problem hiding this comment.
My ambition is to remove incremental ref counting entirely.
|
No framework JIT diffs. Tests (appart from the newly added test) contain a couple of diffs that show similar CSE ordering issues but that have no ill/visible effects. For example https://github.com/dotnet/coreclr/blob/master/tests/src/JIT/Regression/JitBlue/DevDiv_605447/DevDiv_605447.il generates diff https://gist.github.com/mikedn/1f0551dd02c677ac4453aac4a62ef3a7 The newly added test generates diff https://gist.github.com/mikedn/21d772b9cf2fa614276213233cd2c7b9 Though the diffs (with or without PMI) for -mov rax, 0x218AB502980
-mov rsi, gword ptr [rax]
call Test2:M2():int
-movzx rax, word ptr [reloc classVar[0xe2334af8]]
+mov rax, 0x19713FA2980
+mov r8, gword ptr [rax]
+movzx rax, word ptr [reloc classVar[0xe2694af8]]
mov eax, eax
cdq
-idiv rdx:rax, qword ptr [reloc classVar[0xe2334b10]]
-mov rcx, 0x218AB502978
-mov rdx, rsi
+idiv rdx:rax, qword ptr [reloc classVar[0xe2694b10]]
+mov rcx, 0x19713FA2978
+mov rdx, r8
call CORINFO_HELP_CHECKED_ASSIGN_REFBasically the call to |
|
@briansull This is the fix for #18770. I don't know if it will conflict with your work on #8648, it can stay as WIP until you sort that one out. |
When performing CSE on an use both side effects and CSE defs need to be preserved. Their ordering must be preserved as well because side effect trees (as extracted by
gtExtractSideEffList) may contain CSE defs (potential assignments) and uses.However, the current implementation extracts side effects and CSE defs during 2 separate tree traversals and this makes preserving the original order difficult or downright impossible. The implementation contains a workaround for a specific case - a CSE def having a child side effect. Since the side effect tree was already extracted, attempting to also extract the CSE def would result in the side effect tree having multiple uses. The workaround simply rejects CSE in such cases.
Incorrect reordering can still occur as the following example shows:
Performing CSE on use #3 first extracts the LT subtree due to it having an assignment side effect (the assignment just introduced for CSE #3 def). The remaining NE subtree has a CSE def (CSE #2) that will be extracted separately and appended to the side effect list. This yields:
But these 2 trees are now in reverse order so CSE use #1 occurs before CSE def #1.
The only reasonable way to preserve the correct order seems to be extracting side effects and CSE defs at the same time, during a single tree traversal.
gtExtractSideEffListcan be extended rather easily to also look at CSE defs when the already existingGTF_IS_IN_CSEis specified. Additionally, CSE uses can be unmarked during the same traversal, avoiding the need for another traversal that needs to look at removed trees.While working on this I noticed that
optPropagateNonCSEandoptHasNonCSEChildwere not used so I deleted them. I also deletedGTF_PERSISTENT_SIDE_EFFECTS_IN_CSE, it's used only in 2 places and it's IMO more confusing than ORingGTF_IS_IN_CSEexplicitly.Fixes #18770