inference: reinfer and track missing code for inlining#59413
Conversation
|
I'm aware of a couple inlining issues, e.g. #58906, and and (though these might not all be strictly just inlining). This does not fix any of them, but we should try to find an example. |
|
No, those are the sort of issues. This is more related to precompile files causing performance to suffer later. (and that first example is already fixed on master, I think by c05b68f) |
f3fc258 to
9b3d3f1
Compare
c4c85f3 to
21158a0
Compare
Compiler/src/typeinfer.jl
Outdated
| return ci_has_source(interp, code) | ||
| end | ||
|
|
||
| function ci_get_source(interp::AbstractInterpreter, code::CodeInstance) |
There was a problem hiding this comment.
I think this definition should only be used for NativeInterpreter, and for AbstractInterpreter it should be defined to perform a lookup to code_cache by default?
The current test failure should also be fixed by this.
There was a problem hiding this comment.
Yes, using codegen_cache felt like a hack I added just for getting the PR started. With your refactoring, I think we should not need this anymore for any interpreter (since the code should now be in the regular local cache)? I'm not really sure there is a point to having separate codegen_cache and local_cache--seems like we can and should just merge those, so that everything done in this AbstractInterpreter is now predictably in local_cache; and optionally also in global_cache.
2894282 to
5b2f8e7
Compare
|
I believe I finally tracked down the last failing test to be an IRInterp problem. While the normal interpreter prepares code for successful inlining with update stmt info, the IRInterp appears that it may not. So since it appears that disabling IRInterp bypasses that bug, it should be possible to make sure IRInterp is doing the correct work in |
5b2f8e7 to
077c3f2
Compare
|
I finally tracked it down even more fully. It is a known implementation problem with IRInterp that it degrades code optimization quality when used, which has had a TODO in the code from #48913, but it was previously hidden from tests by the combination of luck, this bug, #58183, and that we have incomplete test coverage of when exactly inlining must occur. I put a hack in place to ensure IRInterp is used no more nor less after this PR than before, so that it cannot cause any new regressions than already existed. |
077c3f2 to
43944c6
Compare
43944c6 to
f1b6506
Compare
|
Thanks for looking into the test failures.
Could you elaborate what specific fixes you made for this? I thought most of the tests newly marked as |
|
Seems good. It is a little awkward intermediate state, since once this PR is done, we should deprecate
Prohibits IRInterp unless the method body is inlineable. That aligns with the old implementation, before this PR fixed the problems with getting code and caused the worse code generated by IRInterp to be caught by an existing test. |
|
Looks like the other current test failure just needs to be relaxed somewhat on ordering to just make sure flbi is invalidating useflbi: diff --git a/test/precompile.jl b/test/precompile.jl
index 04ec1b2e007..73dbd7b99c6 100644
--- a/test/precompile.jl
+++ b/test/precompile.jl
@@ -1124,9 +1124,9 @@ precompile_test_harness("code caching") do dir
idxv = findnext(==("verify_methods"), invalidations, idxv+1)
end
@test invalidations[idxv-1].def.def.name === :flbi
- idxv = findnext(==("verify_methods"), invalidations, idxv+1)
- @show invalidations
- @test invalidations[idxv-1].def.def.name === :useflbi
+ idxv = findnext(==(invalidations[idxv-1]), invalidations, idxv+1)
+ @test invalidations[idxv-1] == "verify_methods"
+ @test invalidations[idxv-2].def.def.name === :useflbi
m = only(methods(MB.map_nbits))
@test !hasvalid(m.specializations::Core.MethodInstance, world+1) # insert_backedges invalidations also trigger their backedges |
f1b6506 to
5f16afd
Compare
I agree. |
95a8224 to
5a1faf4
Compare
|
d80d173 to
c083fd9
Compare
When code is potentially needed for inlining, but missing, be sure to regenerate it during inference with the correct `ci_meets_requirement` flags (SOURCE_MODE_GET_SOURCE instead of NOT_REQUIRED). This removes the dependence on what had happened to be in the cache when deciding optimization heuristic decisions. To achieve this, this commit comprehensively refactors the data structures for passing source from inference to inlining. Specifically it introduces the `InferredCallResult` type as a centralized data structure for holding source information, which is stored in the `info.call_results` field of `info::MethodMatchInfo`. The `info.call_results` field is managed in the same way as `info.edges`, making the code simpler. Also be sure to use CACHE_MODE_LOCAL, since it needs to actually end up in the local cache (as we know the global cache already has it, but volatile may discard it before inlining). This simplifies the role of many steps, since inference has complete responsibility for making all IPO decisions, and the optimizer and codegen have only the ability to consume and change the code inside of a function. The special invoke resolver for inlining is simplified now by delegating to general resolver code: the special resolver for :invoke-d calls now uses info.result from InvokeCallInfo and delegates to the general resolve_todo function instead of reimplementing the inlining logic. It should be noted that `ConstCallInfo` has been removed. This would be a breaking change for packages like Cthulhu, but since no information is lost, preserving existing functionality should not be impossible. Fix a finalizer optimizer bug: declared type was Nothing, which does not match runtime behavior and results in segfaults if code is reachable. 🤖 Produced with some use of Claude Code
Prevents cache entries being created with wider applicability than actually valid, from global cache :invoke lookup inlining. Delay updating CodeInstance fields until we can fill them all in correctly at once, avoiding torn state updates.
Incomplete work (other than not being in the cache) should be kept out of being reachable from any part of the cache. Only populate the parts that we know the optimizer will unfortunately try to look at.
c083fd9 to
0d19993
Compare
When code is potentially needed for inlining, but missing for any reason, be sure to regenerate it during inference with the correct
ci_meets_requirementflags (SOURCE_MODE_GET_SOURCE instead of NOT_REQUIRED) so it is prepared for the optimizer if needed.This was supposed to be the correct fix for someone else's bug with missing inlining when expected, but I don't remember what bug it was anymore.
Fixes #59846