Add optimizer support for invoke(f, ::CodeInstance, args...)#60442
Add optimizer support for invoke(f, ::CodeInstance, args...)#60442adienes merged 22 commits intoJuliaLang:masterfrom
invoke(f, ::CodeInstance, args...)#60442Conversation
|
Does this/should this do a world age check? |
So just to be clear, I am not proposing introducing a new mechanism here. The new mechanism was already added for v1.12 in #56660. The only thing this PR does is make it so that the mechanism isn't slow to call. I don't really know what one can or cannot do by messing around with the invoke pointer, presumably arbitrary things, but I'm not sure. Regarding a comparison to opaque closures, I think the idea is that these codeinstances are a bit more 'static', rather than combining runtime data with alternative interpretation. Another difference is that a I didn't make the feature though, that was @Keno, so maybe he can motivate it a bit more if the linked PR and my crappy explanation isn't sufficient.
I had initially assumed that |
|
For the world-age check, how do we do that? It seems that the |
|
Okay, I have
|
invoke(f, ::CodeInstance, args...)invoke(f, ::CodeInstance, args...)
|
I have moved all the testing to This is now quite minimal and hopefully easy to review. |
|
Thank you! Is this backportable to 1.12/1.13? |
|
I think it's ok to backport 1.13. It does open up a bit of a can of worms, but it's in a new and experimental feature. |
…Lang#60442) ~~(I wrote this with assistance from Gemini since I'm not very used to writing LLVM IR)~~ No longer using that code This is an attempt to fix JuliaLang#60441. After bumbling around a bit, it seems that the problem is that `invoke(f, ::CodeInstance, args...)` calls are not turned into `Expr(:invoke` statements in the IR, but remains as `:call`s to the `invoke` builtin which ends up going through the runtime. ~~There's probably a better way to do this, but the way I found was to just detect these builtin calls in the LLVM IR, and send them to `emit_invoke`.~~ I'm now detecting `InvokeCICallInfo`s in the inlining step of the optimizer and turning those into `Expr(:invoke`s. It appears to resolve the issue: ```julia using BenchmarkTools mysin(x::Float64) = sin(x) @Assert mysin(1.0) == sin(1.0) const mysin_ci = Base.specialize_method(Base._which(Tuple{typeof(mysin), Float64})).cache ``` Before this PR: ```julia julia> @Btime invoke(mysin, mysin_ci, x) setup=(x=rand()) 24.952 ns (2 allocations: 32 bytes) 0.7024964043721993 ``` After this PR: ```julia julia> @Btime invoke(mysin, mysin_ci, x) setup=(x=rand()) 4.748 ns (0 allocations: 0 bytes) 0.32283046823183426 ```
~~(I wrote this with assistance from Gemini since I'm not very used to writing LLVM IR)~~ No longer using that code This is an attempt to fix #60441. After bumbling around a bit, it seems that the problem is that `invoke(f, ::CodeInstance, args...)` calls are not turned into `Expr(:invoke` statements in the IR, but remains as `:call`s to the `invoke` builtin which ends up going through the runtime. ~~There's probably a better way to do this, but the way I found was to just detect these builtin calls in the LLVM IR, and send them to `emit_invoke`.~~ I'm now detecting `InvokeCICallInfo`s in the inlining step of the optimizer and turning those into `Expr(:invoke`s. It appears to resolve the issue: ```julia using BenchmarkTools mysin(x::Float64) = sin(x) @Assert mysin(1.0) == sin(1.0) const mysin_ci = Base.specialize_method(Base._which(Tuple{typeof(mysin), Float64})).cache ``` Before this PR: ```julia julia> @Btime invoke(mysin, mysin_ci, x) setup=(x=rand()) 24.952 ns (2 allocations: 32 bytes) 0.7024964043721993 ``` After this PR: ```julia julia> @Btime invoke(mysin, mysin_ci, x) setup=(x=rand()) 4.748 ns (0 allocations: 0 bytes) 0.32283046823183426 ``` (cherry picked from commit e1dda38)
(I wrote this with assistance from Gemini since I'm not very used to writing LLVM IR)No longer using that codeThis is an attempt to fix #60441. After bumbling around a bit, it seems that the problem is that
invoke(f, ::CodeInstance, args...)calls are not turned intoExpr(:invokestatements in the IR, but remains as:calls to theinvokebuiltin which ends up going through the runtime.There's probably a better way to do this, but the way I found was to just detect these builtin calls in the LLVM IR, and send them toI'm now detectingemit_invoke.InvokeCICallInfos in the inlining step of the optimizer and turning those intoExpr(:invokes.It appears to resolve the issue:
Before this PR:
After this PR: