Use size based MultiplicativeInverse to speedup sequential access of ReshapedArray#43518
Use size based MultiplicativeInverse to speedup sequential access of ReshapedArray#43518N5N3 merged 6 commits intoJuliaLang:masterfrom
size based MultiplicativeInverse to speedup sequential access of ReshapedArray#43518Conversation
mi to speedup sequential access of ReshapedArraysize based MultiplicativeInverse to speedup sequential access of ReshapedArray
Update reshapedarray.jl
|
this appears to still be a decent performance improvement (& non-conflicted) on my machine is 2ms --> 1.6ms. any interest in picking it back up? |
I believe it is because the sequence of and note that the improvement can get really quite large when dimensionality is high. on e.g. this PR is 3x faster than master |
|
seems reasonable to me! |
|
test error is real, but weird. I reduced it to this: which works on master but segfaults on this PR I doubt this is technically this PR's fault... but I will keep trying to understand why it happens anyway |
|
fixed after #59525 |
|
I hope it is ok that I pushed directly here. I also changed the |
|
Thanks so much for picking this up and exploring the root cause of the improvements @adienes |
This performance difference was found when working on #42736.
Currently, our
ReshapedArrayuse stride basedMultiplicativeInverseto speed up index transformation.For example, for
a::AbstractArray{T,3}andb = vec(a), the index transformation is equivalent to:(All the
strideis replaced with aMultiplicativeInverseto acceleratedivrem)This PR wants to replace the above machinery with:
For random access, they should have the same computational cost. But for sequential access, like
sum(b),sizebased transformation seems faster.To avoid bottleneck from IO, use
reshape(::CartesianIndices, x...)to benchmark:I haven't looked into the reason for this performance difference.
Beside acceleration, this also makes it possible to reuse the
MultiplicativeInversein some cases (like #42736).So I think it might be useful?