Skip to content

Use cublas<t>matinvBatched() for N <= 32#739

Merged
cliffburdick merged 1 commit intomainfrom
optimize-inv-operator-for-small-systems
Aug 27, 2024
Merged

Use cublas<t>matinvBatched() for N <= 32#739
cliffburdick merged 1 commit intomainfrom
optimize-inv-operator-for-small-systems

Conversation

@tbensonatl
Copy link
Collaborator

Use the cublasmatinvBatched() family of functions to invert linear systems of size N <= 32. This has two advantages over the more general pair of getrfBatched() and getriBatched() functions:

  1. Higher performance with the single kernel than with split kernels.
  2. The matinv functions support in-place transforms and do not modify the input in the case of out-of-place transforms, so we do not need a temporary input work buffer if the input is a tensor view.

Use the cublas<t>matinvBatched() family of functions to invert linear systems
of size N <= 32. This has two advantages over the more general pair of
getrfBatched() and getriBatched() functions:

1. Higher performance with the single kernel than with split kernels.
2. The matinv functions support in-place transforms and do not modify the
input in the case of out-of-place transforms, so we do not need a temporary
input work buffer if the input is a tensor view.
@tbensonatl tbensonatl self-assigned this Aug 27, 2024
@tbensonatl
Copy link
Collaborator Author

/build

@coveralls
Copy link

Coverage Status

coverage: 93.386% (-0.02%) from 93.406%
when pulling c6ae9fa on optimize-inv-operator-for-small-systems
into 77f2901 on main.

@cliffburdick cliffburdick merged commit d9053d6 into main Aug 27, 2024
@cliffburdick cliffburdick deleted the optimize-inv-operator-for-small-systems branch August 27, 2024 18:29
@cliffburdick
Copy link
Collaborator

/build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants