Skip to content

Conversation

@jeremylt
Copy link
Member

@jeremylt jeremylt commented Nov 15, 2024

This PR adds AtPoints to /gpu/[cuda,hip]/shared, which is a blocker for a /gpu/[cuda,hip]/gen AtPoints capability.

Work in progress. The t35* series tests pass, but the t59* series don't all pass.

There seems to be an issue with the 2D/3D transpose interp and grad with multiple elements.

Passing for libCEED t* tests. petsc-bpsswarm isn't passing yet though. There is also some issue with a few Ratel tests where they get stuck, probably on a __syncthreads(). (MMS tests it seems).

@jeremylt jeremylt self-assigned this Nov 15, 2024
@jeremylt jeremylt force-pushed the jeremy/shared-at-points branch 6 times, most recently from 37e9e49 to 514184d Compare November 18, 2024 20:28
@jeremylt
Copy link
Member Author

Yay - it works locally but fails on Nother. Those are always fun to debug

@jeremylt
Copy link
Member Author

Cuda is more betterer now. Need to check thread block sizes setup for Hip for 3D

@jeremylt jeremylt force-pushed the jeremy/shared-at-points branch 7 times, most recently from dcfc06f to ff3d054 Compare November 19, 2024 23:24
@jeremylt
Copy link
Member Author

Ok, now every element other than the first in 3D is wrong for HIP. Progress, but super bizarre how it doesn't seem to behave as I expect. Almost the same logic as CUDA so I have to miss some subtle change between HIP shared and CUDA shared as those two have slightly diverged.

@jeremylt jeremylt force-pushed the jeremy/shared-at-points branch from 88a3d5f to ebcc04c Compare November 21, 2024 19:43
@jeremylt jeremylt force-pushed the jeremy/shared-at-points branch from ebcc04c to 1b3d9bd Compare November 21, 2024 20:20
@jeremylt
Copy link
Member Author

Ok, the kernels work now. We can optimize them in the future

@jeremylt
Copy link
Member Author

Local testing for Ratel passes

@jeremylt jeremylt mentioned this pull request Dec 2, 2024
Copy link
Collaborator

@zatkins-dev zatkins-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good! I have a few clarifying questions, but I think this is good to merge

co-authored-by: zatkins-dev <zach.atkins@colorado.edu>
@jeremylt jeremylt merged commit 290fc47 into main Dec 2, 2024
@jeremylt jeremylt deleted the jeremy/shared-at-points branch December 2, 2024 23:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants