Skip to content

ggml-cuda : add rope f16, restore performance with parallel decoding#3272

Merged
ggerganov merged 4 commits intocustom-attention-maskfrom
cam-cuda-2
Sep 20, 2023
Merged

ggml-cuda : add rope f16, restore performance with parallel decoding#3272
ggerganov merged 4 commits intocustom-attention-maskfrom
cam-cuda-2

Commits

Commits on Sep 19, 2023

Commits on Sep 20, 2023