Llama2 70b support by bretello · Pull Request #522 · abetlen/llama-cpp-python

bretello · 2023-07-24T13:56:33Z

Add support for llama2 70b (uses n_gqa parameter)

See #488 (comment) and ggml-org/llama.cpp#2276

bretello · 2023-07-24T15:58:29Z

From my limited testing, it seems that the GPU is not being used when providing n_gpu_layers=1, even when compiled with metal. I tried running other models (yet) but I'm probably going to work on these in the coming days.

oobabooga · 2023-07-24T16:07:00Z

I have made a test and was able to get 70b working with GPU acceleration by compiling it like this (probably half of it is redundant):

pip uninstall llama-cpp-python
git clone 'https://github.com/bretello/llama-cpp-python' -b llama2-70b-support
cd llama-cpp-python/
git submodule update --init --recursive
cd vendor/llama.cpp/
make LLAMA_CUBLAS=1 -j8
cd ..
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install . --no-cache-dir

oobabooga · 2023-07-24T16:24:54Z

Maybe rms_norm_eps is also necessary?

ggml-org/llama.cpp#2374

It seems like it currently needs to be manually set to 1e-5 for llama-2 (the default is 1e-6).

bretello · 2023-07-24T17:39:53Z

@oobabooga Rebased on main (v0.1.76), now that's also included.

abetlen · 2023-07-24T17:48:18Z

@bretello thanks for rebasing, I'll merge this in and publish a new release since I don't think it can work without setting that param during initialisation.

oobabooga mentioned this pull request Jul 24, 2023

Add llama-2-70b GGML support oobabooga/text-generation-webui#3285

Merged

add support for llama2 70b

0f09f10

bretello force-pushed the llama2-70b-support branch from fae32b5 to 0f09f10 Compare July 24, 2023 17:39

abetlen merged commit e4431a6 into abetlen:main Jul 24, 2023

bretello deleted the llama2-70b-support branch July 24, 2023 18:05

CorentinWicht mentioned this pull request Sep 8, 2023

Issues loading Facebook-LLaMA2-70B serge-chat/serge#675

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Llama2 70b support#522

Llama2 70b support#522
abetlen merged 1 commit intoabetlen:mainfrom
bretello:llama2-70b-support

bretello commented Jul 24, 2023

Uh oh!

bretello commented Jul 24, 2023

Uh oh!

oobabooga commented Jul 24, 2023

Uh oh!

oobabooga commented Jul 24, 2023 •

edited

Loading

Uh oh!

bretello commented Jul 24, 2023

Uh oh!

abetlen commented Jul 24, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

bretello commented Jul 24, 2023

Uh oh!

bretello commented Jul 24, 2023

Uh oh!

oobabooga commented Jul 24, 2023

Uh oh!

oobabooga commented Jul 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bretello commented Jul 24, 2023

Uh oh!

abetlen commented Jul 24, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

oobabooga commented Jul 24, 2023 •

edited

Loading