feat: add a local endpoint type for inference directly from chat-ui#1778
Merged
Conversation
6 tasks
Contributor
Author
|
Something is going wrong in the build step... Found this relevant issue, trying to fix |
Contributor
Author
|
Works well you can do something like it will automatically use your GPU if available and download models to the It's still super rough as it doesn't handle running out of memory gracefully so I'm still working on dealing with this better. I also want to automatically expose any .gguf files in the |
Contributor
Author
|
Merging for now, it works well in local testing! Will update the docs to explain this when I'm done with the quick setup. |
csanz91
pushed a commit
to csanz91/chat-ui
that referenced
this pull request
Apr 7, 2025
…uggingface#1778) * feat: add a local endpoint type running llama.cpp from chat-ui * fix: build image * fix: lock file * wip: try to make it more reliable * feat: load chat template from .gguf file * feat: load gguf models from `models/` folder * fix: default config * feat: make endpoint use chatSession instead of completion * refactor: improve exit handling, exit immediately on second sinal * fix: various fixes to improve reliability when calling multiple models at once * docs: add instructions for adding .gguf files to the models directory
csanz91
pushed a commit
to csanz91/chat-ui
that referenced
this pull request
Apr 24, 2025
…uggingface#1778) * feat: add a local endpoint type running llama.cpp from chat-ui * fix: build image * fix: lock file * wip: try to make it more reliable * feat: load chat template from .gguf file * feat: load gguf models from `models/` folder * fix: default config * feat: make endpoint use chatSession instead of completion * refactor: improve exit handling, exit immediately on second sinal * fix: various fixes to improve reliability when calling multiple models at once * docs: add instructions for adding .gguf files to the models directory
maksym-work
pushed a commit
to siilats/chat-ui
that referenced
this pull request
Jul 2, 2025
…uggingface#1778) * feat: add a local endpoint type running llama.cpp from chat-ui * fix: build image * fix: lock file * wip: try to make it more reliable * feat: load chat template from .gguf file * feat: load gguf models from `models/` folder * fix: default config * feat: make endpoint use chatSession instead of completion * refactor: improve exit handling, exit immediately on second sinal * fix: various fixes to improve reliability when calling multiple models at once * docs: add instructions for adding .gguf files to the models directory
Matsenas
pushed a commit
to Matsenas/chat-ui
that referenced
this pull request
Jul 4, 2025
…uggingface#1778) * feat: add a local endpoint type running llama.cpp from chat-ui * fix: build image * fix: lock file * wip: try to make it more reliable * feat: load chat template from .gguf file * feat: load gguf models from `models/` folder * fix: default config * feat: make endpoint use chatSession instead of completion * refactor: improve exit handling, exit immediately on second sinal * fix: various fixes to improve reliability when calling multiple models at once * docs: add instructions for adding .gguf files to the models directory
Matsenas
pushed a commit
to Matsenas/chat-ui
that referenced
this pull request
Jul 4, 2025
…uggingface#1778) * feat: add a local endpoint type running llama.cpp from chat-ui * fix: build image * fix: lock file * wip: try to make it more reliable * feat: load chat template from .gguf file * feat: load gguf models from `models/` folder * fix: default config * feat: make endpoint use chatSession instead of completion * refactor: improve exit handling, exit immediately on second sinal * fix: various fixes to improve reliability when calling multiple models at once * docs: add instructions for adding .gguf files to the models directory
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Part of #1774
models/as a model ifMODELSis undefined