Skip to content

toggle to remove buffering from token streaming #56

@Rose22

Description

@Rose22

id like the option to have direct token-by-token streaming instead of kobold's current buffering method, it makes things feel much faster and more responsive

my idea is to have it as a setting you can toggle, between kobold's buffering method and the more direct method (which is what llamacpp does)

also you told me in voice call to tell you to investigate if theres a way to set how many tokens to buffer per chunk

Metadata

Metadata

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions