toggle to remove buffering from token streaming

id like the option to have direct token-by-token streaming instead of kobold's current buffering method, it makes things feel much faster and more responsive

my idea is to have it as a setting you can toggle, between kobold's buffering method and the more direct method (which is what llamacpp does)

also you told me in voice call to tell you to investigate if theres a way to set how many tokens to buffer per chunk

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

toggle to remove buffering from token streaming #56

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

toggle to remove buffering from token streaming #56

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions