Is your feature request related to a problem? Please describe.
No, but rather a limitation. The current models available on the gallery are great, but require capable hardware for doing vision.
Describe the solution you'd like
Recently, llama.cpp added support for vision (https://news.ycombinator.com/item?id=43943047), and now it's possible to run much smaller models that are good enough for things like surveillance video analysis in realtime (e.g. using SmolVLM)
Describe alternatives you've considered
I'm currently using Gemma3 as an alternative, but it's very hardware intensive.
Additional context
N/A
Is your feature request related to a problem? Please describe.
No, but rather a limitation. The current models available on the gallery are great, but require capable hardware for doing vision.
Describe the solution you'd like
Recently, llama.cpp added support for vision (https://news.ycombinator.com/item?id=43943047), and now it's possible to run much smaller models that are good enough for things like surveillance video analysis in realtime (e.g. using SmolVLM)
Describe alternatives you've considered
I'm currently using Gemma3 as an alternative, but it's very hardware intensive.
Additional context
N/A