## Goal - We should have a model folder that is able to handle different models - Built-in models (e.g. `janhq/llama3:7b-tensorrt-llm`) - Huggingface GGUF repos with multiple quants (e.g. `bartowski/llama3-gguf`) - Huggingface specific GGUF (may have multiple from same directory) - In future: Nvidia NGC or TensorRT Cloud - Do we use sub-folders? - How does `model.yaml` work? - Model detection should not depend on model folder ## Tasklist - [X] #1300 - [x] #1320 - [x] https://github.com/janhq/cortex.cpp/issues/1121 - [x] https://github.com/janhq/cortex.cpp/issues/1241 ## Decisions - https://github.com/janhq/cortex.cpp/discussions/1113 - https://github.com/janhq/cortex.cpp/discussions/1123 - #1178 - Legacy model folder structure: https://github.com/janhq/jan/issues/3541#issuecomment-2328413473 ## Bugs - [x] https://github.com/janhq/cortex.cpp/issues/1270 - [x] https://github.com/janhq/cortex.cpp/issues/1274 ## Edge Cases - What if we download multiple GGUFs from the same Huggingface repo? - Saved in the same repo folder, with multiple .gguf files (see https://github.com/janhq/cortex.cpp/issues/1320#issuecomment-2387671325) - How does `cortex model update <model>` work? - https://github.com/janhq/cortex.cpp/issues/1121
Goal
janhq/llama3:7b-tensorrt-llm)bartowski/llama3-gguf)model.yamlwork?Tasklist
Decisions
Bugs
cortex pull invalid_urlcreates a model folder #1270Edge Cases
cortex model update <model>work?