Skip to content
This repository was archived by the owner on Jul 4, 2025. It is now read-only.
This repository was archived by the owner on Jul 4, 2025. It is now read-only.

epic: Implement new Model Folder and model.yaml #1154

Description

@dan-menlo

Goal

  • We should have a model folder that is able to handle different models
    • Built-in models (e.g. janhq/llama3:7b-tensorrt-llm)
    • Huggingface GGUF repos with multiple quants (e.g. bartowski/llama3-gguf)
    • Huggingface specific GGUF (may have multiple from same directory)
    • In future: Nvidia NGC or TensorRT Cloud
  • Do we use sub-folders?
  • How does model.yaml work?
  • Model detection should not depend on model folder

Tasklist

Decisions

Bugs

Edge Cases

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions