Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
xenova
left a comment
There was a problem hiding this comment.
Very exciting PR! 🙌 Just a quick review from scanning the PR briefly.
| // Hub utilities for cache and file management | ||
| export { get_file_metadata } from './utils/hub.js'; | ||
| export { get_files } from './utils/hub/get_files.js'; | ||
| export { get_model_files } from './utils/hub/get_model_files.js'; | ||
| export { get_tokenizer_files } from './utils/hub/get_tokenizer_files.js'; | ||
| export { get_processor_files } from './utils/hub/get_processor_files.js'; | ||
| export { is_cached } from './utils/hub/is_cached.js'; | ||
|
|
There was a problem hiding this comment.
can we find a way to export a single Object/static class, which encapsulates all these methods?
e.g.,
import { ClassName } from '@huggingface/transformers'; // no idea what to call it yet haha
const x = await ClassName.get_files(...);I feel like this could be more future proof.
No idea what to call this wrapper class, but something along the lines of CacheRegistry, FileRegistry, HubRegistry could work? Maybe we can find a similar abstraction in the transformers/huggingface_hub library?
There was a problem hiding this comment.
Good point. I think the perfect solution would be to have a different pattern for the pipeline:
const textGenerator = new Pipeline({
"text-generation",
"onnx-community/gemma-3-270m-it-ONNX",
{
dtype: "fp32",
device: "webgpu",
}
});
const isCached = await textGenerator.isCached();
const files = await textGenerator.getFiles();
await textGenerator.load({ progress_callback: console.log })
const output = await textGenerator.generate([
{
role: "system",
content: "You are a helpful assistant",
},
{
role: "user",
content: "How are you?",
},
], {
max_new_tokens: 1024,
})
Then those methods would be on the pipe. But I definitely see your point. The whole feature should be encapsuled in one export. I'll think of a cleaner solution.
There was a problem hiding this comment.
| if (!foundInMapping) { | ||
| if (config.is_encoder_decoder) { | ||
| const modelName = config.model_type; | ||
| if (['whisper', 'vision-encoder-decoder'].includes(modelName)) { | ||
| modelType = MODEL_TYPES.Vision2Seq; | ||
| } else if (modelName === 'musicgen') { | ||
| modelType = MODEL_TYPES.Musicgen; | ||
| } else { | ||
| modelType = MODEL_TYPES.Seq2Seq; | ||
| } | ||
| } else { | ||
| if (architectures.some((arch) => arch.includes('CausalLM') || arch.includes('LMHead'))) { | ||
| modelType = MODEL_TYPES.DecoderOnly; | ||
| } else { | ||
| modelType = MODEL_TYPES.EncoderOnly; | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
can we check which models are not correctly set? Ideally, we shouldn't need manual checking here.
There was a problem hiding this comment.
Yes, we can check! I added logging to detect models not in MODEL_TYPE_MAPPING:
console.warn(
`[get_model_files] Architecture(s) not found in MODEL_TYPE_MAPPING: [${archList}] ` +
`for model type '${config.model_type}'. Using heuristic detection. ` +
`Consider adding to packages/transformers/src/models/registry.js`
);
I also improved the lookup to check model_type before falling back to heuristics. This fixed models like modnet (no architectures array) and briaai/RMBG-1.4 which were triggering the fallback unnecessarily.
| // Note: generation_config.json is only loaded for generation models (e.g., T5ForConditionalGeneration) | ||
| // not for base models (e.g., T5Model). Since we can't determine the specific class here, | ||
| // we include it as it's loaded for most use cases. |
There was a problem hiding this comment.
if generation_config.json isn't found, do we throw an error, or gracefully ignore it? 👀
Luckily, as you said, users very rarely load the non-generation variants of these models.
There was a problem hiding this comment.
Good question. I mean I wouldn't throw an error. If there is no generation_config.json I don't see a problem. It would expect a file, but the expected fize will be 0. So the total would still be correct. On the progress callback it will then load all the other files up to 100%, then its done. I think its fine.
There was a problem hiding this comment.
In fact, its better to add a file if we are not sure. The only downside could be a 404 on the metadata check. But I dont think thats a problem.
…tion that does not check for tokenizer files or processor files if the task does not use them
Co-authored-by: Joshua Lochner <admin@xenova.com>
Improved Download Progress Tracking
Problem
Transformers.js couldn't reliably track total download progress because:
Solution
New Exported Functions
get_files(): Determines required files before downloadingget_model_files()/get_tokenizer_files()/get_processor_files(): Helper functions to identify files for each componentget_file_metadata(): Fetches file metadata using Range requests without downloading full contentfromCacheboolean to identify cached filesis_cached(): Checks if all files from a model are already in cacheEnhanced Progress Tracking
readResponse()withexpectedSize: Falls back to metadata whencontent-lengthheader is missingtotal_progresscallback: Provides aggregate progress across all filesReview
One thing I am not super confident is the get_model_files function. I tried to test it with different model architectures, but maybe I missed some that load files that are not in that function. @xenova, could you smoke-test some models and write mie the models that fail?
Easiest way to do that is: