model: Add eager-embed embedding model#3602
model: Add eager-embed embedding model#3602Samoed merged 10 commits intoembeddings-benchmark:mainfrom
Conversation
KennethEnevoldsen
left a comment
There was a problem hiding this comment.
A few minor comments - otherwise the submission looks good
KennethEnevoldsen
left a comment
There was a problem hiding this comment.
A few minor comments otherwise this looks good
There was a problem hiding this comment.
Also probably it would be better to integrate your model with sentence transformers
There was a problem hiding this comment.
Do you mean loading the model from sentence transformers instead of from transformers? What do I need to change?
There was a problem hiding this comment.
Yes. This can be complicated. You can see how this was done for other models as example
99d692c to
33fc5b9
Compare
|
@KennethEnevoldsen @Samoed Thanks for your comments, the code is much cleaner now. Implemented most of them and left some questions. Thanks! |
KennethEnevoldsen
left a comment
There was a problem hiding this comment.
I think this is good to merge - @Samoed do you have any remaining issues?
|
@jpbalarini Did you try to encode images and texts together without separation on image/text modalities |
|
@Samoed I did but I was having some breaking changes with the batches when running some tasks (specifically Vidore2ESGReportsHLRetrieval). I rolled back the changes just to check if I had the same issues with the above changes, and it's the same: and then I remembered why I added this (it was because of this bug): #3602 (comment) just in case, here's my try on the unified embeddings method (I get the same error as above when running the benchmark): I assume I must be doing something wrong with how I handle the tensors, but I was debugging this for several hours with no luck so far. |
|
@jpbalarini I've added a fix #3618. Thank you for reporting! |
You're welcome! Let me add the latest changes and rerun the benchmark to see that everything works as expected |
|
@jpbalarini Is this the final version? Have you submitted all the results with the processing of text and images combined? |
62e623e to
569f056
Compare
Yes @Samoed I updated the new results here (and I added vidore v3 too). |
|
Great work! |
Add inference code for eager-embed embedding model.
eager-embed-v1 is a multimodal dense embedding model with a 2560 embed dimension based on Qwen3-VL and finetuned on multiple public datasets.
More info here:
https://huggingface.co/eagerworks/eager-embed-v1
https://github.com/eagerworks/eager-embed
Checklist:
mteb.get_model(model_name, revision)andmteb.get_model_meta(model_name, revision)