Skip to content

Look into integrating mteb with huggingfaces community evals #4055

@KennethEnevoldsen

Description

@KennethEnevoldsen

Currently it seems to be mostly relevant for generative models (blog), but talked with @tomaarsen and it sounds like we could do something similar but where mteb replaces inspect AI as the runner.

Image

I would say that we need the following:

    1. Include the required metadata in the dataset repo (eq. to the eval.yaml). I suppose we need the metadata as well as any additional settings such as input_column_name, label_column_name etc. - we could of course also allow a script itself as the config. We could even allow a "mteb version" tag for debugging.
    1. We need to make this format loadable into MTEB
    1. We need to a way for the community to push results

(the leaderboard in this case will only be a task-specific leaderboard, not for a full benchmark)

fixing 1 and 2 would also allow us to convert a lot of our code into metadata files that can simply be loaded in

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs discussionThis issue is still being discussed. It would be pre-mature to implement it.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions