Skip to content

fix: softmax in predict() method only on the scores of the selected labels#1

Open
jradola wants to merge 1 commit intocisnlp:mainfrom
jradola:fix-softmax-after-limiting-language-labels
Open

fix: softmax in predict() method only on the scores of the selected labels#1
jradola wants to merge 1 commit intocisnlp:mainfrom
jradola:fix-softmax-after-limiting-language-labels

Conversation

@jradola
Copy link

@jradola jradola commented Dec 10, 2025

Replace
softmax_result = self._softmax(result_vector)[self.language_indices]
with
softmax_result = self._softmax(result_vector[self.language_indices]).

This way we make sure that after restricting the set of labels the assigned probabilities sum up to 1, instead of taking the results at given language indices after softmaxing over all languages.

Example:

import fasttext
from masklid import MaskLID
from huggingface_hub import hf_hub_download

model_path = hf_hub_download(repo_id="cis-lmu/glotlid", filename="model.bin", cache_dir=None)
labels = ['__label__eng_Latn', '__label__pol_Latn']
model = MaskLID(model_path, languages=labels)
model.predict('i am', k=2)

Output before:
(('__label__pol_Latn', '__label__eng_Latn'), array([4.6378150e-06, 2.2964262e-15], dtype=float32))
Output after:
(('__label__pol_Latn', '__label__eng_Latn'), array([1.000000e+00, 4.951526e-10], dtype=float32))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant