Skip to content

Expose repetition-aware CT2 decoding options in inference engine#131

Open
Legedith wants to merge 1 commit into
AI4Bharat:mainfrom
Legedith:codex/ct2-repetition-controls
Open

Expose repetition-aware CT2 decoding options in inference engine#131
Legedith wants to merge 1 commit into
AI4Bharat:mainfrom
Legedith:codex/ct2-repetition-controls

Conversation

@Legedith
Copy link
Copy Markdown

Expose repetition-aware CTranslate2 decoding options in inference/engine.py.

What changed

  • Added repetition_penalty and no_repeat_ngram_size to Model.
  • Validated both values before use.
  • Forwarded both options to ctranslate2.Translator.translate_batch().
  • Kept fairseq behavior unchanged and reject CT2-only decoding options there.
  • Centralized CT2 decoding kwargs in one place.

Why

This makes repetition control available during CTranslate2 inference and helps reduce repeated words or phrases in generated translations.

Notes

  • Defaults preserve existing behavior.
  • Callers that instantiate Model(...) positionally should verify argument order.
  • The change only affects the CTranslate2 path unless non-default repetition options are passed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant