Skip to content

parakeet-unified-en-0.6b generates ?? for words with non-English pronounce #15657

@livefantasia

Description

@livefantasia

Describe the bug

Sometime the English speech can have non-English words, such as Spain, French etc. parakeet-unified-en-0.6b generates ?? and broken words, while previous parakeet model such as parakeet-0.6b-v2 can handle it more smoothly.

Steps/Code to reproduce bug

Run the model in streaming mode, with and audio (e.g. https://www.youtube.com/watch?v=mV15tm_5048 ) have the Pope's name in non-English pronounce - "Cardinal Robert Prevost / Prévost", parakeet-unified-en-0.6b generates Pr ⁇ vos, while parakeet-0.6b-v2 generates Prevost correctly.
It may be a general error, not for this special word.

Expected behavior

Generates correct word Prevost, or something with similar pronounce, not broken word "Pr ⁇ vos"

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions