Skip to content

Different results between cli, stream, and command #3706

@OddLingo

Description

@OddLingo

If I record some audio from my mircophone into a WAV file (using arecord or Audacity) and give that to whisper-cli the results are nearly perfect, regardless of utterance length. I would like to do this in real-time like stream is supposed to do.

But whisper-stream barely picks up one word here and there. It is not that it outputs the wrong word - it is more like it can not hear me at all most of the time. Lots of [BLANK_AUDIO] messages. Perhaps the difference is in the use of the SDL package? Are there tuning parameters?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions