Skip to content

Merge updates of Multi-Talker Parakeet Model, Modules, Dataloader and Utils PR 01#14905

Merged
tango4j merged 38 commits intoNVIDIA-NeMo:mainfrom
KunalDhawan:mt_parakeet_pr01
Oct 15, 2025
Merged

Merge updates of Multi-Talker Parakeet Model, Modules, Dataloader and Utils PR 01#14905
tango4j merged 38 commits intoNVIDIA-NeMo:mainfrom
KunalDhawan:mt_parakeet_pr01

Conversation

@weiqingw4ng
Copy link
Collaborator

@weiqingw4ng weiqingw4ng commented Oct 9, 2025

What does this PR do ?

Add models, modules and dataloader codes.

Collection: ASR-SpeakerTask

Changelog

Added RNNT based streaming multitalker ASR model (parakeet)
Added dataloader code and util files.
Added example files to run streaming multi-talker parakeet.

Usage

python3 <NeMoROOT>/examples/asr/asr_cache_aware_streaming/speech_to_text_multitalker_streaming_infer.py \
          asr_model=nvidia/multitalker-parakeet-streaming-0.6b-v1 \
          diar_model=nvidia/diar_streaming_sortformer_4spk-v2 \
          audio_file=example.wav \
          max_num_of_spks=4 \
          masked_asr=false \
          parallel_speaker_strategy=true \
          att_context_size=[70,13] \
          output_path=./output.json \ # where to save the output seglst file
          print_path=./print_script.sh

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in NeMo SpeechAI ASR

Additional Information

Signed-off-by: Weiqing Wang <weiqingw@nvidia.com>
Signed-off-by: weiqingw4ng <weiqingw4ng@users.noreply.github.com>
@weiqingw4ng weiqingw4ng changed the title initiate PR 01 for MT-Parakeet Merge updates of Multi-Talker Parakeet PR 01 Oct 9, 2025
Copy link
Contributor

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

weiqingw4ng and others added 3 commits October 9, 2025 14:27
Signed-off-by: Weiqing Wang <weiqingw@nvidia.com>
Signed-off-by: Weiqing Wang <weiqingw@nvidia.com>
Signed-off-by: weiqingw4ng <weiqingw4ng@users.noreply.github.com>
weiqingw4ng and others added 3 commits October 9, 2025 15:23
Signed-off-by: Weiqing Wang <weiqingw@nvidia.com>
Signed-off-by: weiqingw4ng <weiqingw4ng@users.noreply.github.com>
@tango4j tango4j added Run CICD and removed Run CICD labels Oct 10, 2025
Signed-off-by: tango4j <tango4j@users.noreply.github.com>
Signed-off-by: Weiqing Wang <weiqingw@nvidia.com>
Signed-off-by: weiqingw4ng <weiqingw4ng@users.noreply.github.com>
@tango4j tango4j requested a review from ipmedenn October 14, 2025 15:40
@tango4j tango4j marked this pull request as ready for review October 14, 2025 15:40
Signed-off-by: Weiqing Wang <weiqingw@nvidia.com>
Signed-off-by: weiqingw4ng <weiqingw4ng@users.noreply.github.com>
Signed-off-by: Weiqing Wang <weiqingw@nvidia.com>
Signed-off-by: tango4j <tango4j@users.noreply.github.com>
Signed-off-by: Weiqing Wang <weiqingw@nvidia.com>
Copy link
Collaborator

@tango4j tango4j left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Approving.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants