Streaming Sortformer release PR03: NeMo documentations and tutorial notebook#14388
Streaming Sortformer release PR03: NeMo documentations and tutorial notebook#14388tango4j merged 24 commits intoNVIDIA-NeMo:mainfrom
Conversation
Signed-off-by: taejinp <tango4j@gmail.com>
There was a problem hiding this comment.
Pull Request Overview
This PR adds documentation for the streaming version of the Sortformer diarization model to NeMo's ASR speaker diarization documentation. The streaming Sortformer enables real-time speaker diarization by processing audio in chunks while maintaining speaker identity through an Arrival-Order Speaker Cache (AOSC).
Key changes include:
- Addition of streaming Sortformer architecture explanation and visual aids
- Documentation of training and inference procedures for streaming mode
- Complete configuration reference for streaming Sortformer models
Reviewed Changes
Copilot reviewed 3 out of 5 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| docs/source/asr/speaker_diarization/results.rst | Adds training and inference commands for streaming Sortformer diarizer |
| docs/source/asr/speaker_diarization/models.rst | Introduces streaming Sortformer concept with architectural diagrams and explanations |
| docs/source/asr/speaker_diarization/configs.rst | Provides comprehensive configuration documentation for streaming Sortformer training |
Comments suppressed due to low confidence (1)
docs/source/asr/speaker_diarization/configs.rst:234
- [nitpick] The model name 'StreamingSortFormerDiarizer' uses inconsistent capitalization compared to 'Sortformer' used elsewhere in the documentation. Consider using 'StreamingSortformerDiarizer' for consistency.
name: "StreamingSortFormerDiarizer"
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: ipmedenn <65592416+ipmedenn@users.noreply.github.com>
Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: taejinp <tango4j@gmail.com>
…nd modules Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: tango4j <tango4j@users.noreply.github.com>
Signed-off-by: taejinp <tango4j@gmail.com>
…m/tango4j/NeMo into streaming_sortformer_pr2_unittest
…t' into streaming_sortformer_docs
Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: taejinp <tango4j@gmail.com>
…j/NeMo into streaming_sortformer_docs
|
@ko3n1g Now that streaming Sortformer V2 has been released, https://huggingface.co/nvidia/diar_streaming_sortformer_4spk-v2 |
|
Hi, I am having link error that I haven't touched upon. Can we disable it or skip this ..?
|
…otebook (NVIDIA-NeMo#14388) * Adding streaming sortformer images and descriptions Signed-off-by: taejinp <tango4j@gmail.com> * Update docs/source/asr/speaker_diarization/models.rst Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Taejin Park <tango4j@gmail.com> * updating docs for streaming sortformer Signed-off-by: ipmedenn <65592416+ipmedenn@users.noreply.github.com> * Adding GIF images Signed-off-by: taejinp <tango4j@gmail.com> * Uploading bugfixes, refactored vars and yamlfile name changes Signed-off-by: taejinp <tango4j@gmail.com> * Adding the missing offline pp yamls Signed-off-by: taejinp <tango4j@gmail.com> * Streaming Sortformer release PR02: unit tests for sortformer models and modules Signed-off-by: taejinp <tango4j@gmail.com> * Apply isort and black reformatting Signed-off-by: tango4j <tango4j@users.noreply.github.com> * Resolved CODE QL and test issues Signed-off-by: taejinp <tango4j@gmail.com> * Added tutorial notebook Signed-off-by: taejinp <tango4j@gmail.com> * Finalized the tutorial notebook Signed-off-by: taejinp <tango4j@gmail.com> * updating docs Signed-off-by: ipmedenn <65592416+ipmedenn@users.noreply.github.com> * updating sortformer.png Signed-off-by: ipmedenn <65592416+ipmedenn@users.noreply.github.com> * updating streaming sortformer inference tutorial Signed-off-by: ipmedenn <65592416+ipmedenn@users.noreply.github.com> * minor changes in streaming params validation Signed-off-by: ipmedenn <65592416+ipmedenn@users.noreply.github.com> * updating sortformer animations in docs Signed-off-by: ipmedenn <65592416+ipmedenn@users.noreply.github.com> * updating streaming sortformer train configs Signed-off-by: ipmedenn <65592416+ipmedenn@users.noreply.github.com> --------- Signed-off-by: taejinp <tango4j@gmail.com> Signed-off-by: Taejin Park <tango4j@gmail.com> Signed-off-by: ipmedenn <65592416+ipmedenn@users.noreply.github.com> Signed-off-by: tango4j <tango4j@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: ipmedenn <65592416+ipmedenn@users.noreply.github.com> Co-authored-by: tango4j <tango4j@users.noreply.github.com> Signed-off-by: Guyue Huang <guyueh@nvidia.com>


What does this PR do ?
This PR is adding NeMo documentations and tutorial notebook for streaming Sortformer.
Collection: [Note which collection this PR will affect]
ASR/speaker_tasks
Add specific line by line info of high level changes in this PR.
Adding two images
Adding descriptions for streaming Sortformer.
You can potentially add a usage example below
Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in NeMo ASR
Additional Information