Skip to content

fix: Log exception and error in FirstRankPerNode before exiting#1468

Merged
thomasdhc merged 1 commit intomainfrom
athitten/log_error
Mar 10, 2026
Merged

fix: Log exception and error in FirstRankPerNode before exiting#1468
thomasdhc merged 1 commit intomainfrom
athitten/log_error

Conversation

@athitten
Copy link
Contributor

@athitten athitten commented Mar 6, 2026

What does this PR do ?

FirstRankPerNode() silently quit when there was an error/exception making the real error invisible and hard to debug. This PR logs the error before exiting which is useful.

Changelog

  • Add specific line by line info of high level changes in this PR.

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?

If you haven't finished some of the above items you can still open "Draft" PR.

Additional Information

  • Related to # (issue)

Signed-off-by: Abhishree <abhishreetm@gmail.com>
@copy-pr-bot
Copy link

copy-pr-bot bot commented Mar 6, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@athitten athitten changed the title Log exception and error in FirstRankPerNode before exiting fix: Log exception and error in FirstRankPerNode before exiting Mar 6, 2026
Copy link
Contributor

@HuiyingLi HuiyingLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks!

@HuiyingLi
Copy link
Contributor

/ok to test 76f6dd5

Copy link
Contributor

@thomasdhc thomasdhc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fast merge skip codecov

@thomasdhc thomasdhc merged commit 0e9c11e into main Mar 10, 2026
50 of 55 checks passed
@thomasdhc thomasdhc deleted the athitten/log_error branch March 10, 2026 13:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants