Skip to content

TEST/COMMON: Fix file descriptor leaks between tests#11222

Closed
guy-ealey-morag wants to merge 8 commits intoopenucx:masterfrom
guy-ealey-morag:fix-fd-leaks
Closed

TEST/COMMON: Fix file descriptor leaks between tests#11222
guy-ealey-morag wants to merge 8 commits intoopenucx:masterfrom
guy-ealey-morag:fix-fd-leaks

Conversation

@guy-ealey-morag
Copy link
Copy Markdown
Contributor

@guy-ealey-morag guy-ealey-morag commented Feb 27, 2026

What?

Fix file descriptor leaks between tests.

@guy-ealey-morag guy-ealey-morag marked this pull request as ready for review March 4, 2026 08:37
Comment thread src/uct/ib/rdmacm/rdmacm_listener.c Outdated
Comment thread test/gtest/common/test.cc
m_state = RUNNING;
} catch (test_skip_exception& e) {
} catch (test_skip_exception &e) {
cleanup();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't want to call cleanup if init() failed (same as we don't call d'tor if c'tor throws)

Copy link
Copy Markdown
Contributor Author

@guy-ealey-morag guy-ealey-morag Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some cases the test_skip_exception is thrown after some initialization already happened, so the cleanup is needed to avoid fd leaks.

I think that the cleanup functions should assume that initialization is not complete and adapt to that (clean only what is initialized, and not assume anything).
It is also possible to check if (m_state != INITIALIZING) to do only part of the cleanup if needed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we able to fix the cases to always throw before init?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to map out when it can happen

Copy link
Copy Markdown
Contributor Author

@guy-ealey-morag guy-ealey-morag Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tvegas1 It looks like there are many cases where the exception is thrown during initialization in a way that can't be be performed during check_skip_test() without a major refactor (or possibly not at all)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, "cleanup functions should assume that initialization is not complete and adapt to that " is also valid option so let's document it in the functions' declaration. I'd prefer to avoid "m_state != INITIALIZING" checks, like in test_log for example, and make the instead make the cleanup function more "robust". like done in test_p2p_am for example.

Comment thread test/gtest/uct/test_stats.cc Outdated
@guy-ealey-morag
Copy link
Copy Markdown
Contributor Author

Reverted the changes to gdaki.c and moved them to #11170 to make CI pass.

tvegas1
tvegas1 previously approved these changes Mar 6, 2026
@openucx openucx deleted a comment from tvegas1 Mar 6, 2026
Comment thread src/uct/ib/mlx5/gdaki/gdaki.c Outdated
Comment thread src/uct/ib/rdmacm/rdmacm_listener.c Outdated
EXPECT_LE(files_count, ucs_global_opts.log_file_rotate + 1);
EXPECT_NE(0, files_count);
ucs_log_init();
if (m_state != INITIALIZING) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also skip it for NEW?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cleanup should be never called when m_state is NEW, I also added an assert in ucs::test::cleanup to enforce that

@guy-ealey-morag
Copy link
Copy Markdown
Contributor Author

The major fd leak fix was merged with another PR, and the rest of the fixes proved to be false-positives because the leak check was placed in TearDownProxy() instead of the destructor ~test_base().
So the changes proposed in this PR are not necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants