Skip to content

list: fix IsSet check for global root flag#5207

Open
RedMakeUp wants to merge 1 commit intoopencontainers:mainfrom
RedMakeUp:5203-fix-list-isset-root
Open

list: fix IsSet check for global root flag#5207
RedMakeUp wants to merge 1 commit intoopencontainers:mainfrom
RedMakeUp:5203-fix-list-isset-root

Conversation

@RedMakeUp
Copy link
Copy Markdown

@RedMakeUp RedMakeUp commented Mar 31, 2026

Summary

Fix runc list erroring with open /run/runc: no such file or directory when the default root directory doesn't exist (i.e., no containers have been created yet).

Two bugs in list.go getContainers():

  • Wrong method: context.IsSet("root") always returns false in subcommand context because root is a global flag. Use context.GlobalIsSet("root") instead.
  • Inverted logic: The condition should be !context.GlobalIsSet("root") — ignore a non-existing default root directory, but report error for an explicitly set non-existent --root.

Fixes #5203

Test plan

Added two integration tests in tests/integration/list.bats:

  • list with non-existent default root succeeds — backs up and removes /run/runc, then runs ROOT="" runc list (which goes through the standard wrapper without --root) to verify the fix returns an empty result instead of erroring.
  • list with empty default root succeeds — ensures runc list works correctly when the default root exists but contains no containers.

Both tests skip for rootless (the default root /run/runc is only used as real root; rootless uses $XDG_RUNTIME_DIR/runc instead, which is auto-created by app.Before).

Verified locally:

  • All three tests pass with the fix
  • list with non-existent default root succeeds fails without the fix

Signed-off-by: RedMakeUp girafeeblue@gmail.com

# When the default root (/run/runc) does not exist, runc list
# should succeed with an empty list rather than error out.
unset XDG_RUNTIME_DIR
[[ ! -d /run/runc ]] || skip "/run/runc exists"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. nit: use [ not [[ here I guess.
  2. Does this test actually runs in some of our CI?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole PR seems LLM-generated so I wouldn't be surprised that it didn't check if the test actually runs. 🤷

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review firstly. I did use an LLM to assist with drafting parts of the testcase! However, I want to clarify that the fix itself, the bug, and the testcase are 100% real and based on a strict production environment I am currently working with.
My Scenario
I am currently upgrading my stack (runc 1.1.2 -> 1.3.4, containerd 1.7.14 -> 1.7.30, kubelet 1.29.3-> 1.32.11)

  • standalone kubelet: I run a standalone kubelet with static pods only (no apiserver), using HostNetwork and HostPID for control plane services.
  • Direct runc Usage: To reduce memory overhead from containerd-shim, I bypass containerd and invoke runc directly to run VM images.
  • Almost everything is under the root user: non-root users can not access my env

The actual problem is that, in my control plane services, I use runc list --format json to list current VMs.
In runc 1.1.2, if no containers were running yet (and thus the default /run/runc didn't exist), it safely returned null and exited 0.
In runc 1.3.4, it throws an error and breaks my workflow.

So I unset the rootless (XDG_RUNTIME_DIR) aspect and the setup_bundle behavior, because I don't care about them in my services, not I blindly rely on the LLM.

I'll rework and adjust the testcase based on your suggestions.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • nit: use [ not [[ here I guess.
  • Does this test actually runs in some of our CI?
  1. Done, switched to [.
  2. In the previous version, the test would skip when /run/runc exists, which could happen in CI. I considered a few approaches:
    • Go unit test (like notify_socket_test.go): deterministic, but felt out of place for a one-off case.
    • unshare -m with a fresh tmpfs on /run: guarantees /run/runc doesn't exist, but adds complexity .
    • Backup and remove /run/runc: simple, uses the standard runc wrapper, and teardown() guarantees restoration even if the test fails.

Went with the third approach. The test now runs in all root CI rounds, and skips only for rootless where this bug doesn't apply (rootless uses $XDG_RUNTIME_DIR/runc which is auto-created by app.Before).

unset XDG_RUNTIME_DIR
[[ ! -d /run/runc ]] || skip "/run/runc exists"

sane_run "$RUNC" list
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can just use runc list here.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We auto-set ROOT (and thus add --root $ROOT) for all of our tests in setup_bundle, and that would break this test AFAICS. (Actually, I'm a little surprised we do that -- I thought we only did this for the rootless tests...)

I think we should test the XDG_RUNTIME_DIR case properly though...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add a comment telling that we use sane_run $RUNC to avoid passing --root here.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We auto-set ROOT (and thus add --root $ROOT) for all of our tests in setup_bundle, and that would break this test AFAICS. (Actually, I'm a little surprised we do that -- I thought we only did this for the rootless tests...)

I think we should test the XDG_RUNTIME_DIR case properly though...

Resolved the setup_bundle issue by using ROOT="" runc list — empty ROOT makes setup_runc_cmdline skip adding --root, so runc uses the default /run/runc. I looked into testing the XDG_RUNTIME_DIR path, but this bug is actually not reachable there:

  • As real root, shouldHonorXDGRuntimeDir() returns false, so XDG_RUNTIME_DIR is ignored entirely.
  • As rootless, app.Before auto-creates $XDG_RUNTIME_DIR/runc via MkdirAll, so os.ReadDir never hits ErrNotExist.

So the tests skip for rootless with a comment explaining this.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add a comment telling that we use sane_run $RUNC to avoid passing --root here.

I found ROOT="" runc list can also work to avoid passing --root, so I've changed to that based on your previous comment. But I wonder which one is better

sane_run "$RUNC" list
[ "$status" -eq 0 ]

sane_run "$RUNC" list --format json
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

[ "$status" -eq 0 ]
[[ "${output}" == "null" ]]

sane_run "$RUNC" list -q
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@RedMakeUp RedMakeUp force-pushed the 5203-fix-list-isset-root branch from 9fd7538 to bedb747 Compare April 1, 2026 17:32
@test "list with non-existent default root succeeds" {
# The default root /run/runc is only used when running as real
# root; rootless uses $XDG_RUNTIME_DIR/runc instead.
[ "$(id -u)" -eq 0 ] || skip "requires real root"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[ "$(id -u)" -eq 0 ] || skip "requires real root"
requires root

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@RedMakeUp RedMakeUp force-pushed the 5203-fix-list-isset-root branch 2 times, most recently from 79a8f83 to c68c18e Compare April 1, 2026 18:18
kolyshkin added a commit to kolyshkin/runc that referenced this pull request Apr 1, 2026
Some of runc integration tests may do something that I would not like
when running those on my development laptop. Examples include

 - changing the root mount propagation [1];
 - replacing /root/runc [2];
 - changing the file in /etc (see checkpoint.bats).

Yet it is totally fine to do all that in a throwaway CI environment,
or inside a Docker container.

Introduce a mechanism to skip specific "unsafe" tests unless an
environment variable, RUNC_ALLOW_UNSAFE_TESTS, is set. Use it
from a specific checkpoint/restore test which modifies
/etc/criu/default.conf.

[1]: opencontainers#5200
[2]: opencontainers#5207

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
kolyshkin added a commit to kolyshkin/runc that referenced this pull request Apr 2, 2026
Some of runc integration tests may do something that I would not like
when running those on my development laptop. Examples include

 - changing the root mount propagation [1];
 - replacing /root/runc [2];
 - changing the file in /etc (see checkpoint.bats).

Yet it is totally fine to do all that in a throwaway CI environment,
or inside a Docker container.

Introduce a mechanism to skip specific "unsafe" tests unless an
environment variable, RUNC_ALLOW_UNSAFE_TESTS, is set. Use it
from a specific checkpoint/restore test which modifies
/etc/criu/default.conf.

[1]: opencontainers#5200
[2]: opencontainers#5207

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
@kolyshkin
Copy link
Copy Markdown
Contributor

The test in this PR (which modifies /root/runc) should benefit from the feature from #5212, so if that one is approved and merged, the test here needs to add require unsafe.

@RedMakeUp
Copy link
Copy Markdown
Author

The test in this PR (which modifies /root/runc) should benefit from the feature from #5212, so if that one is approved and merged, the test here needs to add require unsafe.

That sounds great!

kolyshkin added a commit to kolyshkin/runc that referenced this pull request Apr 3, 2026
Some of runc integration tests may do something that I would not like
when running those on my development laptop. Examples include

 - changing the root mount propagation [1];
 - replacing /root/runc [2];
 - changing the file in /etc (see checkpoint.bats).

Yet it is totally fine to do all that in a throwaway CI environment,
or inside a Docker container.

Introduce a mechanism to skip specific "unsafe" tests unless an
environment variable, RUNC_ALLOW_UNSAFE_TESTS, is set. Use it
from a specific checkpoint/restore test which modifies
/etc/criu/default.conf.

[1]: opencontainers#5200
[2]: opencontainers#5207

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
@RedMakeUp RedMakeUp force-pushed the 5203-fix-list-isset-root branch from c68c18e to 3bffef9 Compare April 5, 2026 06:40
context.IsSet("root") always returns false in subcommand context
because "root" is a global flag. Use context.GlobalIsSet instead.
Also fix the inverted logic -- the intent (per the comment and
commit d1fca8e) is to ignore a non-existing default root directory.

Fixes opencontainers#5203

Signed-off-by: RedMakeUp <girafeeblue@gmail.com>
@RedMakeUp RedMakeUp force-pushed the 5203-fix-list-isset-root branch from 3bffef9 to dcdc5ee Compare April 5, 2026 07:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

list: context.IsSet("root") always returns false for global flag, causing error on default root

3 participants