list: fix IsSet check for global root flag#5207
list: fix IsSet check for global root flag#5207RedMakeUp wants to merge 1 commit intoopencontainers:mainfrom
Conversation
tests/integration/list.bats
Outdated
| # When the default root (/run/runc) does not exist, runc list | ||
| # should succeed with an empty list rather than error out. | ||
| unset XDG_RUNTIME_DIR | ||
| [[ ! -d /run/runc ]] || skip "/run/runc exists" |
There was a problem hiding this comment.
- nit: use
[not[[here I guess. - Does this test actually runs in some of our CI?
There was a problem hiding this comment.
This whole PR seems LLM-generated so I wouldn't be surprised that it didn't check if the test actually runs. 🤷
There was a problem hiding this comment.
Thanks for your review firstly. I did use an LLM to assist with drafting parts of the testcase! However, I want to clarify that the fix itself, the bug, and the testcase are 100% real and based on a strict production environment I am currently working with.
My Scenario
I am currently upgrading my stack (runc 1.1.2 -> 1.3.4, containerd 1.7.14 -> 1.7.30, kubelet 1.29.3-> 1.32.11)
- standalone kubelet: I run a standalone kubelet with static pods only (no apiserver), using HostNetwork and HostPID for control plane services.
- Direct
runcUsage: To reduce memory overhead fromcontainerd-shim, I bypass containerd and invokeruncdirectly to run VM images. - Almost everything is under the root user: non-root users can not access my env
The actual problem is that, in my control plane services, I use runc list --format json to list current VMs.
In runc 1.1.2, if no containers were running yet (and thus the default /run/runc didn't exist), it safely returned null and exited 0.
In runc 1.3.4, it throws an error and breaks my workflow.
So I unset the rootless (XDG_RUNTIME_DIR) aspect and the setup_bundle behavior, because I don't care about them in my services, not I blindly rely on the LLM.
I'll rework and adjust the testcase based on your suggestions.
There was a problem hiding this comment.
- nit: use
[not[[here I guess.- Does this test actually runs in some of our CI?
- Done, switched to [.
- In the previous version, the test would skip when /run/runc exists, which could happen in CI. I considered a few approaches:
- Go unit test (like notify_socket_test.go): deterministic, but felt out of place for a one-off case.
- unshare -m with a fresh tmpfs on /run: guarantees /run/runc doesn't exist, but adds complexity .
- Backup and remove /run/runc: simple, uses the standard runc wrapper, and teardown() guarantees restoration even if the test fails.
Went with the third approach. The test now runs in all root CI rounds, and skips only for rootless where this bug doesn't apply (rootless uses $XDG_RUNTIME_DIR/runc which is auto-created by app.Before).
tests/integration/list.bats
Outdated
| unset XDG_RUNTIME_DIR | ||
| [[ ! -d /run/runc ]] || skip "/run/runc exists" | ||
|
|
||
| sane_run "$RUNC" list |
There was a problem hiding this comment.
I think you can just use runc list here.
There was a problem hiding this comment.
We auto-set ROOT (and thus add --root $ROOT) for all of our tests in setup_bundle, and that would break this test AFAICS. (Actually, I'm a little surprised we do that -- I thought we only did this for the rootless tests...)
I think we should test the XDG_RUNTIME_DIR case properly though...
There was a problem hiding this comment.
I'd add a comment telling that we use sane_run $RUNC to avoid passing --root here.
There was a problem hiding this comment.
We auto-set
ROOT(and thus add--root $ROOT) for all of our tests insetup_bundle, and that would break this test AFAICS. (Actually, I'm a little surprised we do that -- I thought we only did this for the rootless tests...)I think we should test the
XDG_RUNTIME_DIRcase properly though...
Resolved the setup_bundle issue by using ROOT="" runc list — empty ROOT makes setup_runc_cmdline skip adding --root, so runc uses the default /run/runc. I looked into testing the XDG_RUNTIME_DIR path, but this bug is actually not reachable there:
- As real root, shouldHonorXDGRuntimeDir() returns false, so XDG_RUNTIME_DIR is ignored entirely.
- As rootless, app.Before auto-creates $XDG_RUNTIME_DIR/runc via MkdirAll, so os.ReadDir never hits ErrNotExist.
So the tests skip for rootless with a comment explaining this.
There was a problem hiding this comment.
I'd add a comment telling that we use
sane_run $RUNCto avoid passing--roothere.
I found ROOT="" runc list can also work to avoid passing --root, so I've changed to that based on your previous comment. But I wonder which one is better
tests/integration/list.bats
Outdated
| sane_run "$RUNC" list | ||
| [ "$status" -eq 0 ] | ||
|
|
||
| sane_run "$RUNC" list --format json |
tests/integration/list.bats
Outdated
| [ "$status" -eq 0 ] | ||
| [[ "${output}" == "null" ]] | ||
|
|
||
| sane_run "$RUNC" list -q |
9fd7538 to
bedb747
Compare
tests/integration/list.bats
Outdated
| @test "list with non-existent default root succeeds" { | ||
| # The default root /run/runc is only used when running as real | ||
| # root; rootless uses $XDG_RUNTIME_DIR/runc instead. | ||
| [ "$(id -u)" -eq 0 ] || skip "requires real root" |
There was a problem hiding this comment.
| [ "$(id -u)" -eq 0 ] || skip "requires real root" | |
| requires root |
79a8f83 to
c68c18e
Compare
Some of runc integration tests may do something that I would not like when running those on my development laptop. Examples include - changing the root mount propagation [1]; - replacing /root/runc [2]; - changing the file in /etc (see checkpoint.bats). Yet it is totally fine to do all that in a throwaway CI environment, or inside a Docker container. Introduce a mechanism to skip specific "unsafe" tests unless an environment variable, RUNC_ALLOW_UNSAFE_TESTS, is set. Use it from a specific checkpoint/restore test which modifies /etc/criu/default.conf. [1]: opencontainers#5200 [2]: opencontainers#5207 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Some of runc integration tests may do something that I would not like when running those on my development laptop. Examples include - changing the root mount propagation [1]; - replacing /root/runc [2]; - changing the file in /etc (see checkpoint.bats). Yet it is totally fine to do all that in a throwaway CI environment, or inside a Docker container. Introduce a mechanism to skip specific "unsafe" tests unless an environment variable, RUNC_ALLOW_UNSAFE_TESTS, is set. Use it from a specific checkpoint/restore test which modifies /etc/criu/default.conf. [1]: opencontainers#5200 [2]: opencontainers#5207 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
|
The test in this PR (which modifies /root/runc) should benefit from the feature from #5212, so if that one is approved and merged, the test here needs to add |
That sounds great! |
Some of runc integration tests may do something that I would not like when running those on my development laptop. Examples include - changing the root mount propagation [1]; - replacing /root/runc [2]; - changing the file in /etc (see checkpoint.bats). Yet it is totally fine to do all that in a throwaway CI environment, or inside a Docker container. Introduce a mechanism to skip specific "unsafe" tests unless an environment variable, RUNC_ALLOW_UNSAFE_TESTS, is set. Use it from a specific checkpoint/restore test which modifies /etc/criu/default.conf. [1]: opencontainers#5200 [2]: opencontainers#5207 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
c68c18e to
3bffef9
Compare
context.IsSet("root") always returns false in subcommand context
because "root" is a global flag. Use context.GlobalIsSet instead.
Also fix the inverted logic -- the intent (per the comment and
commit d1fca8e) is to ignore a non-existing default root directory.
Fixes opencontainers#5203
Signed-off-by: RedMakeUp <girafeeblue@gmail.com>
3bffef9 to
dcdc5ee
Compare
Summary
Fix
runc listerroring withopen /run/runc: no such file or directorywhen the default root directory doesn't exist (i.e., no containers have been created yet).Two bugs in
list.gogetContainers():context.IsSet("root")always returnsfalsein subcommand context becauserootis a global flag. Usecontext.GlobalIsSet("root")instead.!context.GlobalIsSet("root")— ignore a non-existing default root directory, but report error for an explicitly set non-existent--root.Fixes #5203
Test plan
Added two integration tests in
tests/integration/list.bats:list with non-existent default root succeeds— backs up and removes/run/runc, then runsROOT="" runc list(which goes through the standard wrapper without--root) to verify the fix returns an empty result instead of erroring.list with empty default root succeeds— ensuresrunc listworks correctly when the default root exists but contains no containers.Both tests skip for rootless (the default root
/run/runcis only used as real root; rootless uses$XDG_RUNTIME_DIR/runcinstead, which is auto-created byapp.Before).Verified locally:
list with non-existent default root succeedsfails without the fixSigned-off-by: RedMakeUp girafeeblue@gmail.com