Skip to content

feat(namespace): implement mount propagation semantics (shared/private/slave/unbindable) for container bind mounts #1849

@fslongjin

Description

@fslongjin

Motivation

Mount propagation controls how mount/unmount events propagate between namespaces and
peer groups. It is the mechanism that allows container runtimes to:

  • Bind-mount host directories into containers (docker run -v /host:/container)
  • Prevent container mounts from leaking back to the host (MS_PRIVATE)
  • Share specific mount points between containers (MS_SHARED)
  • Implement overlay filesystems for container image layers

Without correct mount propagation, runc/containerd cannot set up a container
rootfs correctly.

Current Status

  • kernel/src/process/namespace/mnt.rsbasic mount namespace structure exists
  • kernel/src/process/namespace/propagation.rspartial implementation
  • MS_BIND, MS_REC, MS_SHARED, MS_PRIVATE, MS_SLAVE, MS_UNBINDABLE — flags defined
  • Full propagation event propagation (peer group traversal) — status unclear / incomplete

Scope

Implement and verify the following against Linux 6.6 semantics:

  • MS_PRIVATE — a mount receives no propagation events and sends none
  • MS_SHARED — mounts in the peer group see each other's mount/unmount events
  • MS_SLAVE — receives events from master but does not propagate back
  • MS_UNBINDABLE — like MS_PRIVATE but also cannot be bind-mounted
  • MS_BIND + MS_REC — recursive bind mount of a subtree
  • pivot_root(new_root, put_old) — all Linux error conditions (non-mount-point,
    shared mount, same dir, etc.) must be handled correctly

Implementation Notes

  • Linux 6.6 reference: fs/namespace.cdo_mount(), mount_setattr(),
    propagate_mnt(), propagate_umount().
  • Asterinas reference: kernel/src/fs/vfs/path/mount_namespace.rs.
  • gvisor tests: gvisor/test/syscalls/linux/mount.cc, pivot_root.cc.
  • This work is closely related to the correctness of pivot_root, which is the
    standard mechanism for container rootfs switching.

Acceptance Criteria

  • mount --make-private / followed by a bind mount does not propagate to other namespaces.
  • mount --make-shared /run propagates bind mounts to peer namespaces.
  • MS_BIND | MS_REC correctly bind-mounts a subtree including all sub-mounts.
  • pivot_root works correctly for a tmpfs-based rootfs (standard container setup).
  • gvisor mount.cc and pivot_root.cc tests pass.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions