If you think of “the set of mounts” as being (at least) a set of (device, mount point) pairs, rather than merely a set of mount points, then it starts to look a lot like the fstab
or the output of the mount
command (with no arguments), albeit without the additional information about flags and options (e.g. rw
, nosuid
, etc.).
Such a “set of mounts” provides complete information about what filesystems are mounted where. This is, by definition, the “mount namespace” of a process. Once you go from the traditional situation of having one global mount namespace to having per-process mount namespaces, additional questions arise when a process fork()
s.
Traditionally, mounting or unmounting a filesystem changed the filesystem as seen by all processes.
With per-process mount namespaces, it is possible for a child process to have a different mount namespace from its parent. A question now arises:
Should changes to the mount namespace made by the child propagate back to the parent?
Clearly, this functionality must at least be supported and, indeed, must probably be the default. Otherwise, launching the mount
command itself would effect no change (since the filesystem as seen by the parent shell would be unaffected).
Equally clearly, it must also be possible for this necessary propagation to be suppressed, otherwise we can never create a child process whose mount namespace differs from its parent, and we have one global mount namespace again (the filesystem as seen by init
).
Thus, we must decide on fork()
whether the child process gets its own copy of the data about mounted filesystems from the parent, which it can change without affecting the parent, or gets a pointer to the same data structures as the, which it can change (necessary for changes to propagate back, as when you launch mount
from the shell).
If the CLONE_NEWNS
flag is passed to clone()
or fork()
, the child gets a copy of its parent's mounted filesystem data, which it can change without affecting the parent's mount namespace. Otherwise, it gets a pointer to the parents data structure, where changes made by the child will be seen by the parent (so the mount
command itself can work).