This function is cgroupv1-specific, is only used once, and its name
is very close to the name of another function, FindCgroupMountpoint.
Inline it into the (only) caller.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This fixes a few cases of accessing m.paths map directly without holding
the mutex lock.
Fixes: 9087f2e82
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Since commit 714c91e9f7, method GetPaths() should only be used
for saving container state. For other uses, we have a new method,
Path(), which is cleaner.
Fix GetPaths() usage introduced by recent commits 859a780d6f and 9087f2e82.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
When we use cgroup with systemd driver, the cgroup path will be auto removed
by systemd when all processes exited. So we should check cgroup path exists
when we access the cgroup path, for example in `kill/ps`, or else we will
got an error.
Signed-off-by: lifubang <lifubang@acmcoder.com>
Commit 4e65e0e90a added a check for cpu shares. Apparently, the
kernel allows to set a value higher than max or lower than min without
an error, but the value read back is always within the limits.
The check (which was later moved out to a separate CheckCpushares()
function) is always performed after setting the cpu shares, so let's
move it to the very place where it is set.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Commit 18ebc51b3cc3 "Reset Swap when memory is set to unlimited (-1)"
added handling of the case when a user updates the container limits
to set memory to unlimited (-1) but do not set any other limits.
Apparently, in this case, if swap limit was previously set, kernel fails
to set memory.limit_in_bytes to -1 if memory.memsw.limit_in_bytes is
not set to -1.
What the above commit fails to handle correctly is the request when
Memory is set to -1 and MemorySwap is set to some specific limit N
(where N > 0). In this case, the value of N is silently discarded
and MemorySwap is set to -1 instead.
This is wrong thing to do, as the limit set, even if incorrectly,
should not be ignored.
Fix this by only assigning MemorySwap == -1 in case it was not
explicitly set.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
The function GetClosestMountpointAncestor is not very efficient,
does not really belong to cgroup package, and is only used once
(from fs/cpuset.go).
Remove it, replacing with the implementation based on moby/sys/mountinfo
parser.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This function is not very efficient, does not really belong to cgroup
package, and is only used once (from fs/cpuset.go).
Prepare to remove it by replacing with the implementation based on
the parser from github.com/moby/sys/mountinfo parser.
This commit is here to make sure the proposed replacement passes the
unit test.
Funny, but the unit test need to be slightly modified since it
supplies the wrong mountinfo (space as the first character, empty line
at the end).
Validated by
$ go test -v -run Ance
=== RUN TestGetClosestMountpointAncestor
--- PASS: TestGetClosestMountpointAncestor (0.00s)
PASS
ok github.com/opencontainers/runc/libcontainer/cgroups 0.002s
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Now that all of the infrastructure for devices.Emulator is in place, we
can finally implement minimal transition rules for devices cgroups. This
allows for minimal disruption to running containers if a rule update is
requested. Only in very rare circumstances (black-list cgroups and mode
switching) will a clear-all rule be written. As a result, containers
should no longer see spurious errors.
A similar issue affects the cgroupv2 devices setup, but that is a topic
for another time (as the solution is drastically different).
Signed-off-by: Aleksa Sarai <asarai@suse.de>
Making them the same type is simply confusing, but also means that you
could accidentally use one in the wrong context. This eliminates that
problem. This also includes a whole bunch of cleanups for the types
within DeviceRule, so that they can be used more ergonomically.
Signed-off-by: Aleksa Sarai <asarai@suse.de>
These lists have been in the codebase for a very long time, and have
been unused for a large portion of that time -- specconv doesn't
generate them and the only user of these flags has been tests (which
doesn't inspire much confidence).
In addition, we had an incorrect implementation of a white-list policy.
This wasn't exploitable because all of our users explicitly specify
"deny all" as the first rule, but it was a pretty glaring issue that
came from the "feature" that users can select whether they prefer a
white- or black- list. Fix this by always writing a deny-all rule (which
is what our users were doing anyway, to work around this bug).
This is one of many changes needed to clean up the devices cgroup code.
Signed-off-by: Aleksa Sarai <asarai@suse.de>
This is effectively a nicer implementation of the container.isPaused()
helper, but to be used within the cgroup code for handling some fun
issues we have to fix with the systemd cgroup driver.
Signed-off-by: Aleksa Sarai <asarai@suse.de>
This unties the Gordian Knot of using GetPaths in cgroupv2 code.
The problem is, the current code uses GetPaths for three kinds of things:
1. Get all the paths to cgroup v1 controllers to save its state (see
(*linuxContainer).currentState(), (*LinuxFactory).loadState()
methods).
2. Get all the paths to cgroup v1 controllers to have the setns process
enter the proper cgroups in `(*setnsProcess).start()`.
3. Get the path to a specific controller (for example,
`m.GetPaths()["devices"]`).
Now, for cgroup v2 instead of a set of per-controller paths, we have only
one single unified path, and a dedicated function `GetUnifiedPath()` to get it.
This discrepancy between v1 and v2 cgroupManager API leads to the
following problems with the code:
- multiple if/else code blocks that have to treat v1 and v2 separately;
- backward-compatible GetPaths() methods in v2 controllers;
- - repeated writing of the PID into the same cgroup for v2;
Overall, it's hard to write the right code with all this, and the code
that is written is kinda hard to follow.
The solution is to slightly change the API to do the 3 things outlined
above in the same manner for v1 and v2:
1. Use `GetPaths()` for state saving and setns process cgroups entering.
2. Introduce and use Path(subsys string) to obtain a path to a
subsystem. For v2, the argument is ignored and the unified path is
returned.
This commit converts all the controllers to the new API, and modifies
all the users to use it.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
1. Prevent theoretical "concurrent map access" error to m.paths.
2. There is no need to call m.Paths -- we can access m.paths directly.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
In many places (not all of them though) we can use `unix.`
instead of `syscall.` as these are indentical.
In particular, x/sys/unix defines:
```go
type Signal = syscall.Signal
type Errno = syscall.Errno
type SysProcAttr = syscall.SysProcAttr
const ENODEV = syscall.Errno(0x13)
```
and unix.Exec() calls syscall.Exec().
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Make use of errors.Is() and errors.As() where appropriate to check
the underlying error. The biggest motivation is to simplify the code.
The feature requires go 1.13 but since merging #2256 we are already
not supporting go 1.12 (which is an unsupported release anyway).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Using errors.Unwrap() is not the best thing to do, since it returns
nil in case of an error which was not wrapped. More to say,
errors package provides more elegant ways to check for underlying
errors, such as errors.As() and errors.Is().
This reverts commit f8e138855d, reversing
changes made to 6ca9d8e6da.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
`configs.Cgroup` contains the configuration used to create cgroups. This
configuration must be saved to disk, since it's required to restore the
cgroup manager that was used to create the cgroups.
Add method to get cgroup configuration from cgroup Manager to allow API users
save it to disk and restore a cgroup manager later.
fixes#2176
Signed-off-by: Julio Montes <julio.montes@intel.com>
split fs2 package from fs, as mixing up fs and fs2 is very likely to result in
unmaintainable code.
Inspired by containerd/cgroups#109
Fix#2157
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
As the baby step, only unit tests are executed.
Failing tests are currently skipped and will be fixed in follow-up PRs.
Fix#2124
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Implemented `runc ps` for cgroup v2 , using a newly added method `m.GetUnifiedPath()`.
Unlike the v1 implementation that checks `m.GetPaths()["devices"]`, the v2 implementation does not require the device controller to be available.
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
allow to set what subsystems are used by
libcontainer/cgroups/fs.Manager.
subsystemsUnified is used on a system running with cgroups v2 unified
mode.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>