In cgroup v2, when memory and memorySwap set to the same value which is greater than zero,
runc should write zero in `memory.swap.max` to disable swap.
Signed-off-by: lifubang <lifubang@acmcoder.com>
1. Instead of enabling all available controllers, figure out which
ones are required, and only enable those.
2. Amend all setFoo() functions to call isFooSet(). While this might
seem unnecessary, it might actually help to uncover a bug.
Imagine someone:
- adds a cgroup.Resources.CpuFoo setting;
- modifies setCpu() to apply the new setting;
- but forgets to amend isCpuSet() accordingly <-- BUG
In this case, a test case modifying CpuFoo will help
to uncover the BUG. This is the reason why it's added.
This patch *could be* amended by enabling controllers on a best-effort
basis, i.e. :
- do not return an error early if we can't enable some controllers;
- if we fail to enable all controllers at once (usually because one
of them can't be enabled), try enabling them one by one.
Currently this is not implemented, and it's not clear whether this
would be a good way to go or not.
[v2: add/use is${Controller}Set() functions]
[v3: document neededControllers()]
[v4: drop "best-effort" part]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Commit 6905b72154 treats all negative values as "max",
citing cgroup v1 compatibility as a reason. In fact, in
cgroup v1 only -1 is treated as "unlimited", and other
negative values usually calse an error.
Treat -1 as "max", pass other negative values as is
(the error will be returned from the kernel).
Fixes: 6905b72154
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
The resources.MemorySwap field from OCI is memory+swap, while cgroupv2
has a separate swap limit, so subtract memory from the limit (and make
sure values are set and sane).
Make sure to set MemorySwapMax for systemd, too. Since systemd does not
have MemorySwapMax for cgroupv1, it is only needed for v2 driver.
[v2: return -1 on any negative value, add unit test]
[v3: treat any negative value other than -1 as error]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Cgroup v1 kernel doc [1] says:
> We can write "-1" to reset the ``*.limit_in_bytes(unlimited)``.
and cgroup v2 kernel documentation [2] says:
> - If a controller implements an absolute resource guarantee and/or
> limit, the interface files should be named "min" and "max"
> respectively. If a controller implements best effort resource
> guarantee and/or limit, the interface files should be named "low"
> and "high" respectively.
>
> In the above four control files, the special token "max" should be
> used to represent upward infinity for both reading and writing.
Allow -1 value to still be used for v2, converting it to "max"
where it makes sense to do so.
This fixes the following issue:
> runc update test_update --memory-swap -1:
> error while setting cgroup v2: [write /sys/fs/cgroup/machine.slice/runc-cgroups-integration-test.scope/memory.swap.max: invalid argument
> failed to write "-1" to "/sys/fs/cgroup/machine.slice/runc-cgroups-integration-test.scope/memory.swap.max"
> github.com/opencontainers/runc/libcontainer/cgroups/fscommon.WriteFile
> /home/kir/go/src/github.com/opencontainers/runc/libcontainer/cgroups/fscommon/fscommon.go:21
> github.com/opencontainers/runc/libcontainer/cgroups/fs2.setMemory
> /home/kir/go/src/github.com/opencontainers/runc/libcontainer/cgroups/fs2/memory.go:20
> github.com/opencontainers/runc/libcontainer/cgroups/fs2.(*manager).Set
> /home/kir/go/src/github.com/opencontainers/runc/libcontainer/cgroups/fs2/fs2.go:175
> github.com/opencontainers/runc/libcontainer/cgroups/systemd.(*UnifiedManager).Set
> /home/kir/go/src/github.com/opencontainers/runc/libcontainer/cgroups/systemd/unified_hierarchy.go:290
> github.com/opencontainers/runc/libcontainer.(*linuxContainer).Set
> /home/kir/go/src/github.com/opencontainers/runc/libcontainer/container_linux.go:211
[1] linux/Documentation/admin-guide/cgroup-v1/memory.rst
[2] linux/Documentation/admin-guide/cgroup-v2.rst
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
split fs2 package from fs, as mixing up fs and fs2 is very likely to result in
unmaintainable code.
Inspired by containerd/cgroups#109
Fix#2157
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>