there have been cases observed where instead of `v$VER.0-$OS` the systemdVersion returned is just `$VER`, or `$VER-1`.
handle these cases
Signed-off-by: Peter Hunt <pehunt@redhat.com>
Not sure why but the errors from scanner were ignored. Such errors
can happen if open(2) has succeeded but the subsequent read(2) fails.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
For some reason, runc systemd drivers (both v1 and v2) never set
systemd unit property named `CPUQuotaPeriod` (known as
`CPUQuotaPeriodUSec` on dbus and in `systemctl show` output).
Set it, and add a check to all the integration tests. The check is less
than trivial because, when not set, the value is shown as "infinity" but
when set to the same (default) value, shown as "100ms", so in case we
expect 100ms (period = 100000 us), we have to _also_ check for
"infinity".
[v2: add systemd version checks since CPUQuotaPeriod requires v242+]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
When testing GetCgroupMounts, the map data is supposed to be obtained
from /proc/self/cgroup, but since we're mocking things, we provide
our own map.
Unfortunately, not all controllers existing in mountinfos were listed.
Also, "name=systemd" needs special handling, so add it.
The controllers added were:
* for fedoraMountinfo case: name=systemd
* for systemdMountinfo case: name=systemd, net_prio
* for bedrockMountinfo case: name=systemd, net_prio, pids
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
In most project, "utils" is a big mess, and this is not an exception.
Try to clean it up a bit by moving cgroup v1 specific code to a separate
source file.
There are no code changes in this commit, just moving it from one file
to another.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This function is cgroupv1-specific, is only used once, and its name
is very close to the name of another function, FindCgroupMountpoint.
Inline it into the (only) caller.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This function is only called from cgroupv1 code, so there is no need
for it to implement cgroupv2 stuff.
Make it v1-specific, and panic if it is called from v2 code (since this
is an internal function, the panic would mean incorrect runc code).
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
It's bad and wrong to use these functions for any cgroupv2 code,
and there are no existing users (in runc, at least).
Make them return an error in such case.
Also, remove the cgroupv2-specific handling from
findCgroupMountpointAndRootFromReader().
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This function should not really be used for cgroupv2 code.
Currently it is used in kubernetes code, so we can't remove
the v2 case yet.
Add a TODO item to remove v2 code once kubernetes is converted
to not use it, and separate out v1 code.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This function is not used and were never used in any cgroupv2 code.
To have it stay that way, let it return error in case it's called
for v2.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This fixes a few cases of accessing m.paths map directly without holding
the mutex lock.
Fixes: 9087f2e82
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Since commit 714c91e9f7, method GetPaths() should only be used
for saving container state. For other uses, we have a new method,
Path(), which is cleaner.
Fix GetPaths() usage introduced by recent commits 859a780d6f and 9087f2e82.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
systemd drivers ignore --cpu-quota during update if the CPU
period was not set earlier.
Fixed by adding the default for the period.
The test will be added by the following commit.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This (and the converting function) is only used by one of the four
cgroup drivers. The other three do some checking and conversion in
place, so let the fs2 do the same.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
The code that adds CpuQuotaPerSecUSec is the same in v1 and v2
systemd cgroup driver. Move it to common.
No functional change.
Note that the comment telling that we always set this property
contradicts with the current code, and therefore it is removed.
[v2: drop cgroupv1-specific comment]
[v3: drop returning error as it's not used]
[v4: remove an obsoleted comment]
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Use r instead of c.Resources for readability. No functional change.
This commit has been brought to you by '<,'>s/c\.Resources\./r./g
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
When we use cgroup with systemd driver, the cgroup path will be auto removed
by systemd when all processes exited. So we should check cgroup path exists
when we access the cgroup path, for example in `kill/ps`, or else we will
got an error.
Signed-off-by: lifubang <lifubang@acmcoder.com>
This is a regression from commit 1d4ccc8e0. We only need to enable
kernel memory accounting once, from the (*legacyManager*).Apply(),
and there is no need to do it in (*legacyManager*).Set().
While at it, rename the method to better reflect what it's doing.
This saves 1 call to mountinfo parser.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Commit 4e65e0e90a added a check for cpu shares. Apparently, the
kernel allows to set a value higher than max or lower than min without
an error, but the value read back is always within the limits.
The check (which was later moved out to a separate CheckCpushares()
function) is always performed after setting the cpu shares, so let's
move it to the very place where it is set.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
1. In cases there are no sub-cgroups, a single rmdir should be faster
than iterating through the list of files.
2. Use unix.Rmdir() to save one more syscall since os.Remove() tries
unlink(2) first which fails on a directory, and only then tries
rmdir(2).
3. Re-use rmdir.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
This is a quick-n-dirty fix the regression introduced by commit
06d7c1d, which made it impossible to only set CpuQuota
(without the CpuPeriod). It partially reverts the above commit,
and adds a test case.
The proper fix will follow.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
... and mem+swap is not explicitly set otherwise.
This ensures compatibility with cgroupv1 controller which interprets
things this way.
With this fixed, we can finally enable swap tests for cgroupv2.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
1. Partially revert "CreateCgroupPath: only enable needed controllers"
If we update a resource which did not limited in the beginning,
it will have no effective.
2. Returns err if we use an non enabled controller,
or else the user may feel success, but actually there are no effective.
Signed-off-by: lifubang <lifubang@acmcoder.com>
Commit 18ebc51b3cc3 "Reset Swap when memory is set to unlimited (-1)"
added handling of the case when a user updates the container limits
to set memory to unlimited (-1) but do not set any other limits.
Apparently, in this case, if swap limit was previously set, kernel fails
to set memory.limit_in_bytes to -1 if memory.memsw.limit_in_bytes is
not set to -1.
What the above commit fails to handle correctly is the request when
Memory is set to -1 and MemorySwap is set to some specific limit N
(where N > 0). In this case, the value of N is silently discarded
and MemorySwap is set to -1 instead.
This is wrong thing to do, as the limit set, even if incorrectly,
should not be ignored.
Fix this by only assigning MemorySwap == -1 in case it was not
explicitly set.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Currently, both systemd cgroup drivers (v1 and v2) only set
"TasksMax" unit property if the value > 0, so there is no
way to update the limit to -1 / unlimited / infinity / max.
Since systemd driver is backed by fs driver, and both fs and fs2
set the limit of -1 properly, it works, but systemd still has
the old value:
# runc --systemd-cgroup update $CT --pids-limit 42
# systemctl show runc-$CT.scope | grep TasksMax
TasksMax=42
# cat /sys/fs/cgroup/system.slice/runc-$CT.scope/pids.max
42
# ./runc --systemd-cgroup update $CT --pids-limit -1
# systemctl show runc-$CT.scope | grep TasksMax=
TasksMax=42
# cat /sys/fs/cgroup/system.slice/runc-xx77.scope/pids.max
max
Fix by changing the condition to allow -1 as a valid value.
NOTE other negative values are still being ignored by systemd drivers
(as it was done before). I am not sure whether this is correct, or
should we return an error.
A test case is added.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
1. do not allow to set quota without period or period without quota, as we
won't be able to calculate new value for CPUQuotaPerSecUSec otherwise.
2. do not ignore setting quota to -1 when a period is not set.
3. update the test case accordingly.
Note that systemd value checks will be added in the next commit.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
The function GetClosestMountpointAncestor is not very efficient,
does not really belong to cgroup package, and is only used once
(from fs/cpuset.go).
Remove it, replacing with the implementation based on moby/sys/mountinfo
parser.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>