opencontainers/runc

mirror of https://github.com/opencontainers/runc.git synced 2025-08-01 05:06:52 +03:00

Author	SHA1	Message	Date
Mrunal Patel	406298fdf0	Merge pull request #2466 from kolyshkin/systemd-cpu-quota-period cgroups/systemd: add setting CPUQuotaPeriod prop	2020-06-17 12:03:30 -07:00
Kir Kolyshkin	e751a168dc	cgroups/systemd: add setting CPUQuotaPeriod prop For some reason, runc systemd drivers (both v1 and v2) never set systemd unit property named `CPUQuotaPeriod` (known as `CPUQuotaPeriodUSec` on dbus and in `systemctl show` output). Set it, and add a check to all the integration tests. The check is less than trivial because, when not set, the value is shown as "infinity" but when set to the same (default) value, shown as "100ms", so in case we expect 100ms (period = 100000 us), we have to _also_ check for "infinity". [v2: add systemd version checks since CPUQuotaPeriod requires v242+] Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-06-16 15:48:06 -07:00
Kir Kolyshkin	dd2426d067	libct/cgroups: fix m.paths map access This fixes a few cases of accessing m.paths map directly without holding the mutex lock. Fixes: `9087f2e82` Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-06-15 18:30:16 -07:00
Kir Kolyshkin	5b247e739c	Merge pull request #2338 from lifubang/systemdcgroupv2 fix path error in systemd when stopped LGTMs: @mrunalp @AkihiroSuda	2020-06-15 18:01:13 -07:00
Kir Kolyshkin	8b9646775e	cgroups/systemd: unify adding CpuQuota The code that adds CpuQuotaPerSecUSec is the same in v1 and v2 systemd cgroup driver. Move it to common. No functional change. Note that the comment telling that we always set this property contradicts with the current code, and therefore it is removed. [v2: drop cgroupv1-specific comment] [v3: drop returning error as it's not used] [v4: remove an obsoleted comment] Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-06-09 17:14:43 -07:00
Kir Kolyshkin	2ce20ed158	cgroups/systemd: simplify gen*ResourcesProperties Use r instead of c.Resources for readability. No functional change. This commit has been brought to you by '<,'>s/c\.Resources\./r./g Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-06-08 13:42:09 -07:00
lifubang	9087f2e827	fix path error in systemd when stopped When we use cgroup with systemd driver, the cgroup path will be auto removed by systemd when all processes exited. So we should check cgroup path exists when we access the cgroup path, for example in `kill/ps`, or else we will got an error. Signed-off-by: lifubang <lifubang@acmcoder.com>	2020-06-02 18:17:43 +08:00
Mrunal Patel	332a84581e	Merge pull request #2443 from kolyshkin/kmem-fixup cgroupv1/systemd.Set: don't enable kernel memory acct	2020-05-31 10:04:45 -07:00
Kir Kolyshkin	3fe6e04510	cgroupv1/systemd.Set: don't enable kernel memory acct This is a regression from commit `1d4ccc8e0`. We only need to enable kernel memory accounting once, from the (legacyManager).Apply(), and there is no need to do it in (legacyManager).Set(). While at it, rename the method to better reflect what it's doing. This saves 1 call to mountinfo parser. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-05-29 17:54:50 -07:00
Kir Kolyshkin	3249e2379c	cgroupv1: check cpu shares in place Commit `4e65e0e90a` added a check for cpu shares. Apparently, the kernel allows to set a value higher than max or lower than min without an error, but the value read back is always within the limits. The check (which was later moved out to a separate CheckCpushares() function) is always performed after setting the cpu shares, so let's move it to the very place where it is set. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-05-29 16:46:28 -07:00
Kir Kolyshkin	be5467872d	cgroupv1: minimal fix for cpu quota regression This is a quick-n-dirty fix the regression introduced by commit `06d7c1d`, which made it impossible to only set CpuQuota (without the CpuPeriod). It partially reverts the above commit, and adds a test case. The proper fix will follow. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-05-26 11:02:16 -07:00
Kir Kolyshkin	59897367c4	cgroups/systemd: allow to set -1 as pids.limit Currently, both systemd cgroup drivers (v1 and v2) only set "TasksMax" unit property if the value > 0, so there is no way to update the limit to -1 / unlimited / infinity / max. Since systemd driver is backed by fs driver, and both fs and fs2 set the limit of -1 properly, it works, but systemd still has the old value: # runc --systemd-cgroup update $CT --pids-limit 42 # systemctl show runc-$CT.scope \| grep TasksMax TasksMax=42 # cat /sys/fs/cgroup/system.slice/runc-$CT.scope/pids.max 42 # ./runc --systemd-cgroup update $CT --pids-limit -1 # systemctl show runc-$CT.scope \| grep TasksMax= TasksMax=42 # cat /sys/fs/cgroup/system.slice/runc-xx77.scope/pids.max max Fix by changing the condition to allow -1 as a valid value. NOTE other negative values are still being ignored by systemd drivers (as it was done before). I am not sure whether this is correct, or should we return an error. A test case is added. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-05-20 13:20:04 -07:00
Kir Kolyshkin	06d7c1d261	systemd+cgroupv1: fix updating CPUQuotaPerSecUSec 1. do not allow to set quota without period or period without quota, as we won't be able to calculate new value for CPUQuotaPerSecUSec otherwise. 2. do not ignore setting quota to -1 when a period is not set. 3. update the test case accordingly. Note that systemd value checks will be added in the next commit. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-05-20 13:17:18 -07:00
Aleksa Sarai	b810da1490	cgroups: systemd: make use of Device= properties It seems we missed that systemd added support for the devices cgroup, as a result systemd would actually write an allow-all rule each time you did 'runc update'* if you used the systemd cgroup driver. This is obviously ... bad and was a clear security bug. Luckily the commits which introduced this were never in an actual runc release. So we simply generate the cgroupv1-style rules (which is what systemd's DeviceAllow wants) and default to a deny-all ruleset. Unfortunately it turns out that systemd is susceptible to the same spurrious error failure that we were, so that problem is out of our hands for systemd cgroup users. However, systemd has a similar bug to the one fixed in [1]. It will happily write a disruptive deny-all rule when it is not necessary. Unfortunately, we cannot even use devices.Emulator to generate a minimal set of transition rules because the DBus API is limited (you can only clear or append to the DeviceAllow= list -- so we are forced to always clear it). To work around this, we simply freeze the container during SetUnitProperties. [1]: `afe83489d4` ("cgroupv1: devices: use minimal transition rules with devices.Emulator") Fixes: `1d4ccc8e0c` ("fix data inconsistent when runc update in systemd driven cgroup v1") Fixes: `7682a2b2a5` ("fix data inconsistent when runc update in systemd driven cgroup v2") Signed-off-by: Aleksa Sarai <asarai@suse.de>	2020-05-13 17:43:56 +10:00
Aleksa Sarai	859a780d6f	cgroups: add GetFreezerState() helper to Manager This is effectively a nicer implementation of the container.isPaused() helper, but to be used within the cgroup code for handling some fun issues we have to fix with the systemd cgroup driver. Signed-off-by: Aleksa Sarai <asarai@suse.de>	2020-05-13 17:38:45 +10:00
Kir Kolyshkin	714c91e9f7	Simplify cgroup path handing in v2 via unified API This unties the Gordian Knot of using GetPaths in cgroupv2 code. The problem is, the current code uses GetPaths for three kinds of things: 1. Get all the paths to cgroup v1 controllers to save its state (see (linuxContainer).currentState(), (LinuxFactory).loadState() methods). 2. Get all the paths to cgroup v1 controllers to have the setns process enter the proper cgroups in `(*setnsProcess).start()`. 3. Get the path to a specific controller (for example, `m.GetPaths()["devices"]`). Now, for cgroup v2 instead of a set of per-controller paths, we have only one single unified path, and a dedicated function `GetUnifiedPath()` to get it. This discrepancy between v1 and v2 cgroupManager API leads to the following problems with the code: - multiple if/else code blocks that have to treat v1 and v2 separately; - backward-compatible GetPaths() methods in v2 controllers; - - repeated writing of the PID into the same cgroup for v2; Overall, it's hard to write the right code with all this, and the code that is written is kinda hard to follow. The solution is to slightly change the API to do the 3 things outlined above in the same manner for v1 and v2: 1. Use `GetPaths()` for state saving and setns process cgroups entering. 2. Introduce and use Path(subsys string) to obtain a path to a subsystem. For v2, the argument is ignored and the unified path is returned. This commit converts all the controllers to the new API, and modifies all the users to use it. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-05-08 12:04:06 -07:00
Kir Kolyshkin	51e1a0842d	libct/cgroups/systemd/v1: privatize v1 manager This patch was generated entirely by gorename -- nothing to review here. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-05-08 10:09:48 -07:00
Kir Kolyshkin	d827e323b0	libct/cgroups/systemd/v1: add NewLegacyManager Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-05-08 10:07:40 -07:00
Akihiro Suda	bf15cc99b1	cgroup v2: support rootless systemd Tested with both Podman (master) and Moby (master), on Ubuntu 19.10 . $ podman --cgroup-manager=systemd run -it --rm --runtime=runc \ --cgroupns=host --memory 42m --cpus 0.42 --pids-limit 42 alpine / # cat /proc/self/cgroup 0::/user.slice/user-1001.slice/user@1001.service/user.slice/libpod-132ff0d72245e6f13a3bbc6cdc5376886897b60ac59eaa8dea1df7ab959cbf1c.scope / # cat /sys/fs/cgroup/user.slice/user-1001.slice/user@1001.service/user.slice/libpod-132ff0d72245e6f13a3bbc6cdc5376886897b60ac59eaa8dea1df7ab959cbf1c.scope/memory.max 44040192 / # cat /sys/fs/cgroup/user.slice/user-1001.slice/user@1001.service/user.slice/libpod-132ff0d72245e6f13a3bbc6cdc5376886897b60ac59eaa8dea1df7ab959cbf1c.scope/cpu.max 42000 100000 / # cat /sys/fs/cgroup/user.slice/user-1001.slice/user@1001.service/user.slice/libpod-132ff0d72245e6f13a3bbc6cdc5376886897b60ac59eaa8dea1df7ab959cbf1c.scope/pids.max 42 Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2020-05-08 12:39:20 +09:00
lifubang	bfa1b2aab3	check that StartTransientUnit and StopUnit succeeds Signed-off-by: lifubang <lifubang@acmcoder.com>	2020-04-28 15:46:28 +08:00
lifubang	1d4ccc8e0c	fix data inconsistent when runc update in systemd driven cgroup v1 Signed-off-by: lifubang <lifubang@acmcoder.com>	2020-04-23 19:32:57 +08:00
Kir Kolyshkin	bb47e35843	cgroup/systemd: reorganize 1. Rename the files - v1.go: cgroupv1 aka legacy; - v2.go: cgroupv2 aka unified hierarchy; - unsupported.go: when systemd is not available. 2. Move the code that is common between v1 and v2 to common.go Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-04-19 16:27:40 -07:00

22 Commits