opencontainers/runc

mirror of https://github.com/opencontainers/runc.git synced 2025-09-18 05:23:14 +03:00

Author	SHA1	Message	Date
Kir Kolyshkin	89e59902c4	Modernize code for Go 1.24 Brought to you by modernize -fix -test ./... Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2025-08-27 19:11:02 -07:00
Kir Kolyshkin	17570625c0	Use for range over integers This appears in Go 1.22 (see https://tip.golang.org/ref/spec#For_range). Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2025-03-31 17:15:06 -07:00
Kir Kolyshkin	a638f1330b	.golangci.yml: add nolintlint, fix found issues The errrolint linter can finally ignore errors from Close, and it also ignores direct comparisons of errors from x/sys/unix. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2025-03-24 11:59:54 -07:00
Kir Kolyshkin	65e0f2b719	libct/int: use destroyContainer Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2025-03-24 10:02:47 -07:00
Kir Kolyshkin	1aebfa3eab	libct/int: don't use _ = runContainerOk There is no need to explicitly ignore returned value. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2025-03-24 10:02:47 -07:00
Kir Kolyshkin	5ac77ed6d9	libct/int: add/use needUserNS helper Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2025-03-13 08:40:27 -07:00
Kir Kolyshkin	a75076b4a4	Switch to opencontainers/cgroups This removes libcontainer/cgroups packages and starts using those from github.com/opencontainers/cgroups repo. Mostly generated by: git rm -f libcontainer/cgroups find . -type f -name "*.go" -exec sed -i \ 's\|github.com/opencontainers/runc/libcontainer/cgroups\|github.com/opencontainers/cgroups\|g' \ {} + go get github.com/opencontainers/cgroups@v0.0.1 make vendor gofumpt -w . Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2025-02-28 15:20:33 -08:00
Kir Kolyshkin	7dc2486889	libct: switch to numeric UID/GID/groups This addresses the following TODO in the code (added back in 2015 by commit `845fc65e5`): > // TODO: fix libcontainer's API to better support uid/gid in a typesafe way. Historically, libcontainer internally uses strings for user, group, and additional (aka supplementary) groups. Yet, runc receives those credentials as part of runtime-spec's process, which uses integers for all of them (see [1], [2]). What happens next is: 1. runc start/run/exec converts those credentials to strings (a User string containing "UID:GID", and a []string for additional GIDs) and passes those onto runc init. 2. runc init converts them back to int, in the most complicated way possible (parsing container's /etc/passwd and /etc/group). All this conversion and, especially, parsing is totally unnecessary, but is performed on every container exec (and start). The only benefit of all this is, a libcontainer user could use user and group names instead of numeric IDs (but runc itself is not using this feature, and we don't know if there are any other users of this). Let's remove this back and forth translation, hopefully increasing runc exec performance. The only remaining need to parse /etc/passwd is to set HOME environment variable for a specified UID, in case $HOME is not explicitly set in process.Env. This can now be done right in prepareEnv, which simplifies the code flow a lot. Alas, we can not use standard os/user.LookupId, as it could cache host's /etc/passwd or the current user (even with the osusergo tag). PS Note that the structures being changed (initConfig and Process) are never saved to disk as JSON by runc, so there is no compatibility issue for runc users. Still, this is a breaking change in libcontainer, but we never promised that libcontainer API will be stable (and there's a special package that can handle it -- github.com/moby/sys/user). Reflect this in CHANGELOG. For 3998. [1]: https://github.com/opencontainers/runtime-spec/blob/v1.0.2/config.md#posix-platform-user [2]: https://github.com/opencontainers/runtime-spec/blob/v1.0.2/specs-go/config.go#L86 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2025-02-06 17:49:17 -08:00
Kir Kolyshkin	6c9ddcc648	libct: switch from libct/devices to libct/cgroups/devices/config Use the old package name as an alias to minimize the patch. No functional change; this just eliminates a bunch of deprecation warnings. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2025-01-31 16:51:09 -08:00
Kir Kolyshkin	6171da6005	libct/configs: add HookList.SetDefaultEnv 1. Make CommandHook.Command a pointer, which reduces the amount of data being copied when using hooks, and allows to modify command hooks. 2. Add SetDefaultEnv, which is to be used by the next commit. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2025-01-09 18:22:53 +08:00
Kir Kolyshkin	390641d148	libct/int: improve TestExecInEnvironment This is a slight refactor of TestExecInEnvironment, making it more strict wrt checking the exec output. 1. Explain why DEBUG is added twice to the env. 2. Reuse the execEnv for the check. 3. Make the check more strict -- instead of looking for substrings, check line by line. 4. Add a check for extra environment variables. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2025-01-09 18:22:53 +08:00
Kir Kolyshkin	9a54594752	libct/int: add BenchmarkExecInBigEnv Here's what it shows on my laptop (with -count 10 -benchtime 10s, summarized by benchstat): │ sec/op │ ExecTrue-20 8.477m ± 2% ExecInBigEnv-20 61.53m ± 1% Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2025-01-09 18:22:53 +08:00
Kir Kolyshkin	a56f85f87b	libct/*: switch from configs to cgroups Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2024-12-11 19:08:40 -08:00
Kir Kolyshkin	1e674098f5	libct/int: add exec benchmark This is a benchmark which checks how fast we can execute /bin/true inside a container. Results from my machine are below. As you can see, in default setup about 70% of exec time is spent for CVE-2019-5736 (copying runc binary), and using either RUNC_DMZ=true or memfd-bind helps a lot. This can also be used for profiling (using -test.cpuprofile option). === Default setup === [kir@kir-tp1 integration]$ sudo ./integration.test -test.run xxx -test.v -test.benchtime 5s -test.count 5 -test.bench . . goos: linux goarch: amd64 pkg: github.com/opencontainers/runc/libcontainer/integration cpu: 12th Gen Intel(R) Core(TM) i7-12800H BenchmarkExecTrue BenchmarkExecTrue-20 327 24475677 ns/op BenchmarkExecTrue-20 244 25242718 ns/op BenchmarkExecTrue-20 232 26187174 ns/op BenchmarkExecTrue-20 237 26780030 ns/op BenchmarkExecTrue-20 318 18487219 ns/op PASS === With DMZ enabled === [kir@kir-tp1 integration]$ sudo -E RUNC_DMZ=true ./integration.test -test.run xxx -test.v -test.benchtime 5s -test.count 5 -test.bench . . goos: linux goarch: amd64 pkg: github.com/opencontainers/runc/libcontainer/integration cpu: 12th Gen Intel(R) Core(TM) i7-12800H BenchmarkExecTrue BenchmarkExecTrue-20 694 8263744 ns/op BenchmarkExecTrue-20 778 8483228 ns/op BenchmarkExecTrue-20 784 8456018 ns/op BenchmarkExecTrue-20 732 8160239 ns/op BenchmarkExecTrue-20 769 8236972 ns/op PASS === With memfd-bind === [kir@kir-tp1 integration]$ sudo systemctl start memfd-bind@$(systemd-escape -p $PWD/integration.test) [kir@kir-tp1 integration]$ sudo ./integration.test -test.run xxx -test.v -test.benchtime 5s -test.count 5 -test.bench . . goos: linux goarch: amd64 pkg: github.com/opencontainers/runc/libcontainer/integration cpu: 12th Gen Intel(R) Core(TM) i7-12800H BenchmarkExecTrue BenchmarkExecTrue-20 800 7538839 ns/op BenchmarkExecTrue-20 717 7424755 ns/op BenchmarkExecTrue-20 848 7747787 ns/op BenchmarkExecTrue-20 800 7668740 ns/op BenchmarkExecTrue-20 751 7304373 ns/op PASS Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2024-10-24 13:39:26 -07:00
Kir Kolyshkin	cb20148703	libct/int: use testing.TB for utils ...so that they can be used for benchmarks, too. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2024-10-24 13:39:26 -07:00
Kir Kolyshkin	0de1953333	runc spec, libct/int: do not add ambient capabilities Commit `98fe566c` removed inheritable capabilities from the example spec (used by runc spec) and from the libcontainer/integration test config, but neglected to also remove ambient capabilities. An ambient capability could only be set if the same inheritable capability is set, so as a result of the above change ambient capabilities were not set (but due to a bug in gocapability package, those errors are never reported). Once we start using a library with the fix [1], that bug will become apparent (both bats-based and libct/int tests will fail). [1]: https://github.com/kolyshkin/capability/pull/3 Fixes: `98fe566c` ("runc: do not set inheritable capabilities") Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2024-09-25 21:48:29 -07:00
Kir Kolyshkin	2c398bb41d	libct/int/seccomp_test: simplify exit code checks In all the three cases, we check that the program returned non-zero exit code. This can be done in a much simpler manner. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2024-08-10 14:46:43 -07:00
Aleksa Sarai	81d63f265a	merge #4331 into opencontainers/runc:main Sebastiaan van Stijn (1): libct/userns: split userns detection from internal userns code LGTMs: kolyshkin cyphar	2024-07-03 14:24:05 +10:00
Sebastiaan van Stijn	30b530ca94	libct/userns: split userns detection from internal userns code Commit `4316df8b53` isolated RunningInUserNS to a separate package to make it easier to consume without bringing in additional dependencies, and with the potential to move it separate in a similar fashion as libcontainer/user was moved to a separate module in commit `ca32014adb`. While RunningInUserNS is fairly trivial to implement, it (or variants of this utility) is used in many codebases, and moving to a separate module could consolidate those implementations, as well as making it easier to consume without large dependency trees (when being a package as part of a larger code base). Commit `1912d5988b` and follow-ups introduced cgo code into the userns package, and code introduced in those commits are not intended for external use, therefore complicating the potential of moving the userns package separate. This commit moves the new code to a separate package; some of this code was included in v1.1.11 and up, but I could not find external consumers of `GetUserNamespaceMappings` and `IsSameMapping`. The `Mapping` and `Handles` types (added in `ba0b5e2698`) only exist in main and in non-stable releases (v1.2.0-rc.x), so don't need an alias / deprecation. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2024-06-30 20:06:30 +02:00
Sebastiaan van Stijn	c14213399a	remove pre-go1.17 build-tags Removed pre-go1.17 build-tags with go fix; go fix -mod=readonly ./... Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2024-06-29 15:45:25 +02:00
Kir Kolyshkin	42cea2ecb4	libct: don't allow to start second init process By definition, every container has only 1 init (i.e. PID 1) process. Apparently, libcontainer API supported running more than 1 init, and at least one tests mistakenly used it. Let's not allow that, erroring out if we already have init. Doing otherwise _probably_ results in some confusion inside the library. Fix two cases in libct/int which ran two inits inside a container. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2024-06-10 22:30:03 -07:00
Kir Kolyshkin	36be6d0510	libct/int: checkpoint test: skip pre-dump if not avail Since we're now testing on ARM, the test case fails when trying to do pre-dump since MemTrack is not available. Skip the pre-dump part if so. This also reverts part of commit `3f4a73d6` as it is no longer needed (now, instead of skipping the whole test, we're just skipping the pre-dump). [Review with --ignore-all-space] Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2024-05-10 11:04:58 -07:00
Kir Kolyshkin	e42d981d67	libct/int: rm double logging in checkpoint_test Since commit `c77aaa3f` the tail of criu log is printed by runc, so there's no need to do the same thing in tests. This also fixes a test failure on ARM where showLog fails (because there's no log file) and thus the conditional t.Skip is not called. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2024-05-09 15:18:29 -07:00
Kir Kolyshkin	62a314656a	libct/int/cpt: simplify test pre-check criu check --feature userns also tests for the /proc/self/ns/user presense, so remove the redundant check, and simplify the error message. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2024-05-09 15:18:29 -07:00
lifubang	4ea0bf88fd	update/add some tests for rlimit issues: https://github.com/opencontainers/runc/issues/4195 https://github.com/opencontainers/runc/pull/4265#discussion_r1588599809 Signed-off-by: lifubang <lifubang@acmcoder.com>	2024-05-08 10:57:10 +00:00
Aleksa Sarai	8e1cd2f56d	init: verify after chdir that cwd is inside the container If a file descriptor of a directory in the host's mount namespace is leaked to runc init, a malicious config.json could use /proc/self/fd/... as a working directory to allow for host filesystem access after the container runs. This can also be exploited by a container process if it knows that an administrator will use "runc exec --cwd" and the target --cwd (the attacker can change that cwd to be a symlink pointing to /proc/self/fd/... and wait for the process to exec and then snoop on /proc/$pid/cwd to get access to the host). The former issue can lead to a critical vulnerability in Docker and Kubernetes, while the latter is a container breakout. We can (ab)use the fact that getcwd(2) on Linux detects this exact case, and getcwd(3) and Go's Getwd() return an error as a result. Thus, if we just do os.Getwd() after chdir we can easily detect this case and error out. In runc 1.1, a /sys/fs/cgroup handle happens to be leaked to "runc init", making this exploitable. On runc main it just so happens that the leaked /sys/fs/cgroup gets clobbered and thus this is only consistently exploitable for runc 1.1. Fixes: GHSA-xr7r-f8xq-vfvv CVE-2024-21626 Co-developed-by: lifubang <lifubang@acmcoder.com> Signed-off-by: lifubang <lifubang@acmcoder.com> [refactored the implementation and added more comments] Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>	2024-01-24 00:20:58 +11:00
Akihiro Suda	3f4a73d632	TestCheckpoint: skip on ErrCriuMissingFeatures Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2023-12-19 18:47:03 +09:00
Aleksa Sarai	8e8b136c49	tree-wide: use /proc/thread-self for thread-local state With the idmap work, we will have a tainted Go thread in our thread-group that has a different mount namespace to the other threads. It seems that (due to some bad luck) the Go scheduler tends to make this thread the thread-group leader in our tests, which results in very baffling failures where /proc/self/mountinfo produces gibberish results. In order to avoid this, switch to using /proc/thread-self for everything that is thread-local. This primarily includes switching all file descriptor paths (CLONE_FS), all of the places that check the current cgroup (technically we never will run a single runc thread in a separate cgroup, but better to be safe than sorry), and the aforementioned mountinfo code. We don't need to do anything for the following because the results we need aren't thread-local: * Checks that certain namespaces are supported by stat(2)ing /proc/self/ns/... * /proc/self/exe and /proc/self/cmdline are not thread-local. * While threads can be in different cgroups, we do not do this for the runc binary (or libcontainer) and thus we do not need to switch to the thread-local version of /proc/self/cgroups. * All of the CLONE_NEWUSER files are not thread-local because you cannot set the usernamespace of a single thread (setns(CLONE_NEWUSER) is blocked for multi-threaded programs). Note that we have to use runtime.LockOSThread when we have an open handle to a tid-specific procfs file that we are operating on multiple times. Go can reschedule us such that we are running on a different thread and then kill the original thread (causing -ENOENT or similarly confusing errors). This is not strictly necessary for most usages of /proc/thread-self (such as using /proc/thread-self/fd/$n directly) since only operating on the actual inodes associated with the tid requires this locking, but because of the pre-3.17 fallback for CentOS, we have to do this in most cases. In addition, CentOS's kernel is too old for /proc/thread-self, which requires us to emulate it -- however in rootfs_linux.go, we are in the container pid namespace but /proc is the host's procfs. This leads to the incredibly frustrating situation where there is no way (on pre-4.1 Linux) to figure out which /proc/self/task/... entry refers to the current tid. We can just use /proc/self in this case. Yes this is all pretty ugly. I also wish it wasn't necessary. Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>	2023-12-14 11:36:41 +11:00
Aleksa Sarai	09822c3da8	configs: disallow ambiguous userns and timens configurations For userns and timens, the mappings (and offsets, respectively) cannot be changed after the namespace is first configured. Thus, configuring a container with a namespace path to join means that you cannot also provide configuration for said namespace. Previously we would silently ignore the configuration (and just join the provided path), but we really should be returning an error (especially when you consider that the configuration userns mappings are used quite a bit in runc with the assumption that they are the correct mapping for the userns -- but in this case they are not). In the case of userns, the mappings are also required if you _do not_ specify a path, while in the case of the time namespace you can have a container with a timens but no mappings specified. It should be noted that the case checking that the user has not specified a userns path and a userns mapping needs to be handled in specconv (as opposed to the configuration validator) because with this patchset we now cache the mappings of path-based userns configurations and thus the validator can't be sure whether the mapping is a cached mapping or a user-specified one. So we do the validation in specconv, and thus the test for this needs to be an integration test. Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>	2023-12-05 17:46:09 +11:00
Aleksa Sarai	f81ef1493d	libcontainer: sync: cleanup synchronisation code This includes quite a few cleanups and improvements to the way we do synchronisation. The core behaviour is unchanged, but switching to embedding json.RawMessage into the synchronisation structure will allow us to do more complicated synchronisation operations in future patches. The file descriptor passing through the synchronisation system feature will be used as part of the idmapped-mount and bind-mount-source features when switching that code to use the new mount API outside of nsexec.c. Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2023-08-15 19:54:24 -07:00
Kir Kolyshkin	883aef789b	libct/init: unify init, fix its error logic This commit does two things: 1. Consolidate StartInitialization calling logic into Init(). 2. Fix init error handling logic. The main issues at hand are: - the "unable to convert _LIBCONTAINER_INITPIPE" error from StartInitialization is never shown; - errors from WriteSync and WriteJSON are never shown; - the StartInit calling code is triplicated; - using panic is questionable. Generally, our goals are: - if there's any error, do our best to show it; - but only show each error once; - simplify the code, unify init implementations. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2023-08-04 13:00:35 -07:00
Francis Laniel	c47f58c4e9	Capitalize [UG]idMappings as [UG]IDMappings Signed-off-by: Francis Laniel <flaniel@linux.microsoft.com>	2023-07-21 13:55:34 +02:00
Kir Kolyshkin	f8ad20f500	runc kill: drop -a option As of previous commit, this is implied in a particular scenario. In fact, this is the one and only scenario that justifies the use of -a. Drop the option from the documentation. For backward compatibility, do recognize it, and retain the feature of ignoring the "container is stopped" error when set. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2023-06-08 09:30:40 -07:00
Kir Kolyshkin	9583b3d1c2	libct: move killing logic to container.Signal By default, the container has its own PID namespace, and killing (with SIGKILL) its init process from the parent PID namespace also kills all the other processes. Obviously, it does not work that way when the container is sharing its PID namespace with the host or another container, since init is no longer special (it's not PID 1). In this case, killing container's init will result in a bunch of other processes left running (and thus the inability to remove the cgroup). The solution to the above problem is killing all the container processes, not just init. The problem with the current implementation is, the killing logic is implemented in libcontainer's initProcess.wait, and thus only available to libcontainer users, but not the runc kill command (which uses nonChildProcess.kill and does not use wait at all). So, some workarounds exist: - func destroy(c *Container) calls signalAllProcesses; - runc kill implements -a flag. This code became very tangled over time. Let's simplify things by moving the killing all processes from initProcess.wait to container.Signal, and documents the new behavior. In essence, this also makes `runc kill` to automatically kill all container processes when the container does not have its own PID namespace. Document that as well. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2023-06-08 09:29:25 -07:00
Kir Kolyshkin	2a7dcbbb40	libct: fix shared pidns detection When someone is using libcontainer to start and kill containers from a long lived process (i.e. the same process creates and removes the container), initProcess.wait method is used, which has a kludge to work around killing containers that do not have their own PID namespace. The code that checks for own PID namespace is not entirely correct. To be exact, it does not set sharePidns flag when the host/caller PID namespace is implicitly used. As a result, the above mentioned kludge does not work. Fix the issue, add a test case (which fails without the fix). Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2023-06-08 09:23:29 -07:00
Kir Kolyshkin	5b8f8712a4	libct: signalAllProcesses: remove child reaping There are two very distinct usage scenarios for signalAllProcesses: * when used from the runc binary ("runc kill" command), the processes that it kills are not the children of "runc kill", and so calling wait(2) on each process is totally useless, as it will return ECHLD; * when used from a program that have created the container (such as libcontainer/integration test suite), that program can and should call wait(2), not the signalling code. So, the child reaping code is totally useless in the first case, and should be implemented by the program using libcontainer in the second case. I was not able to track down how this code was added, my best guess is it happened when this code was part of dockerd, which did not have a proper child reaper implemented at that time. Remove it, and add a proper documentation piece. Change the integration test accordingly. PS the first attempt to disable the child reaping code in signalAllProcesses was made in commit `bb912eb00c`, which used a questionable heuristic to figure out whether wait(2) should be called. This heuristic worked for a particular use case, but is not correct in general. While at it: - simplify signalAllProcesses to use unix.Kill; - document (container).Signal. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2023-06-08 09:23:29 -07:00
Kir Kolyshkin	7e481ee2eb	libct/int: remove logger from init Currently, TestInit sets up logrus, and init uses it to log an error from StartInitialization(). This is solely used by TestExecInError to check that error returned from StartInitialization is the one it expects. Note that the very same error is communicated to the runc init parent and is ultimately returned by container.Run(), so checking what StartInitialization returned is redundant. Remove logrus setup and use from TestMain/init. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2023-05-17 12:46:27 -07:00
utam0k	d9230602e9	Implement to set a domainname opencontainers/runtime-spec#1156 Signed-off-by: utam0k <k0ma@utam0k.jp>	2023-04-12 13:31:20 +00:00
Kir Kolyshkin	f2e71b085d	libct/int: make TestFdLeaks more robust The purpose of this test is to check that there are no extra file descriptors left open after repeated calls to runContainer. In fact, the first call to runContainer leaves a few file descriptors opened, and this is by design. Previously, this test relied on two things: 1. some other tests were run before it (and thus all such opened-once file descriptors are already opened); 2. explicitly excluding fd opened to /sys/fs/cgroup. Now, if we run this test separately, it will fail (because of 1 above). The same may happen if the tests are run in a random order. To fix this, add a container run before collection the initial fd list, so those fds that are opened once are included and won't be reported. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2023-02-22 02:58:47 -08:00
Kir Kolyshkin	be7e03940f	libct/int: wording nits Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2023-02-22 02:58:47 -08:00
Kir Kolyshkin	7c75e84e22	libc/int: add/use runContainerOk wrapper This is to de-duplicate the code that checks that err is nil and that the exit code is zero. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2023-02-22 02:58:47 -08:00
Kir Kolyshkin	4fd4af5b1c	CI: workaround CentOS Stream 9 criu issue Older criu builds fail to work properly on CentOS Stream 9 due to changes in glibc's rseq. Skip criu tests if an older criu version is found. Fixes: https://github.com/opencontainers/runc/issues/3532 Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-07-27 17:51:41 -07:00
Kir Kolyshkin	b6967fa84c	Decouple cgroup devices handling This commit separates the functionality of setting cgroup device rules out of libct/cgroups to libct/cgroups/devices package. This package, if imported, sets the function variables in libct/cgroups and libct/cgroups/systemd, so that a cgroup manager can use those to manage devices. If those function variables are nil (when libct/cgroups/devices are not imported), a cgroup manager returns the ErrDevicesUnsupported in case any device rules are set in Resources. It also consolidates the code from libct/cgroups/ebpf and libct/cgroups/ebpf/devicefilter into libct/cgroups/devices. Moved some tests in libct/cg/sd that require device management to libct/sd/devices. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-05-18 11:17:08 -07:00
Kir Kolyshkin	98fe566c52	runc: do not set inheritable capabilities Do not set inheritable capabilities in runc spec, runc exec --cap, and in libcontainer integration tests. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-05-12 08:14:50 +10:00
Kir Kolyshkin	102b8abd26	libct: rm BaseContainer and Container interfaces The only implementation of these is linuxContainer. It does not make sense to have an interface with a single implementation, and we do not foresee other types of containers being added to runc. Remove BaseContainer and Container interfaces, moving their methods documentation to linuxContainer. Rename linuxContainer to Container. Adopt users from using interface to using struct. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-03-23 11:04:12 -07:00
Kir Kolyshkin	6a3fe1618f	libcontainer: remove LinuxFactory Since LinuxFactory has become the means to specify containers state top directory (aka --root), and is only used by two methods (Create and Load), it is easier to pass root to them directly. Modify all the users and the docs accordingly. While at it, fix Create and Load docs (those that were originally moved from the Factory interface docs). Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-03-22 23:44:31 -07:00
Kir Kolyshkin	8358a0ecbb	libct: StartInitialization: decouple from factory StartInitialization does not have to be a method of Factory (while it is clear why it was done that way initially, now we only have Linux containers so it does not make sense). Fix callers and docs accordingly. No change in functionality. Also, since this was the only user of libcontainer.New with the empty string as an argument, the corresponding check can now be removed from it. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-03-22 23:29:24 -07:00
Kir Kolyshkin	0fec1c2d8c	libct: Mount: rm {Pre,Post}mountCmds Those were added by commit `59c5c3ac0` back in Apr 2015, but AFAICS were never used and are obsoleted by more generic container hooks (initially added by commit `05567f2c94` in Sep 2015). Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2022-01-26 15:51:55 -08:00
Kir Kolyshkin	953e56c56f	libct/int: runContainer: drop console arg It is not and was never ever used. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-11-29 20:10:22 -08:00
Kir Kolyshkin	972aea3af0	libct/configs/validate: allow / in sysctl names Runtime spec says: > sysctl (object, OPTIONAL) allows kernel parameters to be modified at > runtime for the container. For more information, see the sysctl(8) > man page. and sysctl(8) says: > variable > The name of a key to read from. An example is > kernel.ostype. The '/' separator is also accepted in place of a '.'. Apparently, runc config validator do not support sysctls with / as a separator. Fortunately this is a one-line fix. Add some more test data where / is used as a separator. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2021-10-29 09:45:55 -07:00

1 2 3 4 5

217 Commits