1
0
mirror of https://github.com/opencontainers/runc.git synced 2025-08-07 01:22:38 +03:00
Commit Graph

84 Commits

Author SHA1 Message Date
Renaud Gaubert
ccdd75760c Add the CreateRuntime, CreateContainer and StartContainer Hooks
Signed-off-by: Renaud Gaubert <rgaubert@nvidia.com>
2020-06-17 02:10:00 +00:00
Kir Kolyshkin
4189cb65f8 cgroups: remove cgroup.Resources.CpuMax
This (and the converting function) is only used by one of the four
cgroup drivers. The other three do some checking and conversion in
place, so let the fs2 do the same.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-06-09 17:15:38 -07:00
Giuseppe Scrivano
41aa19662b libcontainer: honor seccomp errnoRet
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2020-05-20 09:11:55 +02:00
Aleksa Sarai
24388be71e configs: use different types for .Devices and .Resources.Devices
Making them the same type is simply confusing, but also means that you
could accidentally use one in the wrong context. This eliminates that
problem. This also includes a whole bunch of cleanups for the types
within DeviceRule, so that they can be used more ergonomically.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2020-05-13 17:38:45 +10:00
Aleksa Sarai
60e21ec26e specconv: remove default /dev/console access
/dev/console is a host resouce which gives a bunch of permissions that
we really shouldn't be giving to containers, not to mention that
/dev/console in containers is actually /dev/pts/$n. Drop this since
arguably this is a fairly scary thing to allow...

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2020-05-13 17:38:45 +10:00
Aleksa Sarai
b2bec9806f cgroup: devices: eradicate the Allow/Deny lists
These lists have been in the codebase for a very long time, and have
been unused for a large portion of that time -- specconv doesn't
generate them and the only user of these flags has been tests (which
doesn't inspire much confidence).

In addition, we had an incorrect implementation of a white-list policy.
This wasn't exploitable because all of our users explicitly specify
"deny all" as the first rule, but it was a pretty glaring issue that
came from the "feature" that users can select whether they prefer a
white- or black- list. Fix this by always writing a deny-all rule (which
is what our users were doing anyway, to work around this bug).

This is one of many changes needed to clean up the devices cgroup code.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2020-05-13 17:38:45 +10:00
Pradyumna Agrawal
4aa9101477 Honor spec.Process.NoNewPrivileges in specconv.CreateLibcontainerConfig
The change ensures that the passed in value of NoNewPrivileges under spec.Process
is reflected in the container config generated by specconv.CreateLibcontainerConfig

Closes #2397

Signed-off-by: Pradyumna Agrawal <pradyumnaa@vmware.com>
2020-05-11 13:38:14 -07:00
lifubang
d2a9c5da37 using default allowed devices when linux resources is null
Signed-off-by: lifubang <lifubang@acmcoder.com>
2020-04-16 11:40:44 +08:00
Akihiro Suda
cc183ca662 Merge pull request #2242 from AkihiroSuda/vendor-systemd
vendor: update go-systemd and godbus
2020-03-25 02:40:22 +09:00
Akihiro Suda
492d525e55 vendor: update go-systemd and godbus
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-03-16 13:26:03 +09:00
Akihiro Suda
aa269315a4 cgroup2: add CpuMax conversion
Fix #2243

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2020-03-13 02:58:39 +09:00
Kir Kolyshkin
1cd71dfd71 systemd properties: support for *Sec values
Some systemd properties are documented as having "Sec" suffix
(e.g. "TimeoutStopSec") but are expected to have "USec" suffix
when passed over dbus, so let's provide appropriate conversion
to improve compatibility.

This means, one can specify TimeoutStopSec with a numeric argument,
in seconds, and it will be properly converted to TimeoutStopUsec
with the argument in microseconds. As a side bonus, even float
values are converted, so e.g. TimeoutStopSec=1.5 is possible.

This turned out a bit more tricky to implement when I was
originally expected, since there are a handful of numeric
types in dbus and each one requires explicit conversion.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-02-17 16:07:19 -08:00
Kir Kolyshkin
4c5c3fb960 Support for setting systemd properties via annotations
In case systemd is used to set cgroups for the container,
it creates a scope unit dedicated to it (usually named
`runc-$ID.scope`).

This patch adds an ability to set arbitrary systemd properties
for the systemd unit via runtime spec annotations.

Initially this was developed as an ability to specify the
`TimeoutStopUSec` property, but later generalized to work with
arbitrary ones.

Example usage: add the following to runtime spec (config.json):

```
	"annotations": {
		"org.systemd.property.TimeoutStopUSec": "uint64 123456789",
		"org.systemd.property.CollectMode":"'inactive-or-failed'"
	},
```

and start the container (e.g. `runc --systemd-cgroup run $ID`).

The above will set the following systemd parameters:
* `TimeoutStopSec` to 2 minutes and 3 seconds,
* `CollectMode` to "inactive-or-failed".

The values are in the gvariant format (see [1]). To figure out
which type systemd expects for a particular parameter, see
systemd sources.

In particular, parameters with `USec` suffix require an `uint64`
typed argument, while gvariant assumes int32 for a numeric values,
therefore the explicit type is required.

NOTE that systemd receives the time-typed parameters as *USec
but shows them (in `systemctl show`) as *Sec. For example,
the stop timeout should be set as `TimeoutStopUSec` but
is shown as `TimeoutStopSec`.

[1] https://developer.gnome.org/glib/stable/gvariant-text.html

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-02-17 16:07:19 -08:00
Boris Popovschi
7c439cc6f6 Added conversion for cpu.weight v2
Signed-off-by: Boris Popovschi <zyqsempai@mail.ru>
2020-02-12 11:32:34 +02:00
Julio Montes
cd7c59d042 libcontainer: export createCgroupConfig
A `config.Cgroups` object is required to manipulate cgroups v1 and v2 using
libcontainer.
Export `createCgroupConfig` to allow API users to create `config.Cgroups`
objects using directly libcontainer API.

Signed-off-by: Julio Montes <julio.montes@intel.com>
2019-12-17 22:46:03 +00:00
Akihiro Suda
faf673ee45 cgroup2: port over eBPF device controller from crun
The implementation is based on https://github.com/containers/crun/blob/0.10.2/src/libcrun/ebpf.c

Although ebpf.c is originally licensed under LGPL-3.0-or-later, the author
Giuseppe Scrivano agreed to relicense the file in Apache License 2.0:
https://github.com/opencontainers/runc/issues/2144#issuecomment-543116397

See libcontainer/cgroups/ebpf/devicefilter/devicefilter_test.go for tested configurations.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2019-10-31 14:01:46 +09:00
Aleksa Sarai
8296826da5 specconv: always set "type: bind" in case of MS_BIND
We discovered in umoci that setting a dummy type of "none" would result
in file-based bind-mounts no longer working properly, which is caused by
a restriction for when specconv will change the device type to "bind" to
work around rootfs_linux.go's ... issues.

However, bind-mounts don't have a type (and Linux will ignore any type
specifier you give it) because the type is copied from the source of the
bind-mount. So we should always overwrite it to avoid user confusion.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2019-04-08 15:08:08 +10:00
Filipe Brandenburger
4b2b978291 Add cgroup name to error message
More information should help troubleshoot an issue when this error occurs.

Signed-off-by: Filipe Brandenburger <filbranden@google.com>
2019-03-14 10:25:00 -07:00
Mrunal Patel
4769cdf607 Merge pull request #1916 from crosbymichael/cgns
Add support for cgroup namespace
2018-11-13 12:21:38 -08:00
Ace-Tang
16d55f17a8 libcontainer: fix potential panic if spec.Process is nil
for the code logic, pointer 'spec.Process' should be judge first
to avoid panic.

Signed-off-by: Ace-Tang <aceapril@126.com>
2018-11-06 11:55:30 +08:00
Aleksa Sarai
9a3a8a5ebf libcontainer: implement CLONE_NEWCGROUP
This is a very simple implementation because it doesn't require any
configuration unlike the other namespaces, and in its current state it
only masks paths.

This feature is available in Linux 4.6+ and is enabled by default for
kernels compiled with CONFIG_CGROUP=y.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2018-10-23 16:23:00 -04:00
Xiaochen Shen
27560ace2f libcontainer: intelrdt: add support for Intel RDT/MBA in runc
Memory Bandwidth Allocation (MBA) is a resource allocation sub-feature
of Intel Resource Director Technology (RDT) which is supported on some
Intel Xeon platforms. Intel RDT/MBA provides indirect and approximate
throttle over memory bandwidth for the software. A user controls the
resource by indicating the percentage of maximum memory bandwidth.

Hardware details of Intel RDT/MBA can be found in section 17.18 of
Intel Software Developer Manual:
https://software.intel.com/en-us/articles/intel-sdm

In Linux 4.12 kernel and newer, Intel RDT/MBA is enabled by kernel
config CONFIG_INTEL_RDT. If hardware support, CPU flags `rdt_a` and
`mba` will be set in /proc/cpuinfo.

Intel RDT "resource control" filesystem hierarchy:
mount -t resctrl resctrl /sys/fs/resctrl
tree /sys/fs/resctrl
/sys/fs/resctrl/
|-- info
|   |-- L3
|   |   |-- cbm_mask
|   |   |-- min_cbm_bits
|   |   |-- num_closids
|   |-- MB
|       |-- bandwidth_gran
|       |-- delay_linear
|       |-- min_bandwidth
|       |-- num_closids
|-- ...
|-- schemata
|-- tasks
|-- <container_id>
    |-- ...
    |-- schemata
    |-- tasks

For MBA support for `runc`, we will reuse the infrastructure and code
base of Intel RDT/CAT which implemented in #1279. We could also make
use of `tasks` and `schemata` configuration for memory bandwidth
resource constraints.

The file `tasks` has a list of tasks that belongs to this group (e.g.,
<container_id>" group). Tasks can be added to a group by writing the
task ID to the "tasks" file (which will automatically remove them from
the previous group to which they belonged). New tasks created by
fork(2) and clone(2) are added to the same group as their parent.

The file `schemata` has a list of all the resources available to this
group. Each resource (L3 cache, memory bandwidth) has its own line and
format.

Memory bandwidth schema:
It has allocation values for memory bandwidth on each socket, which
contains L3 cache id and memory bandwidth percentage.
    Format: "MB:<cache_id0>=bandwidth0;<cache_id1>=bandwidth1;..."

The minimum bandwidth percentage value for each CPU model is predefined
and can be looked up through "info/MB/min_bandwidth". The bandwidth
granularity that is allocated is also dependent on the CPU model and
can be looked up at "info/MB/bandwidth_gran". The available bandwidth
control steps are: min_bw + N * bw_gran. Intermediate values are
rounded to the next control step available on the hardware.

For more information about Intel RDT kernel interface:
https://www.kernel.org/doc/Documentation/x86/intel_rdt_ui.txt

An example for runc:
Consider a two-socket machine with two L3 caches where the minimum
memory bandwidth of 10% with a memory bandwidth granularity of 10%.
Tasks inside the container may use a maximum memory bandwidth of 20%
on socket 0 and 70% on socket 1.

"linux": {
    "intelRdt": {
        "memBwSchema": "MB:0=20;1=70"
    }
}

Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
2018-10-16 14:29:29 +08:00
Mrunal Patel
a00bf01908 Merge pull request #1862 from AkihiroSuda/decompose-rootless-pr
Disable rootless mode except RootlessCgMgr when executed as the root in userns (fix Docker-in-LXD regression)
2018-10-15 17:32:15 -07:00
Jonathan Marler
1499c746a1 Move spec.Linux.IntelRdt check to spec.Linux != nil block
Signed-off-by: Jonathan Marler <johnnymarler@gmail.com>
2018-10-04 21:30:55 -06:00
Akihiro Suda
06f789cf26 Disable rootless mode except RootlessCgMgr when executed as the root in userns
This PR decomposes `libcontainer/configs.Config.Rootless bool` into `RootlessEUID bool` and
`RootlessCgroups bool`, so as to make "runc-in-userns" to be more compatible with "rootful" runc.

`RootlessEUID` denotes that runc is being executed as a non-root user (euid != 0) in
the current user namespace. `RootlessEUID` is almost identical to the former `Rootless`
except cgroups stuff.

`RootlessCgroups` denotes that runc is unlikely to have the full access to cgroups.
`RootlessCgroups` is set to false if runc is executed as the root (euid == 0) in the initial namespace.
Otherwise `RootlessCgroups` is set to true.
(Hint: if `RootlessEUID` is true, `RootlessCgroups` becomes true as well)

When runc is executed as the root (euid == 0) in an user namespace (e.g. by Docker-in-LXD, Podman, Usernetes),
`RootlessEUID` is set to false but `RootlessCgroups` is set to true.
So, "runc-in-userns" behaves almost same as "rootful" runc except that cgroups errors are ignored.

This PR does not have any impact on CLI flags and `state.json`.

Note about CLI:
* Now `runc --rootless=(auto|true|false)` CLI flag is only used for setting `RootlessCgroups`.
* Now `runc spec --rootless` is only required when `RootlessEUID` is set to true.
  For runc-in-userns, `runc spec`  without `--rootless` should work, when sufficient numbers of
  UID/GID are mapped.

Note about `$XDG_RUNTIME_DIR` (e.g. `/run/user/1000`):
* `$XDG_RUNTIME_DIR` is ignored if runc is being executed as the root (euid == 0) in the initial namespace, for backward compatibility.
  (`/run/runc` is used)
* If runc is executed as the root (euid == 0) in an user namespace, `$XDG_RUNTIME_DIR` is honored if `$USER != "" && $USER != "root"`.
  This allows unprivileged users to allow execute runc as the root in userns, without mounting writable `/run/runc`.

Note about `state.json`:
* `rootless` is set to true when `RootlessEUID == true && RootlessCgroups == true`.

Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
2018-09-07 15:05:03 +09:00
Alban Crequy
3321aa1af7 Fix regression with mounts with non-absolute source path
PR #1753 introduced a test on the mount flags but the binary operator
was wrong, see https://github.com/opencontainers/runc/pull/1753#discussion_r203445652

This was noticed when investigating https://github.com/opencontainers/runtime-tools/issues/651

Symptoms: in the container, /proc/self/mountinfo displays some mounts as
follow:

296 279 0:67 / /tmp rw,nosuid - tmpfs /home/dpark/go/src/github.com/opencontainers/runc/tmpfs rw,size=65536k,mode=755

Signed-off-by: Alban Crequy <alban@kinvolk.io>
2018-07-18 18:30:49 +02:00
Qiang Huang
dd67ab10d7 Merge pull request #1759 from cyphar/rootless-erofs-as-eperm
rootless: cgroup: treat EROFS as a skippable error
2018-05-25 09:24:16 +08:00
dlorenc
40680b2d37 Make the setupSeccomp function public.
This function is useful for converting from the OCI spec format to the one used by runC/libcontainer.

Signed-off-by: dlorenc <lorenc.d@gmail.com>
2018-04-17 10:47:22 -07:00
Michael Crosby
d56f6cc202 Merge pull request #1753 from wking/do-not-require-bind-mount-type
libcontainer/specconv/spec_linux: Support empty 'type' for bind mounts
2018-04-16 11:01:53 -04:00
Michael Crosby
9f0eca2a94 Merge pull request #1777 from nalind/no-config-for-extant-netns
Only configure networking when creating a net ns
2018-04-12 10:55:02 -04:00
Nalin Dahyabhai
4521d4b19c Only configure networking when creating a net ns
When joining an existing namespace, don't default to configuring a
loopback interface in that namespace.

Its creator should have done that, and we don't want to fail to create
the container when we don't have sufficient privileges to configure the
network namespace.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2018-04-11 13:28:19 -04:00
Aleksa Sarai
fd3a6e6c83 libcontainer: handle unset oomScoreAdj corectly
Previously if oomScoreAdj was not set in config.json we would implicitly
set oom_score_adj to 0. This is not allowed according to the spec:

> If oomScoreAdj is not set, the runtime MUST NOT change the value of
> oom_score_adj.

Change this so that we do not modify oom_score_adj if oomScoreAdj is not
present in the configuration. While this modifies our internal
configuration types, the on-disk format is still compatible.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2018-03-17 13:53:42 +11:00
W. Trevor King
0aa6e4e5d3 libcontainer/specconv/spec_linux: Support empty 'type' for bind mounts
From the "Creating a bind mount" section of mount(2) [1]:

> If mountflags includes MS_BIND (available since Linux 2.4), then
> perform a bind mount...
>
> The filesystemtype and data arguments are ignored.

This commit adds support for configurations that leave the OPTIONAL
type [2] unset for bind mounts.  There's a related spec-example change
in flight with [3], although my personal preference would be a more
explicit spec for the whole mount structure [4].

[1]: http://man7.org/linux/man-pages/man2/mount.2.html
[2]: https://github.com/opencontainers/runtime-spec/blame/v1.0.1/config.md#L102
[3]: https://github.com/opencontainers/runtime-spec/pull/954
[4]: https://github.com/opencontainers/runtime-spec/pull/771

Signed-off-by: W. Trevor King <wking@tremily.us>
2018-03-07 10:23:42 -08:00
ynirk
2420eb1f4d The setupUserNamespace function is always called.
The function is called even if the usernamespace is not set.
This results having wrong uid/gid set on devices.

This fix add a test to check if usernamespace is set befor calling
setupUserNamespace.

Fixes #1742

Signed-off-by: Julien Lavesque <julien.lavesque@gmail.com>
2018-02-26 14:27:11 +01:00
Mrunal Patel
c6e4a1ebeb Merge pull request #1665 from Mashimiao/gidmapping-valid-fix
specconv: avoid skipping gidmappings applied when uidmappings is empty
2017-12-11 09:50:54 -08:00
Ma Shimiao
57edfbbaf2 specconv: avoid skipping gidmappings applied when uidmappings is empty
Signed-off-by: Ma Shimiao <mashimiao.fnst@cn.fujitsu.com>
2017-11-30 16:24:36 +08:00
Ma Shimiao
17db6560be support unbindable,runbindable for rootfs propagation
Signed-off-by: Ma Shimiao <mashimiao.fnst@cn.fujitsu.com>
2017-11-17 16:14:15 +08:00
Aleksa Sarai
d4f0f9a52b specconv: emit an error when using MS_PRIVATE with --no-pivot
Due to the semantics of chroot(2) when it comes to mount namespaces, it
is not generally safe to use MS_PRIVATE as a mount propgation when using
chroot(2). The reason for this is that this effectively results in a set
of mount references being held by the chroot'd namespace which the
namespace cannot free. pivot_root(2) does not have this issue because
the @old_root can be unmounted by the process.

Ultimately, --no-pivot is not really necessary anymore as a commonly
used option since f8e6b5af5e ("rootfs: make pivot_root not use a
temporary directory") resolved the read-only issue. But if someone
really needs to use it, MS_PRIVATE is never a good idea.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-10-08 17:50:55 +11:00
Qiang Huang
79ad714374 Merge pull request #1598 from euank/ragent
libcontainer: default mount propagation correctly
2017-09-25 11:55:29 +08:00
Euan Kemp
4301b440d6 libcontainer: default mount propagation correctly
The code in prepareRoot (e385f67a0e/libcontainer/rootfs_linux.go (L599-L605))
attempts to default the rootfs mount to `rslave`. However, since the spec
conversion has already defaulted it to `rprivate`, that code doesn't
actually ever do anything.

This changes the spec conversion code to accept "" and treat it as 0.

Implicitly, this makes rootfs propagation default to `rslave`, which is
a part of fixing the moby bug https://github.com/moby/moby/issues/34672

Alternate implementatoins include changing this defaulting to be
`rslave` and removing the defaulting code in prepareRoot, or skipping
the mapping entirely for "", but I think this change is the cleanest of
those options.

Signed-off-by: Euan Kemp <euan.kemp@coreos.com>
2017-09-22 13:36:23 -07:00
Ma Shimiao
c3d20e7817 Fixes #1585 config.Namespaces is empty when accessed
Signed-off-by: Ma Shimiao <mashimiao.fnst@cn.fujitsu.com>
2017-09-08 09:30:07 +08:00
Xiaochen Shen
692f6e1e27 libcontainer: add support for Intel RDT/CAT in runc
About Intel RDT/CAT feature:
Intel platforms with new Xeon CPU support Intel Resource Director Technology
(RDT). Cache Allocation Technology (CAT) is a sub-feature of RDT, which
currently supports L3 cache resource allocation.

This feature provides a way for the software to restrict cache allocation to a
defined 'subset' of L3 cache which may be overlapping with other 'subsets'.
The different subsets are identified by class of service (CLOS) and each CLOS
has a capacity bitmask (CBM).

For more information about Intel RDT/CAT can be found in the section 17.17
of Intel Software Developer Manual.

About Intel RDT/CAT kernel interface:
In Linux 4.10 kernel or newer, the interface is defined and exposed via
"resource control" filesystem, which is a "cgroup-like" interface.

Comparing with cgroups, it has similar process management lifecycle and
interfaces in a container. But unlike cgroups' hierarchy, it has single level
filesystem layout.

Intel RDT "resource control" filesystem hierarchy:
mount -t resctrl resctrl /sys/fs/resctrl
tree /sys/fs/resctrl
/sys/fs/resctrl/
|-- info
|   |-- L3
|       |-- cbm_mask
|       |-- min_cbm_bits
|       |-- num_closids
|-- cpus
|-- schemata
|-- tasks
|-- <container_id>
    |-- cpus
    |-- schemata
    |-- tasks

For runc, we can make use of `tasks` and `schemata` configuration for L3 cache
resource constraints.

The file `tasks` has a list of tasks that belongs to this group (e.g.,
<container_id>" group). Tasks can be added to a group by writing the task ID
to the "tasks" file  (which will automatically remove them from the previous
group to which they belonged). New tasks created by fork(2) and clone(2) are
added to the same group as their parent. If a pid is not in any sub group, it
Is in root group.

The file `schemata` has allocation bitmasks/values for L3 cache on each socket,
which contains L3 cache id and capacity bitmask (CBM).
	Format: "L3:<cache_id0>=<cbm0>;<cache_id1>=<cbm1>;..."
For example, on a two-socket machine, L3's schema line could be `L3:0=ff;1=c0`
which means L3 cache id 0's CBM is 0xff, and L3 cache id 1's CBM is 0xc0.

The valid L3 cache CBM is a *contiguous bits set* and number of bits that can
be set is less than the max bit. The max bits in the CBM is varied among
supported Intel Xeon platforms. In Intel RDT "resource control" filesystem
layout, the CBM in a group should be a subset of the CBM in root. Kernel will
check if it is valid when writing. e.g., 0xfffff in root indicates the max bits
of CBM is 20 bits, which mapping to entire L3 cache capacity. Some valid CBM
values to set in a group: 0xf, 0xf0, 0x3ff, 0x1f00 and etc.

For more information about Intel RDT/CAT kernel interface:
https://www.kernel.org/doc/Documentation/x86/intel_rdt_ui.txt

An example for runc:
Consider a two-socket machine with two L3 caches where the default CBM is
0xfffff and the max CBM length is 20 bits. With this configuration, tasks
inside the container only have access to the "upper" 80% of L3 cache id 0 and
the "lower" 50% L3 cache id 1:

"linux": {
	"intelRdt": {
		"l3CacheSchema": "L3:0=ffff0;1=3ff"
	}
}

Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
2017-09-01 14:26:33 +08:00
Ma Shimiao
2333e7dc67 fix panic when Linux is nil for rootless case
congfig.Sysctl setting is duplicated.
when contianer is rootless and Linux is nil, runc will panic.

Signed-off-by: Ma Shimiao <mashimiao.fnst@cn.fujitsu.com>
2017-08-16 09:11:13 +08:00
Ma Shimiao
527dc5acbb fix panic when Linux is nil
Linux is not always not nil.
If Linux is nil, panic will occur.

Signed-off-by: Ma Shimiao <mashimiao.fnst@cn.fujitsu.com>
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-08-10 15:57:49 -04:00
Michael Crosby
eb70c213ba Update runtime-spec to rc6
Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-07-12 16:24:04 -07:00
Michael Crosby
fef3aced0e Merge pull request #1460 from wking/mount-option-lazytime
libcontainer/specconv/spec_linux: Add support for (no)lazytime
2017-06-29 10:06:23 -07:00
W. Trevor King
4f81337e95 libcontainer/specconv/spec_linux: Add support for (no)lazytime
And also silent, loud, (no)iversion, and (no)acl.  This is part of
catching runC up with the spec, which punts valid options to mount(8)
[1,2].

(no)acl is a filesystem-specific entry in mount(8), but it's
represented by a MS_* flag in mount(2) so we need an entry in the
translation table.

[1]: https://github.com/opencontainers/runtime-spec/blame/v1.0.0-rc5/config.md#L68
[2]: https://github.com/opencontainers/runtime-spec/pull/771

Signed-off-by: W. Trevor King <wking@tremily.us>
2017-06-01 20:43:35 -07:00
Michael Crosby
854b41d81e Update spec to 239c4e44f2
This provides updates to runc for the spec changes with *Process and
OOMScoreAdj

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-06-01 16:29:47 -07:00
Christy Perez
3d7cb4293c Move libcontainer to x/sys/unix
Since syscall is outdated and broken for some architectures,
use x/sys/unix instead.

There are still some dependencies on the syscall package that will
remain in syscall for the forseeable future:

Errno
Signal
SysProcAttr

Additionally:
- os still uses syscall, so it needs to be kept for anything
returning *os.ProcessState, such as process.Wait.

Signed-off-by: Christy Perez <christy@linux.vnet.ibm.com>
2017-05-22 17:35:20 -05:00
Aleksa Sarai
f0876b0427 libcontainer: configs: add proper HostUID and HostGID
Previously Host{U,G}ID only gave you the root mapping, which isn't very
useful if you are trying to do other things with the IDMaps.

Signed-off-by: Aleksa Sarai <asarai@suse.de>
2017-03-23 20:46:20 +11:00