During early startup memcpy or memset must not be called since many targets
use ifuncs for them which won't be initialized yet. Security hardening may
use -ftrivial-auto-var-init=zero which inserts calls to memset. Redirect
memset to memset_generic by including dl-symbol-redir-ifunc.h in cpu-features.c.
This fixes BZ #33112.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
(cherry picked from commit 681a24ae4d)
The BZ 32653 fix (12a497c716) kept the
stack pointer zeroing from make_main_stack_executable on
_dl_make_stack_executable. However, previously the 'stack_endp'
pointed to temporary variable created before the call of
_dl_map_object_from_fd; while now we use the __libc_stack_end
directly.
Since pthread_getattr_np relies on correct __libc_stack_end, if
_dl_make_stack_executable is called (for instance, when
glibc.rtld.execstack=2 is set) __libc_stack_end will be set to zero,
and the call will always fail.
The __libc_stack_end zero was used a mitigation hardening, but since
52a01100ad it is used solely on
pthread_getattr_np code. So there is no point in zeroing anymore.
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Sam James <sam@gentoo.org>
(cherry picked from commit 0c34259423)
Due to the extensible nature of the rseq area we can't explictly
initialize fields that are not part of the ABI yet. It was agreed with
upstream that all new fields will be documented as zero initialized by
userspace. Future kernels configured with CONFIG_DEBUG_RSEQ will
validate the content of all fields during registration.
Replace the explicit field initialization with a memset of the whole
rseq area which will cover fields as they are added to future kernels.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
(cherry picked from commit 689a62a421)
Test that when we return from a function that enabled GCS at runtime
we get SIGSEGV. Also test that ucontext contains GCS block with the
GCS pointer.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
These tests validate that GCS tunable works as expected depending
on the GCS markings in the test binaries.
Tests validate both static and dynamically linked binaries.
These new tests are AArch64 specific. Moreover, they are included only
if linker supports the "-z gcs=<value>" option. If built, these tests
will run on systems with and without HWCAP_GCS. In the latter case the
tests will be reported as UNSUPPORTED.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The syscall pkey_alloc can return ENOSPC to indicate either that all
keys are in use or that the system runs in a mode in which memory
protection keys are disabled. In such case the test should not fail and
just return unsupported.
This matches the behaviour of the generic tst-pkey.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
(cherry picked from commit 60f2d6be65)
Linux 6.13 was released with a change that overwrites those bytes.
This means that the check_unused subtest fails.
Update the manual accordingly.
Tested-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Unlike for BTI, the kernel does not process GCS properties so update
GL(dl_aarch64_gcs) before the GCS status is set.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Use the ARCH_SETUP_TLS hook to enable GCS in the static linked case.
The system call must be inlined and then GCS is enabled on a top
level stack frame that does not return and has no exception handlers
above it.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This tunable controls Guarded Control Stack (GCS) for the process.
0 = disabled: do not enable GCS
1 = enforced: check markings and fail if any binary is not marked
2 = optional: check markings but keep GCS off if a binary is unmarked
3 = override: enable GCS, markings are ignored
By default it is 0, so GCS is disabled, value 1 will enable GCS.
The status is stored into GL(dl_aarch64_gcs) early and only applied
later, since enabling GCS is tricky: it must happen on a top level
stack frame. Using GL instead of GLRO because it may need updates
depending on loaded libraries that happen after readonly protection
is applied, however library marking based GCS setting is not yet
implemented.
Describe new tunable in the manual.
Co-authored-by: Yury Khrustalev <yury.khrustalev@arm.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Changed the makecontext logic: previously the first setcontext jumped
straight to the user callback function and the return address is set
to __startcontext. This does not work when GCS is enabled as the
integrity of the return address is protected, so instead the context
is setup such that setcontext jumps to __startcontext which calls the
user callback (passed in x20).
The map_shadow_stack syscall is used to allocate a suitably sized GCS
(which includes some reserved area to account for altstack signal
handlers and otherwise supports maximum number of 16 byte aligned
stack frames on the given stack) however the GCS is never freed as
the lifetime of ucontext and related stack is user managed.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Userspace ucontext needs to store GCSPR, it does not have to be
compatible with the kernel ucontext. For now we use the linux
struct gcs_context layout but only use the gcspr field from it.
Similar implementation to the longjmp code, supports switching GCS
if the target GCS is capped, and unwinding a continuous GCS to a
previous state.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The new test elf/tst-rseq-tls-range-4096-static reliably detected
the extra TLS allocation problem (tcb_offset was dropped from
the allocation size) on aarch64. It also failed with a crash
in dlopen *before* the extra TLS changes, so TLS alignment with
static dlopen was already broken.
Reviewed-by: Michael Jeanson <mjeanson@efficios.com>
Careful updates of grnd_alloc.len are required to ensure that
after fork, grnd_alloc.states does not contain entries that
are also encountered by __getrandom_reset_state in TCBs.
For the same reason, it is necessary to overwrite the TCB state
pointer with NULL before updating grnd_alloc.states in
__getrandom_vdso_release.
Before this change, different TCBs could share the same getrandom
state after multi-threaded fork. This would be a critical security
bug (predictable randomness) if not caught during development.
The additional check in stdlib/tst-arc4random-thread makes it more
likely that the test fails due to the bugs mentioned above.
Both __getrandom_reset_state and __getrandom_vdso_release could
put reserved NULL pointers into the states array. This is also
fixed with this commit. After these changes, no null pointers were
observed in the states array during testing.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Some kernels on S390 appear to return a CPU affinity mask based on
configured processors rather than the ones online. Overallocate the CPU
set to match that, but operate only on the ones online.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Co-authored-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
If the bit is not 0, the operations FRCHG and FSCHG are
undefined and cause a trap; qemu now checks for this as
well, so we set it to 0 temporarily and restore the old
value in getcontext afterwards (setcontext/swapcontext
already do so).
From the discussion in the bugreport, this can probably
be optimised in one place but none of the people involved
are SH4 assembly experts, this patch is field-tested, and
it’s not a code path run often. The other question, what
happens if a signal occurs while the bit is temporarily 0,
is also still unsolved, but to fix that a kernel change is
most likely needed; this patch changes a certain trap on
many CPUs for a hard-to-get trap in a signal handler if a
signal is delivered during the few instructions the PR bit
is temporarily set to 0, so it’s not a regression for most
users.
See BZ and https://bugs.launchpad.net/qemu/+bug/1796520 for
related discussion, references and review comments.
Signed-off-by: mirabilos <tg@debian.org>
Reviewed-by: Oleg Endo <olegendo@gcc.gnu.org>
Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
clang issues:
error: value size does not match register size specified by the
constraint and modifier [-Werror,-Wasm-operand-widths]
while tryng to use 32 bit variables with 'mrs' to get/set the
fpsr, dczid_el0, and ctr.
Move the rseq area to the newly added 'extra TLS' block, this is the
last step in adding support for the rseq extended ABI. The size of the
rseq area is now dynamic and depends on the rseq features reported by
the kernel through the elf auxiliary vector. This will allow
applications to use rseq features past the 32 bytes of the original rseq
ABI as they become available in future kernels.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
In preparation to move the rseq area to the 'extra TLS' block, we need
accessors based on the thread pointer and the rseq offset. The ONCE
variant of the accessors ensures single-copy atomicity for loads and
stores which is required for all fields once the registration is active.
A separate header is required to allow including <atomic.h> which
results in an include loop when added to <tcb-access.h>.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
This allows accessing the internal aliases of __rseq_size and
__rseq_offset from ld.so without ifdefs and avoids dynamic symbol
binding at run time for both variables.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Add the Linux implementation of 'extra TLS' which will allocate space
for the rseq area at the end of the TLS blocks in allocation order.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Get the rseq feature size and alignment requirement from the auxiliary
vector for use inside the dynamic loader. Use '__rseq_size' directly to
store the feature size. If the main thread registration fails or is
disabled by tunable, reset the value to 0.
This will be used in the TLS block allocator to compute the size and
alignment of the rseq area block for the extended ABI support.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Add a couple of tests to verify that CPU affinity set using
sched_setaffinity and pthread_setaffinity_np are inherited by a child
process and child thread.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Commit 8f8dd904c4 (“elf:
rtld_multiple_ref is always true”) removed some code that happened
to enable compatibility with programs that do not link against
libc.so. Such programs cannot call dlopen or any dynamic linker
functions (except __tls_get_addr), so this is not really useful.
Still ld.so should not crash with a null-pointer dereference
or undefined symbol reference in these cases.
In the main relocation loop, call _dl_relocate_object unconditionally
because it already checks if the object has been relocated.
If libc.so was loaded, self-relocate ld.so against it and call
__rtld_mutex_init and __rtld_malloc_init_real to activate the full
implementations. Those are available only if libc.so is there,
so skip these initialization steps if libc.so is absent. Without
libc.so, the global scope can be completely empty. This can cause
ld.so self-relocation to fail because if it uses symbol-based
relocations, which is why the second ld.so self-relocation is not
performed if libc.so is missing.
The previous concern regarding GOT updates through self-relocation
no longer applies because function pointers are updated
explicitly through __rtld_mutex_init and __rtld_malloc_init_real,
and not through relocation. However, the second ld.so self-relocation
is still delayed, in case there are other symbols being used.
Fixes commit 8f8dd904c4 (“elf:
rtld_multiple_ref is always true”).
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Since have-mtls-descriptor is only used for glibc testing, rename it to
have-test-mtls-descriptor. Also enable tst-gnu2-tls2-amx only if
$(have-test-mtls-descriptor) == gnu2.
Tested with GCC 14 and Clang 19/18/17 on x86-64.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
When Clang is used to test fortify glibc build configured with
--enable-fortify-source=N
clang issues errors like
In file included from tst-rfc3484.c:60:
In file included from ./getaddrinfo.c:81:
../sysdeps/unix/sysv/linux/not-cancel.h:36:10: error: reference to overloaded function could not be resolved; did you mean to call it?
36 | __typeof (open64) __open64_nocancel;
| ^~~~~~~~
../include/bits/../../io/bits/fcntl2.h:127:1: note: possible target for call
127 | open64 (__fortify_clang_overload_arg (const char *, ,__path), int __oflag,
| ^
../include/bits/../../io/bits/fcntl2.h:118:1: note: possible target for call
118 | open64 (__fortify_clang_overload_arg (const char *, ,__path), int __oflag)
| ^
../include/bits/../../io/bits/fcntl2.h:114:1: note: possible target for call
114 | open64 (const char *__path, int __oflag, mode_t __mode, ...)
| ^
../io/fcntl.h:219:12: note: possible target for call
219 | extern int open64 (const char *__file, int __oflag, ...) __nonnull ((1));
| ^
because clang fortify support for functions with variable arguments relies
on function overload. Update not-cancel.h to avoid __typeof on functions
with variable arguments.
Co-Authored-By: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
If some shared library loaded with dlopen/dlmopen requires an executable
stack, either implicitly because of a missing GNU_STACK ELF header
(where the ABI default flags implies in the executable bit) or explicitly
because of the executable bit from GNU_STACK; the loader will try to set
the both the main thread and all thread stacks (from the pthread cache)
as executable.
Besides the issue where any __nptl_change_stack_perm failure does not
undo the previous executable transition (meaning that if the library
fails to load, there can be thread stacks with executable stacks), this
behavior was used on a CVE [1] as a vector for RCE.
This patch changes that if a shared library requires an executable
stack, and the current stack is not executable, dlopen fails. The
change is done only for dynamically loaded modules, if the program
or any dependency requires an executable stack, the loader will still
change the main thread before program execution and any thread created
with default stack configuration.
[1] https://www.qualys.com/2023/07/19/cve-2023-38408/rce-openssh-forwarded-ssh-agent.txt
Checked on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Move the x86-64 loader first, before the i386 and x32 loaders. In
most cases, it's the loader the script needs. This avoids an error
message if the i386 loader does not work.
The effect of this change to the generated ldd script looks like this:
-RTLDLIST="/lib/ld-linux.so.2 /lib64/ld-linux-x86-64.so.2 /libx32/ld-linux-x32.so.2"
+RTLDLIST="/lib64/ld-linux-x86-64.so.2 /lib/ld-linux.so.2 /libx32/ld-linux-x32.so.2"
Reviewed-by: Sam James <sam@gentoo.org>
Add __attribute_optimization_barrier__ to disable inlining and cloning on a
function. For Clang, expand it to
__attribute__ ((optnone))
Otherwise, expand it to
__attribute__ ((noinline, clone))
Co-Authored-By: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
This simplifies the handling of sanity check errors in clone.S.
Adjusted a couple of comments to reflect current code.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
The hppa Linux kernel supports the cacheflush() syscall
since version 6.5. This adds the glibc syscall wrapper.
Signed-off-by: Helge Deller <deller@gmx.de>
---
v2: This patch was too late in release cycle for GLIBC_2.40,
so update now to GLIBC_2.41 instead.
Clang emits the following warnings:
../sysdeps/unix/sysv/linux/tst-getdents64.c:111:18: error: fields must
have a constant size: 'variable length array in structure' extension
will never be supported
char buffer[buffer_size];
^
Co-Authored-By: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Linux 6.12 adds a constant MSG_SOCK_DEVMEM (recall that various
constants such as this one are defined in the non-uapi linux/socket.h
but still form part of the kernel/userspace interface, so that
non-uapi header is one that needs checking each release for new such
constants). Add it to glibc's bits/socket.h.
Tested for x86_64.