Code used during early static startup in elf/dl-tls.c uses
__mempcpy.
Fixes commit cbd9fd2369 ("Consolidate
TLS block allocation for static binaries with ld.so").
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
It's not necessary to introduce temporaries because the compiler
is able to evaluate l_soname just once in constracts like:
l_soname (l) != NULL && strcmp (l_soname (l), LIBC_SO) != 0
This reduces code size and dependencies on ld.so internals from
libc.so.
Fixes commit f4c142bb9f
("arm: Use _dl_find_object on __gnu_Unwind_Find_exidx (BZ 31405)").
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Decorate BSS mappings with [anon: glibc: .bss <file>], for example
[anon: glibc: .bss /lib/libc.so.6]. The string ".bss" is already used
by bionic so use the same, but add the filename as well. If the name
would be longer than what the kernel allows, drop the directory part
of the path.
Refactor glibc.mem.decorate_maps check to a separate function and use
it to avoid assembling a name, which would not be used later.
Signed-off-by: Petr Malat <oss@malat.biz>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Linux 6.13 (662df3e5c3766) added a lightweight way to define guard areas
through madvise syscall. Instead of PROT_NONE the guard region through
mprotect, userland can madvise the same area with a special flag, and
the kernel ensures that accessing the area will trigger a SIGSEGV (as for
PROT_NONE mapping).
The madvise way has the advantage of less kernel memory consumption for
the process page-table (one less VMA per guard area), and slightly less
contention on kernel (also due to the fewer VMA areas being tracked).
The pthread_create allocates a new thread stack in two ways: if a guard
area is set (the default) it allocates the memory range required using
PROT_NONE and then mprotect the usable stack area. Otherwise, if a
guard page is not set it allocates the region with the required flags.
For the MADV_GUARD_INSTALL support, the stack area region is allocated
with required flags and then the guard region is installed. If the
kernel does not support it, the usual way is used instead (and
MADV_GUARD_INSTALL is disabled for future stack creations).
The stack allocation strategy is recorded on the pthread struct, and it
is used in case the guard region needs to be resized. To avoid needing
an extra field, the 'user_stack' is repurposed and renamed to 'stack_mode'.
This patch also adds a proper test for the pthread guard.
I checked on x86_64, aarch64, powerpc64le, and hppa with kernel 6.13.0-rc7.
Reviewed-by: DJ Delorie <dj@redhat.com>
I haven't exposed _pthread_mutex_lock, _pthread_mutex_trylock and
_pthread_mutex_unlock in GLIBC_PRIVATE since there aren't used in any
code in libpthread
Message-ID: <20250103103750.870897-3-gfleury@disroot.org>
Adding some basic tests for fopen, testing different modes, stream
positioning and concurrent read/write operation on files.
Reviewed-by: DJ Delorie <dj@redhat.com>
Include the space needed to store the length of the message itself, in
addition to the message string. This resolves BZ #32582.
Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Reviewed: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Linux 6.13 was released with a change that overwrites those bytes.
This means that the check_unused subtest fails.
Update the manual accordingly.
Tested-by: Xi Ruoyao <xry111@xry111.site>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
- Add GCS marking to some of the tests when target supports GCS
- Fix tst-ro-dynamic-mod.map linker script to avoid removing
GNU properties
- Add header with macros for GNU properties
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Allocate GCS based on the stack size, this can be used for coroutines
(makecontext) and thread creation (if the kernel allows user allocated
GCS).
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Unlike for BTI, the kernel does not process GCS properties so update
GL(dl_aarch64_gcs) before the GCS status is set.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
check_gcs is called for each dependency of a DSO, but the GNU property
of the ld.so is not processed so ldso->l_mach.gcs may not be correct.
Just assume ld.so is GCS compatible independently of the ELF marking.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
- Handle GCS marking
- Use l_searchlist.r_list for gcs (allows using the
same function for static exe)
Co-authored-by: Yury Khrustalev <yury.khrustalev@arm.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Use the dynamic linker start code to enable GCS in the dynamic linked
case after _dl_start returns and before _dl_start_user which marks
the point after which user code may run.
Like in the static linked case this ensures that GCS is enabled on a
top level stack frame.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Use the ARCH_SETUP_TLS hook to enable GCS in the static linked case.
The system call must be inlined and then GCS is enabled on a top
level stack frame that does not return and has no exception handlers
above it.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This tunable controls Guarded Control Stack (GCS) for the process.
0 = disabled: do not enable GCS
1 = enforced: check markings and fail if any binary is not marked
2 = optional: check markings but keep GCS off if a binary is unmarked
3 = override: enable GCS, markings are ignored
By default it is 0, so GCS is disabled, value 1 will enable GCS.
The status is stored into GL(dl_aarch64_gcs) early and only applied
later, since enabling GCS is tricky: it must happen on a top level
stack frame. Using GL instead of GLRO because it may need updates
depending on loaded libraries that happen after readonly protection
is applied, however library marking based GCS setting is not yet
implemented.
Describe new tunable in the manual.
Co-authored-by: Yury Khrustalev <yury.khrustalev@arm.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Changed the makecontext logic: previously the first setcontext jumped
straight to the user callback function and the return address is set
to __startcontext. This does not work when GCS is enabled as the
integrity of the return address is protected, so instead the context
is setup such that setcontext jumps to __startcontext which calls the
user callback (passed in x20).
The map_shadow_stack syscall is used to allocate a suitably sized GCS
(which includes some reserved area to account for altstack signal
handlers and otherwise supports maximum number of 16 byte aligned
stack frames on the given stack) however the GCS is never freed as
the lifetime of ucontext and related stack is user managed.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Userspace ucontext needs to store GCSPR, it does not have to be
compatible with the kernel ucontext. For now we use the linux
struct gcs_context layout but only use the gcspr field from it.
Similar implementation to the longjmp code, supports switching GCS
if the target GCS is capped, and unwinding a continuous GCS to a
previous state.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This implementations ensures that longjmp across different stacks
works: it scans for GCS cap token and switches GCS if necessary
then the target GCSPR is restored with a GCSPOPM loop once the
current GCSPR is on the same GCS.
This makes longjmp linear time in the number of jumped over stack
frames when GCS is enabled.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
The target specific internal __longjmp is called with a __jmp_buf
argument which has its size exposed in the ABI. On aarch64 this has
no space left, so GCSPR cannot be restored in longjmp in the usual
way, which is needed for the Guarded Control Stack (GCS) extension.
setjmp is implemented via __sigsetjmp which has a jmp_buf argument
however it is also called with __pthread_unwind_buf_t argument cast
to jmp_buf (in cancellation cleanup code built with -fno-exception).
The two types, jmp_buf and __pthread_unwind_buf_t, have common bits
beyond the __jmp_buf field and there is unused space there which we
can use for saving GCSPR.
For this to work some bits of those two generic types have to be
reserved for target specific use and the generic code in glibc has
to ensure that __longjmp is always called with a __jmp_buf that is
embedded into one of those two types. Morally __longjmp should be
changed to take jmp_buf as argument, but that is an intrusive change
across targets.
Note: longjmp is never called with __pthread_unwind_buf_t from user
code, only the internal __libc_longjmp is called with that type and
thus the two types could have separate longjmp implementations on a
target. We don't rely on this now (but might in the future given that
cancellation unwind does not need to restore GCSPR).
Given the above this patch finds an unused slot for GCSPR. This
placement is not exposed in the ABI so it may change in the future.
This is also very target ABI specific so the generic types cannot
be easily changed to clearly mark the reserved fields.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
The Guarded Control Stack instructions can be present even if the
hardware does not support the extension (runtime checked feature),
so the asm code should be backward compatible with old assemblers.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
This variable used to be needed to wait in group switching until all sleepers
have confirmed that they have woken. This is no longer needed. Nothing waits
on this variable so there is no need to track how many threads are currently
asleep in each group.
Signed-off-by: Malte Skarupke <malteskarupke@fastmail.fm>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The new test elf/tst-rseq-tls-range-4096-static reliably detected
the extra TLS allocation problem (tcb_offset was dropped from
the allocation size) on aarch64. It also failed with a crash
in dlopen *before* the extra TLS changes, so TLS alignment with
static dlopen was already broken.
Reviewed-by: Michael Jeanson <mjeanson@efficios.com>