1
0
mirror of https://sourceware.org/git/glibc.git synced 2025-10-12 19:04:54 +03:00
Commit Graph

16796 Commits

Author SHA1 Message Date
Florian Weimer
89e61e96b7 i386: Update ulps for *pi functions
As seen with GCC 11.5 on an AMD Ryzen 9 7950X CPU, with an
-fpmath=sse, --disable-multi-arch build of glibc.
2025-01-20 11:34:38 +01:00
Yury Khrustalev
d3f2b71ef1 aarch64: Fix tests not compatible with targets supporting GCS
- Add GCS marking to some of the tests when target supports GCS
 - Fix tst-ro-dynamic-mod.map linker script to avoid removing
   GNU properties
 - Add header with macros for GNU properties

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2025-01-20 09:36:19 +00:00
Szabolcs Nagy
a335acb8b8 aarch64: Use __alloc_gcs in makecontext
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2025-01-20 09:36:19 +00:00
Szabolcs Nagy
3d8da0d91b aarch64: Add GCS user-space allocation logic
Allocate GCS based on the stack size, this can be used for coroutines
(makecontext) and thread creation (if the kernel allows user allocated
GCS).

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2025-01-20 09:36:19 +00:00
Szabolcs Nagy
d3df351338 aarch64: Process gnu properties in static exe
Unlike for BTI, the kernel does not process GCS properties so update
GL(dl_aarch64_gcs) before the GCS status is set.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2025-01-20 09:36:19 +00:00
Szabolcs Nagy
29476485f9 aarch64: Ignore GCS property of ld.so
check_gcs is called for each dependency of a DSO, but the GNU property
of the ld.so is not processed so ldso->l_mach.gcs may not be correct.
Just assume ld.so is GCS compatible independently of the ELF marking.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2025-01-20 09:36:19 +00:00
Szabolcs Nagy
4d56a5bbd6 aarch64: Handle GCS marking
- Handle GCS marking
 - Use l_searchlist.r_list for gcs (allows using the
   same function for static exe)

Co-authored-by: Yury Khrustalev <yury.khrustalev@arm.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2025-01-20 09:35:56 +00:00
Szabolcs Nagy
8d516b6f85 aarch64: Use l_searchlist.r_list for bti
Allows using the same function for static exe.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2025-01-20 09:31:47 +00:00
Szabolcs Nagy
76b79f7241 aarch64: Mark objects with GCS property note
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2025-01-20 09:31:47 +00:00
Szabolcs Nagy
01f52b11de aarch64: Enable GCS in dynamic linked exe
Use the dynamic linker start code to enable GCS in the dynamic linked
case after _dl_start returns and before _dl_start_user which marks
the point after which user code may run.

Like in the static linked case this ensures that GCS is enabled on a
top level stack frame.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2025-01-20 09:31:47 +00:00
Szabolcs Nagy
b81ee54bc9 aarch64: Enable GCS in static linked exe
Use the ARCH_SETUP_TLS hook to enable GCS in the static linked case.
The system call must be inlined and then GCS is enabled on a top
level stack frame that does not return and has no exception handlers
above it.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2025-01-20 09:31:47 +00:00
Szabolcs Nagy
9ad3d9267d aarch64: Add glibc.cpu.aarch64_gcs tunable
This tunable controls Guarded Control Stack (GCS) for the process.

0 = disabled: do not enable GCS
1 = enforced: check markings and fail if any binary is not marked
2 = optional: check markings but keep GCS off if a binary is unmarked
3 = override: enable GCS, markings are ignored

By default it is 0, so GCS is disabled, value 1 will enable GCS.

The status is stored into GL(dl_aarch64_gcs) early and only applied
later, since enabling GCS is tricky: it must happen on a top level
stack frame. Using GL instead of GLRO because it may need updates
depending on loaded libraries that happen after readonly protection
is applied, however library marking based GCS setting is not yet
implemented.

Describe new tunable in the manual.

Co-authored-by: Yury Khrustalev <yury.khrustalev@arm.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2025-01-20 09:31:33 +00:00
Szabolcs Nagy
3ac237fb71 aarch64: Add GCS support for makecontext
Changed the makecontext logic: previously the first setcontext jumped
straight to the user callback function and the return address is set
to __startcontext. This does not work when GCS is enabled as the
integrity of the return address is protected, so instead the context
is setup such that setcontext jumps to __startcontext which calls the
user callback (passed in x20).

The map_shadow_stack syscall is used to allocate a suitably sized GCS
(which includes some reserved area to account for altstack signal
handlers and otherwise supports maximum number of 16 byte aligned
stack frames on the given stack) however the GCS is never freed as
the lifetime of ucontext and related stack is user managed.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2025-01-20 09:22:41 +00:00
Szabolcs Nagy
7d22054db7 aarch64: Mark swapcontext with indirect_return
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2025-01-20 09:22:41 +00:00
Szabolcs Nagy
9885d13b66 aarch64: Add GCS support for setcontext
Userspace ucontext needs to store GCSPR, it does not have to be
compatible with the kernel ucontext. For now we use the linux
struct gcs_context layout but only use the gcspr field from it.

Similar implementation to the longjmp code, supports switching GCS
if the target GCS is capped, and unwinding a continuous GCS to a
previous state.

Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2025-01-20 09:22:41 +00:00
Szabolcs Nagy
1cf59c2603 aarch64: Add GCS support to vfork
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2025-01-20 09:22:41 +00:00
Szabolcs Nagy
5ff5e7836e aarch64: Add GCS support to longjmp
This implementations ensures that longjmp across different stacks
works: it scans for GCS cap token and switches GCS if necessary
then the target GCSPR is restored with a GCSPOPM loop once the
current GCSPR is on the same GCS.

This makes longjmp linear time in the number of jumped over stack
frames when GCS is enabled.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2025-01-20 09:22:41 +00:00
Szabolcs Nagy
13cbbb0cb2 aarch64: Define jmp_buf offset for GCS
The target specific internal __longjmp is called with a __jmp_buf
argument which has its size exposed in the ABI. On aarch64 this has
no space left, so GCSPR cannot be restored in longjmp in the usual
way, which is needed for the Guarded Control Stack (GCS) extension.

setjmp is implemented via __sigsetjmp which has a jmp_buf argument
however it is also called with __pthread_unwind_buf_t argument cast
to jmp_buf (in cancellation cleanup code built with -fno-exception).
The two types, jmp_buf and __pthread_unwind_buf_t, have common bits
beyond the __jmp_buf field and there is unused space there which we
can use for saving GCSPR.

For this to work some bits of those two generic types have to be
reserved for target specific use and the generic code in glibc has
to ensure that __longjmp is always called with a __jmp_buf that is
embedded into one of those two types. Morally __longjmp should be
changed to take jmp_buf as argument, but that is an intrusive change
across targets.

Note: longjmp is never called with __pthread_unwind_buf_t from user
code, only the internal __libc_longjmp is called with that type and
thus the two types could have separate longjmp implementations on a
target. We don't rely on this now (but might in the future given that
cancellation unwind does not need to restore GCSPR).

Given the above this patch finds an unused slot for GCSPR. This
placement is not exposed in the ABI so it may change in the future.
This is also very target ABI specific so the generic types cannot
be easily changed to clearly mark the reserved fields.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2025-01-20 09:22:41 +00:00
Szabolcs Nagy
58771b8a59 aarch64: Add asm helpers for GCS
The Guarded Control Stack instructions can be present even if the
hardware does not support the extension (runtime checked feature),
so the asm code should be backward compatible with old assemblers.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
2025-01-20 09:22:41 +00:00
Samuel Thibault
8ef1791950 hurd: Fix EINVAL error on linking to a slash-trailing path [BZ #32569]
When the target path finishes with a slash, __file_name_split_at returns
an empty file name. We can test for this to refuse doing the link.
2025-01-19 15:11:44 +01:00
Malte Skarupke
c36fc50781 nptl: Remove g_refs from condition variables
This variable used to be needed to wait in group switching until all sleepers
have confirmed that they have woken. This is no longer needed. Nothing waits
on this variable so there is no need to track how many threads are currently
asleep in each group.

Signed-off-by: Malte Skarupke <malteskarupke@fastmail.fm>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2025-01-17 14:56:58 -05:00
Adhemerval Zanella
109c40ed7a math: update arm ulps
GCC 14.2.1 with -mfpu=neon-vfpv4 -mfloat-abi=hard -mtls-dialect=gnu
-marm -march=armv7-a+neon-vfpv4 on Neoverse-N1.
2025-01-17 19:36:22 +00:00
Andreas K. Hüttel
ae33fb452f math: update arm ulps
CC="gcc -O2 -pipe -march=armv7-a -mfpu=vfpv3-d16 -mfloat-abi=hard"
linux32 chroot on aarch64

Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
2025-01-17 14:50:38 +01:00
Florian Weimer
37b9a5aacc Linux: Add tests that check that TLS and rseq area are separate
The new test elf/tst-rseq-tls-range-4096-static reliably detected
the extra TLS allocation problem (tcb_offset was dropped from
the allocation size) on aarch64.  It also failed with a crash
in dlopen *before* the extra TLS changes, so TLS alignment with
static dlopen was already broken.

Reviewed-by: Michael Jeanson <mjeanson@efficios.com>
2025-01-16 20:02:42 +01:00
Florian Weimer
abeae3c006 Linux: Fixes for getrandom fork handling
Careful updates of grnd_alloc.len are required to ensure that
after fork, grnd_alloc.states does not contain entries that
are also encountered by __getrandom_reset_state in TCBs.
For the same reason, it is necessary to overwrite the TCB state
pointer with NULL before updating grnd_alloc.states in
__getrandom_vdso_release.

Before this change, different TCBs could share the same getrandom
state after multi-threaded fork.  This would be a critical security
bug (predictable randomness) if not caught during development.

The additional check in stdlib/tst-arc4random-thread makes it more
likely that the test fails due to the bugs mentioned above.

Both __getrandom_reset_state and __getrandom_vdso_release could
put reserved NULL pointers into the states array.  This is also
fixed with this commit.  After these changes, no null pointers were
observed in the states array during testing.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2025-01-16 19:58:09 +01:00
Pavel Kozlov
252fc3628b arc: Update libm test ulps
Update fpu and nofpu ULPs. Regenerated on HSDK-4xD board
running Linux 6.12.7 / GCC 14.2.0.
2025-01-15 11:41:30 +00:00
Stefan Liebler
09ea1afec7 affinity-inheritance: Overallocate CPU sets
Some kernels on S390 appear to return a CPU affinity mask based on
configured processors rather than the ones online.  Overallocate the CPU
set to match that, but operate only on the ones online.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
Co-authored-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
2025-01-14 09:23:36 -05:00
Samuel Thibault
2ac7701888 mach: Fix fallthrough warning
gcc would not take the /* FALLTHROUGH */ inside the #ifdef
2025-01-14 00:11:35 +01:00
mirabilos
f42634f824 sh4: ensure FPSCR.PR==0 when executing FRCHG [BZ #27543]
If the bit is not 0, the operations FRCHG and FSCHG are
undefined and cause a trap; qemu now checks for this as
well, so we set it to 0 temporarily and restore the old
value in getcontext afterwards (setcontext/swapcontext
already do so).

From the discussion in the bugreport, this can probably
be optimised in one place but none of the people involved
are SH4 assembly experts, this patch is field-tested, and
it’s not a code path run often. The other question, what
happens if a signal occurs while the bit is temporarily 0,
is also still unsolved, but to fix that a kernel change is
most likely needed; this patch changes a certain trap on
many CPUs for a hard-to-get trap in a signal handler if a
signal is delivered during the few instructions the PR bit
is temporarily set to 0, so it’s not a regression for most
users.

See BZ and https://bugs.launchpad.net/qemu/+bug/1796520 for
related discussion, references and review comments.

Signed-off-by: mirabilos <tg@debian.org>
Reviewed-by: Oleg Endo <olegendo@gcc.gnu.org>
Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2025-01-13 11:25:23 -03:00
Adhemerval Zanella
6c575d835e aarch64: Use 64-bit variable to access the special registers
clang issues:

  error: value size does not match register size specified by the
  constraint and modifier [-Werror,-Wasm-operand-widths]

while tryng to use 32 bit variables with 'mrs' to get/set the
fpsr, dczid_el0, and ctr.
2025-01-13 10:17:38 -03:00
Samuel Thibault
e9f16cb6d1 hurd: Set _POSIX_MONOTONIC_CLOCK to 200809L
Now that CLOCK_MONOTONIC is supported.
2025-01-12 22:47:00 +01:00
Samuel Thibault
b31d490222 hurd: Add CLOCK_MONOTONIC to clock_nanosleep 2025-01-12 22:47:00 +01:00
Zhaoming Luo
3782ffaf3e mach: Add CLOCK_MONOTONIC case in clock_gettime()
The Mach RPC host_get_uptime64() is implemented. It returns the elapsed time
value since bootup. See

https://git.savannah.gnu.org/cgit/hurd/gnumach.git/commit/?id=fc494bfe3fb6363e1077dc035eb119970d84a9d1

In this patch, the RPC is used to implement the monotonic clock for
mach.

* config.h.in: Add HAVE_HOST_GET_UPTIME64 config entry
* sysdeps/mach/clock_gettime.c: Add CLOCK_MONOTONIC case
* sysdeps/mach/configure: Check the existence of host_get_uptime64 RPC
* sysdeps/mach/configure.ac: Check the existence of host_get_uptime64 RPC

Message-ID: <20250106043907.1046-1-zhmingluo@163.com>
2025-01-12 22:47:00 +01:00
Samuel Thibault
73b854e955 hurd: Mark more memory-hungry tests as unsupported
until RLIMIT_AS support gets commited in gnumach.
2025-01-12 16:06:00 +01:00
Samuel Thibault
dbe3e6e022 hurd: Mark more memory-hungry tests as unsupported
until RLIMIT_AS support gets commited in gnumach.
2025-01-12 01:03:13 +01:00
Samuel Thibault
1a09aa03ee hurd: Mark tst-tls-allocation-failure-static-patched as supported
The failure was not due to RLIMIT_AS but unsupported intentional early
abort.
2025-01-12 00:55:56 +01:00
Samuel Thibault
0c48562508 hurd: Cope with signals sent to ourself early
Typically when aborting during initialization, before signals are set
up.
2025-01-12 00:55:56 +01:00
H.J. Lu
0b6ad02b33 x86-64: Cast __rseq_offset to long long int [BZ #32543]
commit 494d65129e
Author: Michael Jeanson <mjeanson@efficios.com>
Date:   Thu Aug 1 10:35:34 2024 -0400

    nptl: Introduce <rseq-access.h> for RSEQ_* accessors

added things like

       asm volatile ("movl %%fs:%P1(%q2),%0"                                  \
                     : "=r" (__value)                                         \
                     : "i" (offsetof (struct rseq_area, member)),             \
                       "r" (__rseq_offset));				      \

But this doesn't work for x32 when __rseq_offset is negative since the
address is computed as

FS + 32-bit to 64-bit zero extension of __rseq_offset
+ offsetof (struct rseq_area, member)

Cast __rseq_offset to long long int

                       "r" ((long long int) __rseq_offset));		      \

to sign-extend 32-bit __rseq_offset to 64-bit.  This is a no-op for x86-64
since x86-64 __rseq_offset is 64-bit.  This fixes BZ #32543.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
2025-01-12 07:08:27 +08:00
Samuel Thibault
53a71b9f66 hurd: Mark more memory-hungry tests as unsupported
until RLIMIT_AS support gets commited in gnumach.
2025-01-11 04:17:38 +01:00
Michael Jeanson
072795229c Linux: Update internal copy of '<sys/rseq.h>'
Sync the internal copy of '<sys/rseq.h>' with the latest Linux kernel
'include/uapi/linux/rseq.h'.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
2025-01-10 20:20:48 +00:00
Michael Jeanson
93d0bfbe8f nptl: Move the rseq area to the 'extra TLS' block
Move the rseq area to the newly added 'extra TLS' block, this is the
last step in adding support for the rseq extended ABI. The size of the
rseq area is now dynamic and depends on the rseq features reported by
the kernel through the elf auxiliary vector. This will allow
applications to use rseq features past the 32 bytes of the original rseq
ABI as they become available in future kernels.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
2025-01-10 20:20:27 +00:00
Michael Jeanson
494d65129e nptl: Introduce <rseq-access.h> for RSEQ_* accessors
In preparation to move the rseq area to the 'extra TLS' block, we need
accessors based on the thread pointer and the rseq offset. The ONCE
variant of the accessors ensures single-copy atomicity for loads and
stores which is required for all fields once the registration is active.

A separate header is required to allow including <atomic.h> which
results in an include loop when added to <tcb-access.h>.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
2025-01-10 20:20:17 +00:00
Michael Jeanson
be440f6c38 nptl: add rtld_hidden_proto to __rseq_size and __rseq_offset
This allows accessing the internal aliases of __rseq_size and
__rseq_offset from ld.so without ifdefs and avoids dynamic symbol
binding at run time for both variables.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
2025-01-10 20:19:53 +00:00
Michael Jeanson
304221775c Add Linux 'extra TLS'
Add the Linux implementation of 'extra TLS' which will allocate space
for the rseq area at the end of the TLS blocks in allocation order.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
2025-01-10 20:19:40 +00:00
Michael Jeanson
0e411c5d30 Add generic 'extra TLS'
Add the logic to append an 'extra TLS' block in the TLS block allocator
with a generic stub implementation. The duplicated code in
'csu/libc-tls.c' and 'elf/dl-tls.c' is to handle both statically linked
applications and the ELF dynamic loader.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
2025-01-10 20:19:28 +00:00
Michael Jeanson
c813c1490d nptl: Add rseq auxvals
Get the rseq feature size and alignment requirement from the auxiliary
vector for use inside the dynamic loader. Use '__rseq_size' directly to
store the feature size. If the main thread registration fails or is
disabled by tunable, reset the value to 0.

This will be used in the TLS block allocator to compute the size and
alignment of the rseq area block for the extended ABI support.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
2025-01-10 20:19:07 +00:00
Florian Weimer
4a9a8a5098 Add missing include guards to <dl-tls.h>
Some architecture-specific variants lack header inclusion guards.
Add them for consistency with the generic version.
2025-01-10 19:02:47 +01:00
Florian Weimer
d1da011118 elf: Always define TLS_TP_OFFSET
This will be needed to compute __rseq_offset outside of the TLS
relocation machinery.

Reviewed-by: Michael Jeanson <mjeanson@efficios.com>
2025-01-09 19:30:44 +01:00
Florian Weimer
9b71570c46 x86: Add missing #include <features.h> to <thread_pointer.h>
It is required for __GNUC_PREREQ.

Reviewed-by: Michael Jeanson <mjeanson@efficios.com>
2025-01-09 19:30:41 +01:00
Florian Weimer
7a3e2e877a Move <thread_pointer.h> to kernel-independent sysdeps directories
Hurd is expected to use the same thread ABI as Linux.

Reviewed-by: Michael Jeanson <mjeanson@efficios.com>
2025-01-09 19:30:16 +01:00