Now we finally support modern GCC and binutils, it's time for a cleanup.
Use PAC and BTI instructions unconditionally and use proper assembler syntax.
Remove the PR target/94791 strip_pac workarounds for buggy GCCs. Remove the
PAC/BTI configure checks - always emit GNU property notes on assembly files.
Change cfi_window_save to the correct cfi_negate_ra_state unwind directive.
Reviewed-by: Matthieu Longo <matthieu.longo@arm.com>
Implement double and single precision variants of the C23 routine atan2pi
for both AdvSIMD and SVE.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Implement double and single precision variants of the C23 routine atanpi
for both AdvSIMD and SVE.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Implement double and single precision variants of the C23 routine asinpi
for both AdvSIMD and SVE.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Implement double and single precision variants of the C23 routine acospi
for both AdvSIMD and SVE.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the rootn functions, which compute the Yth root of X for
integer Y (with a domain error if Y is 0, even if X is a NaN). The
integer exponent has type long long int in C23; it was intmax_t in TS
18661-4, and as with other interfaces changed after their initial
appearance in the TS, I don't think we need to support the original
version of the interface.
As with pown and compoundn, I strongly encourage searching for worst
cases for ulps error for these implementations (necessarily
non-exhaustively, given the size of the input space). I also expect a
custom implementation for a given format could be much faster as well
as more accurate, although the implementation is simpler than those
for pown and compoundn.
This completes adding to glibc those TS 18661-4 functions (ignoring
DFP) that are included in C23. See
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118592 regarding the C23
mathematical functions (not just the TS 18661-4 ones) missing built-in
functions in GCC, where such functions might usefully be added.
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the compoundn functions, which compute (1+X) to the
power Y for integer Y (and X at least -1). The integer exponent has
type long long int in C23; it was intmax_t in TS 18661-4, and as with
other interfaces changed after their initial appearance in the TS, I
don't think we need to support the original version of the interface.
Note that these functions are "compoundn" with a trailing "n", *not*
"compound" (CORE-MATH has the wrong name, for example).
As with pown, I strongly encourage searching for worst cases for ulps
error for these implementations (necessarily non-exhaustively, given
the size of the input space). I also expect a custom implementation
for a given format could be much faster as well as more accurate (I
haven't tested or benchmarked the CORE-MATH implementation for
binary32); this is one of the more complicated and less efficient
functions to implement in a type-generic way.
As with exp2m1 and exp10m1, this showed up places where the
powerpc64le IFUNC setup is not as self-contained as one might hope (in
this case, without the changes specific to powerpc64le, there were
undefined references to __GI___expf128).
Tested for x86_64 and x86, and with build-many-glibcs.py.
C2Y adds unsigned versions of the abs functions (see C2Y draft N3467 and
proposal N3349).
Tested for x86_64.
Signed-off-by: Lenard Mollenkopf <glibc@lenardmollenkopf.de>
When libgcc is built with pac-ret, it requires to autenticate the
unwinding frame based on CFI information. The _dl_tlsdesc_dynamic
uses a custom calling convention, where it is responsible to save
and restore all registers it might use (even volatile).
The pac-ret support added by 1be3d6eb82
was added only on the slow-path, but the fast path also adds DWARF
Register Rule Instruction (cfi_adjust_cfa_offset) since it requires
to save/restore some auxiliary register. It seems that this is not
fully supported neither by libgcc nor AArch64 ABI [1].
Instead, move paciasp/autiasp to function prologue/epilogue to be
used on both fast and slow paths.
I also corrected the _dl_tlsdesc_dynamic comment description, it was
copied from i386 implementation without any adjustment.
Checked on aarch64-linux-gnu with a toolchain built with
--enable-standard-branch-protection on a system with pac-ret
support.
[1] https://github.com/ARM-software/abi-aa/blob/main/aadwarf64/aadwarf64.rst#id1
Reviewed-by: Yury Khrustalev <yury.khrustalev@arm.com>
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the pown functions, which are like pow but with an
integer exponent. That exponent has type long long int in C23; it was
intmax_t in TS 18661-4, and as with other interfaces changed after
their initial appearance in the TS, I don't think we need to support
the original version of the interface. The test inputs are based on
the subset of test inputs for pow that use integer exponents that fit
in long long.
As the first such template implementation that saves and restores the
rounding mode internally (to avoid possible issues with directed
rounding and intermediate overflows or underflows in the wrong
rounding mode), support also needed to be added for using
SET_RESTORE_ROUND* in such template function implementations. This
required math-type-macros-float128.h to include <fenv_private.h>, so
it can tell whether SET_RESTORE_ROUNDF128 is defined. In turn, the
include order with <fenv_private.h> included before <math_private.h>
broke loongarch builds, showing up that
sysdeps/loongarch/math_private.h is really a fenv_private.h file
(maybe implemented internally before the consistent split of those
headers in 2018?) and needed to be renamed to fenv_private.h to avoid
errors with duplicate macro definitions if <math_private.h> is
included after <fenv_private.h>.
The underlying implementation uses __ieee754_pow functions (called
more than once in some cases, where the exponent does not fit in the
floating type). I expect a custom implementation for a given format,
that only handles integer exponents but handles larger exponents
directly, could be faster and more accurate in some cases.
I encourage searching for worst cases for ulps error for these
implementations (necessarily non-exhaustively, given the size of the
input space).
Tested for x86_64 and x86, and with build-many-glibcs.py.
Add function __inet_pton_chk which calls __chk_fail when the size of
argument dst is too small. inet_pton is redirected to __inet_pton_chk
or __inet_pton_warn when _FORTIFY_SOURCE is > 0.
Also add tests to debug/tst-fortify.c, update the abilist with
__inet_pton_chk and mention inet_pton fortification in maint.texi.
Co-authored-by: Frédéric Bérat <fberat@redhat.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
- Create the __inet_ntop_chk routine that verifies that the builtin size
of the destination buffer is at least as big as the size given by the
user.
- Redirect calls from inet_ntop to __inet_ntop_chk or __inet_ntop_warn
- Update the abilist for this new routine
- Update the manual to mention the new fortification
Reviewed-by: Florian Weimer <fweimer@redhat.com>
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the powr functions, which are like pow, but with simpler
handling of special cases (based on exp(y*log(x)), so negative x and
0^0 are domain errors, powers of -0 are always +0 or +Inf never -0 or
-Inf, and 1^+-Inf and Inf^0 are also domain errors, while NaN^0 and
1^NaN are NaN). The test inputs are taken from those for pow, with
appropriate adjustments (including removing all tests that would be
domain errors from those in auto-libm-test-in and adding some more
such tests in libm-test-powr.inc).
The underlying implementation uses __ieee754_pow functions after
dealing with all special cases that need to be handled differently.
It might be a little faster (avoiding a wrapper and redundant checks
for special cases) to have an underlying implementation built
separately for both pow and powr with compile-time conditionals for
special-case handling, but I expect the benefit of that would be
limited given that both functions will end up needing to use the same
logic for computing pow outside of special cases.
My understanding is that powr(negative, qNaN) should raise "invalid":
that the rule on "invalid" for an argument outside the domain of the
function takes precedence over a quiet NaN argument producing a quiet
NaN result with no exceptions raised (for rootn it's explicit that the
0th root of qNaN raises "invalid"). I've raised this on the WG14
reflector to confirm the intent.
Tested for x86_64 and x86, and with build-many-glibcs.py.
Linux 6.13 adds four new syscalls. Update syscall-names.list and
regenerate the arch-syscall.h headers with build-many-glibcs.py
update-syscalls.
Tested with build-many-glibcs.py.
Current Bionic has this function, with enhanced error checking
(the undefined case terminates the process).
Reviewed-by: Joseph Myers <josmyers@redhat.com>
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the rsqrt functions (1/sqrt(x)). The test inputs are
taken from those for sqrt.
Tested for x86_64 and x86, and with build-many-glibcs.py.
Remove unused _dl_hwcap_string defines. As a result many dl-procinfo.h headers
can be removed. This also removes target specific _dl_procinfo implementations
which only printed HWCAP strings using dl_hwcap_string.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This series removes various ILP32 defines that are now
no longer needed.
Remove PTR_ARG/SIZE_ARG.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Test that when we return from a function that enabled GCS at runtime
we get SIGSEGV. Also test that ucontext contains GCS block with the
GCS pointer.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
These tests validate that GCS tunable works as expected depending
on the GCS markings in the test binaries.
Tests validate both static and dynamically linked binaries.
These new tests are AArch64 specific. Moreover, they are included only
if linker supports the "-z gcs=<value>" option. If built, these tests
will run on systems with and without HWCAP_GCS. In the latter case the
tests will be reported as UNSUPPORTED.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The syscall pkey_alloc can return ENOSPC to indicate either that all
keys are in use or that the system runs in a mode in which memory
protection keys are disabled. In such case the test should not fail and
just return unsupported.
This matches the behaviour of the generic tst-pkey.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Unlike for BTI, the kernel does not process GCS properties so update
GL(dl_aarch64_gcs) before the GCS status is set.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Use the ARCH_SETUP_TLS hook to enable GCS in the static linked case.
The system call must be inlined and then GCS is enabled on a top
level stack frame that does not return and has no exception handlers
above it.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
This tunable controls Guarded Control Stack (GCS) for the process.
0 = disabled: do not enable GCS
1 = enforced: check markings and fail if any binary is not marked
2 = optional: check markings but keep GCS off if a binary is unmarked
3 = override: enable GCS, markings are ignored
By default it is 0, so GCS is disabled, value 1 will enable GCS.
The status is stored into GL(dl_aarch64_gcs) early and only applied
later, since enabling GCS is tricky: it must happen on a top level
stack frame. Using GL instead of GLRO because it may need updates
depending on loaded libraries that happen after readonly protection
is applied, however library marking based GCS setting is not yet
implemented.
Describe new tunable in the manual.
Co-authored-by: Yury Khrustalev <yury.khrustalev@arm.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Changed the makecontext logic: previously the first setcontext jumped
straight to the user callback function and the return address is set
to __startcontext. This does not work when GCS is enabled as the
integrity of the return address is protected, so instead the context
is setup such that setcontext jumps to __startcontext which calls the
user callback (passed in x20).
The map_shadow_stack syscall is used to allocate a suitably sized GCS
(which includes some reserved area to account for altstack signal
handlers and otherwise supports maximum number of 16 byte aligned
stack frames on the given stack) however the GCS is never freed as
the lifetime of ucontext and related stack is user managed.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Userspace ucontext needs to store GCSPR, it does not have to be
compatible with the kernel ucontext. For now we use the linux
struct gcs_context layout but only use the gcspr field from it.
Similar implementation to the longjmp code, supports switching GCS
if the target GCS is capped, and unwinding a continuous GCS to a
previous state.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
clang issues:
error: value size does not match register size specified by the
constraint and modifier [-Werror,-Wasm-operand-widths]
while tryng to use 32 bit variables with 'mrs' to get/set the
fpsr, dczid_el0, and ctr.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the atan2pi functions (atan2(y,x)/pi).
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the atanpi functions (atan(x)/pi).
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the asinpi functions (asin(x)/pi).
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the acospi functions (acos(x)/pi).
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the tanpi functions (tan(pi*x)).
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the sinpi functions (sin(pi*x)).
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the cospi functions (cos(pi*x)).
Tested for x86_64 and x86, and with build-many-glibcs.py.
This patch adds support for memory protection keys on AArch64 systems with
enabled Stage 1 permission overlays feature introduced in Armv8.9 / 9.4
(FEAT_S1POE) [1].
1. Internal functions "pkey_read" and "pkey_write" to access data
associated with memory protection keys.
2. Implementation of API functions "pkey_get" and "pkey_set" for
the AArch64 target.
3. AArch64-specific PKEY flags for READ and EXECUTE (see below).
4. New target-specific test that checks behaviour of pkeys on
AArch64 targets.
5. This patch also extends existing generic test for pkeys.
6. HWCAP constant for Permission Overlay Extension feature.
To support more accurate mapping of underlying permissions to the
PKEY flags, we introduce additional AArch64-specific flags. The full
list of flags is:
- PKEY_UNRESTRICTED: 0x0 (for completeness)
- PKEY_DISABLE_ACCESS: 0x1 (existing flag)
- PKEY_DISABLE_WRITE: 0x2 (existing flag)
- PKEY_DISABLE_EXECUTE: 0x4 (new flag, AArch64 specific)
- PKEY_DISABLE_READ: 0x8 (new flag, AArch64 specific)
The problem here is that PKEY_DISABLE_ACCESS has unusual semantics as
it overlaps with existing PKEY_DISABLE_WRITE and new PKEY_DISABLE_READ.
For this reason mapping between permission bits RWX and "restrictions"
bits awxr (a for disable access, etc) becomes complicated:
- PKEY_DISABLE_ACCESS disables both R and W
- PKEY_DISABLE_{WRITE,READ} disables W and R respectively
- PKEY_DISABLE_EXECUTE disables X
Combinations like the one below are accepted although they are redundant:
- PKEY_DISABLE_ACCESS | PKEY_DISABLE_READ | PKEY_DISABLE_WRITE
Reverse mapping tries to retain backward compatibility and ORs
PKEY_DISABLE_ACCESS whenever both flags PKEY_DISABLE_READ and
PKEY_DISABLE_WRITE would be present.
This will break code that compares pkey_get output with == instead
of using bitwise operations. The latter is more correct since PKEY_*
constants are essentially bit flags.
It should be noted that PKEY_DISABLE_ACCESS does not prevent execution.
[1] https://developer.arm.com/documentation/ddi0487/ka/ section D8.4.1.4
Co-authored-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Linux 6.11 has getrandom() in vDSO. It operates on a thread-local opaque
state allocated with mmap using flags specified by the vDSO.
Multiple states are allocated at once, as many as fit into a page, and
these are held in an array of available states to be doled out to each
thread upon first use, and recycled when a thread terminates. As these
states run low, more are allocated.
To make this procedure async-signal-safe, a simple guard is used in the
LSB of the opaque state address, falling back to the syscall if there's
reentrancy contention.
Also, _Fork() is handled by blocking signals on opaque state allocation
(so _Fork() always sees a consistent state even if it interrupts a
getrandom() call) and by iterating over the thread stack cache on
reclaim_stack. Each opaque state will be in the free states list
(grnd_alloc.states) or allocated to a running thread.
The cancellation is handled by always using GRND_NONBLOCK flags while
calling the vDSO, and falling back to the cancellable syscall if the
kernel returns EAGAIN (would block). Since getrandom is not defined by
POSIX and cancellation is supported as an extension, the cancellation is
handled as 'may occur' instead of 'shall occur' [1], meaning that if
vDSO does not block (the expected behavior) getrandom will not act as a
cancellation entrypoint. It avoids a pthread_testcancel call on the fast
path (different than 'shall occur' functions, like sem_wait()).
It is currently enabled for x86_64, which is available in Linux 6.11,
and aarch64, powerpc32, powerpc64, loongarch64, and s390x, which are
available in Linux 6.12.
Link: https://pubs.opengroup.org/onlinepubs/9799919799/nframe.html [1]
Co-developed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Tested-by: Jason A. Donenfeld <Jason@zx2c4.com> # x86_64
Tested-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> # x86_64, aarch64
Tested-by: Xi Ruoyao <xry111@xry111.site> # x86_64, aarch64, loongarch64
Tested-by: Stefan Liebler <stli@linux.ibm.com> # s390x
This enables vectorisation of C23 logp1, which is an alias for log1p.
There are no new tests or ulp entries because the new symbols are simply
aliases.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
And struct sched_attr.
In sysdeps/unix/sysv/linux/bits/sched.h, the hack that defines
sched_param around the inclusion of <linux/sched/types.h> is quite
ugly, but the definition of struct sched_param has already been
dropped by the kernel, so there is nothing else we can do and maintain
compatibility of <sched.h> with a wide range of kernel header
versions. (An alternative would involve introducing a separate header
for this functionality, but this seems unnecessary.)
The existing sched_* functions that change scheduler parameters
are already incompatible with PTHREAD_PRIO_PROTECT mutexes, so
there is no harm in adding more functionality in this area.
The documentation mostly defers to the Linux manual pages.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>