1
0
mirror of https://sourceware.org/git/glibc.git synced 2025-11-15 15:21:18 +03:00
Commit Graph

6789 Commits

Author SHA1 Message Date
Adhemerval Zanella
3d52fd274e linux: Add mseal syscall support
It has been added on Linux 6.10 (8be7258aad44b5e25977a98db136f677fa6f4370)
as a way to block operations such as mapping, moving to another location,
shrinking the size, expanding the size, or modifying it to a pre-existing
memory mapping.

Although the system only works on 64-bit CPUs, the entrypoint was added
for all ABIs (since the kernel might eventually implement it for additional
ones and/or the ABI can execute on a 64-bit kernel).

Checked on x86_64-linux-gnu and aarch64-linux-gnu.

Reviewed-by: Collin Funk <collin.funk1@gmail.com>
2025-11-12 15:27:28 -03:00
Adhemerval Zanella
3078358ac6 math: Remove the SVID error handling from tgammaf
It improves latency for about 1.5% and throughput for about 2-4%.

Tested on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-11-05 10:19:37 -03:00
Adhemerval Zanella
de0e623434 math: Remove the SVID error handling from lgammaf/lgammaf_r
It improves latency throughput for about 2%.

Tested on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-11-05 09:27:07 -03:00
Adhemerval Zanella
c0be0b4527 Add FD_PIDFS_ROOT from Linux 6.17 to bits/fcntl-linux.h
It was added by commit 3941e37f62fe2c3c8b8675c12183185f20450539

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2025-11-05 07:15:52 -03:00
Adhemerval Zanella
1e750f62c4 Add AT_EXECVE_CHECK from Linux 6.14 to bits/fcntl-linux.h
It was added by commit a5874fde3c0884a33ed4145101052318c5e17c74

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2025-11-05 07:15:52 -03:00
Adhemerval Zanella
04e6bdb437 Add AT_HANDLE_CONNECTABLE from Linux 6.13 to bits/fcntl-linux.h
It was added by commit c374196b2b9f4b803fccd59ed82f0712041e21e1.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2025-11-05 07:15:52 -03:00
Adhemerval Zanella
03d9cb23b8 Update syscall lists for Linux 6.17
Linux 6.16 adds no new syscalls, while Linux 6.17 adds file_getattr
and file_setattr (commit be7efb2d20d67f334a7de2aef77ae6c69367e646).
Update syscall-names.list and regenerate the arch-syscall.h headers
with build-many-glibcs.py update-syscalls.
2025-11-05 07:15:52 -03:00
Adhemerval Zanella
c0c9524a11 Update PIDFD_* constants for Linux 6.17
The pidfd interface was extended with:

  * PIDFD_GET_INFO and pidfd_info (along with related extra flags) to
    allow get information about the process without the need to parse
    /proc (commit cdda1f26e74ba, Linux 6.13).

  * PIDFD_SELF_{THREAD,THREAD_GROUP,SELF,SELF_PROCESS} to allow
    pidfd_send_signal refer to the own process or thread lead groups
    without the need of allocating a file descriptor (commit f08d0c3a71114,
    Linux 6.15).

  * PIDFD_INFO_COREDUMP that extends PIDFD_GET_INFO to obtain coredump
    information.

Linux uAPI header defines both PIDFD_SELF_THREAD and
PIDFD_SELF_THREAD_GROUP on linux/fcntl.h (since they reserve part of the
AT_* values), however for glibc I do not see any good reason to add pidfd
definitions on fcntl-linux.h.

The tst-pidfd.c is extended with some PIDFD_SELF_* tests and a new
‘tst-pidfd_getinfo.c’ test is added to check PIDFD_GET_INFO. The
PIDFD_INFO_COREDUMP tests would require very large and complex tests
that are already covered by kernel tests.

Checked on aarch64-linux-gnu and x86_64-linux-gnu on kernels 6.8 and
6.17.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2025-11-05 07:15:52 -03:00
Adhemerval Zanella
bd7be9f447 Update kernel version to 6.17 in header constant tests
There are no new constants covered by tst-mman-consts.py,
tst-mount-consts.py or tst-sched-consts.py in Linux 6.17.
2025-11-05 07:15:52 -03:00
Adhemerval Zanella
7ec8eb5676 math: Remove the SVID error handling from atan2f
It improves latency for about 3-6% and throughput for about 5-12%.

Tested on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-11-05 07:15:52 -03:00
Peter Bergner
47975914fb riscv: Add vector registers to __SYSCALL_CLOBBERS
The Linux kernel ABI specifies that the vector registers are not preserved
across system calls, but the __SYSCALL_CLOBBERS macro doesn't mention them.
This could possibly lead to compilers trying to keep data in the vector
registers across the syscall leading to corruption.  Add the vector registers
to __SYSCALL_CLOBBERS when the vector extension is enabled.  If the vector
extension is enabled, then require GCC 15 or later and RVV 1.0 or later.

Fixes: 36960f0c76 ("RISC-V: Linux Syscall Interface")

Signed-off-by: Peter Bergner <bergner@tenstorrent.com>
2025-11-04 09:18:56 -06:00
Adhemerval Zanella
0dfc849eff math: Remove the SVID error handling wrapper from sqrt
i386 and m68k architectures should use math-use-builtins-sqrt.h rather
than relying on architecture-specific or inline assembly implementations.

The PowerPC optimization for PPC 601/603 (30 years old) is removed.

Tested on x86_64-linux-gnu and i686-linux-gnu.

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-11-04 04:14:01 -03:00
Adhemerval Zanella
f27a146409 math: Remove the SVID error handling from sinhf
It improves latency for about 3-10% and throughput for about 5-15%.

Tested on x86_64-linux-gnu and i686-linux-gnu.

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-11-04 04:14:01 -03:00
Adhemerval Zanella
0e1a1178ee math: Remove the SVID error handling from remainder
The optimized i386 version is faster than the generic one, and
gcc implements it through the builtin. This optimization enables
us to migrate the implementation to a C version.  The performance
on a Zen3 chip is similar to the SVID one.

The m68k provided an optimized version through __m81_u(remainderf)
(mathimpl.h), and gcc does not implement it through a builtin
(different than i386).

Performance improves a bit on x86_64 (Zen3, gcc 15.2.1):

reciprocal-throughput           input    master   NO-SVID  improvement
x86_64                     subnormals   18.8522   16.2506       13.80%
x86_64                         normal  421.8260  403.9270        4.24%
x86_64                 close-exponent   21.0579   18.7642       10.89%
i686                       subnormals   21.3443   21.4229       -0.37%
i686                           normal  525.8380   538.807       -2.47%
i686                   close-exponent   21.6589   21.7983       -0.64%

Tested on x86_64-linux-gnu and i686-linux-gnu.

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-11-04 04:14:01 -03:00
Adhemerval Zanella
c4c6c79d70 math: Remove the SVID error handling from remainderf
The optimized i386 version is faster than the generic one, and gcc
implements it through the builtin.  This optimization enables us to
migrate the implementation to a C version.  The performance on a Zen3
chip is similar to the SVID one.

The m68k provided an optimized version through __m81_u(remainderf)
(mathimpl.h), and gcc does not implement it through a builtin (different
than i386).

Performance improves a bit on x86_64 (Zen3, gcc 15.2.1):

reciprocal-throughput          input   master  NO-SVID  improvement
x86_64                    subnormals  17.5349  15.6125       10.96%
x86_64                        normal  53.8134  52.5754        2.30%
x86_64                close-exponent  20.0211  18.6656        6.77%
i686                      subnormals  21.8105  20.1856        7.45%
i686                          normal  73.1945  71.2199        2.70%
i686                  close-exponent  22.2141   20.331        8.48%

Tested on x86_64-linux-gnu and i686-linux-gnu.

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-11-04 04:14:01 -03:00
Adhemerval Zanella
7e5fe1974c sh: Move atomic-machine to generic sysdep
There is no Linux specific definitions.

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-11-04 04:14:01 -03:00
Adhemerval Zanella
1f5d8663ea riscv: Consolidade atomic-machine.h and remove ununsed atomic macros
The resulting definitions are not Linux specific, so move the header
to generic sysdep folder.

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-11-04 04:14:01 -03:00
Adhemerval Zanella
9201eabed8 loongarch: Consolidate atomic-machine.h and remove ununsed atomic macros
These are already provided by the generic include/atomic.h and
the resulting macros are not Linux specific.

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-11-04 04:14:01 -03:00
Adhemerval Zanella
3642bf4800 m68k: Consolidade atomic-machine.h and Remove ununsed atomic macros
Both m68k and m68k-colfire do not support 64 bit atomis.  The
atomic_barrier syscall on m68k is a no-op, so it can use the compiler
builtin.

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-11-04 04:14:01 -03:00
Adhemerval Zanella
6322a325fc hppa: Move atomic-machine to generic sysdep
There is no Linux specific definitions.

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-11-04 04:14:01 -03:00
Adhemerval Zanella
5a7a9a57c2 arm: Consolidate atomic-machine.h and Remove ununsed atomic macros
The libgcc provides the required support to calling the kernel
auxiliary routines for !__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4.

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-11-04 04:14:01 -03:00
Yury Khrustalev
2f77aec043 aarch64: fix cfi directives around __libc_arm_za_disable
Incorrect CFI directive corrupted call stack information
and prevented debuggers from correctly displaying call
stack information.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2025-10-31 09:48:47 +00:00
Adhemerval Zanella
ee946212fe math: Remove the SVID error handling wrapper from yn/jn
Tested on x86_64-linux-gnu and i686-linux-gnu.

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-10-30 15:41:35 -03:00
Adhemerval Zanella
8d4815e6d7 math: Remove the SVID error handling wrapper from y1/j1
Tested on x86_64-linux-gnu and i686-linux-gnu.

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-10-30 15:41:33 -03:00
Adhemerval Zanella
b050cb53b0 math: Remove the SVID error handling wrapper from y0/j0
Tested on x86_64-linux-gnu and i686-linux-gnu.

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-10-30 15:41:31 -03:00
Adhemerval Zanella
03eeeba705 math: Remove the SVID error handling from coshf
It improves latency for about 3-10% and throughput for about 5-15%.

Tested on x86_64-linux-gnu and i686-linux-gnu.

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-10-30 15:41:28 -03:00
Adhemerval Zanella
555c39c0fc math: Remove the SVID error handling from atanhf
It improves latency for about 1-10% and throughput for about 5-10%.

Tested on x86_64-linux-gnu and i686-linux-gnu.

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-10-30 15:41:26 -03:00
Adhemerval Zanella
8facb464b4 math: Remove the SVID error handling from acoshf
It improves latency for about 3-7% and throughput for about 5-10%.

Tested on x86_64-linux-gnu and i686-linux-gnu.

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-10-30 15:41:24 -03:00
Adhemerval Zanella
f92aba68bc math: Remove the SVID error handling from asinf
It improves latency for about 2% and throughput for about 5%.

Tested on x86_64-linux-gnu and i686-linux-gnu.

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-10-30 15:41:22 -03:00
Adhemerval Zanella
9f8dea5b5d math: Remove the SVID error handling from acosf
It improves latency for about 2-10% and throughput for about 5-10%.

Tested on x86_64-linux-gnu and i686-linux-gnu.

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-10-30 15:41:20 -03:00
Adhemerval Zanella
0b484d7b77 math: Remove the SVID error handling from log10f
It improves latency for about 3-10% and throughput for about 5-10%.

Tested on x86_64-linux-gnu and i686-linux-gnu.

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-10-30 15:41:17 -03:00
Adhemerval Zanella
6deadd4eb6 m68k: Remove SVID error handling on fmod
The m68k provided an optimized version through __m81_u(fmod)
(mathimpl.h), and gcc does not implement it through a builtin
(different than i386).

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-10-30 15:41:15 -03:00
Adhemerval Zanella
ade9f30ce2 m68k: Remove the SVID error handling from fmodf
The m68k provided an optimized version through __m81_u(fmodf)
(mathimpl.h), and gcc does not implement it through a builtin
(different than i386).

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-10-30 15:41:10 -03:00
Adhemerval Zanella
1dd2163e51 i386: Remove the SVID error handling from fmodf
The optimized i386 version is faster than the generic one, and gcc
implements it through the builtin. It allows us to move the
implementation to a C one.

The performance on a Zen3 chip is slight better:

reciprocal-throughput           input   master  no-SVID  improvement
i686                       subnormals  22.4741  20.1571       10.31%
i686                           normal  74.1631  70.3606        5.13%
i686                   close-exponent  22.5625  20.2435       10.28%

Tested on i686-linux-gnu.

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-10-30 15:41:07 -03:00
Adhemerval Zanella
bfee89dc8a i386: Remove the SVID error handling from fmod
The optimized i386 version is faster than the generic one, and gcc
implements it through the builtin. It allows us to move the
implementation to a C one. The performance on a Zen3 chip is
similar to the SVID one.

Tested on i686-linux-gnu.

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2025-10-30 15:40:41 -03:00
Joseph Myers
096fcdc0a5 Rename uimaxabs to umaxabs (bug 33325)
The C2y function uimaxabs has been renamed to umaxabs.  Implement this
change in glibc, keeping a compat symbol under the old name, copying
the test to test the new name and changing the old test to test the
compat symbol.  Jakub has done the corresponding change to the
built-in function in GCC.

Tested for x86_64 and x86.
2025-10-28 12:15:02 +00:00
Collin Funk
3d20d746c3 Linux: fix tst-copy_file_range-large test on 32-bit platforms.
Since SSIZE_MAX is less than UINT_MAX on 32-bit platforms we must AND
the expression with SSIZE_MAX.

Tested on x86_64 and x86.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2025-10-26 19:39:49 -07:00
Adhemerval Zanella
062510a0c1 termios: Suppress clang -Winitializer-overrider on ___cbaud_to_speed
clang-18 and onwards issues:

../sysdeps/unix/sysv/linux/speed.c:71:23: error: initializer overrides prior initialization of this subobject [-Werror,-Winitializer-overrides]
   71 |       [_cbix(__B0)] = 0,
      |                       ^
../sysdeps/unix/sysv/linux/speed.c:70:34: note: previous initialization is here
   70 |       [0 ... _cbix(CBAUDMASK)] = -1,
[...]

The override is explicit used to support the same initialization on
multiple platforms (since the baud values differ on alpha and powerpc).

Reviewed-by: Collin Funk <collin.funk1@gmail.com>
2025-10-21 09:26:04 -03:00
Adhemerval Zanella
76dfd91275 Suppress -Wmaybe-uninitialized only for gcc
The warning is not supported by clang.

Reviewed-by: Sam James <sam@gentoo.org>
2025-10-21 09:24:05 -03:00
Luc Michel
c284fd5eaf microblaze: fix __syscall_cancel_arch (BZ 33547)
The __syscall_cancel_arch function has an epilogue that does not match
the prologue. The stack is not used and the return address still lies in
r15 when reaching the epilogue. Fix the epilogue by simply returning
from the function.

Signed-off-by: Luc Michel <luc.michel@amd.com>
Tested-by: gopi@sankhya.com
Reviewed-by: Neal Frager <neal.frager@amd.com>
Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2025-10-20 11:33:54 -03:00
Adhemerval Zanella
a252205e1c linux: Fix function point cast on vDSO handling
There is no need to cast to avoid, both pointer already have the
expected type.

It fixes the clang -Wpointer-type-mismatch error:

../sysdeps/unix/sysv/linux/gettimeofday.c:43:6: error: pointer type mismatch ('int (*)(struct timeval *, void *)' and 'void *') [-Werror,-Wpointer-type-mismatch]
   41 | libc_ifunc (__gettimeofday,
      | ~~~~~~~~~~~~~~~~~~~~~~~~~~~
   42 |             GLRO(dl_vdso_gettimeofday) != NULL
      |             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   43 |             ? VDSO_IFUNC_RET (GLRO(dl_vdso_gettimeofday))
      |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   44 |             : (void*) __gettimeofday_syscall)
      |             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./../include/libc-symbols.h:789:53: note: expanded from macro 'libc_ifunc'
  789 | #define libc_ifunc(name, expr) __ifunc (name, name, expr, void, INIT_ARCH)
      |                                ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
./../include/libc-symbols.h:705:34: note: expanded from macro '__ifunc'
  705 |   __ifunc_args (type_name, name, expr, init, arg)
      |   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
./../include/libc-symbols.h:677:38: note: expanded from macro '__ifunc_args'
  677 |   __ifunc_resolver (type_name, name, expr, init, static, __VA_ARGS__);  \
      |   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./../include/libc-symbols.h:667:33: note: expanded from macro '__ifunc_resolver'
  667 |     __typeof (type_name) *res = expr;                                   \
      |                                 ^~~~

Reviewed-by: Sam James <sam@gentoo.org>
2025-10-20 11:33:54 -03:00
Adhemerval Zanella
8ec0754067 aarch64: Fix gcs linker flags
clang does not work by using whitespace for defining the -z option:

$ make test t=misc/tst-gcs-disabled
[...]
clang: error: no such file or directory: 'gcs=always'

Use the usual comma separate way.

Reviewed-by: Florian Weimer <fweimer@redhat.com>
2025-10-20 11:33:54 -03:00
Adhemerval Zanella
917425ca6d posix: Defined _POSIX_VDISABLE as integer literal
The constant should be used with c_cc, which for all supported ABIs
is defined as unsigned char.  By using it as literar char constant,
clang triggers an error when compared with signal literal on ABIs that
define 'char' as unsigned.

On aarch64, clang shows:

  ../sysdeps/posix/fpathconf.c:118:21: error: right side of operator
  converted from negative value to unsigned: -1 to 18446744073709551615
  [-Werror]
  #if _POSIX_VDISABLE == -1
    ~~~~~~~~~~~~~~~ ^  ~~

Reviewed-by: Collin Funk <collin.funk1@gmail.com>
2025-10-20 11:33:54 -03:00
Joseph Myers
ea18d5a4c2 Implement C23 memalignment
Add the C23 memalignment function (query the alignment of a pointer)
to glibc.

Given how simple this operation is, it would make sense for compilers
to inline calls to this function, but I'm treating that as a compiler
matter (compilers should add it as a built-in function) rather than
adding an inline version to glibc headers (although such an inline
version would be reasonable as well).  I've filed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122117 for this feature
in GCC.

Tested for x86_64 and x86.
2025-10-17 16:56:59 +00:00
Yury Khrustalev
27effb3d50 aarch64: clear ZA state of SME before clone and clone3 syscalls
This change adds a call to the __arm_za_disable() function immediately
before the SVC instruction inside clone() and clone3() wrappers. It also
adds a macro for inline clone() used in fork() and adds the same call to
the vfork implementation. This sets the ZA state of SME to "off" on return
from these functions (for both the child and the parent).

The __arm_za_disable() function is described in [1] (8.1.3). Note that
the internal Glibc name for this function is __libc_arm_za_disable().

When this change was originally proposed [2,3], it generated a long
discussion where several questions and concerns were raised. Here we
will address these concerns and explain why this change is useful and,
in fact, necessary.

In a nutshell, a C library that conforms to the AAPCS64 spec [1] (pertinent
to this change, mainly, the chapters 6.2 and 6.6), should have a call to the
__arm_za_disable() function in clone() and clone3() wrappers. The following
explains in detail why this is the case.

When we consider using the __arm_za_disable() function inside the clone()
and clone3() libc wrappers, we talk about the C library subroutines clone()
and clone3() rather than the syscalls with similar names. In the current
version of Glibc, clone() is public and clone3() is private, but it being
private is not pertinent to this discussion.

We will begin with stating that this change is NOT a bug fix for something
in the kernel. The requirement to call __arm_za_disable() does NOT come from
the kernel. It also is NOT needed to satisfy a contract between the kernel
and userspace. This is why it is not for the kernel documentation to describe
this requirement. This requirement is instead needed to satisfy a pure userspace
scheme outlined in [1] and to make sure that software that uses Glibc (or any
other C library that has correct handling of SME states (see below)) conforms
to [1] without having to unnecessarily become SME-aware thus losing portability.

To recap (see [1] (6.2)), SME extension defines SME state which is part of
processor state. Part of this SME state is ZA state that is necessary to
manage ZA storage register in the context of the ZA lazy saving scheme [1]
(6.6). This scheme exists because it would be challenging to handle ZA
storage of SME in either callee-saved or caller-saved manner.

There are 3 kinds of ZA state that are defined in terms of the PSTATE.ZA
bit and the TPIDR2_EL0 register (see [1] (6.6.3)):

- "off":       PSTATE.ZA == 0
- "active":    PSTATE.ZA == 1 TPIDR2_EL0 == null
- "dormant":   PSTATE.ZA == 1 TPIDR2_EL0 != null

As [1] (6.7.2) outlines, every subroutine has exactly one SME-interface
depending on the permitted ZA-states on entry and on normal return from
a call to this subroutine. Callers of a subroutine must know and respect
the ZA-interface of the subroutines they are using. Using a subroutine
in a way that is not permitted by its ZA-interface is undefined behaviour.

In particular, clone() and clone3() (the C library functions) have the
ZA-private interface. This means that the permitted ZA-states on entry
are "off" and "dormant" and that the permitted states on return are "off"
or "dormant" (but if and only if it was "dormant" on entry).

This means that both functions in question should correctly handle both
"off" and "dormant" ZA-states on entry. The conforming states on return
are "off" and "dormant" (if inbound state was already "dormant").

This change ensures that the ZA-state on return is always "off". Note,
that, in the context of clone() and clone3(), "on return" means a point
when execution resumes at certain address after transferring from clone()
or clone3(). For the caller (we may refer to it as "parent") this is the
return address in the link register where the RET instruction jumps. For
the "child", this is the target branch address.

So, the "off" state on return is permitted and conformant. Why can't we
retain the "dormant" state? In theory, we can, but we shouldn't, here is
why.

Every subroutine with a private-ZA interface, including clone() and clone3(),
must comply with the lazy saving scheme [1] (6.7.2). This puts additional
responsibility on a subroutine if ZA-state on return is "dormant" because
this state has special meaning. The "caller" (that is the place in code
where execution is transferred to, so this include both "parent" and "child")
may check the ZA-state and use it as per the spec of the "dormant" state that
is outlined in [1] (6.6.6 and 6.6.7).

Conforming to this would require more code inside of clone() and clone3()
which hardly is desirable.

For the return to "parent" this could be achieved in theory, but given that
neither clone() nor clone3() are supposed to be used in the middle of an
SME operation, if wouldn't be useful. For the "return" to "child" this
would be particularly difficult to achieve given the complexity of these
functions and their interfaces. Most importantly, it would be illegal
and somewhat meaningless to allow a "child" to start execution in the
"dormant" ZA-state because the very essence of the "dormant" state implies
that there is a place to return and that there is some outer context that
we are allowed to interact with.

To sum up, calling __arm_za_disable() to ensure the "off" ZA-state when the
execution resumes after a call to clone() or clone3() is correct and also
the most simple way to conform to [1].

Can there be situations when we can avoid calling __arm_za_disable()?

Calling __arm_za_disable() implies certain (sufficiently small) overhead,
so one might rightly ponder avoiding making a call to this function when
we can afford not to. The most trivial cases like this (e.g. when the
calling thread doesn't have access to SME or to the TPIDR2_EL0 register)
are already handled by this function (see [1] (8.1.3 and 8.1.2)). Reasoning
about other possible use cases would require making code inside clone() and
clone3() more complicated and it would defeat the point of trying to make
an optimisation of not calling __arm_za_disable().

Why can't the kernel do this instead?

The handling of SME state by the kernel is described in [4]. In short,
kernel must not impose a specific ZA-interface onto a userspace function.
Interaction with the kernel happens (among other thing) via system calls.
In Glibc many of the system calls (notably, including SYS_clone and
SYS_clone3) are used via wrappers, and the kernel has no control of them
and, moreover, it cannot dictate how these wrappers should behave because
it is simply outside of the kernel's remit.

However, in certain cases, the kernel may ensure that a "child" doesn't
start in an incorrect state. This is what is done by the recent change
included in 6.16 kernel [5]. This is not enough to ensure that code that
uses clone() and clone3() function conforms to [1] when it runs on a
system that provides SME, hence this change.

[1]: https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst
[2]: https://inbox.sourceware.org/libc-alpha/20250522114828.2291047-1-yury.khrustalev@arm.com
[3]: https://inbox.sourceware.org/libc-alpha/20250609121407.3316070-1-yury.khrustalev@arm.com
[4]: https://www.kernel.org/doc/html/v6.16/arch/arm64/sme.html
[5]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cde5c32db55740659fca6d56c09b88800d88fd29

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2025-10-14 09:42:46 +01:00
Yury Khrustalev
b4b713bd89 aarch64: define macro for calling __libc_arm_za_disable
A common sequence of instructions is used in several places
in assembly files, so define it in one place as an assembly
macro.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2025-10-14 09:42:46 +01:00
Arjun Shankar
88ce558a31 string: Add tests for unique strerror and strsignal strings
strerror, strsignal, and their variants should return unique strings for
each known (and, depending on the function, unknown) error/signal.  Add
tests to verify this for strerror, strerror_r (GNU and XSI compliant
variants), and strerror_l (for the C locale), strerrordesc_np,
strsignal, sigabbrev_np, and sigdescr_np.

Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
2025-10-13 19:04:44 +02:00
Uros Bizjak
3a0a8eae50 x86: Fix trivial code formatting erros in my last two commits
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
2025-10-12 17:59:16 +02:00
Uros Bizjak
bb019bc68f i386: Use __seg_gs qualifiers in PTR_{MANGLE,DEMANGLE}() macros
Use __seg_gs named address space qualifiers in PTR_MANGLE() and
PTR_DEMANGLE() macros to access the pointer_guard field in the TCB.

This change allows the compiler to eliminate redundant reads of
the variable, reducing the number of reads from 105 to 94 and
decreasing the text size of the library by 280 bytes.

While at it, fix a few trivial whitespace issues as well

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2025-10-12 17:48:55 +02:00
Uros Bizjak
60e3ada68d x86_64: Use __seg_fs qualifiers in PTR_{MANGLE,DEMANGLE}() macros
Use __seg_fs named address space qualifiers in PTR_MANGLE() and
PTR_DEMANGLE() macros to access the pointer_guard field in the TCB.

This change allows the compiler to eliminate redundant reads of
the variable, reducing the number of reads from 98 to 89 and
decreasing the text size of the library by 512 bytes.

While at it, fix a few trivial whitespace issues as well.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2025-10-12 17:47:55 +02:00