Some kernels returns EINVAL for ioctl (PIDFD_GET_INFO) on pidfd
descriptors.
Checked on aarch64-linux-gnu with Linux 6.12.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Verify that the kernel side of the termios interface gets the various
speed fields set according to our current canonicalization policy.
[ v2.1: fix formatting - Adhemerval Netto ]
[ v4: fix typo in patch description - Dan Horák ]
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (v2.1)
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
The support for lock elision was already deprecated with glibc 2.42:
commit 77438db8cf
"Mark support for lock elision as deprecated."
See also discussions:
https://sourceware.org/pipermail/libc-alpha/2025-July/168492.html
This patch removes the architecture specific support for lock elision
for x86, powerpc and s390 by removing the elision-conf.h, elision-conf.c,
elision-lock.c, elision-timed.c, elision-unlock.c, elide.h, htm.h/hle.h files.
Those generic files are also removed.
The architecture specific structures are adjusted and the elision fields are
marked as unused. See struct_mutex.h files.
Furthermore in struct_rwlock.h, the leftover __rwelision was also removed.
Those were originally removed with commit 0377a7fde6
"nptl: Remove rwlock elision definitions"
and by chance reintroduced with commit 7df8af43ad
"nptl: Add struct_rwlock.h"
The common code (e.g. the pthread_mutex-files) are changed back to the time
before lock elision was introduced with the x86-support:
- commit 1cdbe57948
"Add the low level infrastructure for pthreads lock elision with TSX"
- commit b023e4ca99
"Add new internal mutex type flags for elision."
- commit 68cc29355f
"Add minimal test suite changes for elision enabled kernels"
- commit e8c659d74e
"Add elision to pthread_mutex_{try,timed,un}lock"
- commit 49186d21ef
"Disable elision for any pthread_mutexattr_settype call"
- commit 1717da59ae
"Add a configure option to enable lock elision and disable by default"
Elision is removed also from the tunables, the initialization part, the
pretty-printers and the manual.
Some extra handling in the testsuite is removed as well as the full tst-mutex10
testcase, which tested a race while enabling lock elision.
I've also searched the code for "elision", "elide", "transaction" and e.g.
cleaned some comments.
I've run the testsuite on x86_64 and s390x and run the build-many-glibcs.py
script.
Thanks to Sachin Monga, this patch is also tested on powerpc.
A NEWS entry also mentions the removal.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Factor out the internal kernel interface from termios_internal.h, so
that it can be used in test code without causing breakage due to glibc
internals used in headers.
[ v3: fix Alpha build breakage ]
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
After getting more experience with the various broken direct-to-ioctl
termios2 hacks using Fedora 43 beta, I have found a fair number of
cases where the software would fail to set, or clear CIBAUD for
non-split-speed operation.
Thus it seems will help improve compatibility to clear the kernel-side
version of c_cflag & CIBAUD (having the same meaning to the Linux
kernel as the speed 0 has for cfsetibaud(), i.e. force the input speed
to equal the output speed) for non-split-speed operation, rather than
having it explicitly equal the output speed in CBAUD.
When writing the code that went into glibc 2.42 I had considered this
issue, and had to make an educated guess which way would be more
likely to break fewer things. Unfortunately, it appears I guessed
wrong.
A third option would be to *always* set CIBAUD to __BOTHER, even for
the standard baud rates. However, that is an even bigger departure
from legacy behavior, whereas this variant mostly preserves current
behavior in terms of under what conditions buggy utilities will
continue to work.
This change is in tcsetattr() rather than
___termios2_canonicalize_speeds(), as it should not be run for
tcgetattr(); that would break split speed support for the legacy
interface versions of cfgetispeed() and cfsetispeed().
[ v2: fixed comment style ]
Resolves: BZ #33340
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The file was added with a GPL reference (but LGPL statement) in
commit 0d6bed7150 ("hppa: Add
____longjmp_check C implementation.").
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Collin Funk <collin.funk1@gmail.com>
As discussed in bug 28327, C23 changed the fromfp functions to return
floating types instead of intmax_t / uintmax_t. (Although the
motivation in N2548 was reducing the use of intmax_t in library
interfaces, the new version does have the advantage of being able to
specify arbitrary integer widths for e.g. assigning the result to a
_BitInt, as well as being able to indicate an error case in-band with
a NaN return.)
As with other such changes from interfaces introduced in TS 18661,
implement the new types as a replacement for the old ones, with the
old functions remaining as compat symbols but not supported as an API.
The test generator used for many of the tests is updated to handle
both versions of the functions.
Tested for x86_64 and x86, and with build-many-glibcs.py.
Also tested tgmath tests for x86_64 with GCC 7 to make sure that the
modified case for older compilers in <tgmath.h> does work.
Also tested for powerpc64le to cover the ldbl-128ibm implementation
and the other things that are handled differently for that
configuration. The new tests fail for ibm128, but all the failures
relate to incorrect signs of zero results and turn out to arise from
bugs in the underlying roundl, ceill, truncl and floorl
implementations that I've reported in bug 33623, rather than
indicating any bug in the actual new implementation of the functions
for that format. So given fixes for those functions (which shouldn't
be hard, and of course should add to the tests for those functions
rather than relying only on indirect testing via fromfp), the fromfp
tests should start passing for ibm128 as well.
It has been added on Linux 6.10 (8be7258aad44b5e25977a98db136f677fa6f4370)
as a way to block operations such as mapping, moving to another location,
shrinking the size, expanding the size, or modifying it to a pre-existing
memory mapping.
Although the system only works on 64-bit CPUs, the entrypoint was added
for all ABIs (since the kernel might eventually implement it for additional
ones and/or the ABI can execute on a 64-bit kernel).
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
Reviewed-by: Collin Funk <collin.funk1@gmail.com>
It improves latency for about 1.5% and throughput for about 2-4%.
Tested on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Linux 6.16 adds no new syscalls, while Linux 6.17 adds file_getattr
and file_setattr (commit be7efb2d20d67f334a7de2aef77ae6c69367e646).
Update syscall-names.list and regenerate the arch-syscall.h headers
with build-many-glibcs.py update-syscalls.
The pidfd interface was extended with:
* PIDFD_GET_INFO and pidfd_info (along with related extra flags) to
allow get information about the process without the need to parse
/proc (commit cdda1f26e74ba, Linux 6.13).
* PIDFD_SELF_{THREAD,THREAD_GROUP,SELF,SELF_PROCESS} to allow
pidfd_send_signal refer to the own process or thread lead groups
without the need of allocating a file descriptor (commit f08d0c3a71114,
Linux 6.15).
* PIDFD_INFO_COREDUMP that extends PIDFD_GET_INFO to obtain coredump
information.
Linux uAPI header defines both PIDFD_SELF_THREAD and
PIDFD_SELF_THREAD_GROUP on linux/fcntl.h (since they reserve part of the
AT_* values), however for glibc I do not see any good reason to add pidfd
definitions on fcntl-linux.h.
The tst-pidfd.c is extended with some PIDFD_SELF_* tests and a new
‘tst-pidfd_getinfo.c’ test is added to check PIDFD_GET_INFO. The
PIDFD_INFO_COREDUMP tests would require very large and complex tests
that are already covered by kernel tests.
Checked on aarch64-linux-gnu and x86_64-linux-gnu on kernels 6.8 and
6.17.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
It improves latency for about 3-6% and throughput for about 5-12%.
Tested on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
The Linux kernel ABI specifies that the vector registers are not preserved
across system calls, but the __SYSCALL_CLOBBERS macro doesn't mention them.
This could possibly lead to compilers trying to keep data in the vector
registers across the syscall leading to corruption. Add the vector registers
to __SYSCALL_CLOBBERS when the vector extension is enabled. If the vector
extension is enabled, then require GCC 15 or later and RVV 1.0 or later.
Fixes: 36960f0c76 ("RISC-V: Linux Syscall Interface")
Signed-off-by: Peter Bergner <bergner@tenstorrent.com>
i386 and m68k architectures should use math-use-builtins-sqrt.h rather
than relying on architecture-specific or inline assembly implementations.
The PowerPC optimization for PPC 601/603 (30 years old) is removed.
Tested on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
It improves latency for about 3-10% and throughput for about 5-15%.
Tested on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
The optimized i386 version is faster than the generic one, and
gcc implements it through the builtin. This optimization enables
us to migrate the implementation to a C version. The performance
on a Zen3 chip is similar to the SVID one.
The m68k provided an optimized version through __m81_u(remainderf)
(mathimpl.h), and gcc does not implement it through a builtin
(different than i386).
Performance improves a bit on x86_64 (Zen3, gcc 15.2.1):
reciprocal-throughput input master NO-SVID improvement
x86_64 subnormals 18.8522 16.2506 13.80%
x86_64 normal 421.8260 403.9270 4.24%
x86_64 close-exponent 21.0579 18.7642 10.89%
i686 subnormals 21.3443 21.4229 -0.37%
i686 normal 525.8380 538.807 -2.47%
i686 close-exponent 21.6589 21.7983 -0.64%
Tested on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
The optimized i386 version is faster than the generic one, and gcc
implements it through the builtin. This optimization enables us to
migrate the implementation to a C version. The performance on a Zen3
chip is similar to the SVID one.
The m68k provided an optimized version through __m81_u(remainderf)
(mathimpl.h), and gcc does not implement it through a builtin (different
than i386).
Performance improves a bit on x86_64 (Zen3, gcc 15.2.1):
reciprocal-throughput input master NO-SVID improvement
x86_64 subnormals 17.5349 15.6125 10.96%
x86_64 normal 53.8134 52.5754 2.30%
x86_64 close-exponent 20.0211 18.6656 6.77%
i686 subnormals 21.8105 20.1856 7.45%
i686 normal 73.1945 71.2199 2.70%
i686 close-exponent 22.2141 20.331 8.48%
Tested on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
These are already provided by the generic include/atomic.h and
the resulting macros are not Linux specific.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Both m68k and m68k-colfire do not support 64 bit atomis. The
atomic_barrier syscall on m68k is a no-op, so it can use the compiler
builtin.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
The libgcc provides the required support to calling the kernel
auxiliary routines for !__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
It improves latency for about 3-10% and throughput for about 5-15%.
Tested on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
It improves latency for about 1-10% and throughput for about 5-10%.
Tested on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
It improves latency for about 3-7% and throughput for about 5-10%.
Tested on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
It improves latency for about 2% and throughput for about 5%.
Tested on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
It improves latency for about 2-10% and throughput for about 5-10%.
Tested on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
It improves latency for about 3-10% and throughput for about 5-10%.
Tested on x86_64-linux-gnu and i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
The m68k provided an optimized version through __m81_u(fmod)
(mathimpl.h), and gcc does not implement it through a builtin
(different than i386).
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
The m68k provided an optimized version through __m81_u(fmodf)
(mathimpl.h), and gcc does not implement it through a builtin
(different than i386).
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
The optimized i386 version is faster than the generic one, and gcc
implements it through the builtin. It allows us to move the
implementation to a C one.
The performance on a Zen3 chip is slight better:
reciprocal-throughput input master no-SVID improvement
i686 subnormals 22.4741 20.1571 10.31%
i686 normal 74.1631 70.3606 5.13%
i686 close-exponent 22.5625 20.2435 10.28%
Tested on i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
The optimized i386 version is faster than the generic one, and gcc
implements it through the builtin. It allows us to move the
implementation to a C one. The performance on a Zen3 chip is
similar to the SVID one.
Tested on i686-linux-gnu.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
The C2y function uimaxabs has been renamed to umaxabs. Implement this
change in glibc, keeping a compat symbol under the old name, copying
the test to test the new name and changing the old test to test the
compat symbol. Jakub has done the corresponding change to the
built-in function in GCC.
Tested for x86_64 and x86.
Since SSIZE_MAX is less than UINT_MAX on 32-bit platforms we must AND
the expression with SSIZE_MAX.
Tested on x86_64 and x86.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
clang-18 and onwards issues:
../sysdeps/unix/sysv/linux/speed.c:71:23: error: initializer overrides prior initialization of this subobject [-Werror,-Winitializer-overrides]
71 | [_cbix(__B0)] = 0,
| ^
../sysdeps/unix/sysv/linux/speed.c:70:34: note: previous initialization is here
70 | [0 ... _cbix(CBAUDMASK)] = -1,
[...]
The override is explicit used to support the same initialization on
multiple platforms (since the baud values differ on alpha and powerpc).
Reviewed-by: Collin Funk <collin.funk1@gmail.com>
The __syscall_cancel_arch function has an epilogue that does not match
the prologue. The stack is not used and the return address still lies in
r15 when reaching the epilogue. Fix the epilogue by simply returning
from the function.
Signed-off-by: Luc Michel <luc.michel@amd.com>
Tested-by: gopi@sankhya.com
Reviewed-by: Neal Frager <neal.frager@amd.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>