* hurd/Makefile: add new tests
* hurd/test-sig-rpc-interrupted.c: check xstate save and restore in
the case where a signal is delivered to a thread which is waiting
for an rpc. This test implements the rpc interruption protocol used
by the hurd servers. It was so far passing on Debian thanks to the
local-intr-msg-clobber.diff patch, which is now obsolete.
* hurd/test-sig-xstate.c: check xstate save and restore in the case
where a signal is delivered to a running thread, making sure that
the xstate is modified in the signal handler.
* hurd/test-xstate.h: add helpers to test xstate
* sysdeps/mach/hurd/i386/bits/sigcontext.h: add xstate to the
sigcontext structure.
+ sysdeps/mach/hurd/i386/sigreturn.c: restore xstate from the saved
context
* sysdeps/mach/hurd/x86/trampoline.c: save xstate if
supported. Otherwise we fall back to the previous behaviour of
ignoring xstate.
* sysdeps/mach/hurd/x86_64/bits/sigcontext.h: add xstate to the
sigcontext structure.
* sysdeps/mach/hurd/x86_64/sigreturn.c: restore xstate from the saved
context
Signed-off-by: Luca Dariz <luca@orpolo.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-ID: <20250319171118.142163-1-luca@orpolo.org>
This patch moves any calls of tcache_init away after tcache hot paths.
Since there is no reason to initialize tcaches in the hot path and since
we need to be able to check tcache != NULL in any case, because of
tcache_thread_shutdown function, moving tcache_init away from hot path
can only be beneficial.
The patch also removes the initialization of tcaches within the
__libc_free call. It only makes sense to initialize tcaches for the
thread after it calls one of the allocation functions. Also the patch
removes the save/restore of errno from tcache_init code, as it is no
longer needed.
I misunderstood the recommendation from the hardware team about non-temporal
load/stores. It is still recommended to use them in memset for large sizes. It
was not recommended for their use with device memory and memset is already
not valid to be used with device memory.
This reverts commit e6590f0c86632c36c9a784cf96075f4be2e920d2.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
I misunderstood the recommendation from the hardware team about non-temporal
load/stores. It is still recommended to use them in memcpy for large sizes. It
was not recommended for their use with device memory and memcpy is already
not valid to be use with device memory.
This reverts commit eb5eeb47403e0a91de834868e501b4d62b8d2cb9.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Use tailcalls to avoid the overhead of a frame on the free fastpath.
Move tcache initialization to _int_free_chunk(). Add malloc_printerr_tail()
which can be tailcalled without forcing a frame like no-return functions.
Change tcache_double_free_verify() to retry via __libc_free() after clearing
the key.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Inline tcache_free since it's only used by __libc_free. Add __glibc_likely
for the tcache checks.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
The checks on size can be merged and use __builtin_add_overflow. Since
tcache only handles small sizes (and rejects sizes < MINSIZE), delay this
check until after tcache.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Currently __libc_free checks for a freed mmap chunk in the fast path.
Also errno is always saved and restored to preserve it. Since mmap chunks
are larger than the largest tcache chunk, it is safe to delay this and
handle tcache, smallbin and medium bin blocks first. Move saving of errno
to cases that actually need it. Remove a safety check that fails on mmap
chunks and a check that mmap chunks cannot be added to tcache.
Performance of bench-malloc-thread improves by 9.2% for 1 thread and
6.9% for 32 threads on Neoverse V2.
Reviewed-by: DJ Delorie <dj@redhat.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
There is a spelling mistake in a test filename. Fix it.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
There are spelling mistakes in assert messages. Fix them.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
There is a spelling mistake in a puts statement. Fix it.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* manual/stdio.texi (Line Input): Document that getline and getdelim
where GNU extensions until standardized in POSIX.1-2008. Add restrict
to function prototypes.
Signed-off-by: Collin Funk <collin.funk1@gmail.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Since one path uses _IO_SYNC and the other _IO_OVERFLOW, the newly added
test cases verifies that `fflush (FILE)` and `fflush (NULL)` are
semantically equivalent from the FILE perspective.
Reviewed-by: Joseph Myers <josmyers@redhat.com>
This is required so that fclose, when trying to seek to the right
position after filling the input buffer, does not fail with EINVAL.
This fclose code path only ignores ESPIPE errors.
Reported by Petr Pisar on
<https://bugzilla.redhat.com/show_bug.cgi?id=2358265>.
Fixes commit be6818be31e756398e45f70e2819d78be0961223 ("Make fclose
seek input file to right offset (bug 12724)").
Reviewed-by: Frédéric Bérat <fberat@redhat.com>
Detect Intel Diamond Rapids and tune it similar to Intel Granite Rapids.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
Enable default tuning for unknown Intel processor.
Tested on x86, no regression.
Co-Authored-By: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Hi Joseph,
As we discussed, this patch just makes C23 include every check that is
performed by C11.
I tested the commit by adding the ISO23 Make and Python variables to be
the same as ISO11. So the only difference was compiling with -DISO23
instead of -DISO11. And changed the temporary directories to instead
use the format f'/tmp/glibc-{self.standard}-{self.header}'. Then I used
a shell script to run 'cmp' on each file in the ISO11 and ISO23
directories for each header to make sure they were the same.
-- 8< --
Make C23 checks include every test that is performed by C11. Done by
running the following command:
find conform -name '*.h-data' | xargs sed -i \
-e 's| !defined ISO11| !defined ISO11 \&\& !defined ISO23|g' \
-e 's| defined ISO11| defined ISO11 \|\| defined ISO23|g' \
-e 's|ifdef ISO11|if defined ISO11 \|\| defined ISO23|g' \
-e 's|ifndef ISO11|if !defined ISO11 \&\& !defined ISO23|g'
Signed-off-by: Collin Funk <collin.funk1@gmail.com>
- Add ARROWLAKE model detection.
- Add PANTHERLAKE model detection.
- Add CLEARWATERFOREST model detection.
Intel® Architecture Instruction Set Extensions Programming Reference
https://cdrdv2.intel.com/v1/dl/getContent/671368 Section 1.2.
No regression, validated model detection on SDE.
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
This hopefully provides additional information about why the
test failed, in case the fix in commit 62db87ab24f9ca483f97f
("timezone: Fix tst-bz28707 Makefile rule") turns out to be
insufficient.
Reviewed-by: Paul Eggert <eggert@cs.ucla.edu>
This is only needed for -mno-secure-plt, and this linkage mode
is not supported with powerpc64 and powerp64le.
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
As mentioned by the reporter in a pull request against gcc-mirror,
the THREEp96 constant in e_expl.c is incorrect, it is actually 0x3.p+94f128
rather than 0x3.p+96f128.
The algorithm uses that to compute the t2 integer (tval2), by whose
delta it adjusts the x+xl pair and then in the result uses the precomputed
exp value for that entry.
Using 0x3.p+94f128 rather than 0x3.p+96f128 results in tval2 sometimes
being one smaller, sometimes one larger than the desired value, thus can mean
the x+xl pair after adjustment will be larger in absolute value than it
should be.
DesWursters created a test program for this
https://github.com/DesWurstes/comparefloats
and his results were
total: 1135000000 not_equal: 4322 earlier_score: 674 later_score: 3648
I've modified this so with
https://sourceware.org/bugzilla/show_bug.cgi?id=32411#c3
so that it actually tests pseudo-random _Float128 values with range
(-16384.,16384) with strong bias on values larger than 0.0002 in absolute
value (so that tval1/tval2 aren't zero most of the time) and that gave
total: 10000000000 not_equal: 29861 earlier_score: 4606 later_score: 25255
So, in both cases, in most cases the change doesn't result in any differences,
and in those rare cases where does, about 85% have smaller ulp than without
the patch.
Additionally I've tried
https://sourceware.org/bugzilla/show_bug.cgi?id=32411#c4
and in 2 billion iterations it didn't find any case where x+xl after the
adjustments without this change would be smaller in absolute value compared
to x+xl after the adjustments with this change.
Reviewed-by: Joseph Myers <josmyers@redhat.com>
From the bug report [1], multiple programs still require to dlopen
shared libraries with either missing PT_GNU_STACK or with the executable
bit set. Although, in some cases, it seems to be a hard-craft assembly
source without the required .note.GNU-stack marking (so the static linker
is forced to set the stack executable if the ABI requires it), other
cases seem that the library uses trampolines [2].
Unfortunately, READ_IMPLIES_EXEC is not an option since on some ABIs
(x86_64), the kernel clears the bit, making it unsupported. To avoid
reinstating the broken code that changes stack permission on dlopen
(0ca8785a28), this patch extends the glibc.rtld.execstack tunable to
allow an option to force an executable stack at the program startup.
The tunable is a security issue because it defeats the PT_GNU_STACK
hardening. It has the slight advantage of making it explicit by the
caller, and, as for other tunables, this is disabled for setuid binaries.
A tunable also allows us to eventually remove it, but from previous
experiences, it would require some time.
Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu.
[1] https://sourceware.org/bugzilla/show_bug.cgi?id=32653
[2] https://github.com/conda-forge/ctng-compiler-activation-feedstock/issues/143
Reviewed-by: Sam James <sam@gentoo.org>
C2Y adds unsigned versions of the abs functions (see C2Y draft N3467 and
proposal N3349).
Tested for x86_64.
Signed-off-by: Lenard Mollenkopf <glibc@lenardmollenkopf.de>
The helper thread may get canceled before the open system
call succeds. Then ThreadData.fd remains zero, and eventually
the xclose call in end_reader_thread fails because descriptor 0
is not open.
Instead, initialize the fd member to -1 (not a valid descriptor)
and close the descriptor only if valid. Do this in a new end_thread
helper routine.
Also add more error checking to close operations.
Fixes commit 95b780c1d0549678c0a244c6e2112ec97edf0839 ("stdio: Add
more setvbuf tests").
Scan xstate IDs up to the maximum supported xstate ID. Remove the
separate AMX xstate calculation. Instead, exclude the AMX space from
the start of TILECFG to the end of TILEDATA in xsave_state_size.
Completed validation on SKL/SKX/SPR/SDE and compared xsave state size
with "ld.so --list-diagnostics" option, no regression.
Co-Authored-By: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sunil K Pandey <skpgkp2@gmail.com>
If the input buffer exceeds the stack auxiliary buffer, qsort will
malloc a temporary one to call mergesort. Since C++ standard does
allow the callback comparison function to throw [1], the glibc
implementation can potentially leak memory.
The fixes uses a pthread_cleanup_combined_push and
pthread_cleanup_combined_pop, so it can work with and without
exception enables. The qsort code path that calls malloc now
requires some extra setup and a call to __pthread_cleanup_push
anmd __pthread_cleanup_pop (which should be ok since they just
setup some buffer state).
Checked on x86_64-linux-gnu.
[1] https://timsong-cpp.github.io/cppwp/n4950/alg.c.library#4
Reviewed-by: DJ Delorie <dj@redhat.com>
We mistakenly dropped the check in 27b96e069aad17cefea9437542180bff448ac3a0;
there's some other checks which we *can* drop, but let's worry about that
later.
Fixes the build on ppc64le where GCC is configured with --with-long-double-format=ieee.
Reviewed-by: Andreas Schwab <schwab@suse.de>
Linux 6.14 has no new syscalls. Update the version number in
syscall-names.list to reflect that it is still current for 6.14.
Tested with build-many-glibcs.py.
This fixes a test build failure on Hurd.
Fixes commit 145097dff170507fe73190e8e41194f5b5f7e6bf ("x86: Use separate
variable for TLSDESC XSAVE/XSAVEC state size (bug 32810)").
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
- Change longopt.c's backticks to single quotes
- puts() does not use format specifiers
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Here is a patch updating the documentation to mention GNU and BSD
extensions that were adopted by POSIX.1-2024.
* manual/llio.texi (Memory-mapped I/O): Add that MAP_ANON and
MAP_ANONYMOUS were added by POSIX.1-2024.
* manual/memory.texi (Changing Block Size): Mention that reallocarray
was added by POSIX.1-2024.
* manual/message.texi (Message Translation): Adjust wording to match
standardization.
(Translation with gettext): Mention the gettext family of functions were
added by POSIX.1-2024.
* manual/pattern.texi (Wildcard Matching): Mention that FNM_CASEFOLD was
added by POSIX.1-2024.
* manual/process.texi (Creating a Process): Mention that _Fork and
WCOREDUMP were added by POSIX.1-2024.
* manual/signal.texi (Miscellaneous Signals): Mention that SIGWINCH was
added by POSIX-1.2024.
* manual/startup.texi (Environment Access): Mention that secure_getenv
was added by POSIX.1-2024.
* manual/string.texi (Truncating Strings): Mention that strlcpy,
strlcat, wcslcpy, and wslcat were added by POSIX-1.2024.
(Search Functions): Document that memmem was added by POSIX-1.2024.
* manual/terminal.texi (Allocation): Mention that ptsname_r was added by
POSIX-1.2024.
* manual/threads.texi (Waiting with Explicit Clocks): Move node under
POSIX Threads. Mention pthread_cond_clockwait,
pthread_rwlock_clockrdlock, and pthread_rwlock_clockwrlock were added by
POSIX-1.2024.
(Joining Threads): New node under Non-POSIX Extensions.
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Signed-off-by: Collin Funk <collin.funk1@gmail.com>
When libgcc is built with pac-ret, it requires to autenticate the
unwinding frame based on CFI information. The _dl_tlsdesc_dynamic
uses a custom calling convention, where it is responsible to save
and restore all registers it might use (even volatile).
The pac-ret support added by 1be3d6eb823d8b952fa54b7bbc90cbecb8981380
was added only on the slow-path, but the fast path also adds DWARF
Register Rule Instruction (cfi_adjust_cfa_offset) since it requires
to save/restore some auxiliary register. It seems that this is not
fully supported neither by libgcc nor AArch64 ABI [1].
Instead, move paciasp/autiasp to function prologue/epilogue to be
used on both fast and slow paths.
I also corrected the _dl_tlsdesc_dynamic comment description, it was
copied from i386 implementation without any adjustment.
Checked on aarch64-linux-gnu with a toolchain built with
--enable-standard-branch-protection on a system with pac-ret
support.
[1] https://github.com/ARM-software/abi-aa/blob/main/aadwarf64/aadwarf64.rst#id1
Reviewed-by: Yury Khrustalev <yury.khrustalev@arm.com>
Previously, the initialization code reused the xsave_state_full_size
member of struct cpu_features for the TLSDESC state size. However,
the tunable processing code assumes that this member has the
original XSAVE (non-compact) state size, so that it can use its
value if XSAVEC is disabled via tunable.
This change uses a separate variable and not a struct member because
the value is only needed in ld.so and the static libc, but not in
libc.so. As a result, struct cpu_features layout does not change,
helping a future backport of this change.
Fixes commit 9b7091415af47082664717210ac49d51551456ab ("x86-64:
Update _dl_tlsdesc_dynamic to preserve AMX registers").
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
If we have to use XSAVE or XSAVEC trampolines, do not adjust the size
information they need. Technically, it is an operator error to try to
run with -XSAVE,-XSAVEC on such builds, but this change here disables
some unnecessary code with higher ISA levels and simplifies testing.
Related to commit befe2d3c4dec8be2cdd01a47132e47bdb7020922
("x86-64: Don't use SSE resolvers for ISA level 3 or above").
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Improve performance of __libc_malloc by splitting it into 2 parts: first handle
the tcache fastpath, then do the rest in a separate tailcalled function.
This results in significant performance gains since __libc_malloc doesn't need
to setup a frame and we delay tcache initialization and setting of errno until
later.
On Neoverse V2, bench-malloc-simple improves by 6.7% overall (up to 8.5% for
ST case) and bench-malloc-thread improves by 20.3% for 1 thread and 14.4% for
32 threads.
Reviewed-by: DJ Delorie <dj@redhat.com>
Reject invalid formatted scanf real input data the exponent part of
which is comprised of an exponent introducing character, optionally
followed by a sign, and with no actual digits following. Such data is a
prefix of, but not a matching input sequence and it is required by ISO C
to cause a matching failure.
Currently a matching success is instead incorrectly produced along with
the conversion result according to the input significand read and the
exponent of zero, with the significand and the exponent part wholly
consumed from input.
Correct an invalid `tstscanf.c' test accordingly that expects a matching
success for input data provided in the ISO C standard as an example for
a matching failure.
Enable input data that causes test failures without this fix in place.
Reviewed-by: Joseph Myers <josmyers@redhat.com>
Reject invalid formatted scanf real input data that is comprised of a
hexadecimal prefix, optionally preceded by a sign, and with no actual
digits following owing to the field width restriction in effect. Such
data is a prefix of, but not a matching input sequence and it is
required by ISO C to cause a matching failure.
Currently a matching success is instead incorrectly produced along with
the conversion result of zero, with the prefix wholly consumed from
input. Where the end of input is marked by the end-of-file condition
rather than the field width restriction in effect a matching failure is
already correctly produced.
Enable input data that causes test failures without this fix in place.
Reviewed-by: Joseph Myers <josmyers@redhat.com>