The left shift overflows for 'int', use uint64_t instead. It syncs
with CORE-MATH commit 4d6192d2.
Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
The libm size improvement built with "--enable-stack-protector=strong
--enable-bind-now=yes --enable-fortify-source=2":
Before:
text data bss dec hex filename
585192 860 12 586064 8f150 aarch64-linux-gnu/math/libm.so
960775 1068 12 961855 ead3f x86_64-linux-gnu/math/libm.so
1189174 5544 368 1195086 123c4e powerpc64le-linux-gnu/math/libm.so
After:
text data bss dec hex filename
584952 860 12 585824 8f060 aarch64-linux-gnu/math/libm.so
960615 1068 12 961695 eac9f x86_64-linux-gnu/math/libm.so
1189078 5544 368 1194990 123bee powerpc64le-linux-gnu/math/libm.so
The are small code changes for x86_64 and powerpc64le, which do not
affect performance; but on aarch64 with gcc-14 I see a slight better
code generation due the usage of ldq for floating point constant loading.
Reviewed-by: Andreas K. Huettel <dilfridge@gentoo.org>
The CORE-MATH implementation is correctly rounded (for any rounding mode),
although it should worse performance than current one. The current
implementation performance comes mainly from the internal usage of
the optimize expf implementation, and shows a maximum ULPs of 2 for
FE_TONEAREST and 3 for other rounding modes.
The code was adapted to glibc style and to use the definition of
math_config.h (to handle errno, overflow, and underflow).
Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1,
gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1):
Latency master patched improvement
x86_64 40.6995 49.0737 -20.58%
x86_64v2 40.5841 44.3604 -9.30%
x86_64v3 39.3879 39.7502 -0.92%
i686 112.3380 129.8570 -15.59%
aarch64 (Neoverse) 18.6914 17.0946 8.54%
power10 11.1343 9.3245 16.25%
reciprocal-throughput master patched improvement
x86_64 18.6471 24.1077 -29.28%
x86_64v2 17.7501 20.2946 -14.34%
x86_64v3 17.8262 17.1877 3.58%
i686 64.1454 86.5645 -34.95%
aarch64 (Neoverse) 9.77226 12.2314 -25.16%
power10 4.0200 5.3316 -32.63%
Signed-off-by: Alexei Sibidanov <sibid@uvic.ca>
Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr>
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: DJ Delorie <dj@redhat.com>
We stopped adding "Contributed by" or similar lines in sources in 2012
in favour of git logs and keeping the Contributors section of the
glibc manual up to date. Removing these lines makes the license
header a bit more consistent across files and also removes the
possibility of error in attribution when license blocks or files are
copied across since the contributed-by lines don't actually reflect
reality in those cases.
Move all "Contributed by" and similar lines (Written by, Test by,
etc.) into a new file CONTRIBUTED-BY to retain record of these
contributions. These contributors are also mentioned in
manual/contrib.texi, so we just maintain this additional record as a
courtesy to the earlier developers.
The following scripts were used to filter a list of files to edit in
place and to clean up the CONTRIBUTED-BY file respectively. These
were not added to the glibc sources because they're not expected to be
of any use in future given that this is a one time task:
https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dchttps://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
This patch adds a new macro, libm_alias_finite, to define all _finite
symbol. It sets all _finite symbol as compat symbol based on its first
version (obtained from the definition at built generated first-versions.h).
The <fn>f128_finite symbols were introduced in GLIBC 2.26 and so need
special treatment in code that is shared between long double and float128.
It is done by adding a list, similar to internal symbol redifinition,
on sysdeps/ieee754/float128/float128_private.h.
Alpha also needs some tricky changes to ensure we still emit 2 compat
symbols for sqrt(f).
Passes buildmanyglibc.
Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
This patch continues cleaning up the math_private.h header, which
contains lots of different definitions many of which are only needed
by a limited subset of files using that header (and some of which are
overridden by architectures that only want to override selected parts
of the header), by moving the math_narrow_eval macro out to a separate
math-narrow-eval.h header, only included by those files that need it.
That header is placed in include/ (since it's used in stdlib/, not
just files built in math/, but no sysdeps variants are needed at
present).
Tested for x86_64, and with build-many-glibcs.py. (Installed stripped
shared libraries change because of line numbers in assertions in
strtod_l.c.)
* include/math-narrow-eval.h: New file. Contents moved from ....
* sysdeps/generic/math_private.h: ... here.
(math_narrow_eval): Remove macro. Moved to math-narrow-eval.h.
[FLT_EVAL_METHOD != 0] (excess_precision): Likewise.
* math/s_fdim_template.c: Include <math-narrow-eval.h>.
* stdlib/strtod_l.c: Likewise.
* sysdeps/i386/fpu/s_f32xaddf64.c: Likewise.
* sysdeps/i386/fpu/s_f32xsubf64.c: Likewise.
* sysdeps/i386/fpu/s_fdim.c: Likewise.
* sysdeps/ieee754/dbl-64/e_cosh.c: Likewise.
* sysdeps/ieee754/dbl-64/e_gamma_r.c: Likewise.
* sysdeps/ieee754/dbl-64/e_j1.c: Likewise.
* sysdeps/ieee754/dbl-64/e_jn.c: Likewise.
* sysdeps/ieee754/dbl-64/e_lgamma_r.c: Likewise.
* sysdeps/ieee754/dbl-64/e_sinh.c: Likewise.
* sysdeps/ieee754/dbl-64/gamma_productf.c: Likewise.
* sysdeps/ieee754/dbl-64/k_rem_pio2.c: Likewise.
* sysdeps/ieee754/dbl-64/lgamma_neg.c: Likewise.
* sysdeps/ieee754/dbl-64/s_erf.c: Likewise.
* sysdeps/ieee754/dbl-64/s_llrint.c: Likewise.
* sysdeps/ieee754/dbl-64/s_lrint.c: Likewise.
* sysdeps/ieee754/flt-32/e_coshf.c: Likewise.
* sysdeps/ieee754/flt-32/e_exp2f.c: Likewise.
* sysdeps/ieee754/flt-32/e_expf.c: Likewise.
* sysdeps/ieee754/flt-32/e_gammaf_r.c: Likewise.
* sysdeps/ieee754/flt-32/e_j1f.c: Likewise.
* sysdeps/ieee754/flt-32/e_jnf.c: Likewise.
* sysdeps/ieee754/flt-32/e_lgammaf_r.c: Likewise.
* sysdeps/ieee754/flt-32/e_sinhf.c: Likewise.
* sysdeps/ieee754/flt-32/k_rem_pio2f.c: Likewise.
* sysdeps/ieee754/flt-32/lgamma_negf.c: Likewise.
* sysdeps/ieee754/flt-32/s_erff.c: Likewise.
* sysdeps/ieee754/flt-32/s_llrintf.c: Likewise.
* sysdeps/ieee754/flt-32/s_lrintf.c: Likewise.
* sysdeps/ieee754/ldbl-96/gamma_product.c: Likewise.
Various i386 libm functions return values with excess range and
precision; Wilco Dijkstra's patches to make isfinite etc. expand
inline cause this pre-existing issue to result in test failures (when
e.g. a result that overflows float but not long double gets counted as
overflowing for some purposes but not others).
This patch addresses those cases arising from functions defined in C,
adding a math_narrow_eval macro that forces values to memory to
eliminate excess precision if FLT_EVAL_METHOD indicates this is
needed, and is a no-op otherwise. I'll convert existing uses of
volatile and asm for this purpose to use the new macro later, once
i386 has clean test results again (which requires fixes for .S files
as well).
Tested for x86_64 and x86. Committed.
[BZ #18980]
* sysdeps/generic/math_private.h: Include <float.h>.
(math_narrow_eval): New macro.
[FLT_EVAL_METHOD != 0] (excess_precision): Likewise.
* sysdeps/ieee754/dbl-64/e_cosh.c (__ieee754_cosh): Use
math_narrow_eval on overflowing return value.
* sysdeps/ieee754/dbl-64/e_lgamma_r.c (__ieee754_lgamma_r):
Likewise.
* sysdeps/ieee754/dbl-64/e_sinh.c (__ieee754_sinh): Likewise.
* sysdeps/ieee754/flt-32/e_coshf.c (__ieee754_coshf): Likewise.
* sysdeps/ieee754/flt-32/e_lgammaf_r.c (__ieee754_lgammaf_r):
Likewise.
* sysdeps/ieee754/flt-32/e_sinhf.c (__ieee754_sinhf): Likewise.
This patch fixes bug 16354, spurious underflows from cosh when a tiny
argument is passed to expm1 and expm1 correctly underflows although
the final result of cosh should be 1. As noted in that bug, some
cases are latent because of expm1 implementations not raising
underflow (bug 16353), but all the implementations are fixed
similarly. They already contained checks for tiny arguments, but the
checks were too late to avoid underflow from expm1 (although they
would avoid underflow from subsequent squaring of the result of
expm1); they are moved before the expm1 calls.
The thresholds used for considering arguments tiny are not
particularly consistent in how they relate to the precision of the
floating-point format in question. They are, however, all sufficient
to ensure that the round-to-nearest result of cosh is indeed 1 below
the threshold (although sometimes they are smaller than necessary).
But the previous logic did not return 1, but the previously computed 1
+ expm1(abs(x)) value. And the thresholds in the ldbl-128 and
ldbl-128ibm code (0x1p-71L - I suspect 0x3f8b was intended in the code
instead of 0x3fb8 - and (roughly) 0x1p-55L) are not sufficient for
that value to be 1. So by moving the test for tiny arguments, and
consequently returning 1 directly now the expm1 value hasn't been
computed by that point, this patch also fixes bug 17061, the (large
number of ulps) inaccuracy for small arguments in those
implementations. Tests for that bug are duly added.
Tested x86_64 and x86 and ulps updated accordingly. Also tested for
mips64 and powerpc32 to validate the ldbl-128 and ldbl-128ibm changes.
[BZ #16354]
[BZ #17061]
* sysdeps/ieee754/dbl-64/e_cosh.c (__ieee754_cosh): Check for
small arguments before calling __expm1.
* sysdeps/ieee754/flt-32/e_coshf.c (__ieee754_coshf): Check for
small arguments before calling __expm1f.
* sysdeps/ieee754/ldbl-128/e_coshl.c (__ieee754_coshl): Check for
small arguments before calling __expm1l.
* sysdeps/ieee754/ldbl-128ibm/e_coshl.c (__ieee754_coshl):
Likewise.
* sysdeps/ieee754/ldbl-96/e_coshl.c (__ieee754_coshl): Likewise.
* math/auto-libm-test-in: Add more cosh tests. Do not allow
spurious underflow for some cosh tests.
* math/auto-libm-test-out: Regenerated.
* sysdeps/i386/fpu/libm-test-ulps: Update.