Adhemerval Zanella
9583836785
math: Use coshf from CORE-MATH
...
The CORE-MATH implementation is correctly rounded (for any rounding mode),
although it should worse performance than current one. The current
implementation performance comes mainly from the internal usage of
the optimize expf implementation, and shows a maximum ULPs of 2 for
FE_TONEAREST and 3 for other rounding modes.
The code was adapted to glibc style and to use the definition of
math_config.h (to handle errno, overflow, and underflow).
Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1,
gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1):
Latency master patched improvement
x86_64 40.6995 49.0737 -20.58%
x86_64v2 40.5841 44.3604 -9.30%
x86_64v3 39.3879 39.7502 -0.92%
i686 112.3380 129.8570 -15.59%
aarch64 (Neoverse) 18.6914 17.0946 8.54%
power10 11.1343 9.3245 16.25%
reciprocal-throughput master patched improvement
x86_64 18.6471 24.1077 -29.28%
x86_64v2 17.7501 20.2946 -14.34%
x86_64v3 17.8262 17.1877 3.58%
i686 64.1454 86.5645 -34.95%
aarch64 (Neoverse) 9.77226 12.2314 -25.16%
power10 4.0200 5.3316 -32.63%
Signed-off-by: Alexei Sibidanov <sibid@uvic.ca >
Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr >
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org >
Reviewed-by: DJ Delorie <dj@redhat.com >
2024-12-18 17:24:43 -03:00
Adhemerval Zanella
6f9bacf36b
math: Use atan2f from CORE-MATH
...
The CORE-MATH implementation is correctly rounded (for any rounding mode)
and shows slight better performance to the generic atan2f.
The code was adapted to glibc style and to use the definition of
math_config.h (to handle errno, overflow, and underflow).
Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1,
gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1):
Latency master patched improvement
x86_64 68.1175 69.2014 -1.59%
x86_64v2 66.9884 66.0081 1.46%
x86_64v3 57.7034 61.6407 -6.82%
i686 189.8690 152.7560 19.55%
aarch64 (Neoverse) 32.6151 24.5382 24.76%
power10 21.7282 17.1896 20.89%
reciprocal-throughput master patched improvement
x86_64 34.5202 31.6155 8.41%
x86_64v2 32.6379 30.3372 7.05%
x86_64v3 34.3677 23.6455 31.20%
i686 157.7290 75.8308 51.92%
aarch64 (Neoverse) 27.7788 16.2671 41.44%
power10 15.5715 8.1588 47.60%
Signed-off-by: Alexei Sibidanov <sibid@uvic.ca >
Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr >
Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org >
Reviewed-by: DJ Delorie <dj@redhat.com >
2024-12-18 17:24:43 -03:00
Siddhesh Poyarekar
30891f35fa
Remove "Contributed by" lines
...
We stopped adding "Contributed by" or similar lines in sources in 2012
in favour of git logs and keeping the Contributors section of the
glibc manual up to date. Removing these lines makes the license
header a bit more consistent across files and also removes the
possibility of error in attribution when license blocks or files are
copied across since the contributed-by lines don't actually reflect
reality in those cases.
Move all "Contributed by" and similar lines (Written by, Test by,
etc.) into a new file CONTRIBUTED-BY to retain record of these
contributions. These contributors are also mentioned in
manual/contrib.texi, so we just maintain this additional record as a
courtesy to the earlier developers.
The following scripts were used to filter a list of files to edit in
place and to clean up the CONTRIBUTED-BY file respectively. These
were not added to the glibc sources because they're not expected to be
of any use in future given that this is a one time task:
https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dc
https://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02
Reviewed-by: Carlos O'Donell <carlos@redhat.com >
2021-09-03 22:06:44 +05:30
Wilco Dijkstra
220622dde5
Add libm_alias_finite for _finite symbols
...
This patch adds a new macro, libm_alias_finite, to define all _finite
symbol. It sets all _finite symbol as compat symbol based on its first
version (obtained from the definition at built generated first-versions.h).
The <fn>f128_finite symbols were introduced in GLIBC 2.26 and so need
special treatment in code that is shared between long double and float128.
It is done by adding a list, similar to internal symbol redifinition,
on sysdeps/ieee754/float128/float128_private.h.
Alpha also needs some tricky changes to ensure we still emit 2 compat
symbols for sqrt(f).
Passes buildmanyglibc.
Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org >
Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org >
2020-01-03 10:02:04 -03:00
Joseph Myers
24ab7723b8
Consistently use uintN_t not u_intN_t in libm.
...
This patch changes libm code to make consistent use of C99 uintN_t
types instead of sometimes using those and sometimes using the older
nonstandard u_intN_t names. This makes sense as a cleanup in its own
right, and also facilitates merges to GCC's libquadmath (which gets
the types from stdint.h and so may not have u_intN_t available at
all).
Tested for x86_64, and with build-many-glibcs.py.
* math/s_nextafter.c (__nextafter): Use uintN_t instead of
u_intN_t.
* math/s_nexttowardf.c (__nexttowardf): Likewise.
* sysdeps/generic/math_private.h (ieee_double_shape_type):
Likewise.
(ieee_float_shape_type): Likewise.
* sysdeps/i386/fpu/s_fpclassifyl.c (__fpclassifyl): Likewise.
* sysdeps/i386/fpu/s_isnanl.c (__isnanl): Likewise.
* sysdeps/i386/fpu/s_nextafterl.c (__nextafterl): Likewise.
* sysdeps/i386/fpu/s_nexttoward.c (__nexttoward): Likewise.
* sysdeps/i386/fpu/s_nexttowardf.c (__nexttowardf): Likewise.
* sysdeps/ieee754/dbl-64/e_acosh.c (__ieee754_acosh): Likewise.
* sysdeps/ieee754/dbl-64/e_cosh.c (__ieee754_cosh): Likewise.
* sysdeps/ieee754/dbl-64/e_fmod.c (__ieee754_fmod): Likewise.
* sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r):
Likewise.
* sysdeps/ieee754/dbl-64/e_hypot.c (__ieee754_hypot): Likewise.
* sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Likewise.
(__ieee754_yn): Likewise.
* sysdeps/ieee754/dbl-64/e_log10.c (__ieee754_log10): Likewise.
* sysdeps/ieee754/dbl-64/e_log2.c (__ieee754_log2): Likewise.
* sysdeps/ieee754/dbl-64/e_rem_pio2.c (__ieee754_rem_pio2):
Likewise.
* sysdeps/ieee754/dbl-64/e_sinh.c (__ieee754_sinh): Likewise.
* sysdeps/ieee754/dbl-64/s_ceil.c (__ceil): Likewise.
* sysdeps/ieee754/dbl-64/s_copysign.c (__copysign): Likewise.
* sysdeps/ieee754/dbl-64/s_erf.c (__erf): Likewise.
(__erfc): Likewise.
* sysdeps/ieee754/dbl-64/s_expm1.c (__expm1): Likewise.
* sysdeps/ieee754/dbl-64/s_finite.c (FINITE): Likewise.
* sysdeps/ieee754/dbl-64/s_floor.c (__floor): Likewise.
* sysdeps/ieee754/dbl-64/s_fpclassify.c (__fpclassify): Likewise.
* sysdeps/ieee754/dbl-64/s_isnan.c (__isnan): Likewise.
* sysdeps/ieee754/dbl-64/s_issignaling.c (__issignaling):
Likewise.
* sysdeps/ieee754/dbl-64/s_llrint.c (__llrint): Likewise.
* sysdeps/ieee754/dbl-64/s_llround.c (__llround): Likewise.
* sysdeps/ieee754/dbl-64/s_lrint.c (__lrint): Likewise.
* sysdeps/ieee754/dbl-64/s_lround.c (__lround): Likewise.
* sysdeps/ieee754/dbl-64/s_modf.c (__modf): Likewise.
* sysdeps/ieee754/dbl-64/s_nextup.c (__nextup): Likewise.
* sysdeps/ieee754/dbl-64/s_remquo.c (__remquo): Likewise.
* sysdeps/ieee754/dbl-64/s_round.c (__round): Likewise.
* sysdeps/ieee754/dbl-64/s_trunc.c (__trunc): Likewise.
* sysdeps/ieee754/dbl-64/wordsize-64/s_issignaling.c
(__issignaling): Likewise.
* sysdeps/ieee754/flt-32/e_atan2f.c (__ieee754_atan2f): Likewise.
* sysdeps/ieee754/flt-32/e_fmodf.c (__ieee754_fmodf): Likewise.
* sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r):
Likewise.
* sysdeps/ieee754/flt-32/e_jnf.c (__ieee754_ynf): Likewise.
* sysdeps/ieee754/flt-32/e_log10f.c (__ieee754_log10f): Likewise.
* sysdeps/ieee754/flt-32/e_powf.c (__ieee754_powf): Likewise.
* sysdeps/ieee754/flt-32/e_rem_pio2f.c (__ieee754_rem_pio2f):
Likewise.
* sysdeps/ieee754/flt-32/e_remainderf.c (__ieee754_remainderf):
Likewise.
* sysdeps/ieee754/flt-32/e_sqrtf.c (__ieee754_sqrtf): Likewise.
* sysdeps/ieee754/flt-32/s_ceilf.c (__ceilf): Likewise.
* sysdeps/ieee754/flt-32/s_copysignf.c (__copysignf): Likewise.
* sysdeps/ieee754/flt-32/s_erff.c (__erff): Likewise.
(__erfcf): Likewise.
* sysdeps/ieee754/flt-32/s_expm1f.c (__expm1f): Likewise.
* sysdeps/ieee754/flt-32/s_finitef.c (FINITEF): Likewise.
* sysdeps/ieee754/flt-32/s_floorf.c (__floorf): Likewise.
* sysdeps/ieee754/flt-32/s_fpclassifyf.c (__fpclassifyf):
Likewise.
* sysdeps/ieee754/flt-32/s_isnanf.c (__isnanf): Likewise.
* sysdeps/ieee754/flt-32/s_issignalingf.c (__issignalingf):
Likewise.
* sysdeps/ieee754/flt-32/s_llrintf.c (__llrintf): Likewise.
* sysdeps/ieee754/flt-32/s_llroundf.c (__llroundf): Likewise.
* sysdeps/ieee754/flt-32/s_lrintf.c (__lrintf): Likewise.
* sysdeps/ieee754/flt-32/s_lroundf.c (__lroundf): Likewise.
* sysdeps/ieee754/flt-32/s_modff.c (__modff): Likewise.
* sysdeps/ieee754/flt-32/s_remquof.c (__remquof): Likewise.
* sysdeps/ieee754/flt-32/s_roundf.c (__roundf): Likewise.
* sysdeps/ieee754/ldbl-128/e_acoshl.c (__ieee754_acoshl):
Likewise.
* sysdeps/ieee754/ldbl-128/e_atan2l.c (__ieee754_atan2l):
Likewise.
* sysdeps/ieee754/ldbl-128/e_atanhl.c (__ieee754_atanhl):
Likewise.
* sysdeps/ieee754/ldbl-128/e_fmodl.c (__ieee754_fmodl): Likewise.
* sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r):
Likewise.
* sysdeps/ieee754/ldbl-128/e_hypotl.c (__ieee754_hypotl):
Likewise.
* sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise.
(__ieee754_ynl): Likewise.
* sysdeps/ieee754/ldbl-128/e_powl.c (__ieee754_powl): Likewise.
* sysdeps/ieee754/ldbl-128/e_rem_pio2l.c (__ieee754_rem_pio2l):
Likewise.
* sysdeps/ieee754/ldbl-128/e_remainderl.c (__ieee754_remainderl):
Likewise.
* sysdeps/ieee754/ldbl-128/e_sinhl.c (__ieee754_sinhl): Likewise.
* sysdeps/ieee754/ldbl-128/k_cosl.c (__kernel_cosl): Likewise.
* sysdeps/ieee754/ldbl-128/k_sincosl.c (__kernel_sincosl):
Likewise.
* sysdeps/ieee754/ldbl-128/k_sinl.c (__kernel_sinl): Likewise.
* sysdeps/ieee754/ldbl-128/s_ceill.c (__ceill): Likewise.
* sysdeps/ieee754/ldbl-128/s_copysignl.c (__copysignl): Likewise.
* sysdeps/ieee754/ldbl-128/s_erfl.c (__erfcl): Likewise.
* sysdeps/ieee754/ldbl-128/s_fabsl.c (__fabsl): Likewise.
* sysdeps/ieee754/ldbl-128/s_finitel.c (__finitel): Likewise.
* sysdeps/ieee754/ldbl-128/s_floorl.c (__floorl): Likewise.
* sysdeps/ieee754/ldbl-128/s_fpclassifyl.c (__fpclassifyl):
Likewise.
* sysdeps/ieee754/ldbl-128/s_frexpl.c (__frexpl): Likewise.
* sysdeps/ieee754/ldbl-128/s_isnanl.c (__isnanl): Likewise.
* sysdeps/ieee754/ldbl-128/s_issignalingl.c (__issignalingl):
Likewise.
* sysdeps/ieee754/ldbl-128/s_llrintl.c (__llrintl): Likewise.
* sysdeps/ieee754/ldbl-128/s_llroundl.c (__llroundl): Likewise.
* sysdeps/ieee754/ldbl-128/s_lrintl.c (__lrintl): Likewise.
* sysdeps/ieee754/ldbl-128/s_lroundl.c (__lroundl): Likewise.
* sysdeps/ieee754/ldbl-128/s_modfl.c (__modfl): Likewise.
* sysdeps/ieee754/ldbl-128/s_nearbyintl.c (__nearbyintl):
Likewise.
* sysdeps/ieee754/ldbl-128/s_nextafterl.c (__nextafterl):
Likewise.
* sysdeps/ieee754/ldbl-128/s_nexttoward.c (__nexttoward):
Likewise.
* sysdeps/ieee754/ldbl-128/s_nexttowardf.c (__nexttowardf):
Likewise.
* sysdeps/ieee754/ldbl-128/s_nextupl.c (__nextupl): Likewise.
* sysdeps/ieee754/ldbl-128/s_remquol.c (__remquol): Likewise.
* sysdeps/ieee754/ldbl-128/s_rintl.c (__rintl): Likewise.
* sysdeps/ieee754/ldbl-128/s_roundl.c (__roundl): Likewise.
* sysdeps/ieee754/ldbl-128/s_tanhl.c (__tanhl): Likewise.
* sysdeps/ieee754/ldbl-128/s_truncl.c (__truncl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_fmodl.c (__ieee754_fmodl):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_powl.c (__ieee754_powl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_rem_pio2l.c (__ieee754_rem_pio2l):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/e_remainderl.c
(__ieee754_remainderl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/k_cosl.c (__kernel_cosl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/k_sinl.c (__kernel_sinl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_fabsl.c (__fabsl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_fpclassifyl.c (___fpclassifyl):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_modfl.c (__modfl): Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_nexttowardf.c (__nexttowardf):
Likewise.
* sysdeps/ieee754/ldbl-128ibm/s_remquol.c (__remquol): Likewise.
* sysdeps/ieee754/ldbl-96/e_acoshl.c (__ieee754_acoshl): Likewise.
* sysdeps/ieee754/ldbl-96/e_asinl.c (__ieee754_asinl): Likewise.
* sysdeps/ieee754/ldbl-96/e_atanhl.c (__ieee754_atanhl): Likewise.
* sysdeps/ieee754/ldbl-96/e_coshl.c (__ieee754_coshl): Likewise.
* sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r):
Likewise.
* sysdeps/ieee754/ldbl-96/e_hypotl.c (__ieee754_hypotl): Likewise.
* sysdeps/ieee754/ldbl-96/e_j0l.c (__ieee754_j0l): Likewise.
(__ieee754_y0l): Likewise.
(pzero): Likewise.
(qzero): Likewise.
* sysdeps/ieee754/ldbl-96/e_j1l.c (__ieee754_j1l): Likewise.
(__ieee754_y1l): Likewise.
(pone): Likewise.
(qone): Likewise.
* sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise.
(__ieee754_ynl): Likewise.
* sysdeps/ieee754/ldbl-96/e_lgammal_r.c (sin_pi): Likewise.
(__ieee754_lgammal_r): Likewise.
* sysdeps/ieee754/ldbl-96/e_rem_pio2l.c (__ieee754_rem_pio2l):
Likewise.
* sysdeps/ieee754/ldbl-96/e_sinhl.c (__ieee754_sinhl): Likewise.
* sysdeps/ieee754/ldbl-96/s_copysignl.c (__copysignl): Likewise.
* sysdeps/ieee754/ldbl-96/s_erfl.c (__erfl): Likewise.
(__erfcl): Likewise.
* sysdeps/ieee754/ldbl-96/s_frexpl.c (__frexpl): Likewise.
* sysdeps/ieee754/ldbl-96/s_issignalingl.c (__issignalingl):
Likewise.
* sysdeps/ieee754/ldbl-96/s_llrintl.c (__llrintl): Likewise.
* sysdeps/ieee754/ldbl-96/s_llroundl.c (__llroundl): Likewise.
* sysdeps/ieee754/ldbl-96/s_lrintl.c (__lrintl): Likewise.
* sysdeps/ieee754/ldbl-96/s_lroundl.c (__lroundl): Likewise.
* sysdeps/ieee754/ldbl-96/s_modfl.c (__modfl): Likewise.
* sysdeps/ieee754/ldbl-96/s_nexttoward.c (__nexttoward): Likewise.
* sysdeps/ieee754/ldbl-96/s_nexttowardf.c (__nexttowardf):
Likewise.
* sysdeps/ieee754/ldbl-96/s_nextupl.c (__nextupl): Likewise.
* sysdeps/ieee754/ldbl-96/s_remquol.c (__remquol): Likewise.
* sysdeps/ieee754/ldbl-96/s_roundl.c (__roundl): Likewise.
* sysdeps/ieee754/ldbl-96/s_tanhl.c (__tanhl): Likewise.
* sysdeps/ieee754/ldbl-opt/s_nexttowardfd.c (__nldbl_nexttowardf):
Likewise.
* sysdeps/m68k/m680x0/fpu/e_pow.c (s(__ieee754_pow)): Likewise.
* sysdeps/m68k/m680x0/fpu/s_fpclassifyl.c (__fpclassifyl):
Likewise.
* sysdeps/m68k/m680x0/fpu/s_llrint.c (__llrint): Likewise.
* sysdeps/m68k/m680x0/fpu/s_llrintf.c (__llrintf): Likewise.
* sysdeps/m68k/m680x0/fpu/s_llrintl.c (__llrintl): Likewise.
* sysdeps/m68k/m680x0/fpu/s_nextafterl.c (__nextafterl): Likewise.
* sysdeps/x86/fpu/powl_helper.c (__powl_helper): Likewise.
2017-08-03 19:55:04 +00:00
Richard Henderson
1ed0291c31
Use <> for math.h and math_private.h everywhere.
...
Entire tree edited via find | grep | sed.
2012-03-09 16:09:10 -08:00
Ulrich Drepper
0ac5ae2335
Optimize libm
...
libm is now somewhat integrated with gcc's -ffinite-math-only option
and lots of the wrapper functions have been optimized.
2011-10-12 11:27:51 -04:00
Ulrich Drepper
aa2ebe015a
2005-07-20 Bob Wilson <bob.wilson@acm.org>
...
Darin Petkov <darin@tensilica.com >
* sysdeps/ieee754/flt-32/e_atan2f.c (pi_lo): Correct exponent value.
2005-07-20 18:02:49 +00:00
Ulrich Drepper
a334319f65
(CFLAGS-tst-align.c): Add -mpreferred-stack-boundary=4.
2004-12-22 20:10:10 +00:00
Jakub Jelinek
0ecb606cb6
2.5-18.1
2007-07-12 18:26:36 +00:00
Ulrich Drepper
abfbdde177
Update.
1999-07-14 00:54:57 +00:00