glibc

lib/glibc

mirror of https://sourceware.org/git/glibc.git synced 2025-12-24 17:51:17 +03:00

Author	SHA1	Message	Date
Adhemerval Zanella	3078358ac6	math: Remove the SVID error handling from tgammaf It improves latency for about 1.5% and throughput for about 2-4%. Tested on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-11-05 10:19:37 -03:00
Adhemerval Zanella	de0e623434	math: Remove the SVID error handling from lgammaf/lgammaf_r It improves latency throughput for about 2%. Tested on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-11-05 09:27:07 -03:00
Adhemerval Zanella	7ec8eb5676	math: Remove the SVID error handling from atan2f It improves latency for about 3-6% and throughput for about 5-12%. Tested on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-11-05 07:15:52 -03:00
Joseph Myers	26e4810210	Rename fromfp files in preparation for changing types for C23 As discussed in bug 28327, the fromfp functions changed type in C23 (compared to the version in TS 18661-1); they now return the same type as the floating-point argument, instead of intmax_t / uintmax_t. As with other such incompatible changes compared to the initial TS 18661 versions of interfaces (the types of totalorder functions, in particular), it seems appropriate to support only the new version as an API, not the old one (although many programs written for the old API might in fact work wtih the new one as well). Thus, the existing implementations should become compat symbols. They are sufficiently different from how I'd expect to implement the new version that using separate implementations in separate files is more convenient than trying to share code, and directly sharing testcases would be problematic as well. Rename the existing fromfp implementation and test files to names reflecting how they're intended to become compat symbols, so freeing up the existing filenames for a subsequent implementation of the C23 versions of these functions (which is the point at which the existing implementations would actually become compat symbols). gen-fromfp-tests.py and gen-fromfp-tests-inputs are not renamed; I think it will make sense to adapt the test generator to be able to generate most tests for both versions of the functions (with extra test inputs added that are only of interest with the C23 version). The ldbl-opt/nldbl-* files are also not renamed; since those are for a static only library, no compat versions are needed, and they'll just have their contents changed when the C23 version is implemented. Tested for x86_64, and with build-many-glibcs.py.	2025-11-04 23:41:35 +00:00
Joseph Myers	26d11a0944	Add C23 long_double_t, _FloatN_t C23 Annex H adds <math.h> typedefs long_double_t and _FloatN_t (originally introduced in TS 18661-3), analogous to float_t and double_t. Add these typedefs to glibc. (There are no _FloatNx_t typedefs.) C23 also slightly changes the rules for how such typedef names should be defined, compared to the definition in TS 18661-3. In both cases, <TYPE>_t corresponds to the evaluation format for <TYPE>, as specified by FLT_EVAL_METHOD (for which <math.h> uses glibc's internal __GLIBC_FLT_EVAL_METHOD). Specifically, each FLT_EVAL_METHOD value corresponds to some type U (for example, 64 corresponds to U = _Float64), and for types with exactly the same set of values as U, TS 18661-3 says expressions with those types are to be evaluated to the range and precision of type U (so <TYPE>_t is defined to U), whereas C23 only does that for types whose values are a strict subset of those of type U (so <TYPE>_t is defined to <TYPE>). As with other cases where semantics changed between TS 18661 and C23, this patch only implements the newer version of the semantics (including adjusting existing definitions of float_t and double_t as needed). The new semantics are contradictory between the main standard and Annex H for the case of FLT_EVAL_METHOD == 2 and the choice of double_t when double and long double have the same values (the main standard says it's defined as long double in that case, whereas Annex H would define it as double), which I've raised on the WG14 reflector (but I think setting FLT_EVAL_METHOD == 2 when double and long double have the same values is a fairly theoretical combination of features); for now glibc follows the value in the main standard in that case. Note that I think all existing GCC targets supported by glibc only use values -1, 0, 1, 2 or 16 for FLT_EVAL_METHOD (so most of the header code is somewhat theoretical, though potentially relevant with other compilers since the choice of FLT_EVAL_METHOD is only an API choice, not an ABI one; it can vary with compiler options, and these typedefs should not be used in ABIs). The testcase (expanded to cover the new typedefs) is really just repeating the same logic in a second place (so all it really tests is that __GLIBC_FLT_EVAL_METHOD is consistent with FLT_EVAL_METHOD). Tested for x86_64 and x86, and with build-many-glibcs.py.	2025-11-04 17:12:00 +00:00
Adhemerval Zanella	0dfc849eff	math: Remove the SVID error handling wrapper from sqrt i386 and m68k architectures should use math-use-builtins-sqrt.h rather than relying on architecture-specific or inline assembly implementations. The PowerPC optimization for PPC 601/603 (30 years old) is removed. Tested on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-11-04 04:14:01 -03:00
Adhemerval Zanella	f27a146409	math: Remove the SVID error handling from sinhf It improves latency for about 3-10% and throughput for about 5-15%. Tested on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-11-04 04:14:01 -03:00
Adhemerval Zanella	0e1a1178ee	math: Remove the SVID error handling from remainder The optimized i386 version is faster than the generic one, and gcc implements it through the builtin. This optimization enables us to migrate the implementation to a C version. The performance on a Zen3 chip is similar to the SVID one. The m68k provided an optimized version through __m81_u(remainderf) (mathimpl.h), and gcc does not implement it through a builtin (different than i386). Performance improves a bit on x86_64 (Zen3, gcc 15.2.1): reciprocal-throughput input master NO-SVID improvement x86_64 subnormals 18.8522 16.2506 13.80% x86_64 normal 421.8260 403.9270 4.24% x86_64 close-exponent 21.0579 18.7642 10.89% i686 subnormals 21.3443 21.4229 -0.37% i686 normal 525.8380 538.807 -2.47% i686 close-exponent 21.6589 21.7983 -0.64% Tested on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-11-04 04:14:01 -03:00
Adhemerval Zanella	c4c6c79d70	math: Remove the SVID error handling from remainderf The optimized i386 version is faster than the generic one, and gcc implements it through the builtin. This optimization enables us to migrate the implementation to a C version. The performance on a Zen3 chip is similar to the SVID one. The m68k provided an optimized version through __m81_u(remainderf) (mathimpl.h), and gcc does not implement it through a builtin (different than i386). Performance improves a bit on x86_64 (Zen3, gcc 15.2.1): reciprocal-throughput input master NO-SVID improvement x86_64 subnormals 17.5349 15.6125 10.96% x86_64 normal 53.8134 52.5754 2.30% x86_64 close-exponent 20.0211 18.6656 6.77% i686 subnormals 21.8105 20.1856 7.45% i686 normal 73.1945 71.2199 2.70% i686 close-exponent 22.2141 20.331 8.48% Tested on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-11-04 04:14:01 -03:00
Wilco Dijkstra	1136c036a3	math: Remove xfail from pow test [BZ #33563 ] Remove xfail from pow testcase since pow and powf have been fixed. Also check float128 maximum value. See BZ #33563. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2025-10-31 19:13:53 +00:00
Adhemerval Zanella	ee946212fe	math: Remove the SVID error handling wrapper from yn/jn Tested on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-10-30 15:41:35 -03:00
Adhemerval Zanella	8d4815e6d7	math: Remove the SVID error handling wrapper from y1/j1 Tested on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-10-30 15:41:33 -03:00
Adhemerval Zanella	b050cb53b0	math: Remove the SVID error handling wrapper from y0/j0 Tested on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-10-30 15:41:31 -03:00
Adhemerval Zanella	03eeeba705	math: Remove the SVID error handling from coshf It improves latency for about 3-10% and throughput for about 5-15%. Tested on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-10-30 15:41:28 -03:00
Adhemerval Zanella	555c39c0fc	math: Remove the SVID error handling from atanhf It improves latency for about 1-10% and throughput for about 5-10%. Tested on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-10-30 15:41:26 -03:00
Adhemerval Zanella	8facb464b4	math: Remove the SVID error handling from acoshf It improves latency for about 3-7% and throughput for about 5-10%. Tested on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-10-30 15:41:24 -03:00
Adhemerval Zanella	f92aba68bc	math: Remove the SVID error handling from asinf It improves latency for about 2% and throughput for about 5%. Tested on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-10-30 15:41:22 -03:00
Adhemerval Zanella	9f8dea5b5d	math: Remove the SVID error handling from acosf It improves latency for about 2-10% and throughput for about 5-10%. Tested on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-10-30 15:41:20 -03:00
Adhemerval Zanella	0b484d7b77	math: Remove the SVID error handling from log10f It improves latency for about 3-10% and throughput for about 5-10%. Tested on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-10-30 15:41:17 -03:00
Adhemerval Zanella	e4d812c980	math: Consolidate erf/erfc definitions The common code definitions are consolidated in s_erf_common.h and s_erf_common.c. Checked on x86_64-linux-gnu, aarch64-linux-gnu, and powerpc64le-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>	2025-10-27 09:46:01 -03:00
Adhemerval Zanella	fc419290f9	math: Consolidate internal erf/erfc tables The shared internal data definitions are consolidated in s_erf_data.c and the erfc only one are moved to s_erfc_data.c. Checked on x86_64-linux-gnu, aarch64-linux-gnu, and powerpc64le-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>	2025-10-27 09:34:04 -03:00
Adhemerval Zanella	acaad9ab06	math: Use erfc from CORE-MATH The current implementation precision shows the following accuracy, on three ranges ([-DBL_MAX,5], [-5,5], [5,DBL_MAX]) with 10e9 uniform randomly generated numbers for each range (first column is the accuracy in ULP, with '0' being correctly rounded, second is the number of samples with the corresponding precision): * Range [-DBL_MAX, -5] * FE_TONEAREST 0: 10000000000 100.00% * FE_UPWARD 0: 10000000000 100.00% * FE_DOWNWARD 0: 10000000000 100.00% * FE_TOWARDZERO 0: 10000000000 100.00% * Range [-5, 5] * FE_TONEAREST 0: 8069309665 80.69% 1: 1882910247 18.83% 2: 47485296 0.47% 3: 293749 0.00% 4: 1043 0.00% * FE_UPWARD 0: 5540301026 55.40% 1: 2026739127 20.27% 2: 1774882486 17.75% 3: 567324466 5.67% 4: 86913847 0.87% 5: 3820789 0.04% 6: 18259 0.00% * FE_DOWNWARD 0: 5520969586 55.21% 1: 2057293099 20.57% 2: 1778334818 17.78% 3: 557521494 5.58% 4: 82473927 0.82% 5: 3393276 0.03% 6: 13800 0.00% * FE_TOWARDZERO 0: 6220287175 62.20% 1: 2323846149 23.24% 2: 1251999920 12.52% 3: 190748245 1.91% 4: 12996232 0.13% 5: 122279 0.00% * Range [5, DBL_MAX] * FE_TONEAREST 0: 10000000000 100.00% * FE_UPWARD 0: 10000000000 100.00% * FE_DOWNWARD 0: 10000000000 100.00% * FE_TOWARDZERO 0: 10000000000 100.00% The CORE-MATH implementation is correctly rounded for any rounding mode. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1) shows: reciprocal-throughput master patched improvement x86_64 49.0980 267.0660 -443.94% x86_64v2 49.3220 257.6310 -422.34% x86_64v3 42.9539 84.9571 -97.79% aarch64 28.7266 52.9096 -84.18% power10 14.1673 25.1273 -77.36% Latency master patched improvement x86_64 95.6640 269.7060 -181.93% x86_64v2 95.8296 260.4860 -171.82% x86_64v3 91.1658 112.7150 -23.64% aarch64 37.0745 58.6791 -58.27% power10 23.3197 31.5737 -35.39% Checked on x86_64-linux-gnu, aarch64-linux-gnu, and powerpc64le-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>	2025-10-27 09:34:04 -03:00
Adhemerval Zanella	72a48e45bd	math: Use erf from CORE-MATH The current implementation precision shows the following accuracy, on three rangeis ([-DBL_MIN, -4.2], [-4.2, 4.2], [4.2, DBL_MAX]) with 10e9 uniform randomly generated numbers for each range (first column is the accuracy in ULP, with '0' being correctly rounded, second is the number of samples with the corresponding precision): * Range [-DBL_MIN, -4.2] * FE_TONEAREST 0: 10000000000 100.00% * FE_UPWARD 0: 10000000000 100.00% * FE_DOWNWARD 0: 10000000000 100.00% * FE_TOWARDZERO 0: 10000000000 100.00% * Range [-4.2, 4.2] * FE_TONEAREST 0: 9764404513 97.64% 1: 235595487 2.36% * FE_UPWARD 0: 9468013928 94.68% 1: 531986072 5.32% * FE_DOWNWARD 0: 9493787693 94.94% 1: 506212307 5.06% * FE_TOWARDZERO 0: 9585271351 95.85% 1: 414728649 4.15% * Range [4.2, DBL_MAX] * FE_TONEAREST 0: 10000000000 100.00% * FE_UPWARD 0: 10000000000 100.00% * FE_DOWNWARD 0: 10000000000 100.00% * FE_TOWARDZERO 0: 10000000000 100.00% The CORE-MATH implementation is correctly rounded for any rounding mode. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1) shows: reciprocal-throughput master patched improvement x86_64 38.2754 78.0311 -103.87% x86_64v2 38.3325 75.7555 -97.63% x86_64v3 34.6604 28.3182 18.30% aarch64 23.1499 21.4307 7.43% power10 12.3051 9.3766 23.80% Latency master patched improvement x86_64 84.3062 121.3580 -43.95% x86_64v2 84.1817 117.4250 -39.49% x86_64v3 81.0933 70.6458 12.88% aarch64 35.012 29.5012 15.74% power10 21.7205 18.4589 15.02% For x86_64/x86_64-v2, most performance hit came from the fma call through the ifunc mechanism. Checked on x86_64-linux-gnu, aarch64-linux-gnu, and powerpc64le-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>	2025-10-27 09:34:04 -03:00
Adhemerval Zanella	1cae0550e8	math: Use tgamma from CORE-MATH The current implementation precision shows the following accuracy, on one range ([-20,20]) with 10e9 uniform randomly generated numbers for each range (first column is the accuracy in ULP, with '0' being correctly rounded, second is the number of samples with the corresponding precision): * Range [-20,20] * FE_TONEAREST 0: 4504877808 45.05% 1: 4402224940 44.02% 2: 947652295 9.48% 3: 131076831 1.31% 4: 13222216 0.13% 5: 910045 0.01% 6: 35253 0.00% 7: 606 0.00% 8: 6 0.00% * FE_UPWARD 0: 3477307921 34.77% 1: 4838637866 48.39% 2: 1413942684 14.14% 3: 240762564 2.41% 4: 27113094 0.27% 5: 2130934 0.02% 6: 102599 0.00% 7: 2324 0.00% 8: 14 0.00% * FE_DOWNWARD 0: 3923545410 39.24% 1: 4745067290 47.45% 2: 1137899814 11.38% 3: 171596912 1.72% 4: 20013805 0.20% 5: 1773899 0.02% 6: 99911 0.00% 7: 2928 0.00% 8: 31 0.00% * FE_TOWARDZERO 0: 3697160741 36.97% 1: 4731951491 47.32% 2: 1303092738 13.03% 3: 231969191 2.32% 4: 32344517 0.32% 5: 3283092 0.03% 6: 193010 0.00% 7: 5175 0.00% 8: 45 0.00% The CORE-MATH implementation is correctly rounded for any rounding mode. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1) shows: reciprocal-throughput master patched improvement x86_64 237.7960 175.4090 26.24% x86_64v2 232.9320 163.4460 29.83% x86_64v3 193.0680 89.7721 53.50% aarch64 113.6340 56.7350 50.07% power10 92.0617 26.6137 71.09% Latency master patched improvement x86_64 266.7190 208.0130 22.01% x86_64v2 263.6070 200.0280 24.12% x86_64v3 214.0260 146.5180 31.54% aarch64 114.4760 58.5235 48.88% power10 84.3718 35.7473 57.63% Checked on x86_64-linux-gnu, aarch64-linux-gnu, and powerpc64le-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>	2025-10-27 09:34:04 -03:00
Adhemerval Zanella	d67d2f4688	math: Use lgamma from CORE-MATH The current implementation precision shows the following accuracy, on one range ([-1,1]) with 10e9 uniform randomly generated numbers for each range (first column is the accuracy in ULP, with '0' being correctly rounded, second is the number of samples with the corresponding precision): * Range [-20, 20] * FE_TONEAREST 0: 6701254075 67.01% 1: 3230897408 32.31% 2: 63986940 0.64% 3: 3605417 0.04% 4: 233189 0.00% 5: 20973 0.00% 6: 1869 0.00% 7: 125 0.00% 8: 4 0.00% * FE_UPWARDA 0: 4207428861 42.07% 1: 5001137116 50.01% 2: 740542213 7.41% 3: 49116304 0.49% 4: 1715617 0.02% 5: 54464 0.00% 6: 4956 0.00% 7: 451 0.00% 8: 16 0.00% 9: 2 0.00% * FE_DOWNWARD 0: 4155925193 41.56% 1: 4989821364 49.90% 2: 770312796 7.70% 3: 72014726 0.72% 4: 11040522 0.11% 5: 872811 0.01% 6: 12480 0.00% 7: 106 0.00% 8: 2 0.00% * FE_TOWARDZERO 0: 4225861532 42.26% 1: 5027051105 50.27% 2: 706443411 7.06% 3: 39877908 0.40% 4: 713109 0.01% 5: 47513 0.00% 6: 4961 0.00% 7: 438 0.00% 8: 23 0.00% * Range [20, 0x5.d53649e2d4674p+1012] * FE_TONEAREST 0: 7262241995 72.62% 1: 2737758005 27.38% * FE_UPWARD 0: 4690392401 46.90% 1: 5143728216 51.44% 2: 165879383 1.66% * FE_DOWNWARD 0: 4690333331 46.90% 1: 5143794937 51.44% 2: 165871732 1.66% * FE_TOWARDZERO 0: 4690343071 46.90% 1: 5143786761 51.44% 2: 165870168 1.66% The CORE-MATH implementation is correctly rounded for any rounding mode. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1) shows: reciprocal-throughput master patched improvement x86_64 112.9740 135.8640 -20.26% x86_64v2 111.8910 131.7590 -17.76% x86_64v3 108.2800 68.0935 37.11% aarch64 61.3759 49.2403 19.77% power10 42.4483 24.1943 43.00% Latency master patched improvement x86_64 144.0090 167.9750 -16.64% x86_64v2 139.2690 167.1900 -20.05% x86_64v3 130.1320 96.9347 25.51% aarch64 66.8538 53.2747 20.31% power10 49.5076 29.6917 40.03% For x86_64/x86_64-v2, most performance hit came from the fma call through the ifunc mechanism. Checked on x86_64-linux-gnu, aarch64-linux-gnu, and powerpc64le-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>	2025-10-27 09:34:04 -03:00
Adhemerval Zanella	140e802cb3	math: Move atanh internal data to separate file The internal data definitions are moved to s_atanh_data.c. It helps on ABIs that build the implementation multiple times for ifunc optimizations, like x86_64. Reviewed-by: DJ Delorie <dj@redhat.com>	2025-10-27 09:34:04 -03:00
Adhemerval Zanella	cb8d1575b6	math: Consolidate acosh and asinh internal table The shared internal data definitions are consolidated in s_asincosh_data.c. Reviewed-by: DJ Delorie <dj@redhat.com>	2025-10-27 09:34:04 -03:00
Paul Zimmermann	48fde7b026	various fixes detected with -Wdouble-promotion Changes with respect to v1: - added comment in e_j1f.c to explain the use of float is enough Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2025-10-22 12:35:40 +02:00
Siddhesh Poyarekar	1b657c53c2	Simplify powl computation for small integral y [BZ #33411 ] The powl implementation for x86_64 ends up multiplying X once more than necessary and then throwing away that result. This results in an overflow flag being set in cases where there is no overflow. Simplify the relevant portion by special casing the -3 to 3 range and simply multiplying repetitively. Resolves: BZ #33411 Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Reviewed by: Paul Zimmermann <Paul.Zimmermann@inria.fr>	2025-10-21 14:00:10 -04:00
Adhemerval Zanella	0e4ca88bd2	math: Fix compare sort function on compoundn To use the fabs function to the used type, instead of the double variant. it fixes a build issue with clang: ./s_compoundn_template.c:64:14: error: absolute value function 'fabs' given an argument of type 'const long double' but has parameter of type 'double' which may cause truncation of value [-Werror,-Wabsolute-value] 64 \| FLOAT pd = fabs ((const FLOAT ) p); \| ^ ./s_compoundn_template.c:64:14: note: use function 'fabsl' instead 64 \| FLOAT pd = fabs ((const FLOAT ) p); \| ^~~~ \| fabsl Reviewed-by: Collin Funk <collin.funk1@gmail.com>	2025-10-21 09:27:05 -03:00
Adhemerval Zanella	b9b28ce35f	math: Suppress more aliases builtin type conflicts Reviewed-by: Sam James <sam@gentoo.org>	2025-10-21 09:26:02 -03:00
Adhemerval Zanella	39bf95c1ba	math: Suppress clang -Wabsolute-value warning on math_check_force_underflow clang warns: ../sysdeps/x86/fpu/powl_helper.c:233:3: error: absolute value function '__builtin_fabsf' given an argument of type 'typeof (res)' (aka 'long double') but has parameter of type 'float' which may cause truncation of value [-Werror,-Wabsolute-value] math_check_force_underflow (res); ^ ./math-underflow.h:45:11: note: expanded from macro 'math_check_force_underflow' if (fabs_tg (force_underflow_tmp) \ ^ ./math-underflow.h:27:20: note: expanded from macro 'fabs_tg' #define fabs_tg(x) __MATH_TG ((x), (__typeof (x)) __builtin_fabs, (x)) ^ ../math/math.h:899:16: note: expanded from macro '__MATH_TG' float: FUNC ## f ARGS, \ ^ <scratch space>:73:1: note: expanded from here __builtin_fabsf ^ Due the use of _Generic from TG_MATH. Reviewed-by: Sam James <sam@gentoo.org>	2025-10-21 09:24:21 -03:00
Adhemerval Zanella	850d93f514	math: Use binary search on lgammaf slow path And remove some unused entries of the fallback table. Checked on x86_64-linux-gnu and aarch64-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-10-14 11:12:08 -03:00
Adhemerval Zanella	ae49afe74d	math: Optimize fma call on log2pf1 The fma is required only for x == -0x1.da285cp-5 in FE_TONEAREST to provide correctly rounded results. Checked on x86_64-linux-gnu and i686-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-10-14 11:12:00 -03:00
Adhemerval Zanella	82a4f50b4e	math: Optimize fma call on asinpif The fma is required only for x == +/-0x1.6371e8p-4f in FE_TOWARDZERO to provide correctly rounded results. Checked on x86_64-linux-gnu and aarch64-linux-gnu. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-10-14 11:11:56 -03:00
Adhemerval Zanella	1c459af1ee	math: Update auto-libm-test-out-log2p1 The `0797283910` did not update log2p1 output with the newer values.	2025-10-14 08:46:06 -03:00
Luna Lamb	653e6c4fff	AArch64: Implement AdvSIMD and SVE log10p1(f) routines Vector variants of the new C23 log10p1 routines. Note: Benchmark inputs for log10p1(f) are identical to log1p(f) Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-09-27 12:45:59 +00:00
Luna Lamb	db42732474	AArch64: Implement AdvSIMD and SVE log2p1(f) routines Vector variants of the new C23 log2p1 routines. Note: Benchmark inputs for log2p1(f) are identical to log1p(f). Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-09-27 12:44:09 +00:00
Adhemerval Zanella	63ba1a1509	math: Add fetestexcept internal alias To avoid linknamespace issues on old standards. It is required if the fallback fma implementation is used if/when it is also used internally for other implementation. Reviewed-by: DJ Delorie <dj@redhat.com>	2025-09-11 14:46:07 -03:00
Adhemerval Zanella	2eb8836de7	math: Add feclearexcept internal alias To avoid linknamespace issues on old standards. It is required if the fallback fma implementation is used if/when it is also used internally for other implementation. Reviewed-by: DJ Delorie <dj@redhat.com>	2025-09-11 14:46:07 -03:00
Hasaan Khan	8ced7815fb	AArch64: Implement exp2m1 and exp10m1 routines Vector variants of the new C23 exp2m1 & exp10m1 routines. Note: Benchmark inputs for exp2m1 & exp10m1 are identical to exp2 & exp10 respectively, this also includes the floating point variations. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-09-02 16:50:24 +00:00
Adhemerval Zanella	6ab36c4e6d	math: Update auto-libm-tests-in with ldbl-128ibm compoundn/pown failures It fixes `ce488f7c16` which updated the out files without using gen-auto-libm-tests.c instructions. Checked on powerpc64le-linux-gnu. Tested-by: Andreas K. Huettel <dilfridge@gentoo.org> Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2025-07-28 13:58:54 -03:00
Sachin Monga	ce488f7c16	math: xfail some pown and compoundn tests for ibm128-libgcc On powerpc math/test-ibm128-pown shows below failures: testing long double (without inline functions) infinity has wrong sign. Failure: Test: pown_downward (-inf, 0x7fffffffffffffffLL) Result: is: inf inf should be: -inf -inf Failure: Test: pown_downward (-0, 9223372036854775807LL) Result: is: 0.00000000000000000000000000000000e+00 0x0.000000000000000000000000000p+0 should be: -0.00000000000000000000000000000000e+00 -0x0.000000000000000000000000000p+0 difference: 0.00000000000000000000000000000000e+00 0x0.000000000000000000000000000p+0 ulp : 0.0000 max.ulp : 16.0000 Failure: pown_downward (-0x1p+0, 9223372036854775807LL): Exception "Invalid operation" set Failure: pown_downward (-0x1p+0, 9223372036854775807LL): errno set to 34, expected 0 (unchanged) Failure: Test: pown_downward (-0x1p+0, 9223372036854775807LL) Result: is: qNaN should be: -1.00000000000000000000000000000000e+00 -0x1.000000000000000000000000000p+0 infinity has wrong sign. Failure: Test: pown_towardzero (-0, -0x7fffffffffffffffLL) Result: is: inf inf should be: -inf -inf infinity has wrong sign. Failure: Test: pown_towardzero (-inf, 0x7fffffffffffffffLL) Result: is: inf inf should be: -inf -inf Failure: Test: pown_towardzero (-inf, -0x7fffffffffffffffLL) Result: is: 0.00000000000000000000000000000000e+00 0x0.000000000000000000000000000p+0 should be: -0.00000000000000000000000000000000e+00 -0x0.000000000000000000000000000p+0 difference: 0.00000000000000000000000000000000e+00 0x0.000000000000000000000000000p+0 ulp : 0.0000 max.ulp : 16.0000 Failure: Test: pown_towardzero (-0, 9223372036854775807LL) Result: is: 0.00000000000000000000000000000000e+00 0x0.000000000000000000000000000p+0 should be: -0.00000000000000000000000000000000e+00 -0x0.000000000000000000000000000p+0 difference: 0.00000000000000000000000000000000e+00 0x0.000000000000000000000000000p+0 ulp : 0.0000 max.ulp : 16.0000 Failure: pown_towardzero (-0x1p+0, -9223372036854775807LL): Exception "Invalid operation" set Failure: pown_towardzero (-0x1p+0, -9223372036854775807LL): errno set to 34, expected 0 (unchanged) Failure: Test: pown_towardzero (-0x1p+0, -9223372036854775807LL) Result: is: qNaN should be: -1.00000000000000000000000000000000e+00 -0x1.000000000000000000000000000p+0 Failure: pown_towardzero (-0x1p+0, 9223372036854775807LL): Exception "Invalid operation" set Failure: pown_towardzero (-0x1p+0, 9223372036854775807LL): errno set to 34, expected 0 (unchanged) Failure: Test: pown_towardzero (-0x1p+0, 9223372036854775807LL) Result: is: qNaN should be: -1.00000000000000000000000000000000e+00 -0x1.000000000000000000000000000p+0 infinity has wrong sign. Failure: Test: pown_upward (-0, -0x7fffffffffffffffLL) Result: is: inf inf should be: -inf -inf Failure: Test: pown_upward (-inf, -0x7fffffffffffffffLL) Result: is: 0.00000000000000000000000000000000e+00 0x0.000000000000000000000000000p+0 should be: -0.00000000000000000000000000000000e+00 -0x0.000000000000000000000000000p+0 difference: 0.00000000000000000000000000000000e+00 0x0.000000000000000000000000000p+0 ulp : 0.0000 max.ulp : 16.0000 Failure: pown_upward (-0x1p+0, -9223372036854775807LL): Exception "Invalid operation" set Failure: pown_upward (-0x1p+0, -9223372036854775807LL): errno set to 34, expected 0 (unchanged) Failure: Test: pown_upward (-0x1p+0, -9223372036854775807LL) Result: is: qNaN should be: -1.00000000000000000000000000000000e+00 -0x1.000000000000000000000000000p+0 Likewise, math/test-ibm128-compoundn shows below failure: testing long double (without inline functions) Failure: compoundn_upward (0xf.ffffffffffff8p+1020, 1LL): Exception "Overflow" set Failure: compoundn_upward (0xf.ffffffffffff8p+1020, 1LL): errno set to 34, expected 0 (unchanged) Failure: Test: compoundn_upward (0xf.ffffffffffff8p+1020, 1LL) Result: is: inf inf should be: 1.79769313486231570814527423731707e+308 0x1.fffffffffffff00000000000008p+1023 Signed-off-by: Sachin Monga <smonga@linux.ibm.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2025-07-24 19:36:21 +02:00
Carlos O'Donell	801d566dde	gen-libm-test: Use 'original source' instead of 'master' in code. Use more inclusive language in generated sources. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2025-05-21 12:48:00 -04:00
Dylan Fleming	96abd59bf2	AArch64: Implement AdvSIMD and SVE atan2pi/f Implement double and single precision variants of the C23 routine atan2pi for both AdvSIMD and SVE. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-05-19 15:35:25 +00:00
Dylan Fleming	edf6202815	AArch64: Implement AdvSIMD and SVE atanpi/f Implement double and single precision variants of the C23 routine atanpi for both AdvSIMD and SVE. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-05-19 15:34:40 +00:00
Dylan Fleming	0ef2cf44e7	AArch64: Implement AdvSIMD and SVE asinpi/f Implement double and single precision variants of the C23 routine asinpi for both AdvSIMD and SVE. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-05-19 15:33:50 +00:00
Dylan Fleming	993997ca1b	AArch64: Implement AdvSIMD and SVE acospi/f Implement double and single precision variants of the C23 routine acospi for both AdvSIMD and SVE. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2025-05-19 15:31:59 +00:00
Joseph Myers	06caf53adf	Implement C23 rootn. C23 adds various <math.h> function families originally defined in TS 18661-4. Add the rootn functions, which compute the Yth root of X for integer Y (with a domain error if Y is 0, even if X is a NaN). The integer exponent has type long long int in C23; it was intmax_t in TS 18661-4, and as with other interfaces changed after their initial appearance in the TS, I don't think we need to support the original version of the interface. As with pown and compoundn, I strongly encourage searching for worst cases for ulps error for these implementations (necessarily non-exhaustively, given the size of the input space). I also expect a custom implementation for a given format could be much faster as well as more accurate, although the implementation is simpler than those for pown and compoundn. This completes adding to glibc those TS 18661-4 functions (ignoring DFP) that are included in C23. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118592 regarding the C23 mathematical functions (not just the TS 18661-4 ones) missing built-in functions in GCC, where such functions might usefully be added. Tested for x86_64 and x86, and with build-many-glibcs.py.	2025-05-14 10:51:46 +00:00
Joseph Myers	ae31254432	Implement C23 compoundn C23 adds various <math.h> function families originally defined in TS 18661-4. Add the compoundn functions, which compute (1+X) to the power Y for integer Y (and X at least -1). The integer exponent has type long long int in C23; it was intmax_t in TS 18661-4, and as with other interfaces changed after their initial appearance in the TS, I don't think we need to support the original version of the interface. Note that these functions are "compoundn" with a trailing "n", not "compound" (CORE-MATH has the wrong name, for example). As with pown, I strongly encourage searching for worst cases for ulps error for these implementations (necessarily non-exhaustively, given the size of the input space). I also expect a custom implementation for a given format could be much faster as well as more accurate (I haven't tested or benchmarked the CORE-MATH implementation for binary32); this is one of the more complicated and less efficient functions to implement in a type-generic way. As with exp2m1 and exp10m1, this showed up places where the powerpc64le IFUNC setup is not as self-contained as one might hope (in this case, without the changes specific to powerpc64le, there were undefined references to __GI___expf128). Tested for x86_64 and x86, and with build-many-glibcs.py.	2025-05-09 15:17:27 +00:00

1 2 3 4 5 ...

1613 Commits