1
0
mirror of https://sourceware.org/git/glibc.git synced 2025-08-08 17:42:12 +03:00
Commit Graph

3 Commits

Author SHA1 Message Date
Adhemerval Zanella
8eeb7de8a2 math: Fix UB on cospif (BZ 32923)
The left shift overflows for 'int', use uint32_t instead.  It syncs
with CORE-MATH commit bbfabd993a71b049c210b0febfd06d18369fadc1.

Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
2025-04-29 15:20:16 -03:00
Adhemerval Zanella
246e52574d math: Consolidate cospif and sinpif internal tables
The libm size improvement built with gcc-14, "--enable-stack-protector=strong
--enable-bind-now=yes --enable-fortify-source=2":

Before:

   text    data     bss     dec     hex filename
 584500     844      12  585356   8ee8c aarch64-linux-gnu/math/libm.so
 977341    1076      12  978429   eedfd x86_64-linux-gnu/math/libm.so
1205762    5608     368 1211738  127d5a powerpc64le-linux-gnu/math/libm.so

After:

   text    data     bss     dec     hex filename
 583444     844      12  584300   8ea6c aarch64-linux-gnu/math/libm.so
 976349    1076      12  977437   eea1d x86_64-linux-gnu/math/libm.so
1204738    5608     368 1210714  12795a powerpc64le-linux-gnu/math/libm.so
Reviewed-by: Andreas K. Huettel <dilfridge@gentoo.org>
2025-02-17 10:09:09 -03:00
Adhemerval Zanella
be85208b9f math: Use cospif from CORE-MATH
The CORE-MATH implementation is correctly rounded (for any rounding mode)
and shows better performance to the generic cospif.

The code was adapted to glibc style and to use the definition of
math_config.h (to handle errno, overflow, and underflow).

Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1,
gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1):

latency                    master        patched   improvement
x86_64                    47.4679        38.4157        19.07%
x86_64v2                  46.9686        38.3329        18.39%
x86_64v3                  43.8929        31.8510        27.43%
aarch64 (Neoverse)        18.8867        13.2089        30.06%
power8                    22.9435         7.8023        65.99%
power10                   15.4472        7.77505        49.67%

reciprocal-throughput      master        patched   improvement
x86_64                    20.9518        11.4991        45.12%
x86_64v2                  19.8699        10.5921        46.69%
x86_64v3                  19.3475         9.3998        51.42%
aarch64 (Neoverse)        12.5767         6.2158        50.58%
power8                    15.0566         3.2654        78.31%
power10                    9.2866         3.1147        66.46%

Reviewed-by: DJ Delorie <dj@redhat.com>
2025-02-12 16:31:57 -03:00