To enable “longlong.h” removal, the umul_ppmm is moved to a gmp-arch.h.
The generic implementation now uses a static inline, which provides
better type checking than the GNU extension to cast the asm constraint
(and it works better with clang).
Most of the architecture uses the generic implementation, which is
expanded from a macro, except for alpha, arm, hppa, x86, m68k, mips,
powerpc, and sparc. The 32 bit architectures the compiler generates
good enough code using uint64_t types, where for 64 bit architecture
the patch leverages the math_u128.h definitions that uses 128-bit
integers when available (all 64 bit architectures on gcc 15).
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
To enable “longlong.h” removal, add_ssaaaa and sub_ssaaaa are moved to
gmp-arch.h. The generic implementation now uses a static inline. This
provides better type checking than the GNU extension, which casts the
asm constraint; and it also works better with clang.
Most architectures use the generic implementation, with except of
arc, arm, hppa, x86, m68k, powerpc, and sparc. The 32 bit architectures
the compiler generates good enough code using uint64_t types, where
for 64 bit architecture the patch leverages the math_u128.h definitions
that uses 128-bit integers when available (all 64 bit architectures
on gcc 15).
The strongly typed implementation required some changes. I adjusted
_FP_W_TYPE, _FP_WS_TYPE, and _FP_I_TYPE to use the same type as
mp_limb_t on aarch64, powerpc64le, x86_64, and riscv64. This basically
means using “long” instead of “long long.”
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
To enable “longlong.h” removal, the udiv_qrnnd is moved to a gmp-arch.h
file. It allows each architecture to implement its own arch-specific
optimizations. The generic implementation now uses a static inline,
which provides better type checking than the GNU extension to cast the
asm constraint (and it works better with clang).
Most of the architecture uses the generic implementation, which is
expanded from a macro, except for alpha, x86, m68k, sh, and sparc.
I kept that alpha, which uses out-of-the-line implementations and x86,
where there is no easy way to use the div{q} instruction from C code.
For the rest, the compiler generates good enough code.
The hppa also provides arch-specific implementations, but they are not
routed in “longlong.h” and thus never used.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>