1
0
mirror of https://github.com/postgres/postgres.git synced 2025-10-24 01:29:19 +03:00
Commit Graph

84 Commits

Author SHA1 Message Date
Peter Eisentraut
9a8313dabe Remove useless configure check
The test for "decltype" as a variant of "typeof" apparently never
worked (see also commit 3582b223d4), so remove it.

Discussion: https://www.postgresql.org/message-id/flat/795b1c54-c64a-47f9-8fa3-880dcab59975%40eisentraut.org
2025-01-05 11:34:28 +01:00
Thomas Munro
962da900ac Use <stdint.h> and <inttypes.h> for c.h integers.
Redefine our exact width types with standard C99 types and macros,
including int64_t, INT64_MAX, INT64_C(), PRId64 etc.  We were already
using <stdint.h> types in a few places.

One complication is that Windows' <inttypes.h> uses format strings like
"%I64d", "%I32", "%I" for PRI*64, PRI*32, PTR*PTR, instead of mapping to
other standardized format strings like "%lld" etc as seen on other known
systems.  Teach our snprintf.c to understand them.

This removes a lot of configure clutter, and should also allow 64-bit
numbers and other standard types to be used in localized messages
without casting.

Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/ME3P282MB3166F9D1F71F787929C0C7E7B6312%40ME3P282MB3166.AUSP282.PROD.OUTLOOK.COM
2024-12-04 15:05:38 +13:00
Nathan Bossart
4b03a27faf Use __attribute__((target(...))) for SSE4.2 CRC-32C support.
Presently, we check for compiler support for the required
intrinsics both with and without the -msse4.2 compiler flag, and
then depending on the results of those checks, we pick which files
to compile with which flags.  This is tedious and complicated, and
it results in unsustainable coding patterns such as separate files
for each portion of code that may need to be built with different
compiler flags.

This commit makes use of the newly-added support for
__attribute__((target(...))) in the SSE4.2 CRC-32C code.  This
simplifies both the configure-time checks and the build scripts,
and it allows us to place the functions that use the intrinsics in
files that we otherwise do not want to build with special CPU
instructions (although this commit refrains from doing so).  This
is also preparatory work for a proposed follow-up commit that will
further optimize the CRC-32C code with AVX-512 instructions.

While at it, this commit modifies meson's checks for SSE4.2 CRC
support to be the same as autoconf's.  meson was choosing whether
to use a runtime check based purely on whether -msse4.2 is
required, while autoconf has long checked for the __SSE4_2__
preprocessor symbol to decide.  meson's previous approach seems to
work just fine, but this change avoids needing to build multiple
test programs and to keep track of whether to actually use
pg_attribute_target().

Ideally we'd use __attribute__((target(...))) for ARMv8 CRC
support, too, but there's little point in doing so because until
clang 16, using the ARM intrinsics still requires special compiler
flags.  Perhaps we can re-evaluate this decision after some time
has passed.

Author: Raghuveer Devulapalli
Discussion: https://postgr.es/m/PH8PR11MB8286BE735A463468415D46B5FB5C2%40PH8PR11MB8286.namprd11.prod.outlook.com
2024-11-27 16:19:05 -06:00
Nathan Bossart
41b98ddb77 Fix __attribute__((target(...))) usage.
The commonly supported way to specify multiple target options is to
surround the entire list with quotes and to use a comma (with no
extra spaces) as the delimiter.

Oversight in commit f78667bd91.

Discussion: https://postgr.es/m/Zy0jya8nF8CPpv3B%40nathan
2024-11-07 15:27:32 -06:00
Nathan Bossart
f78667bd91 Use __attribute__((target(...))) for AVX-512 support.
Presently, we check for compiler support for the required
intrinsics both with and without extra compiler flags (e.g.,
-mxsave), and then depending on the results of those checks, we
pick which files to compile with which flags.  This is tedious and
complicated, and it results in unsustainable coding patterns such
as separate files for each portion of code may need to be built
with different compiler flags.

This commit introduces support for __attribute__((target(...))) and
uses it for the AVX-512 code.  This simplifies both the
configure-time checks and the build scripts, and it allows us to
place the functions that use the intrinsics in files that we
otherwise do not want to build with special CPU instructions.  We
are careful to avoid using __attribute__((target(...))) on
compilers that do not understand it, but we still perform the
configure-time checks in case the compiler allows using the
intrinsics without it (e.g., MSVC).

A similar change could likely be made for some of the CRC-32C code,
but that is left as a future exercise.

Suggested-by: Andres Freund
Reviewed-by: Raghuveer Devulapalli, Andres Freund
Discussion: https://postgr.es/m/20240731205254.vfpap7uxwmebqeaf%40awork3.anarazel.de
2024-11-07 13:58:43 -06:00
Nathan Bossart
792752af4e Optimize pg_popcount() with AVX-512 instructions.
Presently, pg_popcount() processes data in 32-bit or 64-bit chunks
when possible.  Newer hardware that supports AVX-512 instructions
can use 512-bit chunks, which provides a nice speedup, especially
for larger buffers.  This commit introduces the infrastructure
required to detect compiler and CPU support for the required
AVX-512 intrinsic functions, and it adds a new pg_popcount()
implementation that uses these functions.  If CPU support for this
optimized implementation is detected at runtime, a function pointer
is updated so that it is used by subsequent calls to pg_popcount().

Most of the existing in-tree calls to pg_popcount() should benefit
from these instructions, and calls with smaller buffers should at
least not regress compared to v16.  The new infrastructure
introduced by this commit can also be used to optimize
visibilitymap_count(), but that is left for a follow-up commit.

Co-authored-by: Paul Amonson, Ants Aasma
Reviewed-by: Matthias van de Meent, Tom Lane, Noah Misch, Akash Shankaran, Alvaro Herrera, Andres Freund, David Rowley
Discussion: https://postgr.es/m/BL1PR11MB5304097DF7EA81D04C33F3D1DCA6A%40BL1PR11MB5304.namprd11.prod.outlook.com
2024-04-06 21:56:23 -05:00
Heikki Linnakangas
0b16bb8776 Remove AIX support
There isn't a lot of user demand for AIX support, we have a bunch of
hacks to work around AIX-specific compiler bugs and idiosyncrasies,
and no one has stepped up to the plate to properly maintain it.
Remove support for AIX to get rid of that maintenance overhead. It's
still supported for stable versions.

The acute issue that triggered this decision was that after commit
8af2565248, the AIX buildfarm members have been hitting this
assertion:

    TRAP: failed Assert("(uintptr_t) buffer == TYPEALIGN(PG_IO_ALIGN_SIZE, buffer)"), File: "md.c", Line: 472, PID: 2949728

Apperently the "pg_attribute_aligned(a)" attribute doesn't work on AIX
for values larger than PG_IO_ALIGN_SIZE, for a static const variable.
That could be worked around, but we decided to just drop the AIX support
instead.

Discussion: https://www.postgresql.org/message-id/20240224172345.32@rfd.leadboat.com
Reviewed-by: Andres Freund, Noah Misch, Thomas Munro
2024-02-28 15:17:23 +04:00
John Naylor
4d14ccd6af Use native CRC instructions on 64-bit LoongArch
As with the Intel and Arm CRC instructions, compiler intrinsics for
them must be supported by the compiler. In contrast, no runtime check
is needed. Aligned memory access is faster, so use the Arm coding as
a model.

YANG Xudong

Discussion: https://postgr.es/m/b522a0c5-e3b2-99cc-6387-58134fb88cbe%40ymatrix.cn
2023-08-10 11:36:15 +07:00
Andres Freund
9db49fc5bf autoconf: Move export_dynamic determination to configure
Previously export_dynamic was set in src/makefiles/Makefile.$port. For solaris
this required exporting with_gnu_ld. The determination of with_gnu_ld would be
nontrivial to copy for meson PGXS compatibility.  It's also nice to delete
libtool.m4.

This uses -Wl,--export-dynamic on all platforms, previously all platforms but
FreeBSD used -Wl,-E. The likelihood of a name conflict seems lower with the
longer spelling.

Discussion: https://postgr.es/m/20221005200710.luvw5evhwf6clig6@awork3.anarazel.de
2022-12-06 18:55:28 -08:00
Andres Freund
e0f0e08e17 autoconf: Unify CFLAGS_SSE42 and CFLAGS_ARMV8_CRC32C
Until now we emitted the cflags to build the CRC objects into architecture
specific variables. That doesn't make a whole lot of sense to me - we're never
going to target x86 and arm at the same time, so they don't need to be
separate variables.

It might be better to instead continue to have CFLAGS_SSE42 /
CFLAGS_ARMV8_CRC32C be computed by PGAC_ARMV8_CRC32C_INTRINSICS /
PGAC_SSE42_CRC32_INTRINSICS and then set CFLAGS_CRC based on those. But it
seems unlikely that we'd need other sets of CRC specific flags for those two
architectures at the same time.

This simplifies the upcoming meson PGXS compatibility.

Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/20221005200710.luvw5evhwf6clig6@awork3.anarazel.de
2022-12-01 18:46:55 -08:00
Michael Paquier
ec3c9cc202 Add definition pg_attribute_aligned() for MSVC
Visual Studio 2015+ has support for a macro to control the alignement of
structures as of __declspec(align(#)), and this commit adds a definition
of pg_attribute_aligned() based on that.  It happens that this was
already used in the implementation of atomics for MSVC.  Note that there
is still no definition fo pg_attribute_packed(), so this does not impact
itemptr.h.

Author: James Coleman
Discussion: https://postgr.es/m/CAAaqYe-HbtZvR3msoMtk+hYW2S0e0OapzMW8icSMYTMA+mN8Aw@mail.gmail.com
2022-09-21 10:11:23 +09:00
Andres Freund
320f92b744 Rely on __func__ being supported
Previously we fell back to __FUNCTION__ and then NULL. As __func__ is in C99
that shouldn't be necessary anymore.

Solution.pm defined HAVE_FUNCNAME__FUNCTION instead of
HAVE_FUNCNAME__FUNC (originating in 4164e6636e), as at some point in the past
MSVC only supported __FUNCTION__. Our minimum version supports __func__.

Reviewed-By: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/20220807012914.ydz73yte6j3coulo@awork3.anarazel.de
2022-08-07 09:36:01 -07:00
Tom Lane
de447bb8e6 Suppress warning about stack_base_ptr with late-model GCC.
GCC 12 complains that set_stack_base is storing the address of
a local variable in a long-lived pointer.  This is an entirely
reasonable warning (indeed, it just helped us find a bug);
but that behavior is intentional here.  We can work around it
by using __builtin_frame_address(0) instead of a specific local
variable; that produces an address a dozen or so bytes different,
in my testing, but we don't care about such a small difference.
Maybe someday a compiler lacking that function will start to issue
a similar warning, but we'll worry about that when it happens.

Patch by me, per a suggestion from Andres Freund.  Back-patch to
v12, which is as far back as the patch will go without some pain.
(Recently-established project policy would permit a back-patch as
far as 9.2, but I'm disinclined to expend the work until GCC 12
is much more widespread.)

Discussion: https://postgr.es/m/3773792.1645141467@sss.pgh.pa.us
2022-02-17 22:46:01 -05:00
Tom Lane
748507c780 Sync up some inconsistent comments in config/c-compiler.m4.
Make header/trailer comments agree with the actual names of some macros.
These seem like legit names in earlier iterations of respective patches
(commit b779168ff "Detect PG_PRINTF_ATTRIBUTE automatically." and
commit 6869b4f25 "Add C++ support to configure.") but the macro had
been renamed out of sync with the header / trailer comment in the final
committed patch.

Even more nitpickily, make the dashed underlines agree with the lengths
of the macro names everyplace.  There doesn't seem to have been any
meeting of the minds previously on whether those should match or not,
but at least some people have been trying to make 'em match.

Jesse Zhang, Tom Lane

Discussion: https://postgr.es/m/CAGf+fX7DDyq6WfCy6X_KtD28MkbNBE6NkRi26fSf25dfUwX0zw@mail.gmail.com
2020-04-22 15:27:43 -04:00
Tom Lane
f4d59369d2 Assume that we have signed integral types and flexible array members.
These compiler features are required by C99, so remove the configure
probes for them.

This is part of a series of commits to get rid of no-longer-relevant
configure checks and dead src/port/ code.  I'm committing them separately
to make it easier to back out individual changes if they prove less
portable than I expect.

Discussion: https://postgr.es/m/15379.1582221614@sss.pgh.pa.us
2020-02-21 14:30:48 -05:00
Tom Lane
02a6a54ecd Make use of compiler builtins and/or assembly for CLZ, CTZ, POPCNT.
Test for the compiler builtins __builtin_clz, __builtin_ctz, and
__builtin_popcount, and make use of these in preference to
handwritten C code if they're available.  Create src/port
infrastructure for "leftmost one", "rightmost one", and "popcount"
so as to centralize these decisions.

On x86_64, __builtin_popcount generally won't make use of the POPCNT
opcode because that's not universally supported yet.  Provide code
that checks CPUID and then calls POPCNT via asm() if available.
This requires indirecting through a function pointer, which is
an annoying amount of overhead for a one-instruction operation,
but it's probably not worth working harder than this for our
current use-cases.

I'm not sure we've found all the existing places that could profit
from this new infrastructure; but we at least touched all the
ones that used copied-and-pasted versions of the bitmapset.c code,
and got rid of multiple copies of the associated constant arrays.

While at it, replace c-compiler.m4's one-per-builtin-function
macros with a single one that can handle all the cases we need
to worry about so far.  Also, because I'm paranoid, make those
checks into AC_LINK checks rather than just AC_COMPILE; the
former coding failed to verify that libgcc has support for the
builtin, in cases where it's not inline code.

David Rowley, Thomas Munro, Alvaro Herrera, Tom Lane

Discussion: https://postgr.es/m/CAKJS1f9WTAGG1tPeJnD18hiQW5gAk59fQ6WK-vfdAKEHyRg2RA@mail.gmail.com
2019-02-15 23:22:33 -05:00
Alvaro Herrera
457aef0f1f Revert attempts to use POPCNT etc instructions
This reverts commits fc6c72747a, 109de05cbb, d0b4663c23 and
711bab1e4d.

Somebody will have to try harder before submitting this patch again.
I've spent entirely too much time on it already, and the #ifdef maze yet
to be written in order for it to build at all got on my nerves.  The
amount of work needed to get a platform-specific performance improvement
that's barely above the noise level is not worth it.
2019-02-15 16:32:30 -03:00
Alvaro Herrera
fc6c72747a Fix compiler builtin usage in new pg_bitutils.c
Split out these new functions in three parts: one in a new file that
uses the compiler builtin and gets compiled with the -mpopcnt compiler
option if it exists; another one that uses the compiler builtin but not
the compiler option; and finally the fallback with open-coded
algorithms.

Split out the configure logic: in the original commit, it was selecting
to use the -mpopcnt compiler switch together with deciding whether to
use the compiler builtin, but those two things are really separate.
Split them out.  Also, expose whether the builtin exists to
Makefile.global, so that src/port's Makefile can decide whether to
compile the hw-optimized file.

Remove CPUID test for CTZ/CLZ.  Make pg_{right,left}most_ones use either
the compiler intrinsic or open-coded algo; trying to use the
HW-optimized version is a waste of time.  Make them static inline
functions.

Discussion: https://postgr.es/m/20190213221719.GA15976@alvherre.pgsql
2019-02-15 13:39:56 -03:00
Alvaro Herrera
109de05cbb Fix portability issues in pg_bitutils
We were using uint64 function arguments as "long int" arguments to
compiler builtins, which fails on machines where long ints are 32 bits:
the upper half of the uint64 was being ignored.  Fix by using the "ll"
builtin variants instead, which on those machines take 64 bit arguments.

Also, remove configure tests for __builtin_popcountl() (as well as
"long" variants for ctz and clz): the theory here is that any compiler
version will provide all widths or none, so one test suffices.  Were
this theory to be wrong, we'd have to add tests for
__builtin_popcountll() and friends, which would be tedious.

Per failures in buildfarm member lapwing and ensuing discussion.
2019-02-13 20:09:48 -03:00
Alvaro Herrera
711bab1e4d Add basic support for using the POPCNT and SSE4.2s LZCNT opcodes
These opcodes have been around in the AMD world since 2007, and 2008 in
the case of intel.  They're supported in GCC and Clang via some __builtin
macros.  The opcodes may be unavailable during runtime, in which case we
fall back on a C-based implementation of the code.  In order to get the
POPCNT instruction we must pass the -mpopcnt option to the compiler.  We
do this only for the pg_bitutils.c file.

David Rowley (with fragments taken from a patch by Thomas Munro)

Discussion: https://postgr.es/m/CAKJS1f9WTAGG1tPeJnD18hiQW5gAk59fQ6WK-vfdAKEHyRg2RA@mail.gmail.com
2019-02-13 16:10:06 -03:00
Tom Lane
aed9fa0bd8 Select appropriate PG_PRINTF_ATTRIBUTE for recent NetBSD.
NetBSD-current generates a large number of warnings about "%m" not
being appropriate to use with *printf functions.  While that's true
for their native printf, it's surely not true for snprintf.c, so I
think they have misunderstood gcc's definition of the "gnu_printf"
archetype.  Nonetheless, choosing "__syslog__" instead silences the
warnings; so teach configure about that.

Since this is only a cosmetic warning issue (and anyway it depends
on previous hacking to be self-consistent), no back-patch.

Discussion: https://postgr.es/m/16785.1539046036@sss.pgh.pa.us
2018-10-09 11:10:07 -04:00
Tom Lane
96bf88d527 Always use our own versions of *printf().
We've spent an awful lot of effort over the years in coping with
platform-specific vagaries of the *printf family of functions.  Let's just
forget all that mess and standardize on always using src/port/snprintf.c.
This gets rid of a lot of configure logic, and it will allow a saner
approach to dealing with %m (though actually changing that is left for
a follow-on patch).

Preliminary performance testing suggests that as it stands, snprintf.c is
faster than the native printf functions for some tasks on some platforms,
and slower for other cases.  A pending patch will improve that, though
cases with floating-point conversions will doubtless remain slower unless
we want to put a *lot* of effort into that.  Still, we've not observed
that *printf is really a performance bottleneck for most workloads, so
I doubt this matters much.

Patch by me, reviewed by Michael Paquier

Discussion: https://postgr.es/m/2975.1526862605@sss.pgh.pa.us
2018-09-26 13:13:57 -04:00
Andres Freund
8ecdefc261 Remove test for VA_ARGS, implied by C99.
This simplifies logic / reduces duplication in a few headers.

Author: Andres Freund
Discussion: https://postgr.es/m/97d4b165-192d-3605-749c-f614a0c4e783@2ndquadrant.com
2018-08-24 10:41:45 -07:00
Tom Lane
46b5e7c4b5 Revert "Distinguish printf-like functions that support %m from those that don't."
This reverts commit 3a60c8ff89.  Buildfarm
results show that that caused a whole bunch of new warnings on platforms
where gcc believes the local printf to be non-POSIX-compliant.  This
problem outweighs the hypothetical-anyway possibility of getting warnings
for misuse of %m.  We could use gnu_printf archetype when we've substituted
src/port/snprintf.c, but that brings us right back to the problem of not
getting warnings for %m.

A possible answer is to attack it in the other direction by insisting
that %m support be included in printf's feature set, but that will take
more investigation.  In the meantime, revert the previous change, and
update the comment for PGAC_C_PRINTF_ARCHETYPE to more fully explain
what's going on.

Discussion: https://postgr.es/m/2975.1526862605@sss.pgh.pa.us
2018-08-12 18:46:01 -04:00
Tom Lane
3a60c8ff89 Distinguish printf-like functions that support %m from those that don't.
The elog/ereport family of functions certainly support the %m format spec,
because they implement it "by hand".  But elsewhere we have printf wrappers
that might or might not allow it depending on whether the platform's printf
does.  (Most non-glibc versions don't, and notably, src/port/snprintf.c
doesn't.)  Hence, rather than using the gnu_printf format archetype
interchangeably for all these functions, use it only for elog/ereport.
This will allow us to get compiler warnings for mistakes like the ones
fixed in commit a13b47a59, at least on platforms where printf doesn't
take %m and gcc is correctly configured to know it.  (Unfortunately,
that won't happen on Linux, nor on macOS according to my testing.
It remains to be seen what the buildfarm's gcc-on-Windows animals will
think of this, but we may well have to rely on less-popular platforms
to warn us about unportable code of this kind.)

Discussion: https://postgr.es/m/2975.1526862605@sss.pgh.pa.us
2018-08-11 11:11:05 -04:00
Peter Eisentraut
1486f7f981 Fix typos 2018-07-10 11:14:53 +02:00
Peter Eisentraut
f61988d160 Fix typo 2018-07-05 08:31:40 +02:00
Heikki Linnakangas
f044d71e33 Use ARMv8 CRC instructions where available.
ARMv8 introduced special CPU instructions for calculating CRC-32C. Use
them, when available, for speed.

Like with the similar Intel CRC instructions, several factors affect
whether the instructions can be used. The compiler intrinsics for them must
be supported by the compiler, and the instructions must be supported by the
target architecture. If the compilation target architecture does not
support the instructions, but adding "-march=armv8-a+crc" makes them
available, then we compile the code with a runtime check to determine if
the host we're running on supports them or not.

For the runtime check, use glibc getauxval() function. Unfortunately,
that's not very portable, but I couldn't find any more portable way to do
it. If getauxval() is not available, the CRC instructions will still be
used if the target architecture supports them without any additional
compiler flags, but the runtime check will not be available.

Original patch by Yuqi Gu, heavily modified by me. Reviewed by Andres
Freund, Thomas Munro.

Discussion: https://www.postgresql.org/message-id/HE1PR0801MB1323D171938EABC04FFE7FA9E3110%40HE1PR0801MB1323.eurprd08.prod.outlook.com
2018-04-04 12:22:45 +03:00
Andres Freund
6869b4f258 Add C++ support to configure.
This is an optional dependency. It'll be used for the upcoming LLVM
based just in time compilation support, which needs to wrap a few LLVM
C++ APIs so they're accessible from C..

For now test for C++ compilers unconditionally, without failing if not
present, to ensure wide buildfarm coverage. If we're bothered by the
additional test times (which are quite short) or verbosity, we can
later make the tests conditional on --with-llvm.

Author: Andres Freund
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
2018-03-20 15:48:48 -07:00
Andres Freund
3de04e4ed1 Add PGAC_PROG_VARCC_VARFLAGS_OPT autoconf macro.
The new macro allows to test flags for different compilers and to
store them in different CFLAG like variables.  The existing
PGAC_PROG_CC_CFLAGS_OPT and PGAC_PROG_CC_VAR_OPT are changed to be
just wrappers around the new function.

This'll be used by the upcoming LLVM support, to separately detect
capabilities used by clang, when generating bitcode.

Author: Andres Freund
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
2018-03-20 12:58:08 -07:00
Tom Lane
2082b3745a Extend configure's __int128 test to check for a known gcc bug.
On Sparc64, use of __attribute__(aligned(8)) with __int128 causes faulty
code generation in gcc versions at least through 5.5.0.  We can work around
that by disabling use of __int128, so teach configure to test for the bug.

This solution doesn't fix things for the case of cross-compiling with a
buggy compiler; to support that nicely, we'd need to add a manual disable
switch.  Unless more such cases turn up, it doesn't seem worth the work.
Affected users could always edit pg_config.h manually.

In passing, fix some typos in the existing configure test for __int128.
They're harmless because we only compile that code not run it, but
they're still confusing for anyone looking at it closely.

This is needed in support of commit 751804998, so back-patch to 9.5
as that was.

Marina Polyakova, Victor Wagner, Tom Lane

Discussion: https://postgr.es/m/0d3a9fa264cebe1cb9966f37b7c06e86@postgrespro.ru
2018-01-18 11:09:44 -05:00
Tom Lane
c6d21d56f1 Try harder to detect unavailability of __builtin_mul_overflow(int64).
Commit c04d35f44 didn't quite do the job here, because it still allowed
the compiler to deduce that the function call could be optimized away.
Prevent that by putting the arguments and results in global variables.

Discussion: https://postgr.es/m/20171213213754.pydkyjs6bt2hvsdb@alap3.anarazel.de
2017-12-17 11:52:22 -05:00
Andres Freund
c04d35f442 Try to detect runtime unavailability of __builtin_mul_overflow(int64).
On some systems the results of 64 bit __builtin_mul_overflow()
operations can be computed at compile time, but not at runtime. The
known cases are arm buildfar animals using clang where the runtime
operation is implemented in a unavailable function.

Try to avoid compile-time computation by using volatile arguments to
__builtin_mul_overflow(). In that case we hopefully will get a link
error when unavailable, similar to what buildfarm animals dangomushi
and gull are reporting.

Author: Andres Freund
Discussion: https://postgr.es/m/20171213213754.pydkyjs6bt2hvsdb@alap3.anarazel.de
2017-12-16 12:49:41 -08:00
Tom Lane
9220b00e57 Tighten configure's test for __builtin_constant_p().
Commit 9fa6f00b1 assumed that __builtin_constant_p("string literal")
is TRUE, if the compiler has that function at all.  Buildfarm results
show that Sun Studio 12, at least, breaks that assumption.  Removing
that usage would leave us with no mechanical check for a very fragile
coding requirement, so instead teach configure to ignore
__builtin_constant_p() if it doesn't behave that way.  We could
complicate matters by distinguishing three cases (no such function,
vs does, vs doesn't work for string literals); but for now, that seems
unnecessary because our other existing uses of this function are just
fairly minor optimizations of non-returning elog/ereport.  We can live
without that on the small population of compilers that act this way.

Discussion: https://postgr.es/m/22997.1513264066@sss.pgh.pa.us
2017-12-14 17:19:27 -05:00
Andres Freund
85abb5b297 Make PGAC_C_BUILTIN_OP_OVERFLOW link instead of just compiling.
Otherwise the detection can spuriously detect symbol as available,
because the compiler may just emits reference to non-existant symbol.
2017-12-12 17:21:37 -08:00
Andres Freund
4d6ad31257 Provide overflow safe integer math inline functions.
It's not easy to get signed integer overflow checks correct and
fast. Therefore abstract the necessary infrastructure into a common
header providing addition, subtraction and multiplication for 16, 32,
64 bit signed integers.

The new macros aren't yet used, but a followup commit will convert
several open coded overflow checks.

Author: Andres Freund, with some code stolen from Greg Stark
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20171024103954.ztmatprlglz3rwke@alap3.anarazel.de
2017-12-12 16:55:37 -08:00
Tom Lane
7518049980 Prevent int128 from requiring more than MAXALIGN alignment.
Our initial work with int128 neglected alignment considerations, an
oversight that came back to bite us in bug #14897 from Vincent Lachenal.
It is unsurprising that int128 might have a 16-byte alignment requirement;
what's slightly more surprising is that even notoriously lax Intel chips
sometimes enforce that.

Raising MAXALIGN seems out of the question: the costs in wasted disk and
memory space would be significant, and there would also be an on-disk
compatibility break.  Nor does it seem very practical to try to allow some
data structures to have more-than-MAXALIGN alignment requirement, as we'd
have to push knowledge of that throughout various code that copies data
structures around.

The only way out of the box is to make type int128 conform to the system's
alignment assumptions.  Fortunately, gcc supports that via its
__attribute__(aligned()) pragma; and since we don't currently support
int128 on non-gcc-workalike compilers, we shouldn't be losing any platform
support this way.

Although we could have just done pg_attribute_aligned(MAXIMUM_ALIGNOF) and
called it a day, I did a little bit of extra work to make the code more
portable than that: it will also support int128 on compilers without
__attribute__(aligned()), if the native alignment of their 128-bit-int
type is no more than that of int64.

Add a regression test case that exercises the one known instance of the
problem, in parallel aggregation over a bigint column.

This will need to be back-patched, along with the preparatory commit
91aec93e6.  But let's see what the buildfarm makes of it first.

Discussion: https://postgr.es/m/20171110185747.31519.28038@wrigleys.postgresql.org
2017-11-14 15:03:55 -05:00
Andres Freund
510b8cbff1 Extend & revamp pg_bswap.h infrastructure.
Upcoming patches are going to address performance issues that involve
slow system provided ntohs/htons etc. To address that expand
pg_bswap.h to provide pg_ntoh{16,32,64}, pg_hton{16,32,64} and
optimize their respective implementations by using compiler intrinsics
for gcc compatible compilers and msvc. Fall back to manual
implementations using shifts etc otherwise.

Additionally remove multiple evaluation hazards from the existing
BSWAP32/64 macros, by replacing them with inline functions when
necessary. In the course of that the naming scheme is changed to
pg_bswap16/32/64.

Author: Andres Freund
Discussion: https://postgr.es/m/20170927172019.gheidqy6xvlxb325@alap3.anarazel.de
2017-09-29 17:24:39 -07:00
Peter Eisentraut
ddce628971 Fix configure check for typeof 2017-03-28 22:28:56 -04:00
Peter Eisentraut
4cb824699e Cast result of copyObject() to correct type
copyObject() is declared to return void *, which allows easily assigning
the result independent of the input, but it loses all type checking.

If the compiler supports typeof or something similar, cast the result to
the input type.  This creates a greater amount of type safety.  In some
cases, where the result is assigned to a generic type such as Node * or
Expr *, new casts are now necessary, but in general casts are now
unnecessary in the normal case and indicate that something unusual is
happening.

Reviewed-by: Mark Dilger <hornschnorter@gmail.com>
2017-03-28 21:59:23 -04:00
Tom Lane
bc18126a6b Add configure test to see if the C compiler has gcc-style computed gotos.
We'll need this for the upcoming patch to speed up expression evaluation.
Might as well push it now to see if it behaves sanely in the buildfarm.

Andres Freund

Discussion: https://postgr.es/m/20170320062511.hp5qeurtxrwsvfxr@alap3.anarazel.de
2017-03-20 13:35:26 -04:00
Peter Eisentraut
1c0cf52b39 Use return instead of exit() in configure
Using exit() requires stdlib.h, which is not included.  Use return
instead.  Also add return type for main().

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Thomas Munro <thomas.munro@enterprisedb.com>
2016-09-30 14:03:57 -04:00
Noah Misch
4ad6f13500 Copyedit comments and documentation. 2016-04-01 21:53:10 -04:00
Robert Haas
c171818b27 Add BSWAP64 macro.
This is like BSWAP32, but for 64-bit values.  Since we've got two of
them now and they have use cases (like sortsupport) beyond CRCs, move
the definitions to their own header file.

Peter Geoghegan
2015-10-08 13:01:36 -04:00
Andres Freund
6cf72879e9 Improve configure test for the sse4.2 crc instruction.
With optimizations enabled at least one compiler, clang 3.7, optimized
away the crc intrinsics knowing that the result went on unused and has
no side effects. That can trigger errors in code generation when the
intrinsic is used, as we chose to use the intrinsics without any
additional compiler flag. Return the computed value to prevent that.

With some more pedantic warning flags (-Wold-style-definition) the
configure test failed to recognize the existence of _mm_crc32_u*
intrinsics due to an independent warning in the test because the test
turned on -Werror, but that's not actually needed here.

Discussion: 20150814092039.GH4955@awork2.anarazel.de
Backpatch: 9.5, where the use of crc intrinsics was integrated.
2015-08-17 11:15:46 +02:00
Andres Freund
de6fd1c898 Rely on inline functions even if that causes warnings in older compilers.
So far we have worked around the fact that some very old compilers do
not support 'inline' functions by only using inline functions
conditionally (or not at all). Since such compilers are very rare by
now, we have decided to rely on inline functions from 9.6 onwards.

To avoid breaking these old compilers inline is defined away when not
supported. That'll cause "function x defined but not used" type of
warnings, but since nobody develops on such compilers anymore that's
ok.

This change in policy will allow us to more easily employ inline
functions.

I chose to remove code previously conditional on PG_USE_INLINE as it
seemed confusing to have code dependent on a define that's always
defined.

Blacklisting of compilers, like in c53f73879f, now has to be done
differently. A platform template can define PG_FORCE_DISABLE_INLINE to
force inline to be defined empty.

Discussion: 20150701161447.GB30708@awork2.anarazel.de
2015-08-05 18:19:52 +02:00
Heikki Linnakangas
a2edb023d0 Replace obsolete autoconf macros with their modern replacements.
AC_TRY_COMPILE(...) -> AC_COMPILE_IFELSE([AC_LANG_PROGRAM(...)])
AC_TRY_LINK(...) -> AC_LINK_IFELSE([AC_LANG_PROGRAM(...)])
AC_TRY_RUN(...) -> AC_RUN_IFELSE([AC_LANG_PROGRAM(...)])
AC_LANG_SAVE/RESTORE -> AC_LANG_PUSH/POP
AC_DECL_SYS_SIGLIST -> AC_CHECK_DECLS(...) (per snippet in autoconf manual)

Also use AC_LANG_SOURCE instead of AC_LANG_PROGRAM, where the main()
function is not needed.

With these changes, autoconf -Wall doesn't complain anymore.

Andreas Karlsson
2015-07-02 19:21:23 +03:00
Heikki Linnakangas
936546dcbc Optimize pg_comp_crc32c_sse42 routine slightly, and also use it on x86.
Eliminate the separate 'len' variable from the loops, and also use the 4
byte instruction. This shaves off a few more cycles. Even though this
routine that uses the special SSE 4.2 instructions is much faster than a
generic routine, it's still a hot spot, so let's make it as fast as
possible.

Change the configure test to not test _mm_crc32_u64. That variant is only
available in the 64-bit x86-64 architecture, not in 32-bit x86. Modify
pg_comp_crc32c_sse42 so that it only uses _mm_crc32_u64 on x86-64. With
these changes, the SSE accelerated CRC-32C implementation can also be used
on 32-bit x86 systems.

This also fixes the 32-bit MSVC build.
2015-04-14 23:58:16 +03:00
Heikki Linnakangas
3dc2d62d04 Use Intel SSE 4.2 CRC instructions where available.
Modern x86 and x86-64 processors with SSE 4.2 support have special
instructions, crc32b and crc32q, for calculating CRC-32C. They greatly
speed up CRC calculation.

Whether the instructions can be used or not depends on the compiler and the
target architecture. If generation of SSE 4.2 instructions is allowed for
the target (-msse4.2 flag on gcc and clang), use them. If they are not
allowed by default, but the compiler supports the -msse4.2 flag to enable
them, compile just the CRC-32C function with -msse4.2 flag, and check at
runtime whether the processor we're running on supports it. If it doesn't,
fall back to the slicing-by-8 algorithm. (With the common defaults on
current operating systems, the runtime-check variant is what you get in
practice.)

Abhijit Menon-Sen, heavily modified by me, reviewed by Andres Freund.
2015-04-14 17:05:03 +03:00
Andres Freund
8122e1437e Add, optional, support for 128bit integers.
We will, for the foreseeable future, not expose 128 bit datatypes to
SQL. But being able to use 128bit math will allow us, in a later patch,
to use 128bit accumulators for some aggregates; leading to noticeable
speedups over using numeric.

So far we only detect a gcc/clang extension that supports 128bit math,
but no 128bit literals, and no *printf support. We might want to expand
this in the future to further compilers; if there are any that that
provide similar support.

Discussion: 544BB5F1.50709@proxel.se
Author: Andreas Karlsson, with significant editorializing by me
Reviewed-By: Peter Geoghegan, Oskari Saarenmaa
2015-03-20 10:26:17 +01:00