1
0
mirror of https://github.com/MariaDB/server.git synced 2025-08-08 11:22:35 +03:00

MDEV-23633 MY_RELAX_CPU performs unnecessary compare-and-swap on ARM

This follows up MDEV-14374, which was filed against MariaDB Server 10.3.
Back then, on a 48-core Qualcomm Centriq 2400, the performance of
delay loops for spinloops was tested both with and without the dummy
compare-and-swap operation, and it was decided to keep the dummy
operation.

On target architectures where nothing special is available (other than
x86 (IA-32, AMD64) or POWER), we perform a dummy compare-and-swap operation.
This is contrary to the idea of the x86 PAUSE instruction and the
__ppc_get_timebase(), which aim to keep the memory bus idle for a while,
to allow other cores to better execute code while a spinloop is waiting
for something to be changed.

On MariaDB Server 10.4 and another implementation of the ARMv8 ISA,
omitting the dummy compare-and-swap improved performance by up to 12%.
So, let us avoid the dummy compare-and-swap on ARM.

For now, we are retaining the dummy compare-and-swap on other ISAs
(such as SPARC, MIPS, S390x, RISC-V) because we do not have any
performance data for them.
This commit is contained in:
Marko Mäkelä
2020-09-04 10:31:41 +03:00
parent 1cda462f46
commit 24f510bba4

View File

@@ -53,6 +53,7 @@
#ifdef _WIN32 #ifdef _WIN32
#elif defined HAVE_PAUSE_INSTRUCTION #elif defined HAVE_PAUSE_INSTRUCTION
#elif defined(_ARCH_PWR8) #elif defined(_ARCH_PWR8)
#elif defined __GNUC__ && (defined __arm__ || defined __aarch64__)
#else #else
# include "my_atomic.h" # include "my_atomic.h"
#endif #endif
@@ -80,6 +81,9 @@ static inline void MY_RELAX_CPU(void)
#endif #endif
#elif defined(_ARCH_PWR8) #elif defined(_ARCH_PWR8)
__ppc_get_timebase(); __ppc_get_timebase();
#elif defined __GNUC__ && (defined __arm__ || defined __aarch64__)
/* Mainly, prevent the compiler from optimizing away delay loops */
__asm__ __volatile__ ("":::"memory")
#else #else
int32 var, oldval = 0; int32 var, oldval = 0;
my_atomic_cas32_strong_explicit(&var, &oldval, 1, MY_MEMORY_ORDER_RELAXED, my_atomic_cas32_strong_explicit(&var, &oldval, 1, MY_MEMORY_ORDER_RELAXED,