1
0
mirror of https://github.com/MariaDB/server.git synced 2025-08-08 11:22:35 +03:00

MDEV-36184 - mhnsw: support powerpc64 SIMD instructions

This patch optimises the dot_product function by leveraging
vectorisation through SIMD intrinsics. This transformation enables
parallel execution of multiple operations, significantly improving the
performance of dot product computation on supported architectures.

The original dot_product function does undergo auto-vectorisation when
compiled with -O3. However, performance analysis has shown that the
newly optimised implementation performs better on Power10 and achieves
comparable performance on Power9 machines.

Benchmark tests were conducted on both Power9 and Power10 machines,
comparing the time taken by the original (auto-vectorized) code and the
new vectorised code. GCC 11.5.0 on RHEL 9.5 operating system with -O3
were used. The benchmarks were performed using a sample test code with
a vector size of 4096 and 10⁷ loop iterations. Here are the average
execution times (in seconds) over multiple runs:

Power9:
Before change: ~16.364 s
After change: ~16.180 s
Performance gain is modest but measurable.

Power10:
Before change: ~8.989 s
After change: ~6.446 s
Significant improvement, roughly 28–30% faster.

Signed-off-by: Manjul Mohan <manjul.mohan@ibm.com>
This commit is contained in:
Manjul Mohan
2025-02-21 12:41:50 -05:00
committed by Sergei Golubchik
parent db5bb6f333
commit 6bb92f98ce
2 changed files with 56 additions and 0 deletions

View File

@@ -53,6 +53,10 @@ SOFTWARE.
#define NEON_IMPLEMENTATION
#endif
#endif
#if defined __powerpc64__ && defined __VSX__
#include <altivec.h>
#define POWER_IMPLEMENTATION
#endif
template <typename T>
struct PatternedSimdBloomFilter