1
0
mirror of https://sourceware.org/git/glibc.git synced 2025-08-08 17:42:12 +03:00

x86: Make the divisor in setting non_temporal_threshold cpu specific

Different systems prefer a different divisors.

From benchmarks[1] so far the following divisors have been found:
    ICX     : 2
    SKX     : 2
    BWD     : 8

For Intel, we are generalizing that BWD and older prefers 8 as a
divisor, and SKL and newer prefers 2. This number can be further tuned
as benchmarks are run.

[1]: https://github.com/goldsteinn/memcpy-nt-benchmarks
Reviewed-by: DJ Delorie <dj@redhat.com>
This commit is contained in:
Noah Goldstein
2023-06-07 13:18:03 -05:00
parent f193ea20ed
commit 180897c161
4 changed files with 51 additions and 26 deletions

View File

@@ -945,6 +945,9 @@ struct cpu_features
unsigned long int level3_cache_linesize;
/* /_SC_LEVEL4_CACHE_SIZE. */
unsigned long int level4_cache_size;
/* When no user non_temporal_threshold is specified. We default to
cachesize / cachesize_non_temporal_divisor. */
unsigned long int cachesize_non_temporal_divisor;
};
/* Get a pointer to the CPU features structure. */