Siddhesh Poyarekar
5a67c4fa01
aarch64: Optimized memset for falkor
The generic memset reads dczid_el0 on every memset. This has a
significant impact on falkor for a range of sizes because reading
dczid_el0 is slow.
The DZP bit in the dczid_el0 register does not change dynamically, so
it is safe to read once during program startup. With this patch
dczid_el0 is read once during startup and zva_size is cached. This is
used to invoke the falkor-specific memset; the generic memset routine
remains unchanged.
The gains due to this are significant for falkor, with run time
reductions as high as 48%. Here's a sample from the falkor tests:
Function: memset
Variant: walk
simple_memset __memset_falkor __memset_generic
=====================================================================
length=256, char=0: 139.96 (-698.28%) 9.07 ( 48.26%) 17.53
length=257, char=0: 140.50 (-699.03%) 9.53 ( 45.80%) 17.58
length=258, char=0: 140.96 (-703.95%) 9.58 ( 45.36%) 17.53
length=259, char=0: 141.56 (-705.16%) 9.53 ( 45.79%) 17.58
length=260, char=0: 142.15 (-710.76%) 9.57 ( 45.39%) 17.53
length=261, char=0: 142.50 (-710.39%) 9.53 ( 45.78%) 17.58
length=262, char=0: 142.97 (-715.09%) 9.57 ( 45.42%) 17.54
length=263, char=0: 143.51 (-716.18%) 9.53 ( 45.80%) 17.58
length=264, char=0: 143.93 (-720.55%) 9.58 ( 45.39%) 17.54
length=265, char=0: 144.56 (-722.07%) 9.53 ( 45.80%) 17.59
length=266, char=0: 144.98 (-726.42%) 9.58 ( 45.42%) 17.54
length=267, char=0: 145.53 (-727.53%) 9.53 ( 45.80%) 17.59
length=268, char=0: 146.25 (-731.81%) 9.53 ( 45.79%) 17.58
length=269, char=0: 146.52 (-735.39%) 9.53 ( 45.66%) 17.54
length=270, char=0: 146.97 (-735.81%) 9.53 ( 45.80%) 17.58
length=271, char=0: 147.54 (-741.08%) 9.58 ( 45.38%) 17.54
length=512, char=0: 268.26 (-1307.85%) 12.06 ( 36.71%) 19.05
length=513, char=0: 268.73 (-1273.89%) 13.56 ( 30.68%) 19.56
length=514, char=0: 269.31 (-1276.89%) 13.56 ( 30.68%) 19.56
length=515, char=0: 269.73 (-1279.05%) 13.56 ( 30.68%) 19.56
length=516, char=0: 270.34 (-1282.24%) 13.56 ( 30.67%) 19.56
length=517, char=0: 270.83 (-1284.71%) 13.56 ( 30.66%) 19.56
length=518, char=0: 271.20 (-1286.54%) 13.56 ( 30.67%) 19.56
length=519, char=0: 271.67 (-1288.67%) 13.65 ( 30.24%) 19.56
length=520, char=0: 272.14 (-1291.04%) 13.65 ( 30.22%) 19.56
length=521, char=0: 272.66 (-1293.69%) 13.65 ( 30.23%) 19.56
length=522, char=0: 273.14 (-1296.13%) 13.65 ( 30.20%) 19.56
length=523, char=0: 273.64 (-1298.75%) 13.65 ( 30.23%) 19.56
length=524, char=0: 274.34 (-1302.16%) 13.66 ( 30.20%) 19.57
length=525, char=0: 274.64 (-1297.78%) 13.56 ( 30.99%) 19.65
length=526, char=0: 275.20 (-1300.04%) 13.56 ( 31.01%) 19.66
length=527, char=0: 275.66 (-1302.86%) 13.56 ( 30.99%) 19.65
length=1024, char=0: 524.46 (-2169.75%) 20.12 ( 12.92%) 23.11
length=1025, char=0: 525.14 (-2124.63%) 21.62 ( 8.40%) 23.61
length=1026, char=0: 525.59 (-2125.36%) 21.88 ( 7.37%) 23.62
length=1027, char=0: 525.98 (-2127.14%) 21.62 ( 8.46%) 23.62
length=1028, char=0: 526.68 (-2131.10%) 21.62 ( 8.42%) 23.61
length=1029, char=0: 527.10 (-2131.70%) 21.79 ( 7.73%) 23.62
length=1030, char=0: 527.54 (-2118.51%) 21.62 ( 9.10%) 23.78
length=1031, char=0: 527.98 (-2136.37%) 21.62 ( 8.43%) 23.61
length=1032, char=0: 528.70 (-2139.38%) 21.62 ( 8.43%) 23.61
length=1033, char=0: 529.25 (-2124.37%) 21.62 ( 9.11%) 23.79
length=1034, char=0: 529.48 (-2142.95%) 21.62 ( 8.43%) 23.61
length=1035, char=0: 530.11 (-2145.13%) 21.62 ( 8.44%) 23.61
length=1036, char=0: 530.76 (-2147.10%) 21.79 ( 7.73%) 23.62
length=1037, char=0: 531.03 (-2149.45%) 21.62 ( 8.42%) 23.61
length=1038, char=0: 531.64 (-2151.87%) 21.62 ( 8.42%) 23.61
length=1039, char=0: 531.99 (-2151.63%) 21.80 ( 7.75%) 23.63
* sysdeps/aarch64/memset-reg.h: New file.
* sysdeps/aarch64/memset.S: Use it.
(__memset): Rename to MEMSET macro.
[ZVA_MACRO]: Use zva_macro.
* sysdeps/aarch64/multiarch/Makefile (sysdep_routines):
Add memset_generic and memset_falkor.
* sysdeps/aarch64/multiarch/ifunc-impl-list.c
(__libc_ifunc_impl_list): Add memset ifuncs.
* sysdeps/aarch64/multiarch/init-arch.h (INIT_ARCH): New
local variable zva_size.
* sysdeps/aarch64/multiarch/memset.c: New file.
* sysdeps/aarch64/multiarch/memset_generic.S: New file.
* sysdeps/aarch64/multiarch/memset_falkor.S: New file.
* sysdeps/aarch64/multiarch/rtld-memset.S: New file.
* sysdeps/unix/sysv/linux/aarch64/cpu-features.c
(DCZID_DZP_MASK): New macro.
(DCZID_BS_MASK): Likewise.
(init_cpu_features): Read and set zva_size.
* sysdeps/unix/sysv/linux/aarch64/cpu-features.h
(struct cpu_features): New member zva_size.
2017-11-20 18:25:04 +05:30
..
2017-11-20 18:25:04 +05:30
2017-11-09 05:10:03 -08:00
2017-11-09 05:10:03 -08:00
2017-11-15 18:41:32 +00:00
2017-09-04 13:38:51 -07:00
2017-11-15 18:40:29 +00:00
2017-11-09 05:10:03 -08:00
2017-11-09 05:10:03 -08:00
2017-10-01 16:05:28 -07:00
2017-11-09 05:10:03 -08:00
2017-11-09 05:10:03 -08:00
2017-11-09 05:10:03 -08:00
2017-11-16 17:52:43 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-11-09 05:10:03 -08:00
2017-11-09 05:10:03 -08:00
2017-11-15 17:39:59 -07:00
2017-01-01 00:14:16 +00:00
2017-11-09 05:10:03 -08:00
2017-11-09 05:10:03 -08:00
2017-08-09 22:58:45 +00:00
2017-11-09 05:10:03 -08:00
2017-10-01 15:51:11 -07:00
2017-08-30 22:02:04 +00:00
2017-11-09 05:10:03 -08:00
2017-01-01 00:14:16 +00:00
2017-06-08 13:58:17 -04:00
2017-05-09 21:59:36 +00:00
2017-03-09 15:22:06 +01:00
2017-07-24 11:21:07 -03:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-08-31 15:59:06 +02:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-08-18 18:38:55 -03:00
2017-08-18 18:38:55 -03:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-08-18 18:38:55 -03:00
2017-01-01 00:14:16 +00:00
2017-03-09 15:22:06 +01:00
2017-05-11 17:27:27 -03:00
2017-05-11 17:27:27 -03:00
2017-01-28 19:21:44 -05:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-08-13 21:11:38 +02:00
2017-06-08 12:52:42 -07:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-08-31 16:59:37 +02:00
2017-08-31 16:59:37 +02:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-11-15 14:40:17 -02:00
2017-08-07 19:55:34 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-08-31 16:10:34 -03:00
2017-01-01 00:14:16 +00:00
2017-05-18 18:06:47 -03:00
2017-09-19 15:50:38 +00:00
2017-08-28 11:58:52 +02:00
2017-08-18 18:38:55 -03:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-10-01 15:08:32 -07:00
2017-10-01 15:08:32 -07:00
2017-05-18 18:06:47 -03:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-08-31 16:07:52 +02:00
2017-08-28 11:58:52 +02:00
2017-09-01 17:14:43 +00:00
2017-01-01 00:14:16 +00:00
2017-08-31 15:59:07 +02:00
2017-08-31 16:02:40 +02:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-08-18 18:38:55 -03:00
2017-01-01 00:14:16 +00:00
2017-08-18 18:38:55 -03:00
2017-08-18 18:38:55 -03:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-10-01 16:08:42 -07:00
2017-01-01 00:14:16 +00:00
2017-10-01 18:00:07 -07:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-10-01 16:12:24 -07:00
2017-10-01 16:12:24 -07:00
2017-10-01 15:57:21 -07:00
2017-01-01 00:14:16 +00:00
2017-09-25 18:04:16 -07:00
2017-10-01 18:02:10 -07:00
2017-09-25 18:04:16 -07:00
2017-09-25 18:04:16 -07:00
2017-09-08 16:34:02 +02:00
2017-09-08 16:34:02 +02:00
2017-08-18 18:38:55 -03:00
2017-11-15 08:58:48 -08:00
2017-09-27 17:18:32 -07:00
2017-01-01 00:14:16 +00:00
2017-10-01 15:08:32 -07:00
2017-10-01 15:08:32 -07:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2016-11-22 09:59:12 -08:00
2017-01-01 00:14:16 +00:00
2017-09-19 16:19:14 +02:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:26:24 +00:00
2017-01-01 00:14:16 +00:00
2017-08-18 18:38:55 -03:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-11-15 21:09:01 +01:00
2017-08-18 18:38:55 -03:00
2017-06-23 17:38:17 -03:00
2017-08-14 10:35:14 -03:00
2017-08-14 10:35:14 -03:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-08-18 18:38:55 -03:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-05-18 18:06:47 -03:00
2017-05-18 18:06:47 -03:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-05-18 18:06:47 -03:00
2017-08-22 14:25:08 -03:00
2017-08-13 21:11:28 +02:00
2017-10-01 15:56:27 -07:00
2017-08-31 16:10:34 -03:00
2017-07-24 11:21:07 -03:00
2017-05-20 19:04:43 -04:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-10-01 15:07:23 -07:00
2017-09-25 18:04:16 -07:00
2017-08-17 16:50:35 -03:00
2017-05-18 18:06:47 -03:00
2017-08-17 16:50:35 -03:00
2017-08-18 16:30:05 -03:00
2017-08-18 16:30:05 -03:00
2017-01-01 00:14:16 +00:00
2017-08-17 10:18:15 +02:00
2017-01-01 00:14:16 +00:00
2017-10-01 15:18:25 -07:00
2017-08-22 14:25:03 -03:00
2017-01-01 00:14:16 +00:00
2017-05-03 10:36:01 -03:00
2017-10-01 18:06:04 -07:00
2017-02-06 10:21:55 -02:00
2017-10-01 18:06:04 -07:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-05-02 14:57:40 -03:00
2017-05-02 14:57:40 -03:00
2017-08-08 09:59:46 -03:00
2017-08-31 18:52:00 +02:00
2017-08-08 09:59:46 -03:00
2017-08-31 18:52:00 +02:00
2017-05-09 14:05:09 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-08-18 18:38:55 -03:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-08-18 18:38:55 -03:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-06-07 17:37:59 +02:00
2017-05-31 17:35:46 -03:00
2017-05-02 14:57:40 -03:00
2017-08-08 09:59:46 -03:00
2017-08-31 18:52:00 +02:00
2017-08-08 09:59:46 -03:00
2017-08-31 18:52:00 +02:00
2017-01-01 00:14:16 +00:00
2017-08-18 10:31:16 -03:00
2017-01-01 00:14:16 +00:00
2017-10-01 15:51:11 -07:00
2017-01-01 00:14:16 +00:00
2017-10-01 15:54:10 -07:00
2017-01-01 00:14:16 +00:00
2017-03-09 15:22:06 +01:00
2017-03-09 15:22:06 +01:00
2017-05-09 21:59:36 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-10-01 18:03:59 -07:00
2017-05-03 10:36:36 -03:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-03-09 15:22:06 +01:00
2017-05-09 21:59:36 +00:00
2017-01-01 00:14:16 +00:00
2017-03-09 15:22:06 +01:00
2017-03-28 12:29:27 -03:00
2017-03-28 12:29:27 -03:00
2017-03-28 12:29:27 -03:00
2017-03-28 12:29:27 -03:00
2017-01-01 00:14:16 +00:00
2017-03-28 12:29:27 -03:00
2017-03-28 12:29:27 -03:00
2017-03-28 12:29:27 -03:00
2017-03-28 12:29:27 -03:00
2017-10-01 17:46:54 -07:00
2017-10-01 17:46:54 -07:00
2017-01-01 00:14:16 +00:00
2017-10-01 16:12:24 -07:00
2017-03-28 12:29:27 -03:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-10-31 17:49:18 -02:00
2017-11-06 17:37:57 -02:00
2017-01-01 00:14:16 +00:00
2017-05-20 19:04:43 -04:00
2017-01-01 00:14:16 +00:00
2017-05-18 18:06:47 -03:00
2017-11-06 17:37:57 -02:00
2017-11-06 17:37:57 -02:00
2017-11-06 17:37:57 -02:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-10-23 13:31:26 -02:00
2017-01-01 00:14:16 +00:00
2017-05-18 18:06:47 -03:00
2017-01-01 00:14:16 +00:00
2017-10-01 15:08:32 -07:00
2017-10-01 15:08:32 -07:00
2017-01-01 00:14:16 +00:00
2017-11-16 17:51:54 +00:00
2017-05-18 18:06:47 -03:00
2017-08-18 18:38:55 -03:00
2017-01-01 00:14:16 +00:00
2017-10-11 14:27:24 -03:00
2017-01-01 00:14:16 +00:00
2017-03-09 15:22:06 +01:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-10-01 17:48:24 -07:00
2017-01-01 00:14:16 +00:00
2017-06-23 17:38:17 -03:00
2017-05-18 18:06:47 -03:00
2017-11-02 13:55:51 +01:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-06-07 14:05:42 +02:00
2017-06-26 17:52:20 -03:00
2017-03-01 20:32:50 -05:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-05-08 16:44:54 +00:00
2017-06-20 20:32:50 -04:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-02-16 17:33:18 -05:00
2017-08-28 11:58:52 +02:00
2017-10-20 04:10:15 +02:00
2017-10-20 04:10:15 +02:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-11-18 14:34:46 +01:00
2017-11-15 20:47:12 +01:00
2017-11-15 20:47:12 +01:00
2017-11-15 20:47:07 +01:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-11-06 17:37:57 -02:00
2017-05-18 18:06:47 -03:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-08-21 15:37:45 -03:00
2017-08-18 10:53:47 -03:00
2017-10-01 15:54:10 -07:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00
2017-01-01 00:14:16 +00:00