Fixes SIGILL in MariaDB Server "main.non_blocking_api" test case on
aarch64 processors supporting Branch Target Identification (BTI).
Branch Target Identification is a new aarch64 feature that prevents
jumping to arbitrary code locations that are not properly marked as
branch targets.
The "-mbranch-protection=standard" flag to GCC will enable code
generation using BTI. This is backwards compatible as the added
instructions are no-ops on older processors.
One example of where both "-mbranch-protection=standard" is enabled and
hardware supports it is with Amazon Linux 2023 and Graviton 4.
The symptom is the following mtr main.non_blocking_api failure:
mysqltest got signal 4
read_command_buf (0xaaaae6ae27a8): connect
(con_nonblock,localhost,root,,test)
conn->name (0xaaaae6b0c0b8):
Attempting backtrace...
stack_bottom = 0x0 thread_stack 0x3c000
/usr/bin/mariadb-test(my_print_stacktrace+0x48)[0xaaaab6934088]
bits/stdio2.h:79(signal_handler)[0xaaaab68f41a0]
addr2line: 'linux-vdso.so.1': No such file
linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0xffffa22bc830]
/usr/bin/mariadb-test(my_context_spawn+0x74)[0xaaaab6914d84]
With GDB giving the information that SIGILL was on the perfectly legal
'mov' instruction:
Program received signal SIGILL, Illegal instruction.
0x0000aaaaaab44d84 in my_context_spawn (c=0xaaaaab07ada8, f=0xfffff7fed240, d=0x0) at
mariadb1011-10.11.11-1.amzn2023.0.1.aarch64/libmariadb/libmariadb/ma_context.c:674
674 __asm__ __volatile__
(gdb) x/i$pc
=> 0xaaaaaab44d84 <my_context_spawn(my_context*, void (*)(void*),
void*)+116>: mov w0, #0x1 // #1
The call sequence to get to this point is as follows:
Breakpoint 1, my_context_spawn (c=0xaaaaab07ada8, f=0xaaaaaab38024
<mysql_real_connect_start_internal(void*)>, d=0xffffffffc648) at
libmariadb/ma_context.c:661
661 register void *stack asm("x13") = c->stack_top;
(gdb) c
Continuing.
Breakpoint 2, my_context_yield (c=0xaaaaab07ada8) at
libmariadb/ma_context.c:843
843 register const uint64_t *save asm("x19") = &c->save[0];
Continuing from here leads to the SIGILL above.
The branch that ends up causing the SIGILL is the "br x11" instruction
in my_context_yield(). This is due to the branch being to a location
that is not marked as a branch target.
The fix is to ensure that these locations in the my_context aarch64
assembler are marked with "bti j" instructions so that the locations
that will be 'br'ed to are valid to be 'br'ed to.
Since older compilers don't know about the instruction, we need an ifdef
to define a string that has the raw opcode in it.
Fixes: d2285fb830
Some research show that X18 is mentioned as a platform-reserved
register on most non-linux platforms, including MacOS, Windows, and
FreeBSD. So only put it in the clobber list in Linux.
Note that the ma_context.c code does not itself use the X18 register
in any way. On platforms where X18 is reserved, the co-routine code
will preserve it. On platforms where co-routine code can modify X18,
it does not need to be preserved. Putting X18 in the clobber list is
only to avoid GCC itself generating code that requires that X18 is
preserved.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
X18 is a platform-reserved register on Android, not a callee-save
register. So it will not be touched by the spawned/resumed co-routine
and must not be included in the GCC asm clobber list on this platform.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
Also CONC-754.
Depending on compiler options (eg. -fno-dwarf2-cfi-asm), the compiler may
not output .cfi_startproc / .cfi_endproc in generated assember, and this
causes a build error on the .cfi_escape directive put in my_context_spawn()
on systems with DWARF support.
Fix by using the proper preprocessor macro __GCC_HAVE_DWARF2_CFI_ASM to test
for .cfi_escape support, rather than crafted check for various compiler
brands and versions. Though this macro is only available in clang since
version 13.0.0, so unconditionally include the .cfi_escape in earlier clang
versions.
Thanks to Rainer Orth for the suggested fix.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
Follow-up patch to fix copy-paste error that causes incorrect restore of
registers in my_context_continue which can cause crashes on arm64.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
The non-blocking API has native (assembler) implementations for x86_64,
i386, and (with recent patch) aarch64; these implementations are the most
efficient. For other architectures, a fallback to ucontext is supported.
But ucontext is not the most efficient, and it is not available on all
platforms (it has been deprecated in POSIX). The boost::context library
provides an alternative fallback that is available on more architectures and
should be more efficient than ucontext (if still not quite as fast as the
native support).
This patch adds a CMake option -DWITH_BOOST_CONTEXT=ON that adds
boost::context as a dependency of libmariadb to provide a fallback on
non-natively supported architectures. Boost::context is preferred over
ucontext when both are available.
The option is off by default and must be explicitly enabled by the
user. This avoids introducing a C++ dependency (including dependency
on a C++ compiler and on libstdc++) unless explicitly requested by the
user (libmariadb is otherwise C-only).
Tested-by: Brad Smith <brad@comstyle.com>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
Implement native my_context for arm64 (aarch64). This is more
efficient than ucontext, and also makes the non-blocking API available
on arm64 platforms that do not have ucontext such as OpenBSD.
Tested-by: Brad Smith <brad@comstyle.com>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
Since there is no way in the ISO C standard to specify a
non-obsolescent function prototype indicating that a
function will be called with an arbitrary number (including
zero) of arguments of arbitrary types, we have to cast the
callback function in makecontext() call to avoid compiler
warnings/errors.
See also:
https://pubs.opengroup.org/onlinepubs/009695399/functions/makecontext.html