Backport of 2f5174e556:
MDEV-33075 Resolve server shutdown issues on macOS, Solaris, and FreeBSD.
This commit addresses multiple server shutdown problems observed on macOS,
Solaris, and FreeBSD:
1. Corrected a non-portable assumption where socket shutdown was expected
to wake up poll() with listening sockets in the main thread.
Use more robust self-pipe to wake up poll() by writing to the pipe's write
end.
Signed-off-by: Ivan Prisyazhnyy <john.koepi@gmail.com>
Backport of 2f5174e556:
MDEV-33075 Resolve server shutdown issues on macOS, Solaris, and FreeBSD.
This commit addresses multiple server shutdown problems observed on macOS,
Solaris, and FreeBSD:
2. Fixed a random crash on macOS from pthread_kill(signal_handler)
when the signal_handler was detached and the thread had already exited.
Use more robust `kill(getpid(), SIGTERM)` to wake up the signal handler
thread.
Additionally, the shutdown code underwent light refactoring
for better readability and maintainability:
- Modified `break_connect_loop()` to no longer wait for the main thread,
aligning behavior with Windows (since 10.4).
Backport of 2f5174e556:
MDEV-33075 Resolve server shutdown issues on macOS, Solaris, and FreeBSD.
This commit addresses multiple server shutdown problems observed on macOS,
Solaris, and FreeBSD:
3. Made sure, that signal handler thread always exits once `abort_loop` is
set, and also calls `my_thread_end()` and clears `signal_thread_in_use`
when exiting.
This fixes warning "1 thread did not exit" by `my_global_thread_end()`
seen on FreeBSD/macOS when the process is terminated via signal.
Additionally, the shutdown code underwent light refactoring
for better readability and maintainability:
- Removed dead code related to the unused `USE_ONE_SIGNAL_HAND`
preprocessor constant.
Signed-off-by: Ivan Prisyazhnyy <john.koepi@gmail.com>
Backport of 2f5174e556:
MDEV-33075 Resolve server shutdown issues on macOS, Solaris, and FreeBSD.
Eliminated support for `#ifndef HAVE_POLL` in `handle_connection_sockets`
This code is also dead, since 10.4
Signed-off-by: Ivan Prisyazhnyy <john.koepi@gmail.com>
Partial commit of the greater MDEV-34348 scope.
MDEV-34348: MariaDB is violating clang-16 -Wcast-function-type-strict
The functions queue_compare, qsort2_cmp, and qsort_cmp2
all had similar interfaces, and were used interchangable
and unsafely cast to one another.
This patch consolidates the functions all into the
qsort_cmp2 interface.
Reviewed By:
============
Marko Mäkelä <marko.makela@mariadb.com>
Set select_thread_in_use only when we're about to enter into the polling
loop, not sooner, allowing early proces aborts to exist cleanly: the
process won't be waiting for a polling loop that isn't yet polling.
Each time a listener socket becomes ready, MariaDB calls accept() ten
times (MAX_ACCEPT_RETRY), even if all but the first one return EAGAIN
because there are no more connections. This causes unnecessary CPU
usage - on our server, the CPU load of that thread, which does nothing
but accept(), saturates one CPU core by ~45%. The loop should stop
after the first EAGAIN.
Perf report:
11.01% mariadbd libc.so.6 [.] accept4
6.42% mariadbd [kernel.kallsyms] [k] finish_task_switch.isra.0
5.50% mariadbd [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
5.50% mariadbd [kernel.kallsyms] [k] syscall_enter_from_user_mode
4.59% mariadbd [kernel.kallsyms] [k] __fget_light
3.67% mariadbd [kernel.kallsyms] [k] kmem_cache_alloc
2.75% mariadbd [kernel.kallsyms] [k] fput
2.75% mariadbd [kernel.kallsyms] [k] mod_objcg_state
1.83% mariadbd [kernel.kallsyms] [k] __inode_wait_for_writeback
1.83% mariadbd [kernel.kallsyms] [k] __sys_accept4
1.83% mariadbd [kernel.kallsyms] [k] _raw_spin_unlock_irq
1.83% mariadbd [kernel.kallsyms] [k] alloc_inode
1.83% mariadbd [kernel.kallsyms] [k] call_rcu
The server does not log errors after startup when it is started without the
--console parameter and not as a service. This issue arises due to an
undocumented behavior of FreeConsole() in Windows when only a single
process (mariadbd/mysqld) is attached to it, causing the window to close.
In this case stderr is redirected to a file before FreeConsole()
is called. Procmon shows FreeConsole closing file handle
subsequent writes to stderr fail with ERROR_INVALID_HANDLE because
WriteFile() cannot operate on the closed handle. This results in losing
all messages after startup, including warnings, errors, notes, and
crash reports.
Additionally, some users reported stderr being redirected to
multi-master.info and failing at startup, but this could not be reproduced
here.
The workaround involves calling FreeConsole() right before the redirection of
stdout/stderr. This fix has been tested with XAMPP and via cmd.exe using
"start mysqld". Automated testing using MTR is challenging for this case.
The fix is only applicable to version 10.5. In later versions, the
FreeConsole() call has been removed.
The feedback plugin server_uid variable and the calculate_server_uid()
function is moved from feedback/utils.cc to sql/mysqld.cc
server_uid is added as a global variable (shown in 'show variables') and
is written to the error log on server startup together with server version
and server commit id.
We have an issue if a user have the following in a configuration file:
log_slow_filter="" # Log everything to slow query log
log_queries_not_using_indexes=ON
This set log_slow_filter to 'not_using_index' which disables
slow_query_logging of most queries.
In effect, on should never use log_slow_filter="" in config files but
instead use log_slow_filter=ALL.
Fixed by changing log_slow_filter="" that comes either from a
configuration file or from the command line, when starting to the server,
to log_slow_filter=ALL.
A warning will be printed when this happens.
Other things:
- One can now use =ALL for any 'set' variable to set all options at once.
(backported from 10.6)
Immediately close down the signal handler loop when we decide to
break connections as it's the start of process termination
anyway, and there's no need to wait once we've invoked break_connections.
Improve detection for DES support in OpenSSL, to allow compilation
against system OpenSSL without DES.
Note that MariaDB needs to be compiled against OpenSSL-like library
that itself has DES support which cmake detected. Positive detection
is indicated with CMake variable HAVE_des 1.
Signed-off-by: Dimitri John Ledkov <dimitri.ledkov@surgut.co.uk>
- Add additional MTRs for more coverage on invalid options
- Updating a few error messages to be more informative
- Use the exit code from handle_options() when there is an error processing
user options
All new code of the whole pull request, including one or several files that are
either new files or modified ones, are contributed under the BSD-new license. I
am contributing on behalf of my employer Amazon Web Services, Inc.
The signal handler thread can use various different runtime
resources when processing a SIGHUP (e.g. master-info information)
due to calling into reload_acl_and_cache(). Currently, the shutdown
process waits for the termination of the signal thread after
performing cleanup. However, this could cause resources actively
used by the signal handler to be freed while reload_acl_and_cache()
is processing.
The specific resource that caused MDEV-30260 is a race condition for
the hostname_cache, such that mysqld would delete it in
clean_up()::hostname_cache_free(), before the signal handler would
use it in reload_acl_and_cache()::hostname_cache_refresh().
Another similar resource is the active_mi/master_info_index. There
was a race between its deletion by the main thread in end_slave(),
and their usage by the Signal Handler as a part of
Master_info_index::flush_all_relay_logs.read(active_mi) in
reload_acl_and_cache().
This patch fixes these race conditions by relocating where server
shutdown waits for the signal handler to die until after
server-level threads have been killed (i.e., as a last step of
close_connections()). With respect to the hostname_cache, active_mi
and master_info_cache, this ensures that they cannot be destroyed
while the signal handler is still active, and potentially using
them.
Additionally:
1) This requires that Events memory is still in place for SIGHUP
handling's mysql_print_status(). So event deinitialization is moved
into clean_up(), but the event scheduler still needs to be stopped
in close_connections() at the same spot.
2) The function kill_server_thread is no longer used, so it is
deleted
3) The timeout to wait for the death of the signal thread was not
consistent with the comment. The comment mentioned up to 10 seconds,
whereas it was actually 0.01s. The code has been fixed to wait up to
10 seconds.
4) A warning has been added if the signal handler thread fails to
exit in time.
5) Added pthread_join() to end of wait_for_signal_thread_to_end()
if it hadn't ended in 10s with a warning. Note this also removes
the pthread_detached attribute from the signal_thread to allow
for the pthread_join().
Reviewed By:
===========
Vladislav Vaintroub <wlad@mariadb.com>
Andrei Elkin <andrei.elkin@mariadb.com>
signal_hand(): Remove the cmake -DWITH_DBUG_TRACE=ON instrumentation.
It can cause a crash on shutdown when the only other thread is
waiting in wait_for_signal_thread_to_end().
This removes the error:
"Failed to load slave replication state from table mysql.gtid_slave_pos:
1017: Can't find file: './mysql/' (errno: 2 "No such file or directory")
my_malloc_size_cb_func() can be called from contexts where it is not safe to
wait for LOCK_thd_kill, for example while holding LOCK_plugin. This could
lead to (probably very unlikely) deadlock of the server.
Fix by skipping the enforcement of --max-session-mem-used in the rare cases
when LOCK_thd_kill cannot be obtained. The limit will instead be enforced on
the following memory allocation. This does not significantly degrade the
behaviour of --max-session-mem-used; that limit is in any case only enforced
"softly", not taking effect until the next point at which the thread does a
check_killed().
Reviewed-by: Monty <monty@mariadb.org>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
This test was prone to failures for a few reasons, summarized below:
1) MDEV-32168 introduced “only_running_threads=1” to
slave_stop.inc, which allowed the stop logic to bypass an
attempting-to-reconnect IO thread. That is, the IO thread could
realize the master shutdown in `read_event()`, and thereby call into
`try_to_reconnect()`. This would leave the IO thread up when the
test expected it to be stopped. Fixed by explicitly stopping the
IO thread and allowing an error state, as the above case would
lead to errno 2003.
2) On slow systems (or those running profiling tools, e.g. MSAN),
the waiting-for-ack transaction can complete before the system
processes the `SHUTDOWN WAIT FOR ALL SLAVES`. There was shutdown
preparation logic in-between the transaction and shutdown itself,
which contributes to this problem. This patch also moves this
preparation logic before the transaction, so there is less to do
in-between the calls.
3) Changed work-around for MDEV-28141 to use debug_sync instead
of sleep delay, as it was still possible to hit the bug on very
slow systems.
4) Masked MTR variable reset with disable/enable query log
Reviewed By:
============
Kristian Nielsen <knielsen@knielsen-hq.org>
In commit b4ff64568c the
signature of mysql_show_var_func was changed, but not all functions
of that type were adjusted.
When the server is configured with `cmake -DWITH_ASAN=ON` and
compiled with clang, runtime errors would be flagged for invoking
functions through an incompatible function pointer.
Reviewed by: Michael 'Monty' Widenius
Remove TLSv1.1 from the default tls_version system variable.
Output a warning if TLSv1.0 or TLSv1.1 are selected.
Thanks Tingyao Nian for the feature request.
Create test for for case insensitive gives a basic warning on creating
a test file and the next thing a user might see is an abort.
ProtectHome and other systemd setting protect system services from
accessing user data. Unfortunately some of our users do put things
on /home due space or other reasons.
Rather than enumberate the systemd options in a very clunkly fragile
way we put an error associated with the "Can't create test file" and
hope the user can work it out from there.
%M tip thanks Sergei.
This patch is the result of running
run-clang-tidy -fix -header-filter=.* -checks='-*,modernize-use-equals-default' .
Code style changes have been done on top. The result of this change
leads to the following improvements:
1. Binary size reduction.
* For a -DBUILD_CONFIG=mysql_release build, the binary size is reduced by
~400kb.
* A raw -DCMAKE_BUILD_TYPE=Release reduces the binary size by ~1.4kb.
2. Compiler can better understand the intent of the code, thus it leads
to more optimization possibilities. Additionally it enabled detecting
unused variables that had an empty default constructor but not marked
so explicitly.
Particular change required following this patch in sql/opt_range.cc
result_keys, an unused template class Bitmap now correctly issues
unused variable warnings.
Setting Bitmap template class constructor to default allows the compiler
to identify that there are no side-effects when instantiating the class.
Previously the compiler could not issue the warning as it assumed Bitmap
class (being a template) would not be performing a NO-OP for its default
constructor. This prevented the "unused variable warning".