There are separate flags DBUG_OFF for disabling the DBUG facility
and ENABLED_DEBUG_SYNC for enabling the DEBUG_SYNC facility.
Let us allow debug builds without DEBUG_SYNC.
Note: For CMAKE_BUILD_TYPE=Debug, CMakeLists.txt will continue to
define ENABLED_DEBUG_SYNC.
* Update wsrep-lib which contains the fix
* Add deterministic test case that reproduces the assertion
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
When a transaction fails in certification phase, it has connsumed one GTID, but as
transaction must rollback, it will not go for commit ordering, and because of this
also the wsrep XID checkpointing can happen out of order.
This PR will make the thread, which has failed for certiication failure to wait for its
commit order turn for checkpointing wsrep IXD in innodb rollback segment.
There is a specific test for wsrep XID checkpointing ordering in mtr test:
mysql-wsrep-bugs-607, which is added in this PR.
Test galera_slave_replay depends also on this fix, as the second test phase
may also assert for bad wsrep XID checkpointing order.
galera_slave_replay.test had also other problems, which caused the test to
fail immediately, thse are now fixes in this PR as well.
A certification failure followed by a clean shutdown would cause an
inconsistency between the sequence number stored in innodb and the
sequence number stored in provider.
This happened both in the case of local certification failure, and in
the case where dummy writeset is applied.
The fix consists of:
- updating wsrep position after dummy writeset is delivered in
`Wsrep_high_priority_service::log_dummy_write_set()`
- updating wsrep position while releasing commit order in wsrep-lib
side
Added two tests which stress the situation where a server is shutdown
after a certification failure.
Test galera_sr_ddl_master would sometimes fail due to leftover
streaming replication fragments. Rollbacker thread would attempt to
open streaming_log table to remove the fragments, but would fail in
check_stack_overrun(). Ultimately the check_stack_overrun() failure
was caused by rollbacker missing to switch the victim's THD thread
stack to rollbacker's thread stack.
Also in this patch:
- Remove duplicate functionality in rollbacker helper functions,
and extract rollbacker fragment removal into function
wsrep_remove_streaming_fragments()
- Reuse open_for_write() in wsrep_schema::remove_fragments
- Partially revert changes to galera_sr_ddl_master test from
commit 44a11a7c08. Removed unnecessary
wait condition and isolation level setting
MariaDB 10.4 was crashing when thread-handling was set to
pool-of-threads and wsrep was enabled.
There were two apparent reasons for the crash:
- Connection handling in threadpool_common.cc was missing calls to
control wsrep client state.
- Thread specific storage which contains thread variables (THR_KEY_mysys)
was not handled appropriately by wsrep patch when pool-of-threads
was configured.
This patch addresses the above issues in the following way:
- Wsrep client state open/close was moved in thd_prepare_connection() and
end_connection() to have common handling for one-thread-per-connection
and pool-of-threads.
- Thread local storage handling in wsrep patch was reworked by introducing
set of wsrep_xxx_threadvars() calls which replace calls to
THD store_globals()/reset_globals() and deal with thread handling
specifics internally.
Wsrep-lib was updated to version which relaxes internal concurrency
related sanity checks.
Rollback code from wsrep_rollback_process() was extracted to separate calls
for better readability.
Post rollback thread was removed as it was completely unused.
Galera versions below 4.x do not generate unique sequence number
for view events. Take this into account when writing the SE checkpoint
to avoid debug assertion in InnoDB.