wsrep_cluster_address_update() causes LOCK_wsrep_slave_threads
to be locked under LOCK_wsrep_cluster_config, while normally
the order should be the opposite.
Fix: don't protect @@wsrep_cluster_address value with the
LOCK_wsrep_cluster_config, LOCK_global_system_variables is enough.
Only protect wsrep reinitialization with the LOCK_wsrep_cluster_config.
And make it use a local copy of the global @@wsrep_cluster_address.
Also, introduce a helper function that checks whether
wsrep_cluster_address is set and also asserts that it can be safely
read by the caller.
rarely (try --repeat 1000), the following happens:
* from wsrep_bf_abort (when a thread is being killed), wsrep-lib
starts streaming_rollback that wants to
convert_streaming_client_to_applier. wsrep_create_streaming_applier
creates a new THD(). All while the other THD is being killed,
so under LOCK_thd_kill and LOCK_thd_data. In particular, THD::init()
takes LOCK_global_system_variables under LOCK_thd_kill.
* updating @@wsrep_slave_threads takes LOCK_global_system_variables
and LOCK_wsrep_cluster_config (in that order) and invokes
wsrep_slave_threads_update() that takes LOCK_wsrep_slave_threads
* wsrep_replication_process() takes LOCK_wsrep_slave_threads and
invokes wsrep_close_applier(), that does thd->set_killed() which
takes LOCK_thd_kill.
et voilà.
As a fix I copied a workaround from wsrep_cluster_address_update()
to wsrep_slave_threads_update(). It seems to be safe: without mutexes
a race condition is possible and a concurrent SET might change
wsrep_slave_threads, but wsrep_slave_threads_update() always verifies
if there's a need to do something, so it will not run twice in this case,
it'll be a no-op.
Added new enum variable `wsrep_mode` which can be used to turn on WSREP
features which are not part of default behaviour.
Added enum `BINLOG_ROW_FORMAT_ONLY`, `REQUIRED_PRIMARY_KEY` and
`STRICT_REPLICATION`. `wsrep-mode=STRICT_REPLICATION` behaves
like variable `wsrep_strict_ddl`.
Variable wsrep_strict_ddl is deprecated and if set we use
new wsrep_mode setting instead.
Reviewed and improved by: Jan Lindström <jan.lindstrom@mariadb.com>
We need to complete SST if both new and old start positions are
not same as initial positions. If they are initial positions
just set local uuid and seqno.
There were multiple problems here
* wsrep_trx_fragment_size should not be set when wsrep is disabled or provider is not loaded
* wsrep_trx_fragment_unit should not be set when wsrep is disabled or provider is not loaded
* wsrep_debug has no effect if wsrep is disabled or provider is not loaded
* wsrep_start_position should not be set when wsrep is disabled or provider is not loaded any other value than default
* wsrep_start_position should be changed only when we are joiner or initialized
* wsrep_start_position should be allowed to set only a value that exits, thus
we need to add error handling to wsrep_sst_complete
Actual assertion mentioned on MDEV seems to be already fixed but
setting seqno to -2 will trigger a different assertion
mysqld: /home/jan/mysql/10.4-bugs/wsrep-lib/src/server_state.cpp:702: void wsrep::server_state::sst_received(wsrep::client_service&, int): Assertion `state_ == s_joiner || state_ == s_initialized' failed.
Fixed this by not allowing user to set seqno < -1 (-1 is special
seqno meaning undefined and seqno is initialized to it). MariaDB
releases 10.2 and 10.3 already do not allow to set seqno < -1.
Galera parameter wsrep_gtid_domain_id was defined using a class where
actual parameter was not a first member. Fixed this by using normal
variable and assigning this value to class member value.
The setting innodb_lock_schedule_algorithm=VATS that was introduced
in MDEV-11039 (commit 021212b525)
causes conflicting exclusive locks to be incorrectly granted to
two transactions. Specifically, in lock_rec_insert_by_trx_age()
the predicate !lock_rec_has_to_wait_in_queue(in_lock) would hold even
though an active transaction is already holding an exclusive lock.
This was observed between two DELETE of the same clustered index record.
The HASH_DELETE invocation in lock_rec_enqueue_waiting() may be related.
Due to lack of progress in diagnosing the problem, we will remove the
option. The unsafe option was enabled by default between
commit 0c15d1a6ff (MariaDB 10.2.3)
and the parent of
commit 1cc1d0429d (MariaDB 10.2.17, 10.3.9),
and it was deprecated in
commit 295e2d500b (MariaDB 10.2.34).
Some invalid wsrep_provider paths may be interpreted as a valid
directory. For example '/invalid/libgalera_smm.so' with UTF character
set is interpreted as '/', which is a valid directory. A early check
that wsrep_provider should not be a directory fixes it.
If the server is compiled WITH_WSREP=OFF, we should avoid evaluating
conditions on a global variable that is constant.
WSREP_ON_: Renamed from WSREP_ON. Defined only WITH_WSREP=ON.
WSREP_ON: Defined as unlikely(WSREP_ON_).
wsrep_on(): Defined as WSREP_ON && wsrep_service->wsrep_on_func().
The reason why we have wsrep_on() at all is that the macro WSREP(thd)
depends on the definition of THD, and that is intentionally an opaque
data type for InnoDB. So, we cannot avoid invoking wsrep_on(), but
we can evaluate the less expensive condition WSREP_ON before calling
the function.
We need to release global system variables mutex before
doing wsrep_init to avoid race with next show status and
we need to save wsrep_on value as it is changed on wsrep_init.
Added test case.
Support for galera GTID consistency thru cluster. All nodes in cluster
should have same GTID for replicated events which are originating from cluster.
Cluster originating commands need to contain sequential WSREP GTID seqno
Ignore manual setting of gtid_seq_no=X.
In master-slave scenario where master is non galera node replicated GTID is
replicated and is preserved in all nodes.
To have this - domain_id, server_id and seqnos should be same on all nodes.
Node which bootstraps the cluster, to achieve this, sends domain_id and
server_id to other nodes and this combination is used to write GTID for events
that are replicated inside cluster.
Cluster nodes that are executing non replicated events are going to have different
GTID than replicated ones, difference will be visible in domain part of gtid.
With wsrep_gtid_domain_id you can set domain_id for WSREP cluster.
Functions WSREP_LAST_WRITTEN_GTID, WSREP_LAST_SEEN_GTID and
WSREP_SYNC_WAIT_UPTO_GTID now works with "native" GTID format.
Fixed galera tests to reflect this chances.
Add variable to manually update WSREP GTID seqno in cluster
Add variable to manipulate and change WSREP GTID seqno. Next command
originating from cluster and on same thread will have set seqno and
cluster should change their internal counter to it's value.
Behavior is same as using @@gtid_seq_no for non WSREP transaction.
Problem was that tests select INFORMATION_SCHEMA.PROCESSLIST processes
from user system user and empty state. Thus, there is not clear
state for slave threads.
Changes:
- Added new status variables that store current amount of applier threads
(wsrep_applier_thread_count) and rollbacker threads
(wsrep_rollbacker_thread_count). This will make clear how many slave threads
of certain type there is.
- Added THD state "wsrep applier idle" when applier slave thread is
waiting for work. This makes finding slave/applier threads easier.
- Added force-restart option for mtr to always restart servers between tests
to avoid race on start of the test
- Added wait_condition_with_debug to wait until the passed statement returns
true, or the operation times out. If operation times out, the additional error
statement will be executed
Changes to be committed:
new file: mysql-test/include/force_restart.inc
new file: mysql-test/include/wait_condition_with_debug.inc
modified: mysql-test/mysql-test-run.pl
modified: mysql-test/suite/galera/disabled.def
modified: mysql-test/suite/galera/r/MW-336.result
modified: mysql-test/suite/galera/r/galera_kill_applier.result
modified: mysql-test/suite/galera/r/galera_var_slave_threads.result
new file: mysql-test/suite/galera/t/MW-336.cnf
modified: mysql-test/suite/galera/t/MW-336.test
modified: mysql-test/suite/galera/t/galera_kill_applier.test
modified: mysql-test/suite/galera/t/galera_parallel_autoinc_largetrx.test
modified: mysql-test/suite/galera/t/galera_parallel_autoinc_manytrx.test
modified: mysql-test/suite/galera/t/galera_var_slave_threads.test
modified: mysql-test/suite/wsrep/disabled.def
modified: mysql-test/suite/wsrep/r/variables.result
modified: mysql-test/suite/wsrep/t/variables.test
modified: sql/mysqld.cc
modified: sql/wsrep_mysqld.cc
modified: sql/wsrep_mysqld.h
modified: sql/wsrep_thd.cc
modified: sql/wsrep_var.cc
Refactored wsrep patch to not use LOCK_thread_count and COND_thread_count anymore.
This has partially been replaced by using old LOCK_wsrep_slave_threads mutex.
For slave thread count change waiting, new COND_wsrep_slave_threads signal has been added
Added LOCK_wsrep_cluster_config mutex to control that cluster address change cannot happen in parallel
Protected wsrep_slave_threads variable changes with LOCK_cluster_config mutex
This is for avoiding concurrent slave thread count and cluster joining operations to happen
Fixes according to Teemu's review
Global variable wsrep_debug now can be used to filter wsrep-lib messages based on debug level provided.
Type of wsrep_debug is now set to be unsigned int, so tests and configuration files changed accordingly.
Problem was that controlling connection i.e. connection that
executed the query SET GLOBAL wsrep_reject_queries = ALL_KILL;
was also killed but server would try to send result from that
query to controlling connection resulting a assertion
mysqld: /home/jan/mysql/10.2-sst/include/mysql/psi/mysql_socket.h:738: inline_mysql_socket_send: Assertion `mysql_socket.fd != -1' failed.
as socket was closed when controlling connection was closed.
wsrep_close_client_connections()
Do not close controlling connection and instead of
wsrep_close_thread() we do now soft kill by THD::awake
wsrep_reject_queries_update()
Call wsrep_close_client_connections using current thd.
Problem was that controlling connection i.e. connection that
executed the query SET GLOBAL wsrep_reject_queries = ALL_KILL;
was also killed but server would try to send result from that
query to controlling connection resulting a assertion
mysqld: /home/jan/mysql/10.2-sst/include/mysql/psi/mysql_socket.h:738: inline_mysql_socket_send: Assertion `mysql_socket.fd != -1' failed.
as socket was closed when controlling connection was closed.
wsrep_close_client_connections()
Do not close controlling connection and instead of
wsrep_close_thread() we do now soft kill by THD::awake
wsrep_reject_queries_update()
Call wsrep_close_client_connections using current thd.