If we have a 2+ node cluster which is replicating from an async master
and the binlog_format is set to STATEMENT and multi-row inserts are executed
on a table with an auto_increment column such that values are automatically
generated by MySQL, then the server node generates wrong auto_increment
values, which are different from what was generated on the async master.
In the title of the MDEV-9519 it was proposed to ban start slave on a Galera
if master binlog_format = statement and wsrep_auto_increment_control = 1,
but the problem can be solved without such a restriction.
The causes and fixes:
1. We need to improve processing of changing the auto-increment values
after changing the cluster size.
2. If wsrep auto_increment_control switched on during operation of
the node, then we should immediately update the auto_increment_increment
and auto_increment_offset global variables, without waiting of the next
invocation of the wsrep_view_handler_cb() callback. In the current version
these variables retain its initial values if wsrep_auto_increment_control
is switched on during operation of the node, which leads to inconsistent
results on the different nodes in some scenarios.
3. If wsrep auto_increment_control switched off during operation of the node,
then we must return the original values of the auto_increment_increment and
auto_increment_offset global variables, as the user has set. To make this
possible, we need to add a "shadow copies" of these variables (which stores
the latest values set by the user).
https://jira.mariadb.org/browse/MDEV-9519
This is used for controlling whether to use a new/optimized
certification rules or the old/classic ones that could cause more
certification failures - when foreign keys are used and two INSERTs are
done concurrently to the child table from different nodes.
(cherry picked from commit 815d73e6af8daace6262ab63ca6c043ffc4204b3)
If we have a 2+ node cluster which is replicating from an async master
and the binlog_format is set to STATEMENT and multi-row inserts are executed
on a table with an auto_increment column such that values are automatically
generated by MySQL, then the server node generates wrong auto_increment
values, which are different from what was generated on the async master.
The causes and fixes:
1. We need to improve processing of changing the auto-increment values
after changing the cluster size.
2. If wsrep auto_increment_control switched on during operation of
the node, then we should immediately update the auto_increment_increment
and auto_increment_offset global variables, without waiting of the next
invocation of the wsrep_view_handler_cb() callback. In the current version
these variables retain its initial values if wsrep_auto_increment_control
is switched on during operation of the node, which leads to inconsistent
results on the different nodes in some scenarios.
3. If wsrep auto_increment_control switched off during operation of the node,
then we must return the original values of the auto_increment_increment and
auto_increment_offset global variables, as the user has set. To make this
possible, we need to add a "shadow copies" of these variables (which stores
the latest values set by the user).
The upper 1M limit for max_prepared_stmt_count was set over 10 years
ago. It doesn't suite current hardware and a sysbench oltp_read_write
test with 512 threads will hit this limit.
MDEV--15609 engines/funcs.crash_manytables_number crashes with error 24
(too many open files)
MDEV-10286 Adjustment of table_open_cache according to system limits
does not work when open-files-limit option is provided
Fixed by adjusting tc_size downwards if there is not enough file
descriptors to use.
Other changes:
- Ensure that there is 30 (was 10) extra file descriptors for other usage
- Decrease TABLE_OPEN_CACHE_MIN to 200 as it's better to have a smaller
table cache than getting error 24
- Increase minimum of max_connections and table_open_cache from 1 to 10
as 1 is not usable for any real application, only for testing.
This is 10.1 version where no merge error exists.
wsrep_on_check
New check function. Galera can't be enabled
if innodb-lock-schedule-algorithm=VATS.
innobase_kill_query
In Galera async kill we could own lock mutex.
innobase_init
If Variance-Aware-Transaction-Sheduling Algorithm (VATS) is
used on Galera we refuse to start InnoDB.
Changed innodb-lock-schedule-algorithm as read-only parameter
as it was designed to be.
lock_rec_other_has_expl_req,
lock_rec_other_has_conflicting,
lock_rec_lock_slow
lock_table_other_has_incompatible
lock_rec_insert_check_and_lock
Change pointer to conflicting lock to normal pointer as this
pointer contents could be changed later.
This patch fixes two problems that may arise when changing the
value of wsrep_slave_threads:
1) Threads may be leaked if wsrep_slave_threads is changed
repeatedly. Specifically, when changing the number of slave
threads, we keep track of wsrep_slave_count_change, the
number of slaves to start / stop. The problem arises when
wsrep_slave_count_change is updated before slaves had a
chance to exit (threads may take some time to exit, as they
exit only after commiting one more replication event).
The fix is to update wsrep_slave_count_change such that it
reflects the number of threads that already exited or are
scheduled to exit.
2) Attempting to set out of range value for wsrep_slave_threads
(below 1 / above 512) results in wsrep_slave_count_change to
be computed based on the out of range value, even though a
warning is generated and wsrep_slave_threads is set to a
truncated value. wsrep_slave_count_change is computed in
wsrep_slave_threads_check(), which is called before mysql
checks for valid range. Fix is to update wsrep_count_change
whenever wsrep_slave_threads is updated with a valid value.
The reason for this is that stop slave takes LOCK_active_mi over the
whole operation while some slave operations will also need LOCK_active_mi
which causes deadlocks.
Fixed by introducing object counting for Master_info and not taking
LOCK_active_mi over stop slave or even stop_all_slaves()
Another benefit of this approach is that it allows:
- Multiple threads can run SHOW SLAVE STATUS at the same time
- START/STOP/RESET/SLAVE STATUS on a slave will not block other slaves
- Simpler interface for handling get_master_info()
- Added some missing unlock of 'log_lock' in error condtions
- Moved rpl_parallel_inactivate_pool(&global_rpl_thread_pool) to end
of stop_slave() to not have to use LOCK_active_mi inside
terminate_slave_threads()
- Changed argument for remove_master_info() to Master_info, as we always
have this available
- Fixed core dump when doing FLUSH TABLES WITH READ LOCK and parallel
replication. Problem was that waiting for pause_for_ftwrl was not done
when deleting rpt->current_owner after a force_abort.
Tasks:-
Changes in wsrep_dirty_reads variable
1.) Global + Session scope (Current: session-only)
2.) Can be set using command line.
3.) Allow all commands that do not change data (besides SELECT)
4.) Allow prepared Statements that do not change data
5.) Works with wsrep_sync_wait enabled