MDEV-17614 flags INSERT…ON DUPLICATE KEY UPDATE unsafe for statement-based
replication when there are multiple unique indexes. This correctly fixes
something whose attempted fix in MySQL 5.7
in mysql/mysql-server@c93b0d9a97
caused lock conflicts. That change was reverted in MySQL 5.7.26
in mysql/mysql-server@066b6fdd43
(with a substantial amount of other changes).
In MDEV-17073 we already disabled the unfortunate MySQL change when
statement-based replication was not being used. Now, thanks to MDEV-17614,
we can actually remove the change altogether.
This reverts commit 8a346f31b913daa011085afec2b2d38450c73e00 (MDEV-17073)
and mysql/mysql-server@c93b0d9a97 while
keeping the test cases.
Problem:-
When mysql executes INSERT ON DUPLICATE KEY INSERT, the storage engine checks
if the inserted row would generate a duplicate key error. If yes, it returns
the existing row to mysql, mysql updates it and sends it back to the storage
engine.When the table has more than one unique or primary key, this statement
is sensitive to the order in which the storage engines checks the keys.
Depending on this order, the storage engine may determine different rows
to mysql, and hence mysql can update different rows.The order that the
storage engine checks keys is not deterministic. For example, InnoDB checks
keys in an order that depends on the order in which indexes were added to
the table. The first added index is checked first. So if master and slave
have added indexes in different orders, then slave may go out of sync.
Solution:-
Make INSERT...ON DUPLICATE KEY UPDATE unsafe while using stmt or mixed format
When there is more then one unique key.
Although there is two exception.
1. Auto Increment key is not counted because Innodb will get gap lock for
failed Insert and concurrent insert will get a next increment value. But if
user supplies auto inc value it can be unsafe.
2. Count only unique keys for which insertion is performed.
So this patch also addresses the bug id #72921
Sources did not compile in some builds because of undeclared
ER_BLOB_KEY_WITHOUT_LENGTH. Moving the implementations of
Key_part_spec::check_key_length_for_blob() from sql_class.h to sql_class.cc
Plugin fixed to not lock the LOCK_operations when not active.
Server fixed to lock the LOCK_plugin less - do it once per
thread and then only if a plugin was installed/uninstalled.
Originally introduced by e972125f1 to avoid harmless wait for
LOCK_global_system_variables in a newly created thread, which creation was
initiated by system variable update.
At the same time it opens dangerous hole, when system variable update
thread already released LOCK_global_system_variables and ack_receiver
thread haven't yet completed new THD construction. In this case THD
constructor goes completely unprotected.
Since ack_receiver.stop() waits for the thread to go down, we have to
temporarily release LOCK_global_system_variables so that it doesn't
deadlock with ack_receiver.run(). Unfortunately it breaks atomicity
of rpl_semi_sync_master_enabled updates and makes them not serialized.
LOCK_rpl_semi_sync_master_enabled was introduced to workaround the above.
TODO: move ack_receiver start/stop into repl_semisync_master
enable_master/disable_master under LOCK_binlog protection?
Part of MDEV-14984 - regression in connect performance
Rather than parsing session_track_system_variables when thread starts, do
it when first trackable event occurs.
Benchmarked on a 2socket/20core/40threads Broadwell system using sysbench
connect brencmark @40 threads (with select 1 disabled):
101379.77 -> 143016.68 CPS, whereas 10.2 is currently at 137766.31 CPS.
Part of MDEV-14984 - regression in connect performance
One less new/delete per connection.
Removed m_mem_flag since most allocs are thread specific. The only
exception are allocs performed during initialization.
Removed State_tracker and Session_tracker constructors as they don't make
sense anymore.
No reason to access session_sysvars_tracker via get_tracker(), so access
it directly instead.
Part of MDEV-14984 - regression in connect performance
If a derived table has SELECT DISTINCT, provide index statistics for it so that the join optimizer in the
upper select knows that ref access to the table will produce one row.
depends on uninitialised value
Initialized THD::force_read_stats introduced in the patch for MDEV-17605.
Leaving this field uninitialized in the constructor of the THD class may
trigger reading statistical data that is not needed.
depends on uninitialised value
Initialized THD::force_read_stats introduced in the patch for MDEV-17605.
Leaving this field uninitialized in the constructor of the THD class may
trigger reading statistical data that is not needed.
Let xid_cache_insert()/xid_cache_delete() handle xa_state.
Let session tracker use is_explicit_XA() rather than xa_state != XA_NOTR.
Fixed open_tables() to refuse data access in XA_ROLLBACK_ONLY state.
Removed dead code from THD::cleanup(). It was supposed to be a reminder,
but it got messed up over time.
spider_internal_start_trx() is called either with XA_NOTR or XA_ACTIVE,
which is guarded by server callers. Thus is_explicit_XA() is acceptable
replacement for XA_ACTIVE check (which was likely wrong anyway).
Setting xa_state to XA_PREPARED in spider_internal_xa_prepare() isn't
meaningful, as this value is never accessed later. It can't be accessed
by current thread and it can't be recovered either. It can only be
accessed by spider internally, which never happens.
Make spider_xa_lock()/spider_xa_unlock() static.
Part of MDEV-7974 - backport fix for mysql bug#12161 (XA and binlog)
XID_STATE::rm_error is never used by internal 2PC, it is intended to be
used by explicit XA only.
Also removed redundant xid reset from THD::init_for_queries(). Must've
been done already either by THD::transaction constructor or by
THD::cleanup().
Part of MDEV-7974 - backport fix for mysql bug#12161 (XA and binlog)
- Changed replaying to always allocate a separate THD object
for applying log events. This is to avoid tampering original
THD state during replay process.
- Return success from sp_instr_stmt::exec_core() if replaying
succeeds.
- Do not push warnings/errors into diagnostics area if the
transaction must be replayed. This is to avoid reporting
transient errors to the client.
Added two tests galera_sp_bf_abort, galera_sp_insert_parallel.
Wsrep-lib position updated.
This bug is caused by pushdown from HAVING into WHERE.
It appears because condition that is pushed wasn't fixed.
It is also discovered that condition pushdown from HAVING into
WHERE is done wrong. There is no need to build clones for some
conditions that can be pushed. They can be simply moved from HAVING
into WHERE without cloning.
build_pushable_cond_for_having_pushdown(),
remove_pushed_top_conjuncts_for_having() methods are changed.
It is found that there is no transformation made for fields of
pushed condition.
field_transformer_for_having_pushdown transformer is added.
New tests are added. Some comments are changed.
ignore FK-prelocked tables when looking for write-prelocked tables
with auto-increment to complain about "Statement is unsafe because
it invokes a trigger or a stored function that inserts into an
AUTO_INCREMENT column"
The MDEV-17262 commit 26432e49d37a37d09b862bb49a021e44bdf4789c
was skipped. In Galera 4, the implementation would seem to require
changes to the streaming replication.
In the tests archive.rnd_pos main.profiling, disable_ps_protocol
for SHOW STATUS and SHOW PROFILE commands until MDEV-18974
has been fixed.
This patch contains a fix for the MDEV-17262/17243 issues and
new mtr test.
These issues (MDEV-17262/17243) have two reasons:
1) After an intermediate commit, a transaction loses its status
of "transaction that registered in the MySQL for 2pc coordinator"
(in the InnoDB) due to the fact that since version 10.2 the
write_row() function (which located in the ha_innodb.cc) does
not call trx_register_for_2pc(m_prebuilt->trx) during the processing
of split transactions. It is necessary to restore this call inside
the write_row() when an intermediate commit was made (for a split
transaction).
Similarly, we need to set the flag of the started transaction
(m_prebuilt->sql_stat_start) after intermediate commit.
The table->file->extra(HA_EXTRA_FAKE_START_STMT) called from the
wsrep_load_data_split() function (which located in sql_load.cc)
will also do this, but it will be too late. As a result, the call
to the wsrep_append_keys() function from the InnoDB engine may be
lost or function may be called with invalid transaction identifier.
2) If a transaction with the LOAD DATA statement is divided into
logical mini-transactions (of the 10K rows) and binlog is rotated,
then in rare cases due to the wsrep handler re-registration at the
boundary of the split, the last portion of data may be lost. Since
splitting of the LOAD DATA into mini-transactions is technical,
I believe that we should not allow these mini-transactions to fall
into separate binlogs. Therefore, it is necessary to prohibit the
rotation of binlog in the middle of processing LOAD DATA statement.
https://jira.mariadb.org/browse/MDEV-17262 and
https://jira.mariadb.org/browse/MDEV-17243