read_statistics_for_tables_if_needed
Regression after 279a907, read_statistics_for_tables_if_needed() was
called after open_normal_and_derived_tables() failure.
Fixed by moving read_statistics_for_tables() call to a branch of
get_schema_stat_record() where result of open_normal_and_derived_tables()
is checked.
Removed THD::force_read_stats, added read_statistics_for_tables() instead.
Simplified away statistics_for_command_is_needed().
MDEV-17614 flags INSERT…ON DUPLICATE KEY UPDATE unsafe for statement-based
replication when there are multiple unique indexes. This correctly fixes
something whose attempted fix in MySQL 5.7
in mysql/mysql-server@c93b0d9a97
caused lock conflicts. That change was reverted in MySQL 5.7.26
in mysql/mysql-server@066b6fdd43
(with a substantial amount of other changes).
In MDEV-17073 we already disabled the unfortunate MySQL change when
statement-based replication was not being used. Now, thanks to MDEV-17614,
we can actually remove the change altogether.
This reverts commit 8a346f31b913daa011085afec2b2d38450c73e00 (MDEV-17073)
and mysql/mysql-server@c93b0d9a97 while
keeping the test cases.
Problem:-
When mysql executes INSERT ON DUPLICATE KEY INSERT, the storage engine checks
if the inserted row would generate a duplicate key error. If yes, it returns
the existing row to mysql, mysql updates it and sends it back to the storage
engine.When the table has more than one unique or primary key, this statement
is sensitive to the order in which the storage engines checks the keys.
Depending on this order, the storage engine may determine different rows
to mysql, and hence mysql can update different rows.The order that the
storage engine checks keys is not deterministic. For example, InnoDB checks
keys in an order that depends on the order in which indexes were added to
the table. The first added index is checked first. So if master and slave
have added indexes in different orders, then slave may go out of sync.
Solution:-
Make INSERT...ON DUPLICATE KEY UPDATE unsafe while using stmt or mixed format
When there is more then one unique key.
Although there is two exception.
1. Auto Increment key is not counted because Innodb will get gap lock for
failed Insert and concurrent insert will get a next increment value. But if
user supplies auto inc value it can be unsafe.
2. Count only unique keys for which insertion is performed.
So this patch also addresses the bug id #72921
Plugin fixed to not lock the LOCK_operations when not active.
Server fixed to lock the LOCK_plugin less - do it once per
thread and then only if a plugin was installed/uninstalled.
Originally introduced by e972125f1 to avoid harmless wait for
LOCK_global_system_variables in a newly created thread, which creation was
initiated by system variable update.
At the same time it opens dangerous hole, when system variable update
thread already released LOCK_global_system_variables and ack_receiver
thread haven't yet completed new THD construction. In this case THD
constructor goes completely unprotected.
Since ack_receiver.stop() waits for the thread to go down, we have to
temporarily release LOCK_global_system_variables so that it doesn't
deadlock with ack_receiver.run(). Unfortunately it breaks atomicity
of rpl_semi_sync_master_enabled updates and makes them not serialized.
LOCK_rpl_semi_sync_master_enabled was introduced to workaround the above.
TODO: move ack_receiver start/stop into repl_semisync_master
enable_master/disable_master under LOCK_binlog protection?
Part of MDEV-14984 - regression in connect performance
Rather than parsing session_track_system_variables when thread starts, do
it when first trackable event occurs.
Benchmarked on a 2socket/20core/40threads Broadwell system using sysbench
connect brencmark @40 threads (with select 1 disabled):
101379.77 -> 143016.68 CPS, whereas 10.2 is currently at 137766.31 CPS.
Part of MDEV-14984 - regression in connect performance
One less new/delete per connection.
Removed m_mem_flag since most allocs are thread specific. The only
exception are allocs performed during initialization.
Removed State_tracker and Session_tracker constructors as they don't make
sense anymore.
No reason to access session_sysvars_tracker via get_tracker(), so access
it directly instead.
Part of MDEV-14984 - regression in connect performance
If a derived table has SELECT DISTINCT, provide index statistics for it so that the join optimizer in the
upper select knows that ref access to the table will produce one row.
depends on uninitialised value
Initialized THD::force_read_stats introduced in the patch for MDEV-17605.
Leaving this field uninitialized in the constructor of the THD class may
trigger reading statistical data that is not needed.
depends on uninitialised value
Initialized THD::force_read_stats introduced in the patch for MDEV-17605.
Leaving this field uninitialized in the constructor of the THD class may
trigger reading statistical data that is not needed.
Let xid_cache_insert()/xid_cache_delete() handle xa_state.
Let session tracker use is_explicit_XA() rather than xa_state != XA_NOTR.
Fixed open_tables() to refuse data access in XA_ROLLBACK_ONLY state.
Removed dead code from THD::cleanup(). It was supposed to be a reminder,
but it got messed up over time.
spider_internal_start_trx() is called either with XA_NOTR or XA_ACTIVE,
which is guarded by server callers. Thus is_explicit_XA() is acceptable
replacement for XA_ACTIVE check (which was likely wrong anyway).
Setting xa_state to XA_PREPARED in spider_internal_xa_prepare() isn't
meaningful, as this value is never accessed later. It can't be accessed
by current thread and it can't be recovered either. It can only be
accessed by spider internally, which never happens.
Make spider_xa_lock()/spider_xa_unlock() static.
Part of MDEV-7974 - backport fix for mysql bug#12161 (XA and binlog)
XID_STATE::rm_error is never used by internal 2PC, it is intended to be
used by explicit XA only.
Also removed redundant xid reset from THD::init_for_queries(). Must've
been done already either by THD::transaction constructor or by
THD::cleanup().
Part of MDEV-7974 - backport fix for mysql bug#12161 (XA and binlog)
- Changed replaying to always allocate a separate THD object
for applying log events. This is to avoid tampering original
THD state during replay process.
- Return success from sp_instr_stmt::exec_core() if replaying
succeeds.
- Do not push warnings/errors into diagnostics area if the
transaction must be replayed. This is to avoid reporting
transient errors to the client.
Added two tests galera_sp_bf_abort, galera_sp_insert_parallel.
Wsrep-lib position updated.
This bug is caused by pushdown from HAVING into WHERE.
It appears because condition that is pushed wasn't fixed.
It is also discovered that condition pushdown from HAVING into
WHERE is done wrong. There is no need to build clones for some
conditions that can be pushed. They can be simply moved from HAVING
into WHERE without cloning.
build_pushable_cond_for_having_pushdown(),
remove_pushed_top_conjuncts_for_having() methods are changed.
It is found that there is no transformation made for fields of
pushed condition.
field_transformer_for_having_pushdown transformer is added.
New tests are added. Some comments are changed.
ignore FK-prelocked tables when looking for write-prelocked tables
with auto-increment to complain about "Statement is unsafe because
it invokes a trigger or a stored function that inserts into an
AUTO_INCREMENT column"
The MDEV-17262 commit 26432e49d37a37d09b862bb49a021e44bdf4789c
was skipped. In Galera 4, the implementation would seem to require
changes to the streaming replication.
In the tests archive.rnd_pos main.profiling, disable_ps_protocol
for SHOW STATUS and SHOW PROFILE commands until MDEV-18974
has been fixed.
This patch contains a fix for the MDEV-17262/17243 issues and
new mtr test.
These issues (MDEV-17262/17243) have two reasons:
1) After an intermediate commit, a transaction loses its status
of "transaction that registered in the MySQL for 2pc coordinator"
(in the InnoDB) due to the fact that since version 10.2 the
write_row() function (which located in the ha_innodb.cc) does
not call trx_register_for_2pc(m_prebuilt->trx) during the processing
of split transactions. It is necessary to restore this call inside
the write_row() when an intermediate commit was made (for a split
transaction).
Similarly, we need to set the flag of the started transaction
(m_prebuilt->sql_stat_start) after intermediate commit.
The table->file->extra(HA_EXTRA_FAKE_START_STMT) called from the
wsrep_load_data_split() function (which located in sql_load.cc)
will also do this, but it will be too late. As a result, the call
to the wsrep_append_keys() function from the InnoDB engine may be
lost or function may be called with invalid transaction identifier.
2) If a transaction with the LOAD DATA statement is divided into
logical mini-transactions (of the 10K rows) and binlog is rotated,
then in rare cases due to the wsrep handler re-registration at the
boundary of the split, the last portion of data may be lost. Since
splitting of the LOAD DATA into mini-transactions is technical,
I believe that we should not allow these mini-transactions to fall
into separate binlogs. Therefore, it is necessary to prohibit the
rotation of binlog in the middle of processing LOAD DATA statement.
https://jira.mariadb.org/browse/MDEV-17262 and
https://jira.mariadb.org/browse/MDEV-17243
* MDEV-16509 Improve wsrep commit performance with binlog disabled
Release commit order critical section early after trx_commit_low() if
binlog is not transaction coordinator. In order to avoid two phase commit,
binlog_hton is not registered for THD during IO_CACHE population.
Implemented a test which verifies that the transactions release
commit order early.
This optimization will change behavior during recovery as the commit
is not two phase when binlog is off. Fixed and recorded wsrep-recover-v25
and wsrep-recover to match the behavior.
* MDEV-18730 Ordering for wsrep binlog group commit
Previously out of order execution was allowed for wsrep commits.
Established proper ordering by populating wait_for_commit
for every wsrep THD and making group commit leader to wait for
prior commits before proceeding to trx_group_commit_leader().
* MDEV-18730 Added a test case to verify correct commit ordering
* MDEV-16509, MDEV-18730 Review fixes
Use WSREP_EMULATE_BINLOG() macro to decide if the binlog_hton
should be registered. Whitespace/syntax fixes and cleanups.
* MDEV-16509 Require binlog for galera_var_innodb_disallow_writes test
If the commit to InnoDB is done in one phase, the native InnoDB behavior
is that the transaction is committed in memory before it is persisted to
disk. This means that the innodb_disallow_writes=ON may not prevent
transaction to become visible to other readers before commit is completely
over. On the other hand, if the commit is two phase (as it is with binlog),
the transaction will be blocked in prepare phase.
Fixed the test to use binlog, which enforces two phase commit, which
in turn makes commit to block before the changes become visible to
other connections. This guarantees that the test produces expected
result.