These test can sporadically show mutex deadlock warnings between LOCK_wsrep_thd
and LOCK_thd_data mutexes. This means that these mutexes can be locked in opposite
order by different threads, and thus result in deadlock situation.
To fix such issue, the locking policy of these mutexes should be revised and
enforced to be uniform. However, a quick code review shows that the number of
lock/unlock operations for these mutexes combined is between 100-200, and all these
mutex invocations should be checked/fixed.
On the other hand, it turns out that LOCK_wsrep_thd is used for protecting access to
wsrep variables of THD (wsrep_conflict_state, wsrep_query_state), whereas LOCK_thd_data
protects query, db and mysys_var variables in THD. Extending LOCK_thd_data to protect
also wsrep variables looks like a viable solution, as there should not be a use case
where separate threads need simultaneous access to wsrep variables and THD data variables.
In this commit LOCK_wsrep_thd mutex is refactored to be replaced by LOCK_thd_data.
By bluntly replacing LOCK_wsrep_thd by LOCK_thd_data, will result in double locking
of LOCK_thd_data, and some adjustements have been performed to fix such situations.
The problem was that the introduction of max-thread-mem-used can cause
an allocation error very early, even before mysql_parse() is called.
As mysql_parse() calls thd->reset_for_next_command(), which called
clear_error(), the error number was lost.
Fixed by adding an option to have unique messages for each KILL
signal and change max-thread-mem-used to use this new feature.
This removes a lot of problems with the original approach, where
one could get errors signaled silenty almost any time.
ixed by moving clear_error() from reset_for_next_command() to
do_command(), before any memory allocation for the thread.
Related changes:
- reset_for_next_command() now have an optional parameter if we should
call clear_error() or not. By default it's called, but not anymore from
dispatch_command() which was the original problem.
- Added optional paramater to clear_error() to force calling of
reset_diagnostics_area(). Before clear_error() only called
reset_diagnostics_area() if there was no error, so we normally
called reset_diagnostics_area() twice.
- This change removed several duplicated calls to clear_error()
when starting a query.
- Reset max_mem_used on COM_QUIT, to protect against kill during
quit.
- Use fatal_error() instead of setting is_fatal_error (cleanup)
- Set fatal_error if max_thead_mem_used is signaled.
(Same logic we use for other places where we are out of resources)
Do not silence uncertain cases, or fix any bugs.
The only functional change should be that ha_federated::extra()
is not calling DBUG_PRINT to report an unhandled case for
HA_EXTRA_PREPARE_FOR_DROP.
- popping PS reprepare observer before BF aborted PS replaying begins
dangling observer will cause failure in open_table() ater on
- test case for this anomaly
Transaction replay causes the THD to re-apply the replication
events from execution, using the same path appliers do. While
applying the log events, the THD's timestamp is set to the
timestamp of the event.
Setting the timestamp explicitly causes function NOW() to
always the timestamp that was set. To avoid this behavior we
reset the timestamp after replaying is done.
different fix. remove old ones, wait for THD to be fully
initialized before continuing with the server startup process.
This reverts commits db2e21b, 13615c5, 3f515a0, 70113ee.
* Wait for aborted thd (victim) to release MDL locks
* Skip aborting an already aborted thd
* Defer setting OK status in case of CTAS
* Minor cosmetic changes
* Added a test case
Annotate_rows event needs to be preserved until the last Rows event has
been applied because after it has been applied thd->query points to the
query stored inside this event.
wsrep-patch uses same connection name for constructing Master_info
objects. As a result all existing wsrep Master_info objects refer
to same rpl_filter object. This could lead to race when multiple
threads try to delete/destruct Master_info object, as they would
all try to delete the same relay_log object.
Fixed by adding a check in Master_info's destructor to not free
the "wsrep" rpl_filter, so that its reused by current & subsequent
wsrep threads and later reclaimed by free_all_rpl_filters() during
server shutdown.
Merged lp:maria/maria-10.0-galera up to revision 3879.
Added a new functions to handler API to forcefully abort_transaction,
producing fake_trx_id, get_checkpoint and set_checkpoint for XA. These
were added for future possiblity to add more storage engines that
could use galera replication.
Analysis: Problem is that we execute galera code when we are actually
executing asyncronoush replication.
Fix: Do not execute galera code if wsrep provider is not set.
This is now otherwise on level wsrep-25.9, but storage/innobase has not been fully merged
wsrep-5.5 is not good source for that, so we probably have to cherry pick innodb changes from wsrep-5.6