1
0
mirror of https://github.com/MariaDB/server.git synced 2025-07-21 21:22:27 +03:00
Commit Graph

163 Commits

Author SHA1 Message Date
0c1f97b3ab MDEV-15152 Optimistic parallel slave doesnt cope well with START SLAVE UNTIL
The immediate bug was caused by a failure to recognize a correct
position to stop the slave applier run in optimistic parallel mode.
There were the following set of issues that the analysis unveil.
1 incorrect estimate for the event binlog position passed to
  is_until_satisfied
2 wait for workers to complete by the driver thread did not account non-group events
  that could be left unprocessed and thus to mix up the last executed
  binlog group's file and position:
  the file remained old and the position related to the new rotated file
3 incorrect 'slave reached file:pos' by the parallel slave report in the error log
4 relay log UNTIL missed out the parallel slave branch in
  is_until_satisfied.

The patch addresses all of them to simplify logics of log change
notification in either the master and relay-log until case.
P.1 is addressed with passing the event into is_until_satisfied()
for proper analisis by the function.
P.2 is fixed by changes in handle_queued_pos_update().
P.4 required removing relay-log change notification by workers.
Instead the driver thread updates the notion of the current relay-log
fully itself with aid of introduced
bool Relay_log_info::until_relay_log_names_defer.

An extra print out of the requested until file:pos is arranged
with --log-warning=3.
2020-05-26 18:49:43 +03:00
f203245e9e Merge remote-tracking branch 'origin/10.1' into 10.2 2019-10-01 07:11:54 +04:00
9b80f9300d MDEV-20645: Replication consistency is broken as workers miss the error notification from an earlier failed group.
Analysis:
========
In general if there are three groups.
1 - Inserts 32 which fails due to local entry '32' on slave.
2 - Inserts 33
3 - Inserts 34

Each group considers itself as a waiter and it waits for prior group 'waitee'.
This is done in 'register_wait_for_prior_event_group_commit'. If there is no
other parallel group being scheduled then no waitee will be there.

Let us assume 3 groups are being scheduled in parallel.

3-> waits for 2-> waits for->1

'1' upon completion it checks is there any registered subsequent waiter. If
so it wakes up the subsequent waiter with its execution status. This execution
status is stored in wakeup_error.

If '1' failed then it sends corresponding wakeup_error to 2. Then '2' aborts
and it propagates error to '3'.  So all further commits are aborted.  This
mechanism works only when all transactions reach a stage where they are
waiting for their prior commit to complete.

In case of optimistic following scenario occurs.

1,2,3 are scheduled in parallel.

3 - Reaches group_commit_code waits for 2 to complete.
1 - errors out sets stop_on_error_sub_id=1.

When a group execution results in error its corresponding sub_id is set to
'stop_on_error_sub_id'. Any new groups queued for execution will check if
their sub_id is > stop_on_error_sub_id.  If it is true their execution will be
skipped as prior group execution failed.  'skip_event_group=1' will be set.
Since the execution of SQL thread is about to stop we just skip execution of
all the following event groups.  We still do all the normal waiting and wakeup
processing between the event groups as a simple way to ensure that everything
is stopped and cleaned up correctly.

Upon error '1' transaction checks for registered waiters. Since no one is
there it simply goes away.

2 - Starts the execution. It checks do I have a waitee.

Since wait_commit_sub_id == entry->last_committed_sub_id no waitee is set.

Secondly: 'entry->stop_on_error_sub_id' is set by '1'st execution.  Now
'handle_parallel_thread' code checks if the current group 'sub_id' is greater
than the 'sub_id' set within 'stop_on_error_sub_id'.

Since the above is true 'skip_event_group=true' is set.  Simply call
'wait_for_prior_commit' to wakeup all waiters.  Group '2' didn't had any
waitee and its execution is skipped.  Hence its wakeup_error=0.It sends a
positive wakeup signal to '3'. Which commits. This results in a missed
transaction. i.e 33 is missed and 34 is committed.

Fix:
===
When a worker learns that an earlier transaction execution has failed, and it
should not proceed for further execution, it should mark its own execution
status as failed so that it alerts its followers to abort as well.
2019-09-30 13:22:37 +05:30
07815d9555 Merge 10.1 into 10.2 2018-10-11 08:16:08 +03:00
f517d8c742 MDEV-17346 parallel slave start and stop races to workers disappeared
The bug appears as a slave SQL thread hanging in
rpl_parallel_thread_pool::get_thread() while there are no slave worker
threads to awake it.

The reason of the hang is that at the parallel slave worker pool
activation the being stared SQL thread could read the worker pool size
concurrently with pool deactivation. At reading the SQL thread did not
employ necessary protection from a race.

Fixed with making the SQL thread at the pool activation first
to grab the same lock as potential deactivator also does prior
to access the pool size.
2018-10-08 19:46:34 +03:00
30019a48bf MDEV-12746 rpl.rpl_parallel_optimistic_nobinlog fails committing
out of order at retry

The test failures were of two sorts. One is that the number of retries
what the slave thought as a temporary error exceeded
the default value of the slave retry option.
The 2nd issue was an out of order commit by transactions that
were supposed to error out instead.
Both issues are caused by the same reason that the post-temporary-error
retry did not check possibly already existing error status.

This is mended with refining conditions to retry. Specifically, a retrying
worker checks `rpl_parallel_entry::stop_on_error_sub_id` that
a potential failing predecessor could set to its own sub id.
Now should the member be set the retrying follower errors out with
ER_PRIOR_COMMIT_FAILED.
2018-03-13 12:46:07 +02:00
7fc25cfbca Fix for MDEV-12730
Assertion `count > 0' failed in rpl_parallel_thread_pool::
get_thread, rpl.rpl_parallel failed in buildbot

The reason for this is that one thread can call
rpl_parallel_resize_pool_if_no_slaves() while
another thread calls at the same time
rpl_parallel_activate_pool(). If rpl_parallel_active_pool() is
called before rpl_parallel_resize_pool_if_no_slaves() has
finished, pool->count will be set to 0 even if there exists
active slave threads.

Added a mutex lock in rpl_parallel_activate_pool() to protect against this scenario, which seams to fix this issue.
2018-01-24 23:33:21 +02:00
7354dc6773 MDEV-13384 - misc Windows warnings fixed 2017-09-28 17:20:46 +00:00
cb1e76e4de Merge branch '10.1' into 10.2 2017-08-17 11:38:34 +02:00
74543698a7 MDEV-13179 main.errors fails with wrong errno
The problem was that the introduction of max-thread-mem-used can cause
an allocation error very early, even before mysql_parse() is called.
As mysql_parse() calls thd->reset_for_next_command(), which called
clear_error(), the error number was lost.

Fixed by adding an option to have unique messages for each KILL
signal and change max-thread-mem-used to use this new feature.
This removes a lot of problems with the original approach, where
one could get errors signaled silenty almost any time.

ixed by moving clear_error() from reset_for_next_command() to
do_command(), before any memory allocation for the thread.

Related changes:
- reset_for_next_command() now have an optional parameter if we should
  call clear_error() or not. By default it's called, but not anymore from
  dispatch_command() which was the original problem.
- Added optional paramater to clear_error() to force calling of
  reset_diagnostics_area(). Before clear_error() only called
  reset_diagnostics_area() if there was no error, so we normally
  called reset_diagnostics_area() twice.
- This change removed several duplicated calls to clear_error()
  when starting a query.
- Reset max_mem_used on COM_QUIT, to protect against kill during
  quit.
- Use fatal_error() instead of setting is_fatal_error (cleanup)
- Set fatal_error if max_thead_mem_used is signaled.
  (Same logic we use for other places where we are out of resources)
2017-08-07 03:48:58 +03:00
f740d23ce6 Merge 10.1 into 10.2 2017-04-28 12:22:32 +03:00
8c38147cdd Merge 10.0 into 10.1 2017-04-21 12:46:12 +03:00
88613e1df6 MDEV-11201: gtid_ignore_duplicates incorrectly ignores statements when GTID replication is not enabled
When master_use_gtid=no, the IO thread loads the slave GTID state from
the master during connect. This races with the SQL thread when
gtid_ignore_duplicates=1. If an event is in the relay log from before
the new connect and has not been applied yet, moving the slave
position causes the SQL thread to think that event should be skipped
due to gtid_ignore_duplicates=1.

This patch simply disables gtid_ignore_duplicates when not using GTID,
which seems to be what one would expect.
2017-04-10 07:53:27 +02:00
da4d71d10d Merge branch '10.1' into 10.2 2017-03-30 12:48:42 +02:00
09a2107b1b Merge branch '10.0' into 10.1 2017-03-21 19:20:44 +01:00
e7f55fde88 Removed wrong assert
The following is an updated commit message for the following commit
that was pushed before I had a chance to update the commit message:
c5e25c8b40

Fixed dead locks when doing stop slave while slave was starting.

- Added a separate lock for protecting start/stop/reset of a specific slave.
  This solves some possible dead locks when one calls stop slave while
  the slave is starting as the old run_locks was over used for other things.
- Set hash->records to 0 before calling free of all hash elements.
  This was set to stop concurrent threads to loop over hash elements and
  access members that was already freed.
  This was a problem especially in start_all_slaves/stop_all_slaves
  as the mutex protecting the hash was temporarily released while a slave
  was started/stopped.
- Because of change to hash->records during hash_reset(),
  any_slave_sql_running() will return 1 during shutdown as one can't
  loop over master_info_index->master_info_hash while hash_reset() of it
  is in progress.
  This also fixes a potential old bug in any_slave_sql_running() where
  during shutdown and ~Master_info_index(), my_hash_free() we could
  potentially try to access elements that was already freed.
2017-03-16 14:21:32 +02:00
adc91387e3 Merge 10.0 into 10.1 2017-03-03 13:27:12 +02:00
84ed5e1d5f Fixed hang doing FLUSH TABLES WITH READ LOCK and parallel replication
The problem was that waiting for pause_for_ftwrl was done before
event_group was completed. This caused rpl_pause_for_ftwrl() to wait
forever during FLUSH TABLES WITH READ LOCK.
Now we only wait for FLUSH TABLES WITH READ LOCK when we are changing
to a new event group.
2017-02-28 16:10:47 +01:00
c5e25c8b40 Added a separate lock for start/stop/reset slave.
This solves some possible dead locks when one calls stop slave while slave
is starting.
2017-02-28 16:10:46 +01:00
e65f667bb6 MDEV-9573 'Stop slave' hangs on replication slave
The reason for this is that stop slave takes LOCK_active_mi over the
whole operation while some slave operations will also need LOCK_active_mi
which causes deadlocks.

Fixed by introducing object counting for Master_info and not taking
LOCK_active_mi over stop slave or even stop_all_slaves()

Another benefit of this approach is that it allows:
- Multiple threads can run SHOW SLAVE STATUS at the same time
- START/STOP/RESET/SLAVE STATUS on a slave will not block other slaves
- Simpler interface for handling get_master_info()
- Added some missing unlock of 'log_lock' in error condtions
- Moved rpl_parallel_inactivate_pool(&global_rpl_thread_pool) to end
  of stop_slave() to not have to use LOCK_active_mi inside
  terminate_slave_threads()
- Changed argument for remove_master_info() to Master_info, as we always
  have this available
- Fixed core dump when doing FLUSH TABLES WITH READ LOCK and parallel
  replication. Problem was that waiting for pause_for_ftwrl was not done
  when deleting rpt->current_owner after a force_abort.
2017-02-28 16:10:46 +01:00
4a5d25c338 Merge branch '10.1' into 10.2 2016-12-29 13:23:18 +01:00
2f20d297f8 Merge branch '10.0' into 10.1 2016-12-11 09:53:42 +01:00
390f2a013b Fix incorrect reading of events from relaylog in parallel replication.
The SQL thread keeps track of the position in the current relay log from
which to read the next event. This position is not normally used, but a
certain interaction with the IO thread can cause the SQL thread to re-open
the relay log and seek to the stored position.

In parallel replication, there were a couple of places where the position
was not updated. This created a race where a re-open of the relay log could
seek to the wrong position and start re-reading and processing events
already handled once, causing various kinds of problems.

Fix this by moving the position update into a single place in
apply_event_and_update_pos(), which should ensure that the position is
always updated in the parallel replication case.

This problem was found from the testcase of MDEV-10863, but it is logically
a separate problem.
2016-11-16 11:00:38 +01:00
c06bc66816 MDEV-11065: Compressed binary log
Minor review comments/changes:

 - A bunch of style-fixes.

 - Change macros to static inline functions.

 - Update check_event_type() with compressed event types.

 - Small .result file update.
2016-10-20 18:00:59 +02:00
e1ef99c3dc MDEV-7145: Delayed replication
Merge feature into 10.2 from feature branch.

Delayed replication adds an option

  CHANGE MASTER TO master_delay=<seconds>

Replication will then delay applying events with that many
seconds. This creates a replication slave that reflects the state of
the master some time in the past.

Feature is ported from MySQL source tree.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2016-10-16 23:44:44 +02:00
3011060b2a MDEV-7145: Delayed slave.
Extend to work also for parallel replication.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2016-10-14 23:15:59 +02:00
50f19ca809 Remove unnecessary global mutex in parallel replication.
The function apply_event_and_update_pos() is called with the
rli->data_lock mutex held. However, there seems to be nothing in the
function actually needing the mutex to be held. Certainly not in the
parallel replication case, where sql_slave_skip_counter is always 0
since the non-zero case is handled by the SQL driver thread.

So this patch makes parallel replication use a variant of
apply_event_and_update_pos() without the need to take the
rli->data_lock mutex. This avoids one contended global mutex for each
event executed, which might improve performance on CPU-bound workloads
somewhat.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2016-10-14 22:44:40 +02:00
ec47beaba6 Merge parallel replication async deadlock kill into 10.2.
Conflicts:
	sql/mysqld.cc
	sql/slave.cc
2016-09-09 12:15:53 +02:00
7e0c9de864 Parallel replication async deadlock kill
When a deadlock kill is detected inside the storage engine, the kill
is not done immediately, to avoid calling back into the storage engine
kill_query method with various lock subsystem mutexes held. Instead the
kill is queued and done later by a slave background thread.

This patch in preparation for fixing TokuDB optimistic parallel
replication, as well as for removing locking hacks in InnoDB/XtraDB in
10.2.

Signed-off-by: Kristian Nielsen <knielsen at knielsen-hq.org>
2016-09-08 15:25:40 +02:00
96e95b5465 Better SHOW PROCESSLIST for replication
- When waiting for events, start time is now counted from start of wait
- Instead of having "Connect" as "Command" for all replication threads we
  now have:
  - Slave_IO for Slave thread reading relay log
  - Slave_SQL for slave executing SQL commands or distribution queries to
    Slave workers
  - Slave_worker for slave threads executin SQL commands in parallel replication
2016-08-29 13:10:17 +03:00
89685d55d7 Reuse THD for new user connections
- To ensure that mallocs are marked for the correct THD, even if it's
  allocated in another thread, I added the thread_id to the THD constructor
- Added st_my_thread_var to thr_lock_info_init() to avoid a call to my_thread_var
- Moved things from THD::THD() to THD::init()
- Moved some things to THD::cleanup()
- Added THD::free_connection() and THD::reset_for_reuse()
- Added THD to CONNECT::create_thd()
- Added THD::thread_dbug_id and st_my_thread_var->dbug_id. These are needed
  to ensure that we have a constant thread_id used for debugging with a THD,
  even if it changes thread_id (=connection_id)
- Set variables.pseudo_thread_id in constructor. Removed not needed sets.
2016-06-04 09:06:00 +02:00
732adec0a4 Removed some not needed when doing delete thd, which caused warnings about
wrong mutex usage from safe_mutex.
Ensure that LOCK_status is always taken before LOCK_thread_count
2016-04-28 13:39:55 +03:00
cdd4043117 Cleanups:
- Removed some QQ markers
- Removed some rows not compatible with valgrind 3.9.0
- Made mysql_install_db.sh more silent by default. --verbose now gives more information
- Added assert that auto-increment doesn't generate 0 (safety)
- Removed thd->set_time() in some places as it's set in init_for_queries()
- Fixed some --big tests in tokudb
- Fixed a bug in mysql_client_test.cc where sql_mode was not properly reset
2016-04-05 18:00:04 +03:00
f67a2211ec Merge branch '10.1' into 10.2 2016-03-23 22:36:46 +01:00
3b0c7ac1f9 Merge branch '10.0' into 10.1 2016-03-21 13:02:53 +01:00
1777fd5f55 Fix spelling: occurred, execute, which etc 2016-03-04 02:09:37 +02:00
3d4a7390c1 MDEV-6150 Speed up connection speed by moving creation of THD to new thread
Creating a CONNECT object on client connect and pass this to the working thread which creates the THD.
Split LOCK_thread_count to different mutexes
Added LOCK_thread_start to syncronize threads
Moved most usage of LOCK_thread_count to dedicated functions
Use next_thread_id() instead of thread_id++

Other things:
- Thread id now starts from 1 instead of 2
- Added cast for thread_id as thread id is now of type my_thread_id
- Made THD->host const (To ensure it's not changed)
- Removed some DBUG_PRINT() about entering/exiting mutex as these was already logged by mutex code
- Fixed that aborted_connects and connection_errors_internal are counted in all cases
- Don't take locks for current_linfo when we set it (not needed as it was 0 before)
2016-02-07 10:34:03 +02:00
a2bcee626d Merge branch '10.0' into 10.1 2015-12-21 21:24:22 +01:00
c3018b0ff4 Fixes to get all test to run on MacosX Lion 10.7
This includes fixing all utilities to not have any memory leaks,
as safemalloc warnings stopped tests from passing on MacOSX.

- Ensure that all clients takes character-set-dir, as the
  libmysqlclient library will use it.
- mysql-test-run now passes character-set-dir to all external clients.
- Changed dynstr_free() so that it can be called twice (made freeing code easier)
- Changed rpl_global_gtid_slave_state to be allocated dynamicly as it
  includes a mutex that needs to be initizlied/destroyed before my_end() is called.
- Removed rpl_slave_state::init() and rpl_slave_stage::deinit() as
  their job are better handling by constructor and delete.
- Print alias instead of table_name in check_duplicate_key as
  table_name may have been converted to lower case.

Other things:
- Fixed a case in time_to_datetime_with_warn() where we where
  using && instead of & in tests
2015-11-29 17:51:23 +02:00
b30a768e7b Fixed failures in rpl_parallel2
Problem was that we used same condition variable with 2 different mutex.
Fixed by changing to use COND_rpl_thread_stop instead of COND_parallel_entry
for stopping threads.

Patch by Kristian Nielsen
2015-11-23 19:58:30 +02:00
8f2e05f41c Merge branch 'mdev7818-4' into 10.1
Conflicts:
	mysql-test/suite/perfschema/r/stage_mdl_global.result
	sql/rpl_rli.cc
	sql/sql_parse.cc
2015-11-13 14:24:40 +01:00
ba02550166 MDEV-7818: Deadlock occurring with parallel replication and FTWRL
Problem is that FLUSH TABLES WITH READ LOCK first blocks threads from
starting new commits, then waits for running commits to complete. But
in-order parallel replication needs commits to happen in a particular
order, so this can easily deadlock.

To fix this problem, this patch introduces a way to temporarily pause
the parallel replication worker threads. Before starting FTWRL, we let
all worker threads complete in-progress transactions, and then
wait. Then we proceed to take the global read lock. Once the lock is
obtained, we unpause the worker threads. Now commits are blocked from
starting by the global read lock, so the deadlock will no longer occur.
2015-11-13 14:02:15 +01:00
6d96fab7dd MDEV-7818: Deadlock occurring with parallel replication and FTWRL
Preparation patch, moving the GCO wait into a separate function, in
preparation for adding a separate wait phase for FLUSH TABLES WITH
READ LOCK.
2015-11-13 14:02:14 +01:00
75dc267101 Change Seconds_behind_master to be updated only at commit in parallel replication
Before, the Seconds_behind_master was updated already when an event
was queued for a worker thread to execute later. This might lead users
to interpret a low value as the slave being almost up to date with the
master, while in reality there might still be lots and lots of events
still queued up waiting to be applied by the slave.

See https://lists.launchpad.net/maria-developers/msg08958.html for
more detailed discussions.
2015-11-13 10:24:53 +01:00
df9b8aee58 Merge MDEV-8193 into 10.1
Conflicts:
	sql/rpl_rli.cc
2015-09-11 12:01:48 +02:00
51eaa7fe53 MDEV-8193: UNTIL clause in START SLAVE is sporadically disobeyed by parallel replication
The code was using the wrong variable when comparing the binlog name
for the UNTIL position. This could cause the comparison to fail after
binlog rotation, in turn causing the UNTIL clause to not trigger slave
stop.
2015-09-11 10:51:56 +02:00
b85a00161e MDEV-8264 encryption for binlog
* Start_encryption_log_event
* --encrypt-binlog command line option

based on google patches.
2015-09-04 10:33:55 +02:00
ef82cb7c2c Merge MDEV-8725 into 10.1 2015-09-02 10:53:37 +02:00
999c43aeb7 MDEV-8725: Assertion `!(thd->rgi_slave && thd-> rgi_slave->did_mark_start_commit)' failed in ha_rollback_trans
The assertion is there to catch cases where we rollback while
mark_start_commit() is active. This can allow following event groups
to be replicated too early, causing conflicts.

But in this case, we have an _explicit_ ROLLBACK event in the binlog,
which should not assert.

We fix this by delaying the mark_start_commit() in the explicit
ROLLBACK case. It seems safest to delay this in ROLLBACK case anyway,
and there should be no reason to try to optimise this corner case.
2015-09-02 09:57:18 +02:00
dbd205797b Merge MDEV-8302 into 10.1 2015-08-04 12:39:22 +02:00