1
0
mirror of https://github.com/MariaDB/server.git synced 2025-11-30 05:23:50 +03:00
Commit Graph

2352 Commits

Author SHA1 Message Date
unknown
2842f6b5dc MDEV-4506: Parallel replication: error handling.
Add an error code to the wait_for_commit facility.

Now, when a transaction fails, it can signal the error to
any subsequent transaction that is waiting for it to commit.
The waiting transactions then receive the error code back from
wait_for_prior_commit() and can handle the error appropriately.

Also fix one race that could cause crash if @@slave_parallel_threads
were changed several times quickly in succession.
2013-10-14 15:28:16 +02:00
Michael Widenius
2e100cc5a4 Fixes for parallel slave:
- Made slaves temporary table multi-thread slave safe by adding mutex around save_temporary_table usage.
  - rli->save_temporary_tables is the active list of all used temporary tables
  - This is copied to THD->temporary_tables when temporary tables are opened and updated when temporary tables are closed
  - Added THD->lock_temporary_tables() and THD->unlock_temporary_tables() to simplify this.
- Relay_log_info->sql_thd renamed to Relay_log_info->sql_driver_thd to avoid wrong usage for merged code.
- Added is_part_of_group() to mark functions that are part of the next function. This replaces setting IN_STMT when events are executed.
- Added is_begin(), is_commit() and is_rollback() functions to Query_log_event to simplify code.
- If slave_skip_counter is set run things in single threaded mode. This simplifies code for skipping events.
- Updating state of relay log (IN_STMT and IN_TRANSACTION) is moved to one single function: update_state_of_relay_log()
  We can't use OPTION_BEGIN to check for the state anymore as the sql_driver and sql execution threads may be different.
  Clear IN_STMT and IN_TRANSACTION in init_relay_log_pos() and Relay_log_info::cleanup_context() to ensure the flags doesn't survive slave restarts
  is_in_group() is now independent of state of executed transaction.
- Reset thd->transaction.all.modified_non_trans_table() if we did set it for single table row events.
  This was mainly for keeping the flag as documented.
- Changed slave_open_temp_tables to uint32 to be able to use atomic operators on it.
- Relay_log_info::sleep_lock -> rpl_group_info::sleep_lock
- Relay_log_info::sleep_cond -> rpl_group_info::sleep_cond
- Changed some functions to take rpl_group_info instead of Relay_log_info to make them multi-slave safe and to simplify usage
  - do_shall_skip()
  - continue_group()
  - sql_slave_killed()
  - next_event()
- Simplifed arguments to io_salve_killed(), check_io_slave_killed() and sql_slave_killed(); No reason to supply THD as this is part of the given structure.
- set_thd_in_use_temporary_tables() removed as in_use is set on usage
- Added information to thd_proc_info() which thread is waiting for slave mutex to exit.
- In open_table() reuse code from find_temporary_table()

Other things:
- More DBUG statements
- Fixed the rpl_incident.test can be run with --debug
- More comments
- Disabled not used function rpl_connect_master()

mysql-test/suite/perfschema/r/all_instances.result:
  Moved sleep_lock and sleep_cond to rpl_group_info
mysql-test/suite/rpl/r/rpl_incident.result:
  Updated result
mysql-test/suite/rpl/t/rpl_incident-master.opt:
  Not needed anymore
mysql-test/suite/rpl/t/rpl_incident.test:
  Fixed that test can be run with --debug
sql/handler.cc:
  More DBUG_PRINT
sql/log.cc:
  More comments
sql/log_event.cc:
  Added DBUG statements
  do_shall_skip(), continue_group() now takes rpl_group_info param
  Use is_begin(), is_commit() and is_rollback() functions instead of inspecting query string
  We don't have set slaves temporary tables 'in_use' as this is now done when tables are opened.
  Removed IN_STMT flag setting. This is now done in update_state_of_relay_log()
  Use IN_TRANSACTION flag to test state of relay log.
  In rows_event_stmt_cleanup() reset thd->transaction.all.modified_non_trans_table if we had set this before.
sql/log_event.h:
  do_shall_skip(), continue_group() now takes rpl_group_info param
  Added is_part_of_group() to mark events that are part of the next event. This replaces setting IN_STMT when events are executed.
  Added is_begin(), is_commit() and is_rollback() functions to Query_log_event to simplify code.
sql/log_event_old.cc:
  Removed IN_STMT flag setting. This is now done in update_state_of_relay_log()
  do_shall_skip(), continue_group() now takes rpl_group_info param
sql/log_event_old.h:
  Added is_part_of_group() to mark events that are part of the next event.
  do_shall_skip(), continue_group() now takes rpl_group_info param
sql/mysqld.cc:
  Changed slave_open_temp_tables to uint32 to be able to use atomic operators on it.
  Relay_log_info::sleep_lock -> Rpl_group_info::sleep_lock
  Relay_log_info::sleep_cond -> Rpl_group_info::sleep_cond
sql/mysqld.h:
  Updated types and names
sql/rpl_gtid.cc:
  More DBUG
sql/rpl_parallel.cc:
  Updated TODO section
  Set thd for event that is execution
  Use new  is_begin(), is_commit() and is_rollback() functions.
  More comments
sql/rpl_rli.cc:
  sql_thd -> sql_driver_thd
  Relay_log_info::sleep_lock -> rpl_group_info::sleep_lock
  Relay_log_info::sleep_cond -> rpl_group_info::sleep_cond
  Clear IN_STMT and IN_TRANSACTION in init_relay_log_pos() and Relay_log_info::cleanup_context() to ensure the flags doesn't survive slave restarts.
  Reset table->in_use for temporary tables as the table may have been used by another THD.
  Use IN_TRANSACTION instead of OPTION_BEGIN to check state of relay log.
  Removed IN_STMT flag setting. This is now done in update_state_of_relay_log()
sql/rpl_rli.h:
  Changed relay log state flags to bit masks instead of bit positions (most other code we have uses bit masks)
  Added IN_TRANSACTION to mark if we are in a BEGIN ... COMMIT section.
  save_temporary_tables is now thread safe
  Relay_log_info::sleep_lock -> rpl_group_info::sleep_lock
  Relay_log_info::sleep_cond -> rpl_group_info::sleep_cond
  Relay_log_info->sql_thd renamed to Relay_log_info->sql_driver_thd to avoid wrong usage for merged code
  is_in_group() is now independent of state of executed transaction.
sql/slave.cc:
  Simplifed arguments to io_salve_killed(), sql_slave_killed() and check_io_slave_killed(); No reason to supply THD as this is part of the given structure.
  set_thd_in_use_temporary_tables() removed as in_use is set on usage in sql_base.cc
  sql_thd -> sql_driver_thd
  More DBUG
  Added update_state_of_relay_log() which will calculate the IN_STMT and IN_TRANSACTION state of the relay log after the current element is executed.
  If slave_skip_counter is set run things in single threaded mode.
  Simplifed arguments to io_salve_killed(), check_io_slave_killed() and sql_slave_killed(); No reason to supply THD as this is part of the given structure.
  Added information to thd_proc_info() which thread is waiting for slave mutex to exit.
  Disabled not used function rpl_connect_master()
  Updated argument to next_event()
sql/sql_base.cc:
  Added mutex around usage of slave's temporary tables. The active list is always kept up to date in sql->rgi_slave->save_temporary_tables.
  Clear thd->temporary_tables after query (safety)
  More DBUG
  When using temporary table, set table->in_use to current thd as the THD may be different for slave threads.
  Some code is ifdef:ed with REMOVE_AFTER_MERGE_WITH_10 as the given code in 10.0 is not yet in this tree.
  In open_table() reuse code from find_temporary_table()
sql/sql_binlog.cc:
  rli->sql_thd -> rli->sql_driver_thd
  Remove duplicate setting of rgi->rli
sql/sql_class.cc:
  Added helper functions rgi_lock_temporary_tables() and rgi_unlock_temporary_tables()
  Would have been nicer to have these inline, but there was no easy way to do that
sql/sql_class.h:
  Added functions to protect slaves temporary tables
sql/sql_parse.cc:
  Added DBUG_PRINT
sql/transaction.cc:
  Added comment
2013-10-14 00:24:05 +03:00
Sergey Petrunya
e5d13c1567 Merge 10.0-base -> 10.0 2013-10-16 13:38:42 +04:00
Sergey Petrunya
fedf769f0b MDEV-3798: EXPLAIN UPDATE/DELETE
- Address review feedback: rename nearly any name used by the new EXPLAIN code.
2013-10-05 09:58:22 +04:00
unknown
5d1d20e409 MDEV-4506: parallel replication.
Remove some unnecessary mutex locking.
2013-09-23 10:22:46 +02:00
Sergey Petrunya
d998a1635f MDEV-5045: Server crashes in QPF_query::print_explain with log_slow_verbosity='explain'
- Don't print a plan when the statement didn't produce it
- Also, add first testcase. We can't check the EXPLAIN from the slow log itself, though.
2013-09-20 17:45:24 +04:00
Sergei Golubchik
9af177042e 10.0-base merge.
Partitioning/InnoDB changes are *not* merged (they'll come from 5.6)
TokuDB does not compile (not updated to 10.0 SE API)
2013-09-21 10:14:42 +02:00
unknown
36df41ed95 MDEV-4506: Parallel replication.
Add another test case.
2013-09-19 12:45:59 +02:00
Sergey Petrunya
2add402891 MDEV-407: Print EXPLAIN [ANALYZE] in the slow query log
- Initial implementation.
2013-09-19 08:33:58 +04:00
unknown
39794dc72c MDEV-4506: Parallel replication.
Add another test case, using DEBUG_SYNC.
Fix one bug found.
2013-09-17 14:07:21 +02:00
unknown
7781cdb79b MDEV-4506: parallel replication.
Add comments explaining tricky memory barrier semantics and
suggestions for future changes.
2013-09-17 11:33:29 +02:00
unknown
d107bdaa01 MDEV-4506, parallel replication.
Some after-review fixes.
2013-09-13 15:09:57 +02:00
unknown
ada15c7a0f Fix various places where code would work incorrectly if the common_header_len of events is different on master and slave
Patch developed with the help of Pavel Ivanov.

Also fix an uninitialised variable in queue_event().
2013-09-04 12:22:09 +02:00
unknown
f9c2b402f4 MDEV-26: Global transaction ID.
Implement @@gtid_binlog_state. This is the internal state of the binlog
(most recent GTID logged for every domain_id and server_id). This allows
to save the state before RESET MASTER and restore it afterwards.
2013-08-23 14:02:13 +02:00
Sergei Golubchik
b7b5f6f1ab 10.0-monty merge
includes:
* remove some remnants of "Bug#14521864: MYSQL 5.1 TO 5.5 BUGS PARTITIONING"
* introduce LOCK_share, now LOCK_ha_data is strictly for engines
* rea_create_table() always creates .par file (even in "frm-only" mode)
* fix a 5.6 bug, temp file leak on dummy ALTER TABLE
2013-07-21 16:39:19 +02:00
Sergei Golubchik
5f6380adde 10.0-base merge 2013-07-18 16:46:57 +02:00
Sergei Golubchik
97e640b9ae 5.5 merge 2013-07-17 21:24:29 +02:00
Sergei Golubchik
005c7e5421 mysql-5.5.32 merge 2013-07-16 19:09:54 +02:00
unknown
e654be3865 MDEV-4506: Parallel replication: Intermediate commit.
Impement options --binlog-commit-wait-count and
--binlog-commit-wait-usec.

These options permit the DBA to deliberately increase latency
of an individual commit to get more transactions in each
binlog group commit. This increases the opportunity for
parallel replication on the slave, and can also decrease I/O
load on the master.

The options also make it easier to test the parallel
replication with mysql-test-run.
2013-07-05 00:26:15 +02:00
Sergey Petrunya
77584dd6d7 MDEV-4756: 10.0-monty tree: log_state.test fails
- make the test output stable
- make Log_to_csv_event_handler::log_slow() to write the value 
  of thd->thread_id (it didn't, and so 0 was always logged).
2013-07-04 15:45:58 +04:00
Ashish Agarwal
e879caf845 WL#7076: Backporting wl6715 to support both formats in 5.5, 5.6, 5.7
Backporting wl6715 to mysql-5.5
2013-07-02 11:58:39 +05:30
Ashish Agarwal
f5b5e6b951 WL#7076: Backporting wl6715 to support both formats in 5.5, 5.6, 5.7
Backporting wl6715 to mysql-5.5
2013-07-02 11:58:39 +05:30
unknown
7e5dc4f074 MDEV-4506: Parallel replication. Intermediate commit.
Implement facility for the commit in one thread to wait for the commit of
another to complete first. The wait is done in a way that does not hinder
that a waiter and a waitee can group commit together with a single fsync()
in both binlog and InnoDB. The wait is done efficiently with respect to
locking.

The patch was originally made to support TaoBao parallel replication with
in-order commit; now it will be adapted to also be used for parallel
replication of group-committed transactions.

A waiter THD registers itself with a prior waitee THD. The waiter will then
complete its commit at the earliest in the same group commit of the waitee
(when using binlog). The wait can also be done explicitly by the waitee.
2013-06-26 12:10:35 +02:00
unknown
26a9fbc416 MDEV-4506: Parallel replication of group-committed transactions: Intermediate commit
First very rough sketch. We spawn and retire a pool of slave threads.
Test main.alias works, most likely not much else does.
2013-06-24 10:50:25 +02:00
Vladislav Vaintroub
c61c1d7ce1 MDEV-4576 : Aria storage engine's temporary files might not be deleted (Errcode : 13)
See also MySQL Bug #39750  and similar ones.

Fix my_delete() on Windows, to safely remvoe files on Windows, including files  that are opened by another threads in the same process, antiviruses and  backup applications. If file to be deleted is  also  opened by another thread, the file is renamed to unique name prior to deletion - this makes it possible to create file with the same name right after deletion.
With this patch my_delete_allow_opened() becomes obsolete and is replaced with my_delete().

This patch is rework of the patch  http://lists.mysql.com/commits/59327  for MySQL bug#39750.
2013-06-16 22:13:26 +02:00
Michael Widenius
5f1f2fc0e4 Applied all changes from Igor and Sanja 2013-06-15 18:32:08 +03:00
Sergei Golubchik
72ba95873a 10.0-base merge
(without InnoDB - all InnoDB changes were ignored)
2013-06-06 21:32:29 +02:00
Sergei Golubchik
4749d40c63 5.5 merge 2013-06-06 17:51:28 +02:00
unknown
a0fd7382bc Merge 10.0-base -> 10.0 2013-05-28 15:39:56 +02:00
unknown
ee2b7db3f8 MDEV-4478: Implement GTID "strict mode"
When @@GLOBAL.gtid_strict_mode=1, then certain operations result
in error that would otherwise result in out-of-order binlog files
between servers.

GTID sequence numbers are now allocated independently per domain;
this results in less/no holes in GTID sequences, increasing the
likelyhood that diverging binlogs will be caught by the slave when
GTID strict mode is enabled.
2013-05-28 13:28:31 +02:00
unknown
1cd6eb5f94 MDEV-26: Global transaction ID.
Change of user interface to be more logical and more in line with expectations
to work similar to old-style replication.

User can now explicitly choose in CHANGE MASTER whether binlog position is
taken into account (master_gtid_pos=current_pos) or not (master_gtid_pos=
slave_pos) when slave connects to master.

@@gtid_pos is replaced by three separate variables @@gtid_slave_pos (can
be set by user, replicated GTIDs only), @@gtid_binlog_pos (read only), and
@@gtid_current_pos (a combination of the two, most recent GTID within each
domain). mysql.rpl_slave_state is renamed to mysql.gtid_slave_pos to match.

This fixes MDEV-4474.
2013-05-22 17:36:48 +02:00
unknown
9fae993024 MDEV-26: Global transaction ID.
Implement START SLAVE UNTIL master_gtid_pos = "<GTID position>".

Add test cases, including a test showing how to use this to promote
a new master among a set of slaves.
2013-05-15 19:52:21 +02:00
unknown
e5a0daae5a Fix big problem in previous push. (Relay log cleanup would nuke binlog state) 2013-05-06 14:35:34 +02:00
Michael Widenius
5333dafa84 Fixed errors and compiler warnings found by buildbot
Solaris fixes:
- Fixed that wait_timeout_func and wait_timeout tests works on solaris
- We have to compile without NO_ALARM on Solaris as Solaris doesn't support timeouts on sockets with setsockopt(.. SO_RCVTIMEO).
- Fixed that compile-solaris-amd64-debug works (before that we got a wrong ELF class: ELFCLASS64 on linkage)
- Added missing sync_with_master
Other bug fixes:
- Free memory for rpl_global_gtid_binlog_state before exit() to avoid 'accessing uninitalized mutex' error.



BUILD/FINISH.sh:
  Fixed issues on Solaris with ksh
BUILD/compile-solaris-amd64-debug:
  Added missing -m64 flag
configure.cmake:
  We have to compile without NO_ALARM on Solaris as Solaris doesn't support timeouts on sockets with setsockopt(.. SO_RCVTIMEO)
mysql-test/suite/rpl/t/rpl_gtid_mdev4473.test:
  - Added missing sync_with_master (fix by knielsen)
sql-common/client.c:
  Added () to get rid of compiler warning
sql/item_strfunc.cc:
  Fixed compiler warning
sql/log.cc:
  Free memory for static variable rpl_global_gtid_binlog_state before exit()
  - If we are compiling with safemalloc, we would try to call sf_free() for some members after sf_terminate() was called, which would result of trying to access the uninitalized mutex 'sf_mutex'
sql/multi_range_read.cc:
  Fixed compiler warnings of converting double to ulong.
sql/opt_range.cc:
  Fixed compiler warnings of converting double to ulong or uint
  - Better to have all variables that can be number of rows as 'ha_rows'
sql/rpl_gtid.cc:
  Added rpl_binlog_state::free() to be able to free memory for static objects before exit()
sql/rpl_gtid.h:
  Added rpl_binlog_state::free() to be able to free memory for static objects before exit()
sql/set_var.cc:
  Fixed compiler warning
sql/sql_join_cache.cc:
  Fixed compiler warnings of converting double to uint
sql/sql_show.cc:
  Added cast to get rid of compiler warning
sql/sql_statistics.cc:
  Remove code that didn't do anything.
  (store_record() with record[0] is a no-op)
storage/xtradb/os/os0file.c:
  Added  __attribute__ ((unused))
support-files/compiler_warnings.supp:
  Ignore warnings from atomic_add_64_nv
  (was not able to fix this with a cast as the macro is a bit different between systems)
vio/viosocket.c:
  Added more DBUG_PRINT
2013-05-05 21:39:31 +03:00
Sergei Golubchik
07315d3603 strmake_buf(X,Y) helper, equivalent to strmake(X,Y,sizeof(X)-1)
with a bit of lame protection against abuse.
2013-04-17 19:42:34 +02:00
unknown
0e7410a154 Merge 10.0-base -> 10.0 (GTID). 2013-04-17 15:17:01 +02:00
Sergei Golubchik
a9035be5b7 10.0-base merge 2013-04-15 15:09:22 +02:00
unknown
665a31af2b MDEV-26: Global transaction ID. First alpha release.
Merge of 10.0-mdev26 feature tree into 10.0-base.

Global transaction ID is prepended to each event group in the binlog.

Slave connect can request to start from GTID position instead of specifying
file name/offset of master binlog. This facilitates easy switch to a new
master.

Slave GTID state is stored in a table mysql.rpl_slave_state, which can be
InnoDB to get crash-safe slave state.

GTID includes a replication domain ID, allowing to keep track of distinct
positions for each of multiple masters.
2013-04-15 10:55:27 +02:00
Sergei Golubchik
f57ecb7786 5.5 merge 2013-04-14 10:04:07 +02:00
Sergei Golubchik
bbbd7cedf5 my_dir() cleanup
* replace pointer acrobatics with a struct
* make sorting explicit: MY_DONT_SORT -> MY_WANT_SORT
(if you want something to be done - say it. fixes all places where
my_dir() was used without thinking)
* typo s/number_off_files/number_of_files/
* directory_file_name() doesn't need to be extern
* remove #ifdef __BORLANDC__
* ignore '.' and '..' entries
2013-04-07 15:19:45 +02:00
Sergei Golubchik
2901497b18 MDEV-4243 Warnings/errors while compiling with clang 2013-03-28 20:04:14 +01:00
unknown
b0389850a5 MDEV-26: Global transaction ID.
Test crashing the master, check that it recovers the binlog state.

Fix one bug introduced by previous commit (crash-recoved binlog state was
overwritten by loading stale binlog state file).

Fix Windows build error.
2013-03-27 19:29:59 +01:00
unknown
0fdbdde474 MDEV-26: Global transaction ID.
Implement test case rpl_gtid_stop_start.test to test normal stop and restart
of master and slave mysqld servers.

Fix a couple bugs found with the test:

 - When InnoDB is disabled (no XA), the binlog state was not read when master
   mysqld starts.

 - Remove old code that puts a bogus D-S-0 into the initial binlog state, it
   is not correct in current design.

 - Fix memory leak in gtid_find_binlog_file().
2013-03-27 16:06:45 +01:00
Michael Widenius
068c61978e Temporary commit of 10.0-merge 2013-03-26 00:03:13 +02:00
unknown
22f91eddb1 MDEV-4322: Race in binlog checkpointing during server shutdown.
During server shutdown, we need to wait for binlog checkpointing to
finish in the binlog background thread before closing the binlog.

This was not done, so we could get assert and failure to finish the
final binlog checkpoint if shutdown happened in the middle.
2013-03-25 12:05:27 +01:00
unknown
9bb989a9d1 MDEV-26: Global transaction ID.
Fix MDEV-4275 - I/O thread restart duplicates events in the relay log.
The first time we connect to master after CHANGE MASTER or restart, we connect
from the GTID position. But then subsequent reconnects or IO thread restarts
reconnect with the old-style file/offset binlog pos from where it left off at
last disconnect. This is necessary to avoid duplicate events in the relay
logs, as there is nothing that synchronises the SQL thread update of GTID
state (multiple threads in case of multi-source) with IO thread reconnects.

Test cases.

Some small cleanups and fixes.
2013-03-21 11:03:31 +01:00
Murthy Narkedimilli
8afe262ae5 Fix for Bug 16395495 - OLD FSF ADDRESS IN GPL HEADER 2013-03-19 15:53:48 +01:00
Murthy Narkedimilli
d978016d93 Fix for Bug 16395495 - OLD FSF ADDRESS IN GPL HEADER 2013-03-19 15:53:48 +01:00
unknown
9d9ddad759 MDEV-26: Global transaction ID.
Fix things so that a master can switch with MASTER_GTID_POS=AUTO to a slave
that was previously running with log_slave_updates=0, by looking into the
slave replication state on the master when the slave requests something not
present in the binlog.

Be a bit more strict about what position the slave can ask for, to avoid some
easy-to-hit misconfiguration errors.

Start over with seq_no counter when RESET MASTER.
2013-03-18 15:09:36 +01:00
unknown
2cf3d61fc2 MDEV-26: Global Transaction ID
- Fix skipping initial MyISAM DML when connecting using GTID.

 - Fix RESET MASTER not clearing in-memory binlog state.

 - Fix not reading standalone flag in Gtid_log_event::peek().

 - Fix skipping DDL that the slave has already seen when using GTID position.
2013-02-22 12:31:55 +01:00