mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-12-24 11:21:21 +03:00

Author	SHA1	Message	Date
Andrei Elkin	c8ae357341	MDEV-742 XA PREPAREd transaction survive disconnect/server restart Lifted long standing limitation to the XA of rolling it back at the transaction's connection close even if the XA is prepared. Prepared XA-transaction is made to sustain connection close or server restart. The patch consists of - binary logging extension to write prepared XA part of transaction signified with its XID in a new XA_prepare_log_event. The concusion part - with Commit or Rollback decision - is logged separately as Query_log_event. That is in the binlog the XA consists of two separate group of events. That makes the whole XA possibly interweaving in binlog with other XA:s or regular transaction but with no harm to replication and data consistency. Gtid_log_event receives two more flags to identify which of the two XA phases of the transaction it represents. With either flag set also XID info is added to the event. When binlog is ON on the server XID::formatID is constrained to 4 bytes. - engines are made aware of the server policy to keep up user prepared XA:s so they (Innodb, rocksdb) don't roll them back anymore at their disconnect methods. - slave applier is refined to cope with two phase logged XA:s including parallel modes of execution. This patch does not address crash-safe logging of the new events which is being addressed by MDEV-21469. CORNER CASES: read-only, pure myisam, binlog-, @@skip_log_bin, etc Are addressed along the following policies. 1. The read-only at reconnect marks XID to fail for future completion with ER_XA_RBROLLBACK. 2. binlog- filtered XA when it changes engine data is regarded as loggable even when nothing got cached for binlog. An empty XA-prepare group is recorded. Consequent Commit-or-Rollback succeeds in the Engine(s) as well as recorded into binlog. 3. The same applies to the non-transactional engine XA. 4. @@skip_log_bin=OFF does not record anything at XA-prepare (obviously), but the completion event is recorded into binlog to admit inconsistency with slave. The following actions are taken by the patch. At XA-prepare: when empty binlog cache - don't do anything to binlog if RO, otherwise write empty XA_prepare (assert(binlog-filter case)). At Disconnect: when Prepared && RO (=> no binlogging was done) set Xid_cache_element::error := ER_XA_RBROLLBACK keep XID in the cache, and rollback the transaction. At XA-"complete": Discover the error, if any don't binlog the "complete", return the error to the user. Kudos ----- Alexey Botchkov took to drive this work initially. Sergei Golubchik, Sergei Petrunja, Marko Mäkelä provided a number of good recommendations. Sergei Voitovich made a magnificent review and improvements to the code. They all deserve a bunch of thanks for making this work done!	2020-03-14 22:45:48 +02:00
Oleksandr Byelkin	fad47df995	Merge branch '10.4' into 10.5	2020-03-11 17:52:49 +01:00
Oleksandr Byelkin	b8c0e49670	Merge commit '10.3' into 10.4	2020-03-11 13:27:10 +01:00
Alexander Barkov	a1e330de5a	MDEV-21743 Split up SUPER privilege to smaller privileges	2020-03-10 23:49:47 +04:00
Sergei Golubchik	7af733a5a2	perfschema compilation, test and misc fixes	2020-03-10 19:24:23 +01:00
Sergei Golubchik	6ded554fc2	perfschema thread instrumentation related changes	2020-03-10 19:24:23 +01:00
Sergei Golubchik	7c58e97bf6	perfschema memory related instrumentation changes	2020-03-10 19:24:22 +01:00
Oleksandr Byelkin	440452628d	Merge branch '10.2' into 10.3	2020-03-06 23:28:26 +01:00
seppo	4618c974e4	MDEV-21723 Async slave thread BF abort and replaying fixes (#1448 ) If async replication slave thread conflicts with cluster replication, then the async slave transaction should be BF aborted, and depending on the state of async slave transaction execution, potentially also replayed. There were problems in such BF abort implementation and the replaying was not started. This pull request contains fixes which make sure that if async slave thread is marked to abort and replay, it will complete carry out the rollback and release all locks and resources before starting the replaying. After replaying, async slave transactions is treated as successful, so the slave thread will continue as usual, handling next replication event. There is also new mtr test: galera.galera_slave_replay, which stresses both a certification failure for async slave thread and a successful BF abort followed by replaying.	2020-02-23 10:29:42 +02:00
Marko Mäkelä	a983b24407	Merge 10.4 into 10.5	2020-01-28 14:17:09 +02:00
Alexander Barkov	f1e13fdc8d	MDEV-21581 Helper functions and methods for CHARSET_INFO	2020-01-28 12:29:23 +04:00
Oleksandr Byelkin	6918157e98	Merge branch '10.3' into 10.4	2020-01-21 23:15:02 +01:00
Andrei Elkin	5cd21ac202	MDEV-20821 parallel slave server shutdown hang Parallel slave server shutdown found to be hanging in close_connections() triggered by shutdown due to a slave worker thread would not be notified to exit in case the worker was sitting idle. Fixed with destroying the worker pool earlier that is in slave_prepare_for_shutdown() when all their driver threads have already left. A test file is added to simulate the bug condition as well as check multi-sourced and not-idle worker cases.	2020-01-21 16:11:52 +02:00
Oleksandr Byelkin	ade89fc898	Merge branch '10.2' into 10.3	2020-01-21 09:11:14 +01:00
Oleksandr Byelkin	3a1716a7e7	Merge branch '10.1' into 10.2	2020-01-20 16:15:05 +01:00
Oleksandr Byelkin	f31bf6f094	Merge branch '5.5' into 10.1	2020-01-19 12:22:12 +01:00
Markus Mäkelä	5683c113b8	Use get_ident_len in heartbeat event error messages The string doesn't appear to be null-terminated when binlog checksums are enabled. This causes a corrupt binlog name in the error message when a slave is ahead of the master.	2020-01-13 14:50:02 +02:00
Marko Mäkelä	28c89b7151	Merge 10.4 into 10.5	2019-12-16 07:47:17 +02:00
Oleksandr Byelkin	a15234bf4b	Merge branch '10.3' into 10.4	2019-12-09 15:09:41 +01:00
Oleksandr Byelkin	008ee867a4	Merge branch '10.2' into 10.3	2019-12-04 17:46:28 +01:00
Jan Lindström	c9b9eb3315	MDEV-18497 : CTAS async replication from mariadb master crashes galera nodes (#1410 ) In MariaDB 10.2 master could have been configured so that there is extra annotate events. When we peak next event type for CTAS we need to skip annotate events.	2019-12-04 11:46:37 +02:00
Oleksandr Byelkin	f8b5e147da	Merge branch '10.1' into 10.2	2019-12-03 14:45:06 +01:00
seppo	38839854b7	MDEV-19572 async slave node fails to apply MyISAM only writes (#1418 ) The problem happens when MariaDB master replicates writes for only non InnoDB tables (e.g. writes to MyISAM table(s)). Async slave node, in Galera cluster, can apply these writes successfully, but it will, in the end, write gtid position in mysql.gtid_slave_pos table. mysql.gtid_slave_pos table is InnoDB engine, and this write makes innodb handlerton part of the replicated "transaction". Note that wsrep patch identifies that write to gtid_slave_pos should not be replicated and skips appending wsrep keys for these writes. However, as InnoDB was present in the transaction, and there are replication events (for MyISAM table) in transaction cache, but there are no appended keys, wsrep raises an error, and this makes the söave thread to stop. The fix is simply to not treat it as an error if async slave tries to replicate a write set with binlog events, but no keys. We just skip wsrep replication and return successfully. This commit contains also a mtr test which forces mysql.gtid_slave_pos table isto be of InnoDB engine, and executes MyISAM only write through asyn replication. There is additional fix for declaring IO and background slave threads as non wsrep. These threads should not write anything for wsrep replication, and this is just a safeguard to make sure nothing leaks into cluster from these slave threads.	2019-11-26 08:49:50 +02:00
seppo	5c68343db7	MDEV-18497 CTAS async replication from mariadb master crashes galera nodes (#1410 ) This PR contains a mtr test for reproducing a failure with replicating create table as select statement (CTAS) through asynchronous mariadb replication to mariadb galera cluster. The problem happens when CTAS replication contains both create table statement followed by row events for populating the table. In such situation, the galera node operating as mariadb replication slave, will first replicate only the create table part into the cluster, and then perform another replication containing both the create table and row events. This will lead all other nodes to fail for duplicate table create attempt, and crash due to this failure. PR contains also a fix, which identifies the situation when CTAS has been replicated, and makes further scan in async replication stream to see if there are following row events. The slave node will replicate either single TOI in case the CTAS table is empty, or if CTAS table contains rows, then single bundled write set with create table and row events is replicated to galera cluster. This fix should keep master server's GTID's for CTAS replication in sync with GTID's in galera cluster.	2019-11-18 15:18:00 +02:00
Marko Mäkelä	d04f2de80a	Merge 10.4 into 10.5	2019-10-11 08:41:36 +03:00
Marko Mäkelä	c11e5cdd12	Merge 10.3 into 10.4	2019-10-10 11:19:25 +03:00
Marko Mäkelä	892378fb9d	Merge 10.2 into 10.3	2019-10-09 13:25:11 +03:00
Sachin Setiya	27664ef29d	MDEV-20574 Position of events reported by mysqlbinlog is wrong with encrypted binlogs, SHOW BINLOG EVENTS reports the correct one. Analysis Mysqlbinlog output for encrypted binary log #Q> insert into tab1 values (3,'row 003') #190912 17:36:35 server id 10221 end_log_pos 980 CRC32 0x53bcb3d3 Table_map: `test`.`tab1` mapped to number 19 # at 940 #190912 17:36:35 server id 10221 end_log_pos 1026 CRC32 0xf2ae5136 Write_rows: table id 19 flags: STMT_END_F Here we can see Table_map_log_event ends at 980 but Next event starts at 940. And the reason for that is we do not send START_ENCRYPTION_EVENT to the slave Solution:- Send Start_encryption_log_event as Ignorable_log_event to slave(mysqlbinlog), So that mysqlbinlog can update its log_pos. Since Slave can request multiple FORMAT_DESCRIPTION_EVENT while master does not have so We only update slave master pos when master actually have the FORMAT_DESCRIPTION_EVENT. Similar logic should be applied for START_ENCRYPTION_EVENT. Also added the test case when new server reads the data from old server which does not send START_ENCRYPTION_EVENT to slave. Master Slave Upgrade Scenario. When Slave is updated first, Slave will have extra logic of handling START_ENCRYPTION_EVENT But master willnot be sending START_ENCRYPTION_EVENT. So there will be no issue. When Master is updated first, It will send START_ENCRYPTION_EVENT to slave , But slave will ignore this event in queue_event.	2019-10-08 14:35:34 +05:30
sachin	1e0f09cacb	MDEV-16239 Many test in rpl suite fails Fix rpl_skip_error test. We cant reset Slave_skipped_errors(even with FLUSH STATUS), So instead of absolute slave_skipped_errors we look for delta of slave_skipped_errors Fix rpl.rpl_binlog_errors and binlog_encryption.rpl_binlog_errors We create the $load_file and $load_file2 but we never remove them. Fix rpl_000011.test Instead of real value use delta value , Since flush status wont flush LONGLONG variable. Fix rpl_row_find_row_debug Instead of searching whole log_error_ file we will use search_pattern_in_file which runs pattern search only on latest test run , instead of full file. Fix rpl_ip_mix rpl_ip_mix2 We should call reset slave all because we also want to reset master_host otherwise show slave status wont be empty and making repeat N a failure. Fix rpl_rotate_logs First we have to remove master.info file (cleanup) and second we have to call reset slave all because if we do not call reset slave all then we wont read master.info file beacuse we already have master config in memory. And this makes start slave to pass , which shoud fail becuase its permision is 000 Fix circular_serverid0 test The reason is that ++dbug_rows_event_count == 2 in queue_event does not take --repeat into account. So I have reseted the dbug_rows_event_count in if body.	2019-10-08 13:34:25 +05:30
Marko Mäkelä	1333da90b5	Merge 10.4 into 10.5	2019-09-24 10:07:56 +03:00
Marko Mäkelä	5a92ccbaea	Merge 10.3 into 10.4 Disable MDEV-20576 assertions until MDEV-20595 has been fixed.	2019-09-23 17:35:29 +03:00
Sujatha	90a9c4cae7	MDEV-20217: Semi_sync: Last_IO_Error: Fatal error: Failed to run 'after_queue_event' hook Fix: === Implemented upstream fix. commit `7d3d0fc303` Author: He Zhenxing <zhenxing.he@sun.com> Backport Bug#45852 Semisynch: Last_IO_Error: Fatal error: Failed to run 'after_queue_event' hook Errors when send reply to master should never cause the IO thread to stop, because master can fall back to async replication if it does not get reply from slave. The problem is fixed by deliberately ignoring the return value of slave_reply.	2019-09-16 15:45:24 +05:30
Marko Mäkelä	624dd71b94	Merge 10.4 into 10.5	2019-08-13 18:57:00 +03:00
Sujatha	2a8ae4bdce	MDEV-19855: Create "Sql_cmd_show_slave_status" class for "SHOW SLAVE STATUS" command. Create "Sql_cmd_show_slave_status" class for "SHOW SLAVE STATUS" command.	2019-07-01 19:25:12 +05:30
Kentoku SHIBA	1635ea9474	MDEV-17402 slave_transaction_retry_errors="12701" won't be enabled (#1346 ) error code 12701 is already included in default value, but other plugin specific error codes are ignored because of checking with ER_ERROR_LAST. ER_ERROR_LAST does not include plugin specific error codes. So I just removed it for fixing this issue.	2019-06-29 00:05:34 +09:00
Eugene Kosov	d36c107a6b	imporve clang build cmake -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_BUILD_TYPE=Debug Maintainer mode makes all warnings errors. This patch fix warnings. Mostly about deprecated `register` keyword. Too much warnings came from Mroonga and I gave up on it.	2019-06-25 13:21:36 +03:00
Oleksandr Byelkin	c07325f932	Merge branch '10.3' into 10.4	2019-05-19 20:55:37 +02:00
Marko Mäkelä	be85d3e61b	Merge 10.2 into 10.3	2019-05-14 17:18:46 +03:00
Marko Mäkelä	26a14ee130	Merge 10.1 into 10.2	2019-05-13 17:54:04 +03:00
Vicențiu Ciorbaru	cb248f8806	Merge branch '5.5' into 10.1	2019-05-11 22:19:05 +03:00
Vicențiu Ciorbaru	5543b75550	Update FSF Address * Update wrong zip-code	2019-05-11 21:29:06 +03:00
Kentoku SHIBA	857310c218	MDEV-16543 Replicating to spider is fragile without retries (#1259 )	2019-04-12 22:58:37 +09:00
Marko Mäkelä	5c3ff5cb93	Merge 10.3 into 10.4	2019-04-02 11:04:54 +03:00
Sergei Golubchik	4e1d3f83b7	Merge branch '10.2' into 10.3	2019-03-29 19:41:41 +01:00
Sujatha Sivakumar	e42192d7b3	MDEV-13895: GTID and Master_Delay causes excessive initial delay Problem: ======== When attempting to delay a Slave attached with GTID, there appears to be an extra delay applied initially. For example, this output reflects a Slave that is already delayed by 43200 seconds. When switching to GTID replication, replication is paused until SQL_Remaining_Delay counts down to 0: CHANGE MASTER TO master_use_gtid=current_pos; CHANGE MASTER TO MASTER_DELAY=43200; Seconds_Behind_Master: 44847 Using_Gtid: Current_Pos SQL_Delay: 43200 SQL_Remaining_Delay: 43089 Slave_SQL_Running_State: Waiting until MASTER_DELAY seconds after master executed event Analysis: ========= When slave initiates a GTID based connection request to master, the master sends two GTID_LIST events. The first one is actual GTID_LIST event and the second one is a fake GTID_LIST event. This is sent by master to provide its current binlary log file position. The fake GTID_LIST events will have their ev->when=0. 'when' (the timestamp) is set to 0 so that slave could distinguish between real and fake Rotate events. On slave side when MASTER_DELAY is configured to "X" the applier will ensure that there is a time delay of "X" seconds before the event is applied. General behaviour of MASTER_DELAY example:- Master timestamp of event e1=10 timestamp of event e2=11 On slave MASTER_DELAY=5 Event e1 will be applied at = 15 e2 will be applied at =16 In bug scenario:- On Master: With GTIDs timestamp of event e1=10 timestamp of event e2=0 On Slave: e1 will be applied at = 10 + 5 =15 For e2, since "e2->when=0" e2->when is set to current timestamp. i.e since the e2->when and current timestamp on slave is the same applier waits for additional master_delay=5 seconds. the ev->when contributes to "rli->last_master_timestamp". rli->last_master_timestamp= ev->when + (time_t) ev->exec_time; Fake events should not update the "ev->when" to "current timestamp" on slave. Fix: === Remove the assignment of current timestamp to "ev->when" when "ev->when=0".	2019-03-28 20:35:39 +05:30
Sergey Vojtovich	88d89ee0ba	Less abort_loop references Removed redundant initialisation in unireg_init(): already done by mysql_init_variables(). Slave threads already check THD::killed, which eliminates the need to check abort_loop. Removed unused wsrep_kill_mysql().	2019-03-09 20:22:24 +04:00
Sergey Vojtovich	9824ec81aa	Removed redundant service_thread_count In contrast to thread_count, which is decremented by THD destructor, this one was most probably intended to be decremented after all THD destructors are done. THD_count class was added to achieve similar effect with thread_count. Aim is to reduce usage of LOCK_thread_count and COND_thread_count. Part of MDEV-15135.	2019-01-28 17:39:08 +04:00
Sergey Vojtovich	3503fbbebf	Move THD list handling to THD_list Implemented and integrated THD_list as a replacement for the global thread list. It uses own mutex instead of LOCK_thread_count for THD list protection. Removed unused first_global_thread() and next_global_thread(). delayed_insert_threads is now protected by LOCK_delayed_insert. Although this patch doesn't fix very wrong synchronization of this variable. After this patch there are only 2 legitimate uses of LOCK_thread_count left, both in mysqld.cc: thread_count and ready_to_exit. Aim is to reduce usage of LOCK_thread_count and COND_thread_count. Part of MDEV-15135.	2019-01-28 17:39:07 +04:00
Brave Galera Crew	36a2a185fe	Galera4	2019-01-23 15:30:00 +04:00
Sergey Vojtovich	d2bdd78915	Master_info counters transition to Atomic_counter	2018-12-29 14:09:15 +04:00

1 2 3 4 5 ...

2551 Commits