mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-08-08 11:22:35 +03:00

Author	SHA1	Message	Date
Nikita Malyavin	d5e59c983f	MDEV-31646 Online alter applies binlog cache limit to cache writes 1. Make online disk writes unlimited, same as filesort does. 2. Make proper error handling -- in 32-bit build IO_CACHE capacity limit is 4GB, so it is quite possible to overfill there. 3. Event_log::write_cache complicated with event reparsing, and as it was proven by QA, contains some mistakes. Rewrite introbuce a simpler and much faster version, not featuring reparsing and therefore copying a whole buffer at once. This also disables checksums and crypto. 4. Handle read_log_event errors correctly: error returned is -1 (eof signal for alter table), and my_error is not called. Call my_error and always return 1. There's no test for this, since it shouldn't happen, see the next bullet. 5. An event could be written partially in case of error, if it's bigger than the IO_CACHE buffer. Restore the position where it was before the error was emitted. As a result, online alter is untied of several binlog variables, which was a second aim of this patch.	2023-08-15 13:59:07 +02:00
Nikita Malyavin	ecb9db4c3d	MDEV-30949 Direct leak in binlog_online_alter_end_trans when committing a big transaction, online_alter_cache_log creates a cache file. It wasn't properly closed, which was spotted by a memory leak from my_register_filename. A temporary file also remained open. Binlog wasn't affected by this, since it features its own file management. A proper closing is calling close_cached_file. It deinits io_cache and closes the underlying file. After closing, the file is expected to be deleted automagically.	2023-08-15 10:16:13 +02:00
Nikita Malyavin	8f6f219a68	control Cache_flip_event_log lifetime with reference count If online alter fails, TABLE_SHARE can be freed while concurrent transactions still have row events in their online_alter_cache_data. On commit they try'll to flush them, writing to TABLE_SHARE's Cache_flip_event_log, which is already freed. This causes a crash in main.alter_table_online_debug test	2023-08-15 10:16:12 +02:00
Sergei Golubchik	64b55151f4	separate online_alter_cache_data from binlog_cache_data	2023-08-15 10:16:12 +02:00
Nikita Malyavin	5a867d847c	Online alter: savepoints	2023-08-15 10:16:11 +02:00
Sergei Golubchik	332f41aae3	don't copy stmt IO_CACHE to trx IO_CACHE at the stmt end instead use only one (trx) IO_CACHE and truncate it if the statement is rolled back. don't use binlog_cache_mngr to accumulate the data, use binlog_cache_data instead. (binlog_cache_data owns one IO_CACHE, binlog_cache_mngr owns two binlog_cache_data's, trx and stmt).	2023-08-15 10:16:11 +02:00
Sergei Golubchik	0b67af5a81	cleanup no functional changes here	2023-08-15 10:16:11 +02:00
Nikita Malyavin	ab4bfad206	MDEV-16329 [5/5] ALTER ONLINE TABLE * Log rows in online_alter_binlog. * Table online data is replicated within dedicated binlog file * Cached data is written on commit. * Versioning is fully supported. * Works both wit and without binlog enabled. * For now savepoints setup is forbidden while ONLINE ALTER goes on. Extra support is required. We can simply log the SAVEPOINT query events and replicate them together with row events. But it's not implemented for now. * Cache flipping: We want to care for the possible bottleneck in the online alter binlog reading/writing in advance. IO_CACHE does not provide anything better that sequential access, besides, only a single write is mutex-protected, which is not suitable, since we should write a transaction atomically. To solve this, a special layer on top Event_log is implemented. There are two IO_CACHE files underneath: one for reading, and one for writing. Once the read cache is empty, an exclusive lock is acquired (we can wait for a currently active transaction finish writing), and flip() is emitted, i.e. the write cache is reopened for read, and the read cache is emptied, and reopened for writing. This reminds a buffer flip that happens in accelerated graphics (DirectX/OpenGL/etc). Cache_flip_event_log is considered non-blocking for a single reader and a single writer in this sense, with the only lock held by reader during flip. An alternative approach by implementing a fair concurrent circular buffer is described in MDEV-24676. * Cache managers: We have two cache sinks: statement and transactional. It is important that the changes are first cached per-statement and per-transaction. If a statement fails, then only statement data is rolled back. The transaction moves along, however. Turns out, there's no guarantee that TABLE well persist in thd->open_tables to the transaction commit moment. If an error occurs, tables from statement are purged. Therefore, we can't store te caches in TABLE. Ideally, it should be handlerton, but we cut the corner and store it in THD in a list.	2023-08-15 10:16:11 +02:00
Nikita Malyavin	d2d0995cf2	MDEV-16329 [4/5] Refactor MYSQL_BIN_LOG: extract Event_log ancestor Event_log is supposed to be a basic logging class that can write events in a single file. MYSQL_BIN_LOG in comparison will have: * rotation support * index files * purging * gtid and transactional information handling. * is dedicated for a general-purpose binlog	2023-08-15 10:16:11 +02:00
Nikita Malyavin	6427e343cf	MDEV-16329 [3/5] use binlog_cache_data directly in most places * Eliminate most usages of THD::use_trans_table. Only 3 left, and they are at quite high levels, and really essential. * Eliminate is_transactional argument when possible. Lots of places are left though, because of some WSREP error handling in MYSQL_BIN_LOG::set_write_error. * Remove junk binlog functions from THD * binlog_prepare_pending_rows_event is moved to log.cc inside MYSQL_BIN_LOG and is not anymore template. Instead it accepls event factory with a type code, and a callback to a constructing function in it.	2023-08-15 10:16:11 +02:00
Nikita Malyavin	429f635f30	MDEV-16329 [2/5] refactor binlog and cache_mngr pump up binlog and cache manager to level of binlog_log_row_internal	2023-08-15 10:16:11 +02:00
Marko Mäkelä	dbab3e8d90	Merge 10.6 into 10.8	2023-02-10 13:43:53 +02:00
Marko Mäkelä	6aec87544c	Merge 10.5 into 10.6	2023-02-10 13:03:01 +02:00
Marko Mäkelä	c41c79650a	Merge 10.4 into 10.5	2023-02-10 12:02:11 +02:00
Vicențiu Ciorbaru	08c852026d	Apply clang-tidy to remove empty constructors / destructors This patch is the result of running run-clang-tidy -fix -header-filter=.* -checks='-,modernize-use-equals-default' . Code style changes have been done on top. The result of this change leads to the following improvements: 1. Binary size reduction. For a -DBUILD_CONFIG=mysql_release build, the binary size is reduced by ~400kb. * A raw -DCMAKE_BUILD_TYPE=Release reduces the binary size by ~1.4kb. 2. Compiler can better understand the intent of the code, thus it leads to more optimization possibilities. Additionally it enabled detecting unused variables that had an empty default constructor but not marked so explicitly. Particular change required following this patch in sql/opt_range.cc result_keys, an unused template class Bitmap now correctly issues unused variable warnings. Setting Bitmap template class constructor to default allows the compiler to identify that there are no side-effects when instantiating the class. Previously the compiler could not issue the warning as it assumed Bitmap class (being a template) would not be performing a NO-OP for its default constructor. This prevented the "unused variable warning".	2023-02-09 16:09:08 +02:00
Oleksandr Byelkin	2f70784c2a	Merge branch '10.7' into 10.8	2022-10-04 11:42:37 +02:00
Marko Mäkelä	829e8111c7	Merge 10.5 into 10.6	2022-09-26 14:34:43 +03:00
Marko Mäkelä	6286a05d80	Merge 10.4 into 10.5	2022-09-26 13:34:38 +03:00
Marko Mäkelä	a69cf6f07e	MDEV-29613 Improve WITH_DBUG_TRACE=OFF In commit `28325b0863` a compile-time option was introduced to disable the macros DBUG_ENTER and DBUG_RETURN or DBUG_VOID_RETURN. The parameter name WITH_DBUG_TRACE would hint that it also covers DBUG_PRINT statements. Let us do that: WITH_DBUG_TRACE=OFF shall disable DBUG_PRINT() as well. A few InnoDB recovery tests used to check that some output from DBUG_PRINT("ib_log", ...) is present. We can live without those checks. Reviewed by: Vladislav Vaintroub	2022-09-23 13:40:42 +03:00
Jan Lindström	dee24f3155	Merge 10.7 into 10.8	2022-09-05 15:59:56 +03:00
Jan Lindström	9fefd440b5	Merge 10.5 into 10.6	2022-09-05 14:05:30 +03:00
Jan Lindström	ba987a46c9	Merge 10.4 into 10.5	2022-09-05 13:28:56 +03:00
Daniele Sciascia	2917bd0d2c	Reduce compilation dependencies on wsrep_mysqld.h Making changes to wsrep_mysqld.h causes large parts of server code to be recompiled. The reason is that wsrep_mysqld.h is included by sql_class.h, even tough very little of wsrep_mysqld.h is needed in sql_class.h. This commit introduces a new header file, wsrep_on.h, which is meant to be included from sql_class.h, and contains only macros and variable declarations used to determine whether wsrep is enabled. Also, header wsrep.h should only contain definitions that are also used outside of sql/. Therefore, move WSREP_TO_ISOLATION* and WSREP_SYNC_WAIT macros to wsrep_mysqld.h. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2022-08-31 11:05:23 +03:00
Marko Mäkelä	f79cebb4d0	Merge 10.7 into 10.8	2022-07-28 10:33:26 +03:00
Andrei	8d238d4726	MDEV-28609 refine gtid-strict-mode to ignore same server-id gtid from the past ... on semisync slave To provide semisync master crash-recovery the same server-id transactions were made to accept for execution on the semisync slave when the strict gtid mode (see MDEV-27760). That however caused out-of-order error on a master's transaction server of the circular setup. The error was fair in the sense of the gtid strict mode rule as indeed under the condition of the circular setup the replicated transaction already exists in the local binlog. This is fixed by the commit to ignore on the gtid strict mode semisync slave those gtids that exist in the slave's binlog that effectively restores the default same-server-id ignore policy. At the same time the fixes complies with MDEV-21117 semisync slave recovery to accept the same server-id transactions that do not exist in local binlog.	2022-07-26 16:01:14 +03:00
Marko Mäkelä	57d4a242da	Merge 10.7 into 10.8	2022-06-06 16:22:09 +03:00
Marko Mäkelä	2f8d0af883	Merge 10.5 into 10.6	2022-06-02 17:39:13 +03:00
Marko Mäkelä	4b3c3e526e	Merge 10.4 into 10.5	2022-06-02 16:51:13 +03:00
mkaruza	ebbd5ef6e2	MDEV-27862 Galera should replicate nextval()-related changes in sequences with INCREMENT <> 0, at least NOCACHE ones with engine=InnoDB Sequence storage engine is not transactionl so cache will be written in stmt_cache that is not replicated in cluster. To fix this replicate what is available in both trans_cache and stmt_cache. Sequences will only work when NOCACHE keyword is used when sequnce is created. If WSREP is enabled and we don't have this keyword report error indicting that sequence will not work correctly in cluster. When binlog is enabled statement cache will be cleared in transaction before COMMIT so cache generated from sequence will not be replicated. We need to keep cache until replication. Tests are re-recorded because of replication changes that were introducted with this PR. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2022-05-30 12:43:52 +03:00
Marko Mäkelä	133c2129cd	Merge 10.7 into 10.8	2022-04-27 10:43:00 +03:00
Marko Mäkelä	fae0ccad6e	Merge 10.5 into 10.6	2022-04-21 17:46:40 +03:00
Marko Mäkelä	620c55e708	Merge 10.4 into 10.5	2022-04-21 15:33:50 +03:00
Marko Mäkelä	394784095e	Merge 10.3 into 10.4	2022-04-21 11:33:59 +03:00
Sergei Golubchik	bbdec04d59	MDEV-24317 Data race in LOGGER::init_error_log at sql/log.cc:1443 and in LOGGER::error_log_print at sql/log.cc:1181 don't initialize error_log_handler_list in set_handlers() * error_log_handler_list is initialized to LOG_FILE early, in init_base() * set_handlers always reinitializes it to LOG_FILE, so it's pointless * after init_base() concurrent threads start using sql_log_warning, so following set_handlers() shouldn't modify error_log_handler_list without some protection	2022-04-12 13:07:20 +02:00
Sachin	0c5d1342ae	MDEV-11675 Lag Free Alter On Slave This commit implements two phase binloggable ALTER. When a new @@session.binlog_alter_two_phase = YES ALTER query gets logged in two parts, the START ALTER and the COMMIT or ROLLBACK ALTER. START Alter is written in binlog as soon as necessary locks have been acquired for the table. The timing is such that any concurrent DML:s that update the same table are either committed, thus logged into binary log having done work on the old version of the table, or will be queued for execution on its new version. The "COMPLETE" COMMIT or ROLLBACK ALTER are written at the very point of a normal "single-piece" ALTER that is after the most of the query work is done. When its result is positive COMMIT ALTER is written, otherwise ROLLBACK ALTER is written with specific error happened after START ALTER phase. Replication of two-phase binloggable ALTER is cross-version safe. Specifically the OLD slave merely does not recognized the start alter part, still being able to process and memorize its gtid. Two phase logged ALTER is read from binlog by mysqlbinlog to produce BINLOG 'string', where 'string' contains base64 encoded Query_log_event containing either the start part of ALTER, or a completion part. The Query details can be displayed with `-v` flag, similarly to ROW format events. Notice, mysqlbinlog output containing parts of two-phase binloggable ALTER is processable correctly only by binlog_alter_two_phase server. @@log_warnings > 2 can reveal details of binlogging and slave side processing of the ALTER parts. The current commit also carries fixes to the following list of reported bugs: MDEV-27511, MDEV-27471, MDEV-27349, MDEV-27628, MDEV-27528. Thanks to all people involved into early discussion of the feature including Kristian Nielsen, those who helped to design, implement and test: Sergei Golubchik, Andrei Elkin who took the burden of the implemenation completion, Sujatha Sivakumar, Brandon Nesterenko, Alice Sherepa, Ramesh Sivaraman, Jan Lindstrom.	2022-01-27 21:25:07 +02:00
Marko Mäkelä	73f5cbd0b6	Merge 10.5 into 10.6	2021-10-21 16:06:34 +03:00
Marko Mäkelä	5f8561a6bc	Merge 10.4 into 10.5	2021-10-21 15:26:25 +03:00
Marko Mäkelä	489ef007be	Merge 10.3 into 10.4	2021-10-21 14:57:00 +03:00
Marko Mäkelä	e4a7c15dd6	Merge 10.2 into 10.3	2021-10-21 13:41:04 +03:00
Brandon Nesterenko	2291f8ef73	MDEV-25284: Assertion `info->type == READ_CACHE \|\| info->type == WRITE_CACHE' failed Problem: ======== This patch addresses two issues. First, if a CHANGE MASTER command is issued and an error happens while locating the replica’s relay logs, the logs can be put into an invalid state where future updates fail and future CHANGE MASTER calls crash the server. More specifically, right before a replica purges the relay logs (part of the `CHANGE MASTER TO` logic), the relay log is temporarily closed with state LOG_TO_BE_OPENED. If the server errors in-between the temporary log closure and purge, i.e. during the function find_log_pos, the log should be closed. MDEV-25284 reveals the log is not properly closed. Second, upon issuing a RESET SLAVE ALL command, a slave’s GTID filters are not cleared (DO_DOMAIN_IDS, IGNORE_DOMIAN_IDS, IGNORE_SERVER_IDS). MySQL had a similar bug report, Bug #18816897, which fixed this issue to clear IGNORE_SERVER_IDS after issuing RESET SLAVE ALL in version 5.7. Solution: ========= To fix the first problem, the CHANGE MASTER error handling logic was extended to transition the relay log state to LOG_CLOSED from LOG_TO_BE_OPENED. To fix the second problem, the RESET SLAVE ALL logic is extended to clear the domain_id filter and ignore_server_ids. Reviewed By: ============ Andrei Elkin <andrei.elkin@mariadb.com>	2021-10-18 10:43:51 -06:00
Marko Mäkelä	f3fcf5f45c	Merge 10.5 to 10.6	2021-08-19 12:25:00 +03:00
Marko Mäkelä	4a25957274	Merge 10.4 into 10.5	2021-08-18 18:22:35 +03:00
Brandon Nesterenko	46c3e7e353	MDEV-20215: binlog.show_concurrent_rotate failed in buildbot with wrong result Problem: ======= There are two issues that are addressed in this patch: 1) SHOW BINARY LOGS uses caching to store the binary logs that exist in the log directory; however, if new events are written to the logs, the caching strategy is unaware. This is okay for users, as it is okay for SHOW to return slightly old data. The test, however, can result in inconsistent data. It runs two connections concurrently, where one shows the logs, and the other adds a new file. The output of SHOW BINARY LOGS then depends on when the cache is built, with respect to the time that the second connection rotates the logs. 2) There is a race condition between RESET MASTER and SHOW BINARY LOGS. More specifically, where they both need the binary log lock to begin, SHOW BINARY LOGS only needs the lock to build its cache. If RESET MASTER is issued after SHOW BINARY LOGS has built its cache and before it has returned the results, the presented data may be incorrect. Solution: ======== 1) As it is okay for users to see stale data, to make the test consistent, use DEBUG_SYNC to force the race condition (problem 2) to make SHOW BINARY LOGS build a cache before RESET MASTER is called. Then, use additional logic from the next part of the solution to rebuild the cache. 2) Use an Atomic_counter to keep track of the number of times RESET MASTER has been called. If the value of the counter changes after building the cache, the cache should be rebuilt and the analysis should be restarted. Reviewed By: ============ Andrei Elkin: <andrei.elkin@mariadb.com>	2021-08-13 10:53:19 -06:00
Andrei Elkin	79a2dbc879	MDEV-21117 post-push fixes 1. work around MDEV-25912 to not apply assert at wsrep running time; 2. handle wsrep mode of the server recovery 3. convert hton calls to static binlog_commit ones. 4. satisfy MSAN complain on uninitialized std::pair	2021-06-15 19:18:11 +03:00
Sujatha	6c39eaeb12	MDEV-21117: refine the server binlog-based recovery for semisync Problem: ======= When the semisync master is crashed and restarted as slave it could recover transactions that former slaves may never have seen. A known method existed to clear out all prepared transactions with --tc-heuristic-recover=rollback does not care to adjust binlog accordingly. Fix: === The binlog-based recovery is made to concern of the slave semisync role of post-crash restarted server. No changes in behavior is done to the "normal" binloggging server and the semisync master. When the restarted server is configured with --rpl-semi-sync-slave-enabled=1 the refined recovery attempts to roll back prepared transactions and truncate binlog accordingly. In case of a partially committed (that is committed at least in one of the engine participants) such transaction gets committed. It's guaranteed no (partially as well) committed transactions exist beyond the truncate position. In case there exists a non-transactional replication event (being in a way a committed transaction) past the computed truncate position the recovery ends with an error. As after master crash and failover to slave, the demoted-to-slave ex-master must be ready to face and accept its own (generated by) events, without generally necessary --replicate-same-server-id. So the acceptance conditions are relaxed for the semisync slave to accept own events without that option. While gtid_strict_mode ON ensures no duplicate transaction can be (re-)executed the master_use_gtid=none slave has to be configured with --replicate-same-server-id. NOTE for reviewers. This patch does not handle the user XA which is done in next git commit.	2021-06-11 19:49:39 +03:00
Marko Mäkelä	d7a5824899	Merge 10.4 into 10.5	2020-11-13 21:54:21 +02:00
Sujatha	b2029c0300	Merge branch '10.3' into 10.4	2020-11-12 15:39:02 +05:30
Sujatha	bafb011a82	Merge branch '10.2' into 10.3	2020-11-12 14:10:05 +05:30
Sujatha	984a06db2c	MDEV-4633: multi_source.simple test fails sporadically Analysis: ======== Writes to 'rli->log_space_total' needs to be synchronized, otherwise both SQL_THREAD and IO_THREAD can try to modify the variable simultaneously resulting in incorrect rli->log_space_total. In the current test scenario SQL_THREAD is trying to decrement 'rli->log_space_total' in 'purge_first_log' and IO_THREAD is trying to increment the 'rli->log_space_total' in 'queue_event' simultaneously. Hence test occasionally fails with result mismatch. Fix: === Convert 'rli->log_space_total' variable to atomic type.	2020-11-12 13:04:39 +05:30
Monty	2682458128	Use larger buffer when reading binary and relay logs - Should speed up replication	2020-07-23 10:54:32 +03:00

1 2 3 4 5 ...

475 Commits