mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-08-08 11:22:35 +03:00

Author	SHA1	Message	Date
Kristian Nielsen	7a67f72979	Binlog-in-engine: Also binlog non-innodb event groups Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2025-07-23 16:19:50 +02:00
Kristian Nielsen	84da20e658	MDEV-34705: Binlog-in-engine: Protect against concurrent RESET MASTER and dump threads This is actually an existing problem in the old binlog implementation, and this patch is applicable to old binlog also. The problem is that RESET MASTER can run concurrently with binlog dump threads / connected slaves. This will remove the binlog from under the feet of the reader, which can cause all sorts of strange behaviour. This patch fixes the problem by disallowing to run RESET MASTER when dump threads (or other RESET MASTER or SHOW BINARY LOGS) are running. An error is thrown in this case, user must stop slaves and/or kill dump threads to make the RESET MASTER go through. A slave that connects in the middle of RESET MASTER will wait for it to complete. Fix a lot of test cases to kill any lingering dump threads before doing RESET MASTER, mostly just by sourcing include/kill_binlog_dump_threads.inc. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2025-07-23 16:19:50 +02:00
Kristian Nielsen	d26851a575	MDEV-34705: Binlog-in-engine: Crash-safe slave This patch makes replication crash-safe with the new binlog implementation, even when --innodb-flush-log-at-trx-commit=0\|2. The point is to not send any binlog events to the slave until they have become durable on master, thus avoiding that a slave may replicate a transaction that is lost during master recovery, diverging the slave from the master. Keep track of which point in the binlog has been durably synced to disk (meaning the corresponding LSN has been durably synced to disk in the InnoDB redo log). Each write to the binlog inserts an entry with offset and corresponding LSN in a FIFO. Dump threads will first read only up to the durable point in the binlog. A dump thread will then check the LSN fifo, and do an InnoDB redo log sync if anything is pending. Then the FIFO is emptied of any LSNs that have now become durable, and the durable point in the binlog is updated and reading the binlog can continue. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2025-07-23 16:19:50 +02:00
Kristian Nielsen	d496e5278d	MDEV-34705: Binlog-in-engine: Integration with server-layer code Mostly various fixes to avoid initializing or creating any data or files for the legacy binlog. A possible later refinement could be to sub-class the binlog class differently for legacy and in-engine binlogs, writing separate virtual functions for behaviour that differ, extracting common functionality into sub-methods. This could remove some if (opt_binlog_engine_hton) conditionals. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2025-04-10 19:16:55 +02:00
Kristian Nielsen	0671add213	MDEV-34705: Binlog-in-engine: Implement PURGE BINARY LOGS Still ToDo: is to restrict auto-purge so that it does not purge any binlog file with out-of-band data that might still be needed by a connected slave. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2025-04-06 10:01:50 +02:00
Kristian Nielsen	c67b014c9c	MDEV-34705: Binlog-in-engine: Implement RESET MASTER Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2025-04-06 10:01:50 +02:00
Kristian Nielsen	6889c8e4cf	MDEV-34705: Binlog-in-engine: Implement FLUSH BINARY LOGS No DELETE_DOMAIN_ID supported yet, will come in a later commit, after PURGE is implemented. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2025-04-06 10:01:50 +02:00
Kristian Nielsen	586ed18fe9	MDEV-34705: Code to restore binlog GTID state at restart To restore the binlog state, after finding the position in the old binlog to continue from, read the full gtid state saved at the start of the binlog file as well as the most recent differentioal gtid state written shortly before the starting position. Then construct a binlog reader to read the remaining few events (if any), and update with any GTIDs read to obtain the final restored GTID binlog state. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2025-04-06 10:00:17 +02:00
Kristian Nielsen	18b9ec637e	MDEV-34705: Binlog in Engine: Searchability for GTID position Every N bytes (hardcoded at 64k for now, to become a configurable setting), write the binlog GTID state into the binlog tablespace. This allows to quickly find a given GTID position by binary search to the prior GTID state in the tablespace and then a small linear scan from that point. The full binlog state is dumped at the start of the binlog file; remaining states dumped are differential states containing only the changed (domain_id, server_id) pairs, to save space if binlog space is large. This commit only implements the writing of the binlog state to the tablespace at regular intervals. The binary search to be implemented in a subsequent commit. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2025-04-06 10:00:17 +02:00
Kristian Nielsen	094c772213	MDEV-34705: Binlog in Engine: Also binlog standalone (eg. DDL) in the engine Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2025-04-06 10:00:17 +02:00
Kristian Nielsen	1db620338d	MDEV-34705: Binlog in Engine: Early draft, first binlogging of DML to InnoDB tablespace The option --innodb-in-engine now causes InnoDB DML commits to include binlogging in the same mtr. Binlog group commit now skips binlogging to old file-based binlog and passes events to InnoDB instead. Many things unfinished still, like allocating new tablespaces when the first one is filled, writing large event groups out-of-band to not bloat the InnoDB commit record in the redo log and exceed max mtr size, writing DDL and all other events to the InnoDB binlog, skipping the creation of the old-style binlog, reading the new style binlog from InnoDB, etc. etc. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2025-04-06 10:00:16 +02:00
Monty	dd99780967	MDEV-34504 PURGE BINARY LOGS not working anymore PURGE BINARY LOGS did not always purge binary logs. This commit fixes some of the issues and adds notifications if a binary log cannot be purged. User visible changes: - 'PURGE BINARY LOG TO log_name' and 'PURGE BINARY LOGS BEFORE date' worked differently. 'TO' ignored 'slave_connections_needed_for_purge' while 'BEFORE' did not. Now both versions ignores the 'slave_connections_needed_for_purge variable'. - 'PURGE BINARY LOG..' commands now returns 'note' if a binary log cannot be deleted like Note 1375 Binary log 'master-bin.000004' is not purged because it is the current active binlog - Automatic binary log purges, based on date or size, will write a note to the error log if a binary log matching the size or date cannot yet be deleted. - If 'slave_connections_needed_for_purge' is set from a config or command line, it is set to 0 if Galera is enabled and 1 otherwise (old default). This ensures that automatic binary log purge works with Galera as before the addition of 'slave_connections_needed_for_purge'. If the variable is changed to 0, a warning will be printed to the error log. Code changes: - Added THD argument to several purge_logs related functions that needed THD. - Added 'interactive' options to purge_logs functions. This allowed me to remove testing of sql_command == SQLCOM_PURGE. - Changed purge_logs_before_date() to first check if log is applicable before calling can_purge_logs(). This ensures we do not get a notification for logs that does not match the remove criteria. - MYSQL_BIN_LOG::can_purge_log() will write notifications to the user or error log if a log cannot yet be removed. - log_in_use() will return reason why a binary log cannot be removed. Changes to keep code consistent: - Moved checking of binlog_format for Galera to be after Galera is initialized (The old check never worked). If Galera is enabled we now change the binlog_format to ROW, with a warning, instead of aborting the server. If this change happens a warning will be printed to the error log. - Print a warning if Galera or FLASHBACK changes the binlog_format to ROW. Before it was done silently. Reviewed by: Sergei Golubchik <serg@mariadb.com>, Kristian Nielsen <knielsen@knielsen-hq.org>	2024-07-10 18:50:08 +03:00
Alexander Barkov	5fb07d942b	Merge remote-tracking branch 'origin/11.2' into 11.4	2024-07-09 21:45:37 +04:00
Alexander Barkov	8aad19ddfc	Merge remote-tracking branch 'origin/11.1' into 11.2	2024-07-09 14:04:11 +04:00
Marko Mäkelä	27a3366663	Merge 10.6 into 10.11	2024-06-27 10:26:09 +03:00
Marko Mäkelä	0076eb3d4e	Merge 10.5 into 10.6	2024-06-24 13:09:47 +03:00
Dave Gosselin	db0c28eff8	MDEV-33746 Supply missing override markings Find and fix missing virtual override markings. Updates cmake maintainer flags to include -Wsuggest-override and -Winconsistent-missing-override.	2024-06-20 11:32:13 -04:00
Alexander Barkov	c4bf4ce948	Merge remote-tracking branch 'origin/11.2' into 11.4	2024-06-17 15:46:39 +04:00
Marko Mäkelä	a21e49cbcc	Merge 11.1 into 11.2	2024-06-17 12:02:03 +03:00
Marko Mäkelä	22ba7e4ff8	Merge 10.6 into 10.11	2024-05-30 16:04:00 +03:00
Marko Mäkelä	5ba542e9ee	Merge 10.5 into 10.6	2024-05-30 14:27:07 +03:00
Oleksandr Byelkin	99b370e023	Merge branch '11.2' into 11.4	2024-05-21 19:38:51 +02:00
Robin Newhouse	dc38d8ea80	Minimize unsafe C functions with safe_strcpy() Similar to #2480. `567b681` introduced safe_strcpy() to minimize the use of C with potentially unsafe memory overflow with strcpy() whose use is discouraged. Replace instances of strcpy() with safe_strcpy() where possible, limited here to files in the `sql/` directory. All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.	2024-05-17 13:33:16 +01:00
Oleksandr Byelkin	cd28b2479c	Merge branch '11.1' into 11.2	2024-04-09 12:12:33 +02:00
Marko Mäkelä	788953463d	Merge 10.6 into 10.11 Some fixes related to commit `f838b2d799` and Rows_log_event::do_apply_event() and Update_rows_log_event::do_exec_row() for system-versioned tables were provided by Nikita Malyavin. This was required by test versioning.rpl,trx_id,row.	2024-03-28 09:16:57 +02:00
Marko Mäkelä	50715bd2ed	Merge 10.5 into 10.6	2024-03-18 17:07:32 +02:00
Marko Mäkelä	09d991d01c	MDEV-33478: Tests massively fail with clang-18 -fsanitize=memory Starting with clang-16, MemorySanitizer appears to check that uninitialized values not be passed by value nor returned. Previously, it was allowed to copy uninitialized data in such cases. get_foreign_key_info(): Remove a local variable that was passed uninitialized to a function. DsMrr_impl: Initialize key_buffer, because DsMrr_impl::dsmrr_init() is reading it. test_bind_result_ext1(): MYSQL_TYPE_LONG is 32 bits, hence we must use a 32-bit type, such as int. sizeof(long) differs between LP64 and LLP64 targets.	2024-03-18 16:01:29 +02:00
Oleksandr Byelkin	fa69b085b1	Merge branch '11.3' into 11.4	2024-02-15 13:53:21 +01:00
Monty	18dfcfdecf	MDEV-31404 Implement binlog_space_limit binlog_space_limit is a variable in Percona server used to limit the total size of all binary logs. This implementation is based on code from Percona server 5.7. In MariaDB we decided to call the variable max-binlog-total-size to be similar to max-binlog-size. This makes it easier to find in the output from 'mariadbd --help --verbose'). MariaDB will also support binlog_space_limit for compatibility with Percona. Some internal notes to explain implementation notes: - When running MariaDB does not delete binary logs that are either used by slaves or have active xid that are not yet committed. Some implementation notes: - max-binlog-total-size is by default 0 (no limit). - max-binlog-total-size can be changed without server restart. - Binlog file sizes are checked on startup, or if max-binlog-total-size is set to a value > 0, not for every log write. The total size of all binary logs is cached and dynamically updated when updating the binary log on binary log rotation. - max-binlog-total-size is checked against existing log files during serverstart, binlog rotation, FLUSH LOGS, when writing to binary log or when max-binlog-total-size changes value. - Option --slave-connections-needed-for-purge with 1 as default added. This allows one to ensure that we do not delete binary logs if there is less than 'slave-connections-needed-for-purge' connected. Without this option max-binlog-total-size would potentially delete binlogs needed by slaves on server startup or when a slave disconnects as there are then no connected slaves to protect active binlogs. - PURGE BINARY LOGS TO ... will be executed as if slave-connectitons-needed-for-purge would be zero. In other words it will do the purge even if there is no slaves connected. If there are connected slaves working on the logs, these will be protected. - If binary log is on and max-binlog-total_size <> 0 then the status variable 'Binlog_disk_use' shows the current size of all old binary logs + the state of the current one. - Removed test of strcmp(log_file_name, log_info.log_file_name) in purge_logs_before_date() as this is tested in can_purge_logs() - To avoid expensive calls of log_in_use() we cache the result for the last log that is in use by a slave. Future calls to can_purge_logs() for this binary log will be quickly detected and false will be returned until a slave starts working on a new log. - Note that after a binary log rotation caused by max_binlog_size, the last log will not be purged directly as it is still in use internally. The next binary log write will purge binlogs if needed. Reviewer:Kristian Nielsen <knielsen@knielsen-hq.org>	2024-02-14 15:02:21 +01:00
Sergei Golubchik	79580f4f96	Merge branch '11.1' into 11.2	2024-02-02 17:43:57 +01:00
Kristian Nielsen	d039346a7a	MDEV-4991: GTID binlog indexing Improve the performance of slave connect using B+-Tree indexes on each binlog file. The index allows fast lookup of a GTID position to the corresponding offset in the binlog file, as well as lookup of a position to find the corresponding GTID position. This eliminates a costly sequential scan of the starting binlog file to find the GTID starting position when a slave connects. This is especially costly if the binlog file is not cached in memory (IO cost), or if it is encrypted or a lot of slaves connect simultaneously (CPU cost). The size of the index files is generally less than 1% of the binlog data, so not expected to be an issue. Most of the work writing the index is done as a background task, in the binlog background thread. This minimises the performance impact on transaction commit. A simple global mutex is used to protect index reads and (background) index writes; this is fine as slave connect is a relatively infrequent operation. Here are the user-visible options and status variables. The feature is on by default and is expected to need no tuning or configuration for most users. binlog_gtid_index On by default. Can be used to disable the indexes for testing purposes. binlog_gtid_index_page_size (default 4096) Page size to use for the binlog GTID index. This is the size of the nodes in the B+-tree used internally in the index. A very small page-size (64 is the minimum) will be less efficient, but can be used to stress the BTree-code during testing. binlog_gtid_index_span_min (default 65536) Control sparseness of the binlog GTID index. If set to N, at most one index record will be added for every N bytes of binlog file written. This can be used to reduce the number of records in the index, at the cost only of having to scan a few more events in the binlog file before finding the target position Two status variables are available to monitor the use of the GTID indexes: Binlog_gtid_index_hit Binlog_gtid_index_miss The "hit" status increments for each successful lookup in a GTID index. The "miss" increments when a lookup is not possible. This indicates that the index file is missing (eg. binlog written by old server version without GTID index support), or corrupt. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-01-27 12:09:54 +01:00
Marko Mäkelä	7ee16b1e29	Merge 11.3 into 11.4	2024-01-05 14:53:03 +02:00
Marko Mäkelä	f6d21a8855	Merge 11.1 into 11.2	2024-01-05 13:06:56 +02:00
Sergei Golubchik	c154aafe1a	Merge remote-tracking branch '11.3' into 11.4	2023-12-21 15:40:55 +01:00
Marko Mäkelä	2b8dc7668a	Merge 10.6 into 10.11	2023-12-21 13:19:17 +02:00
Marko Mäkelä	a81a138aab	Merge 10.5 into 10.6	2023-12-21 12:58:11 +02:00
Marko Mäkelä	a3dd7ea09f	Merge 10.4 into 10.5	2023-12-21 11:30:32 +02:00
Sergei Golubchik	fef31a26f3	Merge branch '11.1' into 11.2	2023-12-20 23:43:05 +01:00
Daniele Sciascia	0e1f4bd661	MDEV-31272 Statement rollback causes empty writeset replication This patch fixes cases where a transaction caused empty writeset to be replicated. This could happen in the case where a transaction executes a statement that initially manages to modify some data and therefore appended keys some for certification. The statement is however rolled back at some later stage due to some error (for example, a duplicate key error). After statement rollback the transaction is still alive, has no other changes. When committing such transaction, an empty writeset was replicated through Galera. The fix is to avoid calling into commit hook only when transaction has appended one or keys for certification and has some data in binlog cache to replicate. Otherwise, the commit is considered empty, and goes through usual empty commit path. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2023-12-20 12:31:17 +01:00
Sergei Golubchik	fd0b47f9d6	Merge branch '10.6' into 10.11	2023-12-18 11:19:04 +01:00
Sergei Golubchik	e95bba9c58	Merge branch '10.5' into 10.6	2023-12-17 11:20:43 +01:00
Sergei Golubchik	98a39b0c91	Merge branch '10.4' into 10.5	2023-12-02 01:02:50 +01:00
Nikita Malyavin	a569515a9d	online alter: rework savepoints Use standard handlerton functions for savepoint add/rollback. To identify the savepoint, the pointer passed is used. Every table that has online alter in progress maintains a list of savepoints independently. Also this removes setting a value to a global variable savepoint_alloc_size without any protection, which was a race condition bug.	2023-11-02 22:58:03 +04:00
Nikita Malyavin	cb52174693	online alter: extract the source to a separate file Move all the functions dedicated to online alter to a newly created online_alter.cc. With that, make many functions static and simplify the static functions naming. Also, rename binlog_log_row_online_alter -> online_alter_log_row.	2023-11-02 22:58:03 +04:00
Nikita Malyavin	830bdfccbd	MDEV-32126 Assertion fails upon online ALTER and binary log enabled Assertion `!writer.checksum_len \|\| writer.remains == 0' fails upon concurrent online ALTER and transactions with failing statements and binary log enabled. Also another assertion, `pos != (~(my_off_t) 0)', fails in my_seek, upon reinit_io_cache, on a simplified test. This means that IO_CACHE wasn't properly initialized, or had an error before. The overall problem is a deep interference with the effect of an installed binlog_hton: the assumption about that thd->binlog_get_cache_mngr() is, sufficiently, NULL, when we shouldn't run the binlog part of binlog_commit/binlog_rollback, is wrong: as turns out, sometimes the binlog handlerton can be not installed in current thd, but binlog_commit can be called on behalf of binlog, as in the bug reported. One separate condition found is XA recovery of the orphaned transaction, when binlog_commit is also called, but it has nothing to do with online alter. Solution: Extract online alter operations into a separate handlerton.	2023-11-02 22:58:03 +04:00
Kristian Nielsen	6fa69ad747	MDEV-27436: binlog corruption (/tmp no space left on device at the same moment) This commit fixes several bugs in error handling around disk full when writing the statement/transaction binlog caches: 1. If the error occurs during a non-transactional statement, the code attempts to binlog the partially executed statement (as it cannot roll back). The stmt_cache->error was still set from the disk full error. This caused MYSQL_BIN_LOG::write_cache() to get an error while trying to read the cache to copy it to the binlog. This was then wrongly interpreted as a disk full error writing to the binlog file. As a result, a partial event group containing just a GTID event (no query or commit) was binlogged. Fixed by checking if an error is set in the statement cache, and if so binlog an INCIDENT event instead of a corrupt event group, as for other errors. 2. For LOAD DATA LOCAL INFILE, if a disk full error occured while writing to the statement cache, the code would attempt to abort and read-and-discard any remaining data sent by the client. The discard code would however continue trying to write data to the statement cache, and wrongly interpret another disk full error as end-of-file from the client. This left the client connection with extra data which corrupts the communication for the next command, as well as again causing an corrupt/incomplete event to be binlogged. Fixed by restoring the default read function before reading any remaining data from the client connection. Reviewed-by: Andrei Elkin <andrei.elkin@mariadb.com> Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2023-10-31 11:48:00 +01:00
Kristian Nielsen	b8f9f796ff	MDEV-31273: Precompute binlog checksums Compute binlog checksums (when enabled) already when writing events into the statement or transaction caches, where before it was done when the caches are copied to the real binlog file. This moves the checksum computation outside of holding LOCK_log, improving scalabitily. At stmt/trx cache write time, the final end_log_pos values are not known, so with this patch these will be set to 0. Events that are written directly to the binlog file (not through stmt/trx cache) keep the correct end_log_pos value. The GTID and COMMIT/XID events at the start and end of event groups are written directly, so the zero end_log_pos is only for events in the middle of event groups, which do not negatively affect replication. An option --binlog-legacy-event-pos, off by default, is provided to disable this behavior to provide backwards compatibility with any external applications that might rely on end_log_pos in events in the middle of event groups. Checksums cannot be pre-computed when binlog encryption is enabled, as encryption relies on correct end_log_pos to provide part of the nonce/IV. Checksum pre-computation is also disabled for WSREP/Galera, as it uses events differently in its write-sets and so on. Extending pre-computation of checksums to Galera where it makes sense could be added in a future patch. The current --binlog-checksum configuration is saved in binlog_cache_data at transaction start and used to pre-compute checksums in cache, if applicable. When the cache is later copied to the binlog, a check is made if the saved value still matches the configured global value; if so, the events are block-copied directly into the binlog file. If --binlog-checksum was changed during the transaction, events are re-written to the binlog file one-by-one and the checksums recomputed/discarded as appropriate. Reviewed-by: Monty <monty@mariadb.org> Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2023-10-27 19:57:43 +02:00
Kristian Nielsen	8eee9806fb	MDEV-31273: Eliminate Log_event::checksum_alg This is a preparatory commit for pre-computing checksums outside of holding LOCK_log, no functional changes. Which checksum algorithm is used (if any) when writing an event does not belong in the event, it is a property of the log being written to. Instead decide the checksum algorithm when constructing the Log_event_writer object, and store it there. Introduce a client-only Log_event::read_checksum_alg to be able to print the checksum read, and a Format_description_log_event::source_checksum_alg which is the checksum algorithm (if any) to use when reading events from a log. Also eliminate some redundant `enum` keywords on the enum_binlog_checksum_alg type. Reviewed-by: Monty <monty@mariadb.org> Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2023-10-26 20:45:35 +02:00
Sergei Golubchik	872ed5342d	fix a sporadic failure of main.alter_table_online_debug on windows first seen in `daca468c68`	2023-09-30 11:13:08 +02:00
Alexey Botchkov	daca468c68	MDEV-32243 Make older compilers happy with log.h. Fix the Cache_flip_event_log constructor.	2023-09-25 14:42:10 +04:00

1 2 3 4 5 ...

475 Commits