mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-07-01 03:26:54 +03:00

Author	SHA1	Message	Date
Monty	dd99780967	MDEV-34504 PURGE BINARY LOGS not working anymore PURGE BINARY LOGS did not always purge binary logs. This commit fixes some of the issues and adds notifications if a binary log cannot be purged. User visible changes: - 'PURGE BINARY LOG TO log_name' and 'PURGE BINARY LOGS BEFORE date' worked differently. 'TO' ignored 'slave_connections_needed_for_purge' while 'BEFORE' did not. Now both versions ignores the 'slave_connections_needed_for_purge variable'. - 'PURGE BINARY LOG..' commands now returns 'note' if a binary log cannot be deleted like Note 1375 Binary log 'master-bin.000004' is not purged because it is the current active binlog - Automatic binary log purges, based on date or size, will write a note to the error log if a binary log matching the size or date cannot yet be deleted. - If 'slave_connections_needed_for_purge' is set from a config or command line, it is set to 0 if Galera is enabled and 1 otherwise (old default). This ensures that automatic binary log purge works with Galera as before the addition of 'slave_connections_needed_for_purge'. If the variable is changed to 0, a warning will be printed to the error log. Code changes: - Added THD argument to several purge_logs related functions that needed THD. - Added 'interactive' options to purge_logs functions. This allowed me to remove testing of sql_command == SQLCOM_PURGE. - Changed purge_logs_before_date() to first check if log is applicable before calling can_purge_logs(). This ensures we do not get a notification for logs that does not match the remove criteria. - MYSQL_BIN_LOG::can_purge_log() will write notifications to the user or error log if a log cannot yet be removed. - log_in_use() will return reason why a binary log cannot be removed. Changes to keep code consistent: - Moved checking of binlog_format for Galera to be after Galera is initialized (The old check never worked). If Galera is enabled we now change the binlog_format to ROW, with a warning, instead of aborting the server. If this change happens a warning will be printed to the error log. - Print a warning if Galera or FLASHBACK changes the binlog_format to ROW. Before it was done silently. Reviewed by: Sergei Golubchik <serg@mariadb.com>, Kristian Nielsen <knielsen@knielsen-hq.org>	2024-07-10 18:50:08 +03:00
Alexander Barkov	5fb07d942b	Merge remote-tracking branch 'origin/11.2' into 11.4	2024-07-09 21:45:37 +04:00
Alexander Barkov	8aad19ddfc	Merge remote-tracking branch 'origin/11.1' into 11.2	2024-07-09 14:04:11 +04:00
Marko Mäkelä	27a3366663	Merge 10.6 into 10.11	2024-06-27 10:26:09 +03:00
Marko Mäkelä	0076eb3d4e	Merge 10.5 into 10.6	2024-06-24 13:09:47 +03:00
Dave Gosselin	db0c28eff8	MDEV-33746 Supply missing override markings Find and fix missing virtual override markings. Updates cmake maintainer flags to include -Wsuggest-override and -Winconsistent-missing-override.	2024-06-20 11:32:13 -04:00
Alexander Barkov	c4bf4ce948	Merge remote-tracking branch 'origin/11.2' into 11.4	2024-06-17 15:46:39 +04:00
Marko Mäkelä	a21e49cbcc	Merge 11.1 into 11.2	2024-06-17 12:02:03 +03:00
Marko Mäkelä	22ba7e4ff8	Merge 10.6 into 10.11	2024-05-30 16:04:00 +03:00
Marko Mäkelä	5ba542e9ee	Merge 10.5 into 10.6	2024-05-30 14:27:07 +03:00
Oleksandr Byelkin	99b370e023	Merge branch '11.2' into 11.4	2024-05-21 19:38:51 +02:00
Robin Newhouse	dc38d8ea80	Minimize unsafe C functions with safe_strcpy() Similar to #2480. `567b681` introduced safe_strcpy() to minimize the use of C with potentially unsafe memory overflow with strcpy() whose use is discouraged. Replace instances of strcpy() with safe_strcpy() where possible, limited here to files in the `sql/` directory. All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.	2024-05-17 13:33:16 +01:00
Oleksandr Byelkin	cd28b2479c	Merge branch '11.1' into 11.2	2024-04-09 12:12:33 +02:00
Marko Mäkelä	788953463d	Merge 10.6 into 10.11 Some fixes related to commit `f838b2d799` and Rows_log_event::do_apply_event() and Update_rows_log_event::do_exec_row() for system-versioned tables were provided by Nikita Malyavin. This was required by test versioning.rpl,trx_id,row.	2024-03-28 09:16:57 +02:00
Marko Mäkelä	50715bd2ed	Merge 10.5 into 10.6	2024-03-18 17:07:32 +02:00
Marko Mäkelä	09d991d01c	MDEV-33478: Tests massively fail with clang-18 -fsanitize=memory Starting with clang-16, MemorySanitizer appears to check that uninitialized values not be passed by value nor returned. Previously, it was allowed to copy uninitialized data in such cases. get_foreign_key_info(): Remove a local variable that was passed uninitialized to a function. DsMrr_impl: Initialize key_buffer, because DsMrr_impl::dsmrr_init() is reading it. test_bind_result_ext1(): MYSQL_TYPE_LONG is 32 bits, hence we must use a 32-bit type, such as int. sizeof(long) differs between LP64 and LLP64 targets.	2024-03-18 16:01:29 +02:00
Oleksandr Byelkin	fa69b085b1	Merge branch '11.3' into 11.4	2024-02-15 13:53:21 +01:00
Monty	18dfcfdecf	MDEV-31404 Implement binlog_space_limit binlog_space_limit is a variable in Percona server used to limit the total size of all binary logs. This implementation is based on code from Percona server 5.7. In MariaDB we decided to call the variable max-binlog-total-size to be similar to max-binlog-size. This makes it easier to find in the output from 'mariadbd --help --verbose'). MariaDB will also support binlog_space_limit for compatibility with Percona. Some internal notes to explain implementation notes: - When running MariaDB does not delete binary logs that are either used by slaves or have active xid that are not yet committed. Some implementation notes: - max-binlog-total-size is by default 0 (no limit). - max-binlog-total-size can be changed without server restart. - Binlog file sizes are checked on startup, or if max-binlog-total-size is set to a value > 0, not for every log write. The total size of all binary logs is cached and dynamically updated when updating the binary log on binary log rotation. - max-binlog-total-size is checked against existing log files during serverstart, binlog rotation, FLUSH LOGS, when writing to binary log or when max-binlog-total-size changes value. - Option --slave-connections-needed-for-purge with 1 as default added. This allows one to ensure that we do not delete binary logs if there is less than 'slave-connections-needed-for-purge' connected. Without this option max-binlog-total-size would potentially delete binlogs needed by slaves on server startup or when a slave disconnects as there are then no connected slaves to protect active binlogs. - PURGE BINARY LOGS TO ... will be executed as if slave-connectitons-needed-for-purge would be zero. In other words it will do the purge even if there is no slaves connected. If there are connected slaves working on the logs, these will be protected. - If binary log is on and max-binlog-total_size <> 0 then the status variable 'Binlog_disk_use' shows the current size of all old binary logs + the state of the current one. - Removed test of strcmp(log_file_name, log_info.log_file_name) in purge_logs_before_date() as this is tested in can_purge_logs() - To avoid expensive calls of log_in_use() we cache the result for the last log that is in use by a slave. Future calls to can_purge_logs() for this binary log will be quickly detected and false will be returned until a slave starts working on a new log. - Note that after a binary log rotation caused by max_binlog_size, the last log will not be purged directly as it is still in use internally. The next binary log write will purge binlogs if needed. Reviewer:Kristian Nielsen <knielsen@knielsen-hq.org>	2024-02-14 15:02:21 +01:00
Sergei Golubchik	79580f4f96	Merge branch '11.1' into 11.2	2024-02-02 17:43:57 +01:00
Kristian Nielsen	d039346a7a	MDEV-4991: GTID binlog indexing Improve the performance of slave connect using B+-Tree indexes on each binlog file. The index allows fast lookup of a GTID position to the corresponding offset in the binlog file, as well as lookup of a position to find the corresponding GTID position. This eliminates a costly sequential scan of the starting binlog file to find the GTID starting position when a slave connects. This is especially costly if the binlog file is not cached in memory (IO cost), or if it is encrypted or a lot of slaves connect simultaneously (CPU cost). The size of the index files is generally less than 1% of the binlog data, so not expected to be an issue. Most of the work writing the index is done as a background task, in the binlog background thread. This minimises the performance impact on transaction commit. A simple global mutex is used to protect index reads and (background) index writes; this is fine as slave connect is a relatively infrequent operation. Here are the user-visible options and status variables. The feature is on by default and is expected to need no tuning or configuration for most users. binlog_gtid_index On by default. Can be used to disable the indexes for testing purposes. binlog_gtid_index_page_size (default 4096) Page size to use for the binlog GTID index. This is the size of the nodes in the B+-tree used internally in the index. A very small page-size (64 is the minimum) will be less efficient, but can be used to stress the BTree-code during testing. binlog_gtid_index_span_min (default 65536) Control sparseness of the binlog GTID index. If set to N, at most one index record will be added for every N bytes of binlog file written. This can be used to reduce the number of records in the index, at the cost only of having to scan a few more events in the binlog file before finding the target position Two status variables are available to monitor the use of the GTID indexes: Binlog_gtid_index_hit Binlog_gtid_index_miss The "hit" status increments for each successful lookup in a GTID index. The "miss" increments when a lookup is not possible. This indicates that the index file is missing (eg. binlog written by old server version without GTID index support), or corrupt. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-01-27 12:09:54 +01:00
Marko Mäkelä	7ee16b1e29	Merge 11.3 into 11.4	2024-01-05 14:53:03 +02:00
Marko Mäkelä	f6d21a8855	Merge 11.1 into 11.2	2024-01-05 13:06:56 +02:00
Sergei Golubchik	c154aafe1a	Merge remote-tracking branch '11.3' into 11.4	2023-12-21 15:40:55 +01:00
Marko Mäkelä	2b8dc7668a	Merge 10.6 into 10.11	2023-12-21 13:19:17 +02:00
Marko Mäkelä	a81a138aab	Merge 10.5 into 10.6	2023-12-21 12:58:11 +02:00
Marko Mäkelä	a3dd7ea09f	Merge 10.4 into 10.5	2023-12-21 11:30:32 +02:00
Sergei Golubchik	fef31a26f3	Merge branch '11.1' into 11.2	2023-12-20 23:43:05 +01:00
Daniele Sciascia	0e1f4bd661	MDEV-31272 Statement rollback causes empty writeset replication This patch fixes cases where a transaction caused empty writeset to be replicated. This could happen in the case where a transaction executes a statement that initially manages to modify some data and therefore appended keys some for certification. The statement is however rolled back at some later stage due to some error (for example, a duplicate key error). After statement rollback the transaction is still alive, has no other changes. When committing such transaction, an empty writeset was replicated through Galera. The fix is to avoid calling into commit hook only when transaction has appended one or keys for certification and has some data in binlog cache to replicate. Otherwise, the commit is considered empty, and goes through usual empty commit path. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2023-12-20 12:31:17 +01:00
Sergei Golubchik	fd0b47f9d6	Merge branch '10.6' into 10.11	2023-12-18 11:19:04 +01:00
Sergei Golubchik	e95bba9c58	Merge branch '10.5' into 10.6	2023-12-17 11:20:43 +01:00
Sergei Golubchik	98a39b0c91	Merge branch '10.4' into 10.5	2023-12-02 01:02:50 +01:00
Nikita Malyavin	a569515a9d	online alter: rework savepoints Use standard handlerton functions for savepoint add/rollback. To identify the savepoint, the pointer passed is used. Every table that has online alter in progress maintains a list of savepoints independently. Also this removes setting a value to a global variable savepoint_alloc_size without any protection, which was a race condition bug.	2023-11-02 22:58:03 +04:00
Nikita Malyavin	cb52174693	online alter: extract the source to a separate file Move all the functions dedicated to online alter to a newly created online_alter.cc. With that, make many functions static and simplify the static functions naming. Also, rename binlog_log_row_online_alter -> online_alter_log_row.	2023-11-02 22:58:03 +04:00
Nikita Malyavin	830bdfccbd	MDEV-32126 Assertion fails upon online ALTER and binary log enabled Assertion `!writer.checksum_len \|\| writer.remains == 0' fails upon concurrent online ALTER and transactions with failing statements and binary log enabled. Also another assertion, `pos != (~(my_off_t) 0)', fails in my_seek, upon reinit_io_cache, on a simplified test. This means that IO_CACHE wasn't properly initialized, or had an error before. The overall problem is a deep interference with the effect of an installed binlog_hton: the assumption about that thd->binlog_get_cache_mngr() is, sufficiently, NULL, when we shouldn't run the binlog part of binlog_commit/binlog_rollback, is wrong: as turns out, sometimes the binlog handlerton can be not installed in current thd, but binlog_commit can be called on behalf of binlog, as in the bug reported. One separate condition found is XA recovery of the orphaned transaction, when binlog_commit is also called, but it has nothing to do with online alter. Solution: Extract online alter operations into a separate handlerton.	2023-11-02 22:58:03 +04:00
Kristian Nielsen	6fa69ad747	MDEV-27436: binlog corruption (/tmp no space left on device at the same moment) This commit fixes several bugs in error handling around disk full when writing the statement/transaction binlog caches: 1. If the error occurs during a non-transactional statement, the code attempts to binlog the partially executed statement (as it cannot roll back). The stmt_cache->error was still set from the disk full error. This caused MYSQL_BIN_LOG::write_cache() to get an error while trying to read the cache to copy it to the binlog. This was then wrongly interpreted as a disk full error writing to the binlog file. As a result, a partial event group containing just a GTID event (no query or commit) was binlogged. Fixed by checking if an error is set in the statement cache, and if so binlog an INCIDENT event instead of a corrupt event group, as for other errors. 2. For LOAD DATA LOCAL INFILE, if a disk full error occured while writing to the statement cache, the code would attempt to abort and read-and-discard any remaining data sent by the client. The discard code would however continue trying to write data to the statement cache, and wrongly interpret another disk full error as end-of-file from the client. This left the client connection with extra data which corrupts the communication for the next command, as well as again causing an corrupt/incomplete event to be binlogged. Fixed by restoring the default read function before reading any remaining data from the client connection. Reviewed-by: Andrei Elkin <andrei.elkin@mariadb.com> Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2023-10-31 11:48:00 +01:00
Kristian Nielsen	b8f9f796ff	MDEV-31273: Precompute binlog checksums Compute binlog checksums (when enabled) already when writing events into the statement or transaction caches, where before it was done when the caches are copied to the real binlog file. This moves the checksum computation outside of holding LOCK_log, improving scalabitily. At stmt/trx cache write time, the final end_log_pos values are not known, so with this patch these will be set to 0. Events that are written directly to the binlog file (not through stmt/trx cache) keep the correct end_log_pos value. The GTID and COMMIT/XID events at the start and end of event groups are written directly, so the zero end_log_pos is only for events in the middle of event groups, which do not negatively affect replication. An option --binlog-legacy-event-pos, off by default, is provided to disable this behavior to provide backwards compatibility with any external applications that might rely on end_log_pos in events in the middle of event groups. Checksums cannot be pre-computed when binlog encryption is enabled, as encryption relies on correct end_log_pos to provide part of the nonce/IV. Checksum pre-computation is also disabled for WSREP/Galera, as it uses events differently in its write-sets and so on. Extending pre-computation of checksums to Galera where it makes sense could be added in a future patch. The current --binlog-checksum configuration is saved in binlog_cache_data at transaction start and used to pre-compute checksums in cache, if applicable. When the cache is later copied to the binlog, a check is made if the saved value still matches the configured global value; if so, the events are block-copied directly into the binlog file. If --binlog-checksum was changed during the transaction, events are re-written to the binlog file one-by-one and the checksums recomputed/discarded as appropriate. Reviewed-by: Monty <monty@mariadb.org> Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2023-10-27 19:57:43 +02:00
Kristian Nielsen	8eee9806fb	MDEV-31273: Eliminate Log_event::checksum_alg This is a preparatory commit for pre-computing checksums outside of holding LOCK_log, no functional changes. Which checksum algorithm is used (if any) when writing an event does not belong in the event, it is a property of the log being written to. Instead decide the checksum algorithm when constructing the Log_event_writer object, and store it there. Introduce a client-only Log_event::read_checksum_alg to be able to print the checksum read, and a Format_description_log_event::source_checksum_alg which is the checksum algorithm (if any) to use when reading events from a log. Also eliminate some redundant `enum` keywords on the enum_binlog_checksum_alg type. Reviewed-by: Monty <monty@mariadb.org> Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2023-10-26 20:45:35 +02:00
Sergei Golubchik	872ed5342d	fix a sporadic failure of main.alter_table_online_debug on windows first seen in `daca468c68`	2023-09-30 11:13:08 +02:00
Alexey Botchkov	daca468c68	MDEV-32243 Make older compilers happy with log.h. Fix the Cache_flip_event_log constructor.	2023-09-25 14:42:10 +04:00
Nikita Malyavin	d5e59c983f	MDEV-31646 Online alter applies binlog cache limit to cache writes 1. Make online disk writes unlimited, same as filesort does. 2. Make proper error handling -- in 32-bit build IO_CACHE capacity limit is 4GB, so it is quite possible to overfill there. 3. Event_log::write_cache complicated with event reparsing, and as it was proven by QA, contains some mistakes. Rewrite introbuce a simpler and much faster version, not featuring reparsing and therefore copying a whole buffer at once. This also disables checksums and crypto. 4. Handle read_log_event errors correctly: error returned is -1 (eof signal for alter table), and my_error is not called. Call my_error and always return 1. There's no test for this, since it shouldn't happen, see the next bullet. 5. An event could be written partially in case of error, if it's bigger than the IO_CACHE buffer. Restore the position where it was before the error was emitted. As a result, online alter is untied of several binlog variables, which was a second aim of this patch.	2023-08-15 13:59:07 +02:00
Nikita Malyavin	ecb9db4c3d	MDEV-30949 Direct leak in binlog_online_alter_end_trans when committing a big transaction, online_alter_cache_log creates a cache file. It wasn't properly closed, which was spotted by a memory leak from my_register_filename. A temporary file also remained open. Binlog wasn't affected by this, since it features its own file management. A proper closing is calling close_cached_file. It deinits io_cache and closes the underlying file. After closing, the file is expected to be deleted automagically.	2023-08-15 10:16:13 +02:00
Nikita Malyavin	8f6f219a68	control Cache_flip_event_log lifetime with reference count If online alter fails, TABLE_SHARE can be freed while concurrent transactions still have row events in their online_alter_cache_data. On commit they try'll to flush them, writing to TABLE_SHARE's Cache_flip_event_log, which is already freed. This causes a crash in main.alter_table_online_debug test	2023-08-15 10:16:12 +02:00
Sergei Golubchik	64b55151f4	separate online_alter_cache_data from binlog_cache_data	2023-08-15 10:16:12 +02:00
Nikita Malyavin	5a867d847c	Online alter: savepoints	2023-08-15 10:16:11 +02:00
Sergei Golubchik	332f41aae3	don't copy stmt IO_CACHE to trx IO_CACHE at the stmt end instead use only one (trx) IO_CACHE and truncate it if the statement is rolled back. don't use binlog_cache_mngr to accumulate the data, use binlog_cache_data instead. (binlog_cache_data owns one IO_CACHE, binlog_cache_mngr owns two binlog_cache_data's, trx and stmt).	2023-08-15 10:16:11 +02:00
Sergei Golubchik	0b67af5a81	cleanup no functional changes here	2023-08-15 10:16:11 +02:00
Nikita Malyavin	ab4bfad206	MDEV-16329 [5/5] ALTER ONLINE TABLE * Log rows in online_alter_binlog. * Table online data is replicated within dedicated binlog file * Cached data is written on commit. * Versioning is fully supported. * Works both wit and without binlog enabled. * For now savepoints setup is forbidden while ONLINE ALTER goes on. Extra support is required. We can simply log the SAVEPOINT query events and replicate them together with row events. But it's not implemented for now. * Cache flipping: We want to care for the possible bottleneck in the online alter binlog reading/writing in advance. IO_CACHE does not provide anything better that sequential access, besides, only a single write is mutex-protected, which is not suitable, since we should write a transaction atomically. To solve this, a special layer on top Event_log is implemented. There are two IO_CACHE files underneath: one for reading, and one for writing. Once the read cache is empty, an exclusive lock is acquired (we can wait for a currently active transaction finish writing), and flip() is emitted, i.e. the write cache is reopened for read, and the read cache is emptied, and reopened for writing. This reminds a buffer flip that happens in accelerated graphics (DirectX/OpenGL/etc). Cache_flip_event_log is considered non-blocking for a single reader and a single writer in this sense, with the only lock held by reader during flip. An alternative approach by implementing a fair concurrent circular buffer is described in MDEV-24676. * Cache managers: We have two cache sinks: statement and transactional. It is important that the changes are first cached per-statement and per-transaction. If a statement fails, then only statement data is rolled back. The transaction moves along, however. Turns out, there's no guarantee that TABLE well persist in thd->open_tables to the transaction commit moment. If an error occurs, tables from statement are purged. Therefore, we can't store te caches in TABLE. Ideally, it should be handlerton, but we cut the corner and store it in THD in a list.	2023-08-15 10:16:11 +02:00
Nikita Malyavin	d2d0995cf2	MDEV-16329 [4/5] Refactor MYSQL_BIN_LOG: extract Event_log ancestor Event_log is supposed to be a basic logging class that can write events in a single file. MYSQL_BIN_LOG in comparison will have: * rotation support * index files * purging * gtid and transactional information handling. * is dedicated for a general-purpose binlog	2023-08-15 10:16:11 +02:00
Nikita Malyavin	6427e343cf	MDEV-16329 [3/5] use binlog_cache_data directly in most places * Eliminate most usages of THD::use_trans_table. Only 3 left, and they are at quite high levels, and really essential. * Eliminate is_transactional argument when possible. Lots of places are left though, because of some WSREP error handling in MYSQL_BIN_LOG::set_write_error. * Remove junk binlog functions from THD * binlog_prepare_pending_rows_event is moved to log.cc inside MYSQL_BIN_LOG and is not anymore template. Instead it accepls event factory with a type code, and a callback to a constructing function in it.	2023-08-15 10:16:11 +02:00
Nikita Malyavin	429f635f30	MDEV-16329 [2/5] refactor binlog and cache_mngr pump up binlog and cache manager to level of binlog_log_row_internal	2023-08-15 10:16:11 +02:00

1 2 3 4 5 ...

464 Commits