mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-08-08 11:22:35 +03:00

Author	SHA1	Message	Date
Oleksandr Byelkin	f1102da37a	Merge branch '11.8' into 12.0	2025-05-22 09:22:55 +02:00
Monty	7728b90a0d	Removed possible deadlock betwen LOCK_log and LOCK_global_system_variables The lock order of the mutex must be LOCK_log followed by LOCK_global_system_variables as InnoDB can lock LOCK_global_system_variables during a transaction commit when LOCK_log is hold. Fix is to temporarly unlock LOCK_global_system_variables when setting global binlog variables that needs to use LOCK_log.	2025-04-28 12:59:39 +03:00
Sergei Golubchik	237e24497b	Merge remote-tracking branch 'github/bb-11.4-release' into bb-11.8-serg	2025-04-27 19:40:00 +02:00
Oleksandr Byelkin	a8d4642375	Merge branch '10.11' into 11.4	2025-04-26 10:53:02 +02:00
Andrei Elkin	a0b77eb806	MDEV-36685 CREATE-SELECT may lose in binlog side-effects of stored-routine When the SELECT sub-statement executes a stored function that is defined to modify a non-transactional table, like delimiter \|; create function f_ia(arg int) returns integer begin insert into ti_pk set a=1; insert into ta set a=1; insert into ti_pk set a=arg; return 1; end \| delimiter ;\| any modified records that the function has succeeded on must be binlogged as a "side effect" of CREATE-SELECT. It is expected that a failing CREATE-SELECT like --error ER_DUP_ENTRY set statement binlog_format = ROW for create table t_y (a int) engine=aria select f_ia(1 /* err in Innodb after Aria stmt is done /) as a; leaves upon itself the following state: include/show_binlog_events.inc Log_name Pos Event_type Server_id End_log_pos Info master-bin.000001 # Gtid # # BEGIN GTID #-#-# master-bin.000001 # Table_map # # table_id: # (test. ta) master-bin.000001 # Write_rows_v1 # # table_id: # flags: STMT_END_F master-bin.000001 # Query # # COMMIT select from ta; a 1 select count() = 0 from ti_pk; true However it's not so for the binlog part. The reason is that prior to MDEV-34150 fixes the CREATE-SELECT's errored phase leaves the binlog caches intact (the file:pos from 10.11 `c06c36218a`) to defer their reset to the rollback phase of the top-level / the statement cache gets binlogged / where the side-effect changes gets binlogged. MDEV-34150 fixes harmed (+#4 line) the statement cache in particular in the error phase (file:pos are from `395db6f1d5` the current 11.8 ) / The caches incl the statement cache are gone / / 'cos of MDEV-34150 */ +#4 0x00005d75f9b6a92e in THD::binlog_remove_rows_events (this=0x52c000240288) at log.cc:579 Apparently it should not have been there, as proper emptying (either with reset for the transactional cache or flush and then reset for the statement cache) is (must be) always done via binlog_rollback of the top-level statement. To observe the above requirement the case is fixed with the removal of thd->binlog_remove_rows_events() and its definition. Tested with rpl.rpl_create_select_row. Reviewed-by Brandon Nesterenko.	2025-04-25 21:26:35 +03:00
Sergei Golubchik	9b824e62d4	Merge branch '11.8' into main	2025-04-18 17:11:01 +02:00
Andrei Elkin	1b4efbeb8c	MDEV-35207 ignored error at binlogging by CREATE-TABLE-SELECT leads to assert MDEV-35499 Errored-out CREATE-or-REPLACE-SELECT does not log DROP table into binlog MDEV-35502 Failed at ROW-format binlogging CREATE-TABLE-SELECT should not generate Incident event When a CREATE TABLE .. SELECT errors while inserting data, a user would expect that all changes are rolled back and the table would not exist after executing the query. However CREATE-TABLE-SELECT can face an error near the end of its execution select_create::send_eof() so that the error was never checked which led to various assert inside binlogging path that should not be attended at all. Specifically when binlog_commit() of ha_commit_one_phase() that CREATE-TABLE-SELECT employs errored out because of a limited cache size (binlog_commit may try writing to a transactional cache) the cache was not flushed to binlog. The missed error check allowed further execution down to trans_commit_implicit() in whose stack DBUG_ASSERT(!(entry->using_trx_cache && !mngr->trx_cache.empty() && mngr->get_binlog_cache_log(TRUE)->error)); fired. In a non-debug build that table remains created/populated inconsistently with binlog. The fixes need and install the error checking in select_create::send_eof(). That prevents from any further execution when ha_commit_one_phase() fails for any reason (typically due to binlog_commit()). This commit also covers CREATE-or-REPLACE-SELECT that additionally had a specific issue in that DROP TABLE was not logged the binary log, MDEV-35499. See changes select_create::abort_result_set(). The current commit also corrects an unnecessary Incident event logging when CREATE-TABLE-SELECT encounters a binloging issue, MDEV-35502. The Incident was actually only harmful in this case as the table was never going to be created, therefore replicated, in such a case. In "normal" cases when the SELECT phase errors due to binlogging, an internal incident flag gets reset inside select_create::abort_result_set(). A hunk in select_insert::prepare_eof() addresses a specific kind of this issue that deals with incorrect computation of the binlog cache type. Because of that in the OLD version execution was allowed to proceed along ha_commit_trans()..binlog_commit() while a Pending event was not flushed to the transactional cache. That might lead to the unnecessary binlogged Incident despite the select_create::abort_result_set() measures. However now with the corrected cache type any binlogging error to flush the Pending event is covered according to the normal case. non-transaction table, updates to the non-transactional table NOTE the commit contains few tests overlapping with unfixed yet MDEV-36027. Thanks to Brandon Nesterenko and Kristian Nielsen for thorough review, and Kristian additionally for ideas to simplify the patch and some code contribution.	2025-04-03 19:00:02 +03:00
Marko Mäkelä	bb1d88b6dc	Merge 11.4 into 11.8	2025-04-02 14:07:01 +03:00
Marko Mäkelä	3ae8f114e2	Merge 10.11 into 11.4	2025-04-02 10:15:08 +03:00
Julius Goryavsky	74f0b99edf	Merge branch '10.6' into '10.11'	2025-04-02 06:33:39 +02:00
Julius Goryavsky	03c31ab099	Merge branch '10.5' into '10.6'	2025-04-02 04:43:24 +02:00
Jan Lindström	25737dbab7	MDEV-33850 : For Galera, create sequence with low cache got signal 6 error: [ERROR] WSREP: FSM: no such a transition REPLICATING -> COMMITTED Problem was that transacton was BF-aborted after certification succeeded and transaction tried to rollback and during rollback binlog stmt cache containing sequence value reservations was written into binlog. Transaction must replay because certification succeeded but transaction must not be written into binlog yet, it will be done during commit after the replay. Fix is to skip binlog write if transaction must replay and in replay we need to reset binlog stmt cache. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2025-04-02 04:29:40 +02:00
Daniele Sciascia	d698b784c8	MDEV-35658 Assertion `commit_trx' failed in test galera_as_master The test issues a simple INSERT statement, while sql_log_bin = 0. This option disables writes to binlog. However, since MDEV-7205, the option does not affect Galera, so changes are still replicated. So sql_log_bin=off, "partially" disabled the binlog and the INSERT will involve both binlog and innodb, thus requiring internal 2 phase commit (2PC). In 2PC INSERT is first prepared, which will make it transition to PREPARED state in innodb, and later committed which causes the new assertion from MDEV-24035 to fail. Running the same test with sql_log_bin enabled also results in 2PC, but the execution has one more step for ordered commit, between prepare and commit. Ordered commit causes the transaction state to transition back to TRX_STATE_NOT_STARTED. Thus avoiding the assertion. This patch makes sure that when sql_log_bin=off, the ordered commit step is not skipped, thus going through the expected state transitions in the storage engine. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2025-04-02 04:29:40 +02:00
Vasilii Lakhin	717c12de0e	Fix typos in C comments inside sql/	2025-03-14 12:08:56 +04:00
Sergei Golubchik	ba01c2aaf0	Merge branch '11.4' into 11.7 * rpl.rpl_system_versioning_partitions updated for MDEV-32188 * innodb.row_size_error_log_warnings_3 changed error for MDEV-33658 (checks are done in a different order)	2025-02-06 16:46:36 +01:00
Sergei Golubchik	7d657fda64	Merge branch '10.11 into 11.4	2025-01-30 12:01:11 +01:00
Sergei Golubchik	e69f8cae1a	Merge branch '10.6' into 10.11	2025-01-30 11:55:13 +01:00
Sergei Golubchik	066e8d6aea	Merge branch '10.5' into 10.6	2025-01-29 11:17:38 +01:00
Oleksandr Byelkin	47f87c5f88	MDEV-20281 "[ERROR] Failed to write to mysql.slow_log:" without error reason Add "backup" (in case of absence issued by error) reasons for failed logging.	2025-01-25 20:37:51 +01:00
Kristian Nielsen	72e1cc8f52	MDEV-35806: Error in read_log_event() corrupts relay log writer, crashes server In Log_event::read_log_event(), don't use IO_CACHE::error of the relay log's IO_CACHE to signal an error back to the caller. When reading the active relay log, this flag is also being used by the IO thread, and setting it can randomly cause the IO thread to wrongly detect IO error on writing and permanently disable the relay log. This was seen sporadically in test case rpl.rpl_from_mysql80. The read error set by the SQL thread in the IO_CACHE would be interpreted as a write error by the IO thread, which would cause it to throw a fatal error and close the relay log. And this would later cause CHANGE MASTER to try to purge a closed relay log, resulting in nullptr crash. SQL thread is not able to parse an event read from the relay log. This can happen like here when replicating unknown events from a MySQL master, potentially also for other reasons. Also fix a mistake in my_b_flush_io_cache() introduced back in 2001 (`fa09f2cd7e`) where my_b_flush_io_cache() could wrongly return an error set in IO_CACHE::error, even if the flush operation itself succeeded. Also fix another sporadic failure in rpl.rpl_from_mysql80 where the outout of MASTER_POS_WAIT() depended on timing of SQL and IO thread. Reviewed-by: Monty <monty@mariadb.org> Reviewed-by: Andrei Elkin <andrei.elkin@mariadb.com> Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2025-01-24 09:15:20 +00:00
Sergei Golubchik	f1a7693bc0	Merge branch '10.11' into 11.4	2025-01-14 23:45:41 +01:00
ParadoxV5	8b9c8631a4	MDEV-35818: Fix `replace_binlog_file` info message `ATTRIBUITE_FORMAT` from #3360 uncovers issues on `my_snprintf` uses. This commit fixes the one in `Binlog_commit_by_rotate::` `replace_binlog_file()` about “required size too big”. All I found is that it’s not present in 11.4 (after I prepared previous batches for all maintained branches), for GitHub blame can’t process a file with over 10K lines.	2025-01-13 13:17:55 +11:00
Sergei Golubchik	221aa5e08f	Merge branch '10.6' into 10.11	2025-01-10 13:14:42 +01:00
Marko Mäkelä	15700f54c2	Merge 11.4 into 11.7	2025-01-09 09:41:38 +02:00
Marko Mäkelä	17f01186f5	Merge 10.11 into 11.4	2025-01-09 07:58:08 +02:00
Kristian Nielsen	39f93b6eab	MDEV-29744: Fix incorrect locking order of LOCK_log/LOCK_commit_ordered and LOCK_global_system_variables The LOCK_global_system_variables must not be held when taking mutexes such as LOCK_commit_ordered and LOCK_log, as this causes inconsistent mutex locking order that can theoretically cause the server to deadlock. To avoid this, temporarily release LOCK_global_system_variables in two system variable update functions, like it is done in many other places. Enforce the correct locking order at server startup, to more easily catch (in debug builds) any remaining wrong orders that may be hidden elsewhere in the code. Note that when this is merged to 11.4, similar unlock/lock of LOCK_global_system_variables must be added in update_binlog_space_limit() as is done in binlog_checksum_update() and fix_max_binlog_size(), as this is a new function added in 11.4 that also needs the same fix. Tests will fail with wrong mutex order until this is done. Reviewed-by: Sergei Golubchik <serg@mariadb.org> Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2025-01-08 17:52:34 +01:00
Marko Mäkelä	a54d151fc1	Merge 10.6 into 10.11	2024-12-19 15:38:53 +02:00
Julius Goryavsky	155203c352	Merge branch '10.5' into '10.6'	2024-12-13 01:45:35 +01:00
Alexander Barkov	ab9182470d	MDEV-31366 Assertion `thd->start_time' failed in bool LOGGER::slow_log_print(THD, const char, size_t, ulonglong) Fixing a wrong DBUG_ASSERT. thd->start_time and thd->start_time_sec_part cannot be 0 at the same time. But thd->start_time can be 0 when thd->start_time_sec_part is not 0, e.g. after: SET timestamp=0.99;	2024-12-12 20:32:56 +01:00
Marko Mäkelä	33907f9ec6	Merge 11.4 into 11.7	2024-12-02 17:51:17 +02:00
Marko Mäkelä	2719cc4925	Merge 10.11 into 11.4	2024-12-02 11:35:34 +02:00
Marko Mäkelä	3d23adb766	Merge 10.6 into 10.11	2024-11-29 13:43:17 +02:00
ParadoxV5	d5f16d6305	Extract some of #3360 fixes to 10.6.x That PR uncovered countless issues on `my_snprintf` uses. This commit backports a squashed subset of their fixes (excludes #3485).	2024-11-18 13:29:04 +11:00
Oleksandr Byelkin	b12ff287ec	Merge branch '11.6' into 11.7	2024-11-10 19:22:21 +01:00
Oleksandr Byelkin	9e1fb104a3	Merge tag '11.4' into 11.6 MariaDB 11.4.4 release	2024-11-08 07:17:00 +01:00
Sergei Golubchik	7feec30939	relax the XA recovery error it's just a suggestion anyway, not a bullet-proof check, let's not act as if it is	2024-11-05 14:00:52 -08:00
Sergei Golubchik	aed5928207	cleanup: extract transaction-related part of handlerton into a separate transaction_participant structure handlerton inherits it, so handlerton itself doesn't change. but entities that only need to participate in a transaction, like binlog or online alter log, use a transaction_participant and no longer need to pretend to be a full-blown but invisible storage engine which doesn't support create table.	2024-11-05 14:00:50 -08:00
Sergei Golubchik	4f4c5a2ba9	fix a typo and an old bug in prefschema.transaction test	2024-11-05 14:00:47 -08:00
Brandon Nesterenko	b07258a0d5	MDEV-35109: Semi-sync Replication stalling Primary using wait point=AFTER_SYNC For a primary configured with wait_point=AFTER_SYNC, if two threads T1 (binlogging through MYSQL_BIN_LOG::write()) and T2 were binlogging at the same time, T1 could accidentally wait for its semi-sync ACK using the binlog coordinates of T2. Prior to MDEV-33551, this only resulted in delayed transactions, because all transactions shared the same condition variable for ACK signaling. However, with the MDEV-33551 changes, each thread has its own condition variable to signal. So T1 could wait indefinitely when either: 1) T1's ACK is received but not T2's when T1 goes into wait_after_sync(), because the ACK receiver thread has already notified about the T1 ACK, but T1 was _actually_ waiting on T2's ACK, and therefore tries to wait (in vain). 2) T1 goes to wait_after_sync() before any ACKs have arrived. When T1's ACK comes in, T1 is woken up; however, sees it needs to wait more (because it was actually waiting on T2's ACK), and goes to wait again (this time, in vain). Note that the actual cause of T1 waiting on T2's binlog coordinates is when MYSQL_BIN_LOG::write() would call Repl_semisync_master::wait_after_sync(), the binlog offset parameter was read as the end of MYSQL_BIN_LOG::log_file, which is shared among transactions. So if T2 had updated the binary log _after_ T1 had released LOCK_log, but not yet invoked wait_after_sync(), it would use the end of the binary log file as the binlog offset, which was that of T2 (or any future transaction). The fix in this patch ensures consistency between the binary log coordinates a transaction uses between report_binlog_update() and wait_after_sync(). Reviewed By ============ Kristian Nielsen <knielsen@knielsen-hq.org> Andrei Elkin <andrei.elkin@mariadb.com>	2024-11-04 10:45:58 -07:00
Oleksandr Byelkin	c770bce898	Merge branch '11.2' into 11.4	2024-10-30 15:11:17 +01:00
Oleksandr Byelkin	69d033d165	Merge branch '10.11' into 11.2	2024-10-29 16:42:46 +01:00
Oleksandr Byelkin	3d0fb15028	Merge branch '10.6' into 10.11	2024-10-29 15:24:38 +01:00
Oleksandr Byelkin	f00711bba2	Merge branch '10.5' into 10.6	2024-10-29 14:20:03 +01:00
Libing Song	72cc58bb71	MDEV-32014 Rename binlog cache temporary file to binlog file for large transaction Description =========== When a transaction commits, it copies the binlog events from binlog cache to binlog file. Very large transactions (eg. gigabytes) can stall other transactions for a long time because the data is copied while holding LOCK_log, which blocks other commits from binlogging. The solution in this patch is to rename the binlog cache file to a binlog file instead of copy, if the commiting transaction has large binlog cache. Rename is a very fast operation, it doesn't block other transactions a long time. Design ====== * binlog_large_commit_threshold type: ulonglong scope: global dynamic: yes default: 128MB Only the binlog cache temporary files large than 128MB are renamed to binlog file. * #binlog_cache_files directory To support rename, all binlog cache temporary files are managed as normal files now. `#binlog_cache_files` directory is in the same directory with binlog files. It is created at server startup if it doesn't exist. Otherwise, all files in the directory is deleted at startup. The temporary files are named with ML_ prefix and the memorary address of the binlog_cache_data object which guarantees it is unique. * Reserve space To supprot rename feature, It must reserve enough space at the begin of the binlog cache file. The space is required for Format description, Gtid list, checkpoint and Gtid events when renaming it to a binlog file. Since binlog_cache_data's cache_log is directly accessed by binlog log, online alter and wsrep. It is not easy to update all the code. Thus binlog cache will not reserve space if it is not session binlog cache or wsrep session is enabled. - m_file_reserved_bytes Stores the bytes reserved at the begin of the cache file. It is initialized in write_prepare() and cleared by reset(). The reserved file header is hide to callers. Thus there is no change for callers. E.g. - get_byte_position() still get the length of binlog data written to the cache, but not the file length. - truncate(0) will truncate the file to m_file_reserved_bytes but not 0. - write_prepare() write_prepare() is called everytime when anything is being written into the cache. It will call init_file_reserved_bytes() to create the cache file (if it doesn't exist) and reserve suitable space if the data written exceeds buffer's size. * Binlog_commit_by_rotate It is used to encapsulate the code for remaing a binlog cache tempoary file to binlog file. - should_commit_by_rotate() it is called by write_transaction_to_binlog_events() to check if a binlog cache should be rename to a binlog file. - commit() That is the entry to rename a binlog cache and commit the transaction. Both rename and commit are protected by LOCK_log, Thus not other transactions can write anything into the renamed binlog before it. Rename happens in a rotation. After the new binlog file is generated, replace_binlog_file() is called to: - copy data from the new binlog file to its binlog cache file. - write gtid event. - rename the binlog cache file to binlog file. After that the rotation will continue to succeed. Then the transaction is committed in a seperated group itself. Its cache file will be detached and cache log will be reset before calling trx_group_commit_with_engines(). Thus only Xid event be written.	2024-10-17 07:53:59 -06:00
Monty	bddbef3573	MDEV-34533 asan error about stack overflow when writing record in Aria The problem was that when using clang + asan, we do not get a correct value for the thread stack as some local variables are not allocated at the normal stack. It looks like that for example clang 18.1.3, when compiling with -O2 -fsanitize=addressan it puts local variables and things allocated by alloca() in other areas than on the stack. The following code shows the issue Thread 6 "mariadbd" hit Breakpoint 3, do_handle_one_connection (connect=0x5080000027b8, put_in_cache=<optimized out>) at sql/sql_connect.cc:1399 THD thd; 1399 thd->thread_stack= (char) &thd; (gdb) p &thd (THD *) 0x7fffedee7060 (gdb) p $sp (void ) 0x7fffef4e7bc0 The address of thd is 24M away from the stack pointer (gdb) info reg ... rsp 0x7fffef4e7bc0 0x7fffef4e7bc0 ... r13 0x7fffedee7060 140737185214560 r13 is pointing to the address of the thd. Probably some kind of "local stack" used by the sanitizer I have verified this with gdb on a recursive call that calls alloca() in a loop. In this case all objects was stored in a local heap, not on the stack. To solve this issue in a portable way, I have added two functions: my_get_stack_pointer() returns the address of the current stack pointer. The code is using asm instructions for intel 32/64 bit, powerpc, arm 32/64 bit and sparc 32/64 bit. Supported compilers are gcc, clang and MSVC. For MSVC 64 bit we are using _AddressOfReturnAddress() As a fallback for other compilers/arch we use the address of a local variable. my_get_stack_bounds() that will return the address of the base stack and stack size using pthread_attr_getstack() or NtCurrentTed() with fallback to using the address of a local variable and user provided stack size. Server changes are: - Moving setting of thread_stack to THD::store_globals() using my_get_stack_bounds(). - Removing setting of thd->thread_stack, except in functions that allocates a lot on the stack before calling store_globals(). When using estimates for stack start, we reduce stack_size with MY_STACK_SAFE_MARGIN (8192) to take into account the stack used before calling store_globals(). I also added a unittest, stack_allocation-t, to verify the new code. Reviewed-by: Sergei Golubchik <serg@mariadb.org>	2024-10-16 17:24:46 +03:00
Marko Mäkelä	f493e46494	Merge 11.6 into 11.7	2024-10-03 18:15:13 +03:00
Yuchen Pei	ba7088d462	Merge '11.4' into 11.6	2024-10-03 15:59:20 +10:00
ParadoxV5	d8d80bd503	Fix a couple of `my_snprintf` arg mismatches This commit backports two fixes from #3360’s vast discovery.	2024-10-01 09:53:13 +01:00
Kristian Nielsen	35c732cdde	MDEV-25611: RESET MASTER causes the server to hang RESET MASTER waits for storage engines to reply to a binlog checkpoint requests. If this response is delayed for a long time for some reason, then RESET MASTER can hang. Fix this by forcing a log sync in all engines just before waiting for the checkpoint reply. (Waiting for old checkpoint responses is needed to preserve durability of any commits that were synced to disk in the to-be-deleted binlog but not yet synced in the engine.) Reviewed-by: Andrei Elkin <andrei.elkin@mariadb.com> Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-09-27 15:10:06 +02:00
Brandon Nesterenko	9811d23b6d	MDEV-33756: Deprecate binlog_optimize_thread_scheduling The option binlog_optimize_thread_scheduling was initially added to provide a safe alternative for the newly added binlog group commit logic, such that when 0, it would disable a leader thread from performing the binlog write for all transactions that are a part of the group commit. Any problems related to the binlog group commit optimization should be sorted out by now, so we can deprecate-to-eventually-remove the option altogether. This commit performs the deprecation, and the removal is tracked by MDEV-33745. Note, as the option is only able to be provided via configuration at startup time, users will not see a deprecation message unless looking through the CLI help message. Reviewed By ============ Kristian Nielsen <knielsen@knielsen-hq.org> Sergei Golubchik <serg@mariadb.org>	2024-08-28 06:02:25 -06:00

1 2 3 4 5 ...

2974 Commits