mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-11-09 11:41:36 +03:00

Author	SHA1	Message	Date
Teemu Ollakka	f307160218	MDEV-29293 MariaDB stuck on starting commit state This commit contains a merge from 10.5-MDEV-29293-squash into 10.6. Although the bug MDEV-29293 was not reproducible with 10.6, the fix contains several improvements for wsrep KILL query and BF abort handling, and addresses the following issues: * MDEV-30307 KILL command issued inside a transaction is problematic for galera replication: This commit will remove KILL TOI replication, so Galera side transaction context is not lost during KILL. * MDEV-21075 KILL QUERY maintains nodes data consistency but breaks GTID sequence: This is fixed as well as KILL does not use TOI, and thus does not change GTID state. * MDEV-30372 Assertion in wsrep-lib state: This was caused by BF abort or KILL when local transaction was in the middle of group commit. This commit disables THD::killed handling during commit, so the problem is avoided. * MDEV-30963 Assertion failure !lock.was_chosen_as_deadlock_victim in trx0trx.h:1065: The assertion happened when the victim was BF aborted via MDL while it was committing. This commit changes MDL BF aborts so that transactions which are committing cannot be BF aborted via MDL. The RQG grammar attached in the issue could not reproduce the crash anymore. Original commit message from 10.5 fix: MDEV-29293 MariaDB stuck on starting commit state The problem seems to be a deadlock between KILL command execution and BF abort issued by an applier, where: * KILL has locked victim's LOCK_thd_kill and LOCK_thd_data. * Applier has innodb side global lock mutex and victim trx mutex. * KILL is calling innobase_kill_query, and is blocked by innodb global lock mutex. * Applier is in wsrep_innobase_kill_one_trx and is blocked by victim's LOCK_thd_kill. The fix in this commit removes the TOI replication of KILL command and makes KILL execution less intrusive operation. Aborting the victim happens now by using awake_no_mutex() and ha_abort_transaction(). If the KILL happens when the transaction is committing, the KILL operation is postponed to happen after the statement has completed in order to avoid KILL to interrupt commit processing. Notable changes in this commit: * wsrep client connections's error state may remain sticky after client connection is closed. This error message will then pop up for the next client session issuing first SQL statement. This problem raised with test galera.galera_bf_kill. The fix is to reset wsrep client error state, before a THD is reused for next connetion. * Release THD locks in wsrep_abort_transaction when locking innodb mutexes. This guarantees same locking order as with applier BF aborting. * BF abort from MDL was changed to do BF abort on server/wsrep-lib side first, and only then do the BF abort on InnoDB side. This removes the need to call back from InnoDB for BF aborts which originate from MDL and simplifies the locking. * Removed wsrep_thd_set_wsrep_aborter() from service_wsrep.h. The manipulation of the wsrep_aborter can be done solely on server side. Moreover, it is now debug only variable and could be excluded from optimized builds. * Remove LOCK_thd_kill from wsrep_thd_LOCK/UNLOCK to allow more fine grained locking for SR BF abort which may require locking of victim LOCK_thd_kill. Added explicit call for wsrep_thd_kill_LOCK/UNLOCK where appropriate. * Wsrep-lib was updated to version which allows external locking for BF abort calls. Changes to MTR tests: * Disable galera_bf_abort_group_commit. This test is going to be removed (MDEV-30855). * Make galera_var_retry_autocommit result more readable by echoing cases and expectations into result. Only one expected result for reap to verify that server returns expected status for query. * Record galera_gcache_recover_manytrx as result file was incomplete. Trivial change. * Make galera_create_table_as_select more deterministic: Wait until CTAS execution has reached MDL wait for multi-master conflict case. Expected error from multi-master conflict is ER_QUERY_INTERRUPTED. This is because CTAS does not yet have open wsrep transaction when it is waiting for MDL, query gets interrupted instead of BF aborted. This should be addressed in separate task. * A new test galera_bf_abort_registering to check that registering trx gets BF aborted through MDL. * A new test galera_kill_group_commit to verify correct behavior when KILL is executed while the transaction is committing. Co-authored-by: Seppo Jaakola <seppo.jaakola@iki.fi> Co-authored-by: Jan Lindström <jan.lindstrom@galeracluster.com> Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2023-05-22 00:42:05 +02:00
Marko Mäkelä	085d0ac238	Merge 10.5 into 10.6	2023-02-28 16:05:21 +02:00
Monty	57c526ffb8	Added detection of memory overwrite with multi_malloc This patch also fixes some bugs detected by valgrind after this patch: - Not enough copy_func elements was allocated by Create_tmp_table() which causes an memory overwrite in Create_tmp_table::add_fields() I added an ASSERT() to be able to detect this also without valgrind. The bug was that TMP_TABLE_PARAM::copy_fields was not correctly set when calling create_tmp_table(). - Aria::empty_bits is not allocated if there is no varchar/char/blob fields in the table. Fixed code to take this into account. This cannot cause any issues as this is just a memory access into other Aria memory and the content of the memory would not be used. - Aria::last_key_buff was not allocated big enough. This may have caused issues with rtrees and ma_extra(HA_EXTRA_REMEMBER_POS) as they would use the same memory area. - Aria and MyISAM didn't take extended key parts into account, which caused problems when copying rec_per_key from engine to sql level. - Mark asan builds with 'asan' in version strihng to detect these in not_valgrind_build.inc. This is needed to not have main.sp-no-valgrind fail with asan.	2023-02-27 19:25:44 +02:00
Marko Mäkelä	6aec87544c	Merge 10.5 into 10.6	2023-02-10 13:03:01 +02:00
Marko Mäkelä	c41c79650a	Merge 10.4 into 10.5	2023-02-10 12:02:11 +02:00
Vicențiu Ciorbaru	08c852026d	Apply clang-tidy to remove empty constructors / destructors This patch is the result of running run-clang-tidy -fix -header-filter=.* -checks='-,modernize-use-equals-default' . Code style changes have been done on top. The result of this change leads to the following improvements: 1. Binary size reduction. For a -DBUILD_CONFIG=mysql_release build, the binary size is reduced by ~400kb. * A raw -DCMAKE_BUILD_TYPE=Release reduces the binary size by ~1.4kb. 2. Compiler can better understand the intent of the code, thus it leads to more optimization possibilities. Additionally it enabled detecting unused variables that had an empty default constructor but not marked so explicitly. Particular change required following this patch in sql/opt_range.cc result_keys, an unused template class Bitmap now correctly issues unused variable warnings. Setting Bitmap template class constructor to default allows the compiler to identify that there are no side-effects when instantiating the class. Previously the compiler could not issue the warning as it assumed Bitmap class (being a template) would not be performing a NO-OP for its default constructor. This prevented the "unused variable warning".	2023-02-09 16:09:08 +02:00
Oleksandr Byelkin	c3a5cf2b5b	Merge branch '10.5' into 10.6	2023-01-31 09:31:42 +01:00
Oleksandr Byelkin	7fa02f5c0b	Merge branch '10.4' into 10.5	2023-01-27 13:54:14 +01:00
Alexander Barkov	284ac6f2b7	MDEV-27653 long uniques don't work with unicode collations	2023-01-19 20:33:03 +04:00
Marko Mäkelä	a8c5635cf1	Merge 10.5 into 10.6	2023-01-17 20:02:29 +02:00
Jan Lindström	179c283372	Merge branch 10.4 into 10.5	2023-01-14 08:25:57 +02:00
sjaakola	a44d896f98	10.4-MDEV-29684 Fixes for cluster wide write conflict resolving If two high priority threads have lock conflict, we look at the order of these transactions and honor the earlier transaction. for_locking parameter in lock_rec_has_to_wait() has become obsolete and it is now removed from the code . Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2023-01-14 07:50:04 +02:00
sjaakola	0ff7f33c7b	10.4-MDEV-29684 Fixes for cluster wide write conflict resolving The rather recent thd_need_ordering_with() function does not take high priority transactions' order in consideration. Chaged this funtion to compare also transaction seqnos and favor earlier transaction. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2023-01-13 13:11:03 +02:00
Oleksandr Byelkin	e5aa58190f	Merge branch '10.5' into 10.6	2022-11-02 14:33:20 +01:00
Oleksandr Byelkin	177d858e38	Merge branch '10.4' into 10.5	2022-11-02 13:14:54 +01:00
Sergei Golubchik	1a3859fff0	MDEV-29924 Assertion `(((nr) % (1LL << 24)) % (int) log_10_int[6 - dec]) == 0' failed in my_time_packed_to_binary on SELECT when using TIME field when assigning the cached item to the Item_cache for the first time make sure to use Item_cache::setup(), not Item_cache::store(). Because the former copies the metadata (and allocates memory, in case of Item_cache_row), and Item_cache::decimal must be set for comparisons to work correctly.	2022-11-01 13:22:34 +01:00
Vladislav Vaintroub	b7fe6179e8	MDEV-29843 Do not use asynchronous log_write_upto() for system THDs Non-blocking log_write_upto (MDEV-24341) was only designed for the client connections. Fix, so it is not be triggered for any system THD. Previously, an incomplete solution only excluded Innodb purge THDs, but not the slave for example. The hang in MDEV still remains somewhat a mystery though, it is not immediately clear how exactly condition variable can become corrupted. But it is clear that it can be avoided.	2022-10-25 19:40:44 +02:00
Sergei Golubchik	900d7bf360	Merge branch '10.5' into 10.6	2022-10-02 22:14:21 +02:00
Sergei Golubchik	3a2116241b	Merge branch '10.4' into 10.5	2022-10-02 14:38:13 +02:00
Sergei Golubchik	d4f6d2f08f	Merge branch '10.3' into 10.4	2022-10-01 23:07:26 +02:00
Sergei Golubchik	fa6d7e4e98	compilation error extended initializers are only allowed since c++11	2022-10-01 17:45:23 +02:00
Sergei Golubchik	194cc36805	Merge branch '10.5' into 10.6	2022-09-30 12:29:24 +02:00
Oleksandr Byelkin	f65ba9aeb7	MDEV-17124: mariadb 10.1.34, views and prepared statements: ERROR 1615 (HY000): Prepared statement needs to be re-prepared The problem is that if table definition cache (TDC) is full of real tables which are in tables cache, view definition can not stay there so will be removed by its own underlying tables. In situation above old mechanism of detection matching definition in PS and current version always require reprepare and so prevent executing the PS. One work around is to increase TDC, other - improve version check for views/triggers (which is done here). Now in suspicious cases we check: - timestamp (microseconds) of the view to be sure that version really have changed; - time (microseconds) of creation of a trigger related to time (microseconds) of statement preparation.	2022-09-30 12:11:37 +02:00
Sergei Golubchik	6b685ea7b0	correctness assert thd_get_ha_data() can be used without a lock, but only from the current thd thread, when calling from anoher thread it must be protected by thd->LOCK_thd_data * fix group commit code to take thd->LOCK_thd_data * remove innobase_close_connection() from the innodb background thread, it's not needed after `87775402cd` and was failing the assert with current_thd==0	2022-09-29 10:44:39 +02:00
Sergei Golubchik	de130323b4	MDEV-29368 Assertion `trx->mysql_thd == thd' failed in innobase_kill_query from process_timers/timer_handler and use-after-poison in innobase_kill_query This is a 10.5 version of `9b750dcbd8`, fix for MDEV-23536 Race condition between KILL and transaction commit InnoDB needs to remove trx from thd before destroying it (trx), otherwise a concurrent KILL might get a pointer from thd to a destroyed trx. ha_close_connection() should allow engines to clear ha_data in hton->on close_connection(). To prevent the engine from being unloaded while hton->close_connection() is running, we remove the lock from ha_data and unlock the plugin manually.	2022-09-29 00:11:02 +02:00
Marko Mäkelä	829e8111c7	Merge 10.5 into 10.6	2022-09-26 14:34:43 +03:00
Marko Mäkelä	6286a05d80	Merge 10.4 into 10.5	2022-09-26 13:34:38 +03:00
Marko Mäkelä	a69cf6f07e	MDEV-29613 Improve WITH_DBUG_TRACE=OFF In commit `28325b0863` a compile-time option was introduced to disable the macros DBUG_ENTER and DBUG_RETURN or DBUG_VOID_RETURN. The parameter name WITH_DBUG_TRACE would hint that it also covers DBUG_PRINT statements. Let us do that: WITH_DBUG_TRACE=OFF shall disable DBUG_PRINT() as well. A few InnoDB recovery tests used to check that some output from DBUG_PRINT("ib_log", ...) is present. We can live without those checks. Reviewed by: Vladislav Vaintroub	2022-09-23 13:40:42 +03:00
Jan Lindström	9fefd440b5	Merge 10.5 into 10.6	2022-09-05 14:05:30 +03:00
Jan Lindström	ba987a46c9	Merge 10.4 into 10.5	2022-09-05 13:28:56 +03:00
Daniele Sciascia	2917bd0d2c	Reduce compilation dependencies on wsrep_mysqld.h Making changes to wsrep_mysqld.h causes large parts of server code to be recompiled. The reason is that wsrep_mysqld.h is included by sql_class.h, even tough very little of wsrep_mysqld.h is needed in sql_class.h. This commit introduces a new header file, wsrep_on.h, which is meant to be included from sql_class.h, and contains only macros and variable declarations used to determine whether wsrep is enabled. Also, header wsrep.h should only contain definitions that are also used outside of sql/. Therefore, move WSREP_TO_ISOLATION* and WSREP_SYNC_WAIT macros to wsrep_mysqld.h. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2022-08-31 11:05:23 +03:00
Marko Mäkelä	fbb2b1f55f	Merge 10.5 into 10.6	2022-08-23 08:47:21 +03:00
Marko Mäkelä	3b656ac8c1	Merge 10.4 into 10.5	2022-08-22 19:49:56 +03:00
Alexander Barkov	316847eab7	MDEV-27101 Subquery using the ALL keyword on TIMESTAMP columns produces a wrong result TIMESTAMP columns were compared as strings in ALL/ANY comparison, which did not work well near DST time change. Changing ALL/ANY comparison to use "Native" representation to compare TIMESTAMP columns, like simple comparison does.	2022-08-22 14:27:22 +04:00
Marko Mäkelä	30914389fe	Merge 10.5 into 10.6	2022-07-27 17:52:37 +03:00
Marko Mäkelä	098c0f2634	Merge 10.4 into 10.5	2022-07-27 17:17:24 +03:00
Oleksandr Byelkin	3bb36e9495	Merge branch '10.3' into 10.4	2022-07-27 11:02:57 +02:00
haomi123	7b0e68b8a2	fix DBUG_ENTER awake_no_mutex	2022-07-22 14:59:18 +02:00
Sergei Golubchik	3bc98a4ec4	Merge branch '10.5' into 10.6	2022-05-10 14:01:23 +02:00
Sergei Golubchik	ef781162ff	Merge branch '10.4' into 10.5	2022-05-09 22:04:06 +02:00
Sergei Golubchik	a70a1cf3f4	Merge branch '10.3' into 10.4	2022-05-08 23:03:08 +02:00
Sergei Golubchik	6f741eb6e4	Merge branch '10.2' into 10.3	2022-05-07 11:48:15 +02:00
Andrei	a5dc12eefd	MDEV-28310 Missing binlog data for INSERT .. ON DUPLICATE KEY UPDATE MDEV-21810 MBR: Unexpected "Unsafe statement" warning for unsafe IODKU MDEV-17614 fixes to replication unsafety for INSERT ON DUP KEY UPDATE on two or more unique key table left a flaw. The fixes checked the safety condition per each inserted record with the idea to catch a user-created value to an autoincrement column and when that succeeds the autoincrement column would become the source of unsafety too. It was not expected that after a duplicate error the next record's write_set may become different and the unsafe decision for that specific record will be computed to screw the Query's binlogging state and when @@binlog_format is MIXED nothing gets bin-logged. This case has been already fixed in 10.5.2 by `91ab42a823` that relocated/optimized THD::decide_logging_format_low() out of the record insert loop. The safety decision is computed once and at the right time. Pertinent parts of the commit are cherry-picked. Also a spurious warning about unsafety is removed when MIXED @@binlog_format; original MDEV-17614 test result corrected. The original test of MDEV-17614 is extended and made more readable.	2022-05-06 22:16:42 +03:00
Oleksandr Byelkin	9614fde1aa	Merge branch '10.2' into 10.3	2022-05-03 10:59:54 +02:00
Brandon Nesterenko	a83c7ab1ea	MDEV-11853: semisync thread can be killed after sync binlog but before ACK in the sync state Problem: ======== If a primary is shutdown during an active semi-sync connection during the period when the primary is awaiting an ACK, the primary hard kills the active communication thread and does not ensure the transaction was received by a replica. This can lead to an inconsistent replication state. Solution: ======== During shutdown, the primary should wait for an ACK or timeout before hard killing a thread which is awaiting a communication. We extend the `SHUTDOWN WAIT FOR SLAVES` logic to identify and ignore any threads waiting for a semi-sync ACK in phase 1. Then, before stopping the ack receiver thread, the shutdown is delayed until all waiting semi-sync connections receive an ACK or time out. The connections are then killed in phase 2. Notes: 1) There remains an unresolved corner case that affects this patch. MDEV-28141: Slave crashes with Packets out of order when connecting to a shutting down master. Specifically, If a slave is connecting to a master which is actively shutting down, the slave can crash with a "Packets out of order" assertion error. To get around this issue in the MTR tests, the primary will wait a small amount of time before phase 1 killing threads to let the replicas safely stop (if applicable). 2) This patch also fixes MDEV-28114: Semi-sync Master ACK Receiver Thread Can Error on COM_QUIT Reviewed By ============ Andrei Elkin <andrei.elkin@mariadb.com>	2022-04-22 12:59:54 -06:00
Alexander Barkov	2be617d869	MDEV-25243 ASAN heap-use-after-free in Item_func_sp::execute_impl upon concurrent view DDL and I_S query with view and function	2022-04-21 09:51:11 +04:00
Marko Mäkelä	b242c3141f	Merge 10.5 into 10.6	2022-03-29 16:16:21 +03:00
Marko Mäkelä	d62b0368ca	Merge 10.4 into 10.5	2022-03-29 12:59:18 +03:00
sjaakola	97582f1c06	MDEV-27649 PS conflict handling causing node crash Handling BF abort for prepared statement execution so that EXECUTE processing will continue until parameter setup is complete, before BF abort bails out the statement execution. THD class has new boolean member: wsrep_delayed_BF_abort, which is set if BF abort is observed in do_command() right after reading client's packet, and if the client has sent PS execute command. In such case, the deadlock error is not returned immediately back to client, but the PS execution will be started. However, the PS execution loop, will now check if wsrep_delayed_BF_abort is set, and stop the PS execution after the type information has been assigned for the PS. With this, the PS protocol type information, which is present in the first PS EXECUTE command, is not lost even if the first PS EXECUTE command was marked to abort. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>	2022-03-18 08:30:25 +02:00
Daniel Black	065f995e6d	Merge branch 10.5 into 10.6	2022-03-18 12:17:11 +11:00

1 2 3 4 5 ...

3745 Commits