1
0
mirror of https://github.com/MariaDB/server.git synced 2025-07-18 23:03:28 +03:00
Commit Graph

4465 Commits

Author SHA1 Message Date
0cf2176b79 MDEV-34033 Exchange partition with virtual columns fails
MDEV-28127 did is_equal() which compared vcol expressions
literally. But another table vcol expression is not equal because of
different table name.

We implement another comparison method is_identical() which respects
different table name in vcol comparison. If any field item points to
table_A and compared field item points to table_B, such items are
treated as equal in (table_A, table_B) comparison. This is done by
cloning table_B expression and renaming any table_B entries to table_A
in it.
2025-01-14 18:56:13 +03:00
0d35fe6e57 MDEV-35326: Memory Leak in init_io_cache_ext upon SHUTDOWN
The problems were that:
1) resources was freed "asimetric" normal execution in send_eof,
 in case of error in destructor.
2) destructor was not called in case of SP for result objects.
(so if the last SP execution ended with error resorces was not
freeded on reinit before execution (cleanup() called before next
execution) and destructor also was not called due to lack of
delete call for the object)

Result cleanup() renamed to reset_for_next_ps_execution() to better
reflect function().

All result method revised and freeing resources made "symetric".

Destructor of result object called for SP.

Added skipped invalidation in case of error in insert.

Removed misleading naming of reset(thd) (could be mixed with
with reset()).
2025-01-13 10:04:27 +01:00
dbfee9fc2b MDEV-34348: Consolidate cmp function declarations
Partial commit of the greater MDEV-34348 scope.
MDEV-34348: MariaDB is violating clang-16 -Wcast-function-type-strict

The functions queue_compare, qsort2_cmp, and qsort_cmp2
all had similar interfaces, and were used interchangable
and unsafely cast to one another.

This patch consolidates the functions all into the
qsort_cmp2 interface.

Reviewed By:
============
Marko Mäkelä <marko.makela@mariadb.com>
2024-11-23 08:14:22 -07:00
cf2d49ddcf Extract some of #3360 fixes to 10.5.x
That PR uncovered countless issues on `my_snprintf` uses.
This commit backports a squashed subset of their fixes.
2024-11-21 22:43:56 +11:00
3cd706b107 MDEV-35236 Assertion `(mem_root->flags & 4) == 0' failed in safe_lexcstrdup_root
Post-fix for MDEV-35144.

Cannot allocate options values on the statement arena, because
HA_CREATE_INFO is shallow-copied for every execution, so if the
option_list was initially empty, it will be reset for every execution
and any values allocated on the  statement arena will be lost.

Cannot allocate option values on the execution arena, because
HA_CREATE_INFO is shallow-copied for every execution, so if the
option_list was  initially NOT empty, any values appended to the
end will be preserved and if they're on the execution arena their
content will be destroyed.

Let's use thd->change_item_tree() to save and restore necessary pointers
for every execution.

followup for 3da565c41d
2024-10-23 14:58:57 +02:00
bddbef3573 MDEV-34533 asan error about stack overflow when writing record in Aria
The problem was that when using clang + asan, we do not get a correct value
for the thread stack as some local variables are not allocated at the
normal stack.

It looks like that for example clang 18.1.3, when compiling with
-O2 -fsanitize=addressan it puts local variables and things allocated by
alloca() in other areas than on the stack.

The following code shows the issue

Thread 6 "mariadbd" hit Breakpoint 3, do_handle_one_connection
    (connect=0x5080000027b8,
    put_in_cache=<optimized out>) at sql/sql_connect.cc:1399

THD *thd;
1399      thd->thread_stack= (char*) &thd;
(gdb) p &thd
(THD **) 0x7fffedee7060
(gdb) p $sp
(void *) 0x7fffef4e7bc0

The address of thd is 24M away from the stack pointer

(gdb) info reg
...
rsp            0x7fffef4e7bc0      0x7fffef4e7bc0
...
r13            0x7fffedee7060      140737185214560

r13 is pointing to the address of the thd. Probably some kind of
"local stack" used by the sanitizer

I have verified this with gdb on a recursive call that calls alloca()
in a loop. In this case all objects was stored in a local heap,
not on the stack.

To solve this issue in a portable way, I have added two functions:

my_get_stack_pointer() returns the address of the current stack pointer.
The code is using asm instructions for intel 32/64 bit, powerpc,
arm 32/64 bit and sparc 32/64 bit.
Supported compilers are gcc, clang and MSVC.
For MSVC 64 bit we are using _AddressOfReturnAddress()

As a fallback for other compilers/arch we use the address of a local
variable.

my_get_stack_bounds() that will return the address of the base stack
and stack size using pthread_attr_getstack() or NtCurrentTed() with
fallback to using the address of a local variable and user provided
stack size.

Server changes are:

- Moving setting of thread_stack to THD::store_globals() using
  my_get_stack_bounds().
- Removing setting of thd->thread_stack, except in functions that
  allocates a lot on the stack before calling store_globals().  When
  using estimates for stack start, we reduce stack_size with
  MY_STACK_SAFE_MARGIN (8192) to take into account the stack used
  before calling store_globals().

I also added a unittest, stack_allocation-t, to verify the new code.

Reviewed-by: Sergei Golubchik <serg@mariadb.org>
2024-10-16 17:24:46 +03:00
5ebda30ccc Revert "MDEV-35019 Provide a way to enable "rollback XA on disconnect" behavior we had before 10.5.2"
This reverts commit 8ae462a220.
2024-10-16 13:23:47 +02:00
8ae462a220 MDEV-35019 Provide a way to enable "rollback XA on disconnect" behavior we had before 10.5.2
Implement variable legacy_xa_rollback_at_disconnect to support
backwards compatibility for applications that rely on the pre-10.5
behavior for connection disconnect, which is to rollback the
transaction (in violation of the XA specification).

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-10-16 10:18:36 +02:00
3ea71a2c8e MDEV-16699 heap-use-after-free in group_concat with compressed or GIS columns
Field_blob::store() has special code for GROUP_CONCAT temporary table
(to store blob values in Blob_mem_storage - this prevents them
from being freed/overwritten when a next row is read).

Field_geom and Field_blob_compressed inherit from Field_blob but they
have their own ::store() method without this special Blob_mem_storage
support.

Considering that non-grouping CONCAT() of such fields converts
them to plain BLOB, let's do the same for GROUP_CONCAT. To do it,
Item_func_group_concat::setup will signal that it's creating
a temporary table for GROUP_CONCAT, and Field_blog::make_new_field()
override will create base Field_blob when under group concat.
2024-10-08 15:31:02 +02:00
1cda4726ca MDEV-34993, part2: backport optimizer_adjust_secondary_key_costs
...and make the fix for MDEV-34993 switchable. It is enabled by default
and controlled with @optimizer_adjust_secondary_key_costs=fix_card_multiplier
2024-10-02 10:52:09 +03:00
20f57a8529 MDEV-33373 part 1: Unexpected ER_FILE_NOT_FOUND upon reading from logging table after crash recovery
We have found that my_errno can be "passed" to the next commad in some cases.

It is practically impossible to check/fix all cases of my_errno in the server,
plugins and engines so we will reset it as we reset other errors.

The test case will be fixed by CSV engine fix so will be added with it
(see part2).
2024-09-30 12:53:07 +02:00
aebd2397cc MDEV-34404 Use safe_str in spider udfs to avoid passing NULL str 2024-06-25 13:45:04 +08:00
db0c28eff8 MDEV-33746 Supply missing override markings
Find and fix missing virtual override markings.  Updates cmake
maintainer flags to include -Wsuggest-override and
-Winconsistent-missing-override.
2024-06-20 11:32:13 -04:00
310fd6ff69 Backporting bugs fixes fixed by MDEV-31340 from 11.5
The patch for MDEV-31340 fixed the following bugs:

MDEV-33084 LASTVAL(t1) and LASTVAL(T1) do not work well with lower-case-table-names=0
MDEV-33085 Tables T1 and t1 do not work well with ENGINE=CSV and lower-case-table-names=0
MDEV-33086 SHOW OPEN TABLES IN DB1 -- is case insensitive with lower-case-table-names=0
MDEV-33088 Cannot create triggers in the database `MYSQL`
MDEV-33103 LOCK TABLE t1 AS t2 -- alias is not case sensitive with lower-case-table-names=0
MDEV-33108 TABLE_STATISTICS and INDEX_STATISTICS are case insensitive with lower-case-table-names=0
MDEV-33109 DROP DATABASE MYSQL -- does not drop SP with lower-case-table-names=0
MDEV-33110 HANDLER commands are case insensitive with lower-case-table-names=0
MDEV-33119 User is case insensitive in INFORMATION_SCHEMA.VIEWS
MDEV-33120 System log table names are case insensitive with lower-cast-table-names=0

Backporting the fixes from 11.5 to 10.5
2024-05-21 14:58:01 +04:00
16aa4b5f59 Merge from 10.4 to 10.5
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-04-15 17:46:49 +02:00
d695e2de54 MDEV-33506 Show original IP in the "aborted" message.
Add "real ip:<ip_or_localhost>" part to the aborted message
Only for proxy-protocoled connection, so it does not  not to cause
confusion to normal users.
2024-03-26 13:10:36 +01:00
d7758debae MDEV-33218: Assertion `active_arena->is_stmt_prepare_or_first_stmt_execute() || active_arena->state == Query_arena::STMT_SP_QUERY_ARGUMENTS' failed in st_select_lex::fix_prepare_information
In case there is a view that queried from a stored routine or
a prepared statement and this temporary table is dropped between
executions of SP/PS, then it leads to hitting an assertion
at the SELECT_LEX::fix_prepare_information. The fired assertion
 was added by the commit 85f2e4f8e8
(MDEV-32466: Potential memory leak on executing of create view statement).
Firing of this assertion means memory leaking on execution of SP/PS.
Moreover, if the added assert be commented out, different result sets
can be produced by the statement SELECT * FROM the hidden table.

Both hitting the assertion and different result sets have the same root
cause. This cause is usage of temporary table's metadata after the table
itself has been dropped. To fix the issue, reload the cache of stored
routines. To do it  cache of stored routines is reset at the end of
execution of the function dispatch_command(). Next time any stored routine
be called it will be loaded from the table mysql.proc. This happens inside
the method Sp_handler::sp_cache_routine where loading of a stored routine
is performed in case it missed in cache. Loading is performed unconditionally
while previously it was controlled by the parameter lookup_only. By that
reason the signature of the method Sroutine_hash_entry::sp_cache_routine
was changed by removing unused parameter lookup_only.

Clearing of sp caches affects the test main.lock_sync since it forces
opening and locking the table mysql.proc but the test assumes that each
statement locks its tables once during its execution. To keep this invariant
the debug sync points with names "before_lock_tables_takes_lock" and
"after_lock_tables_takes_lock" are not activated on handling the table
mysql.proc
2024-03-14 15:43:03 +07:00
85517f609a MDEV-33393 audit plugin do not report user did the action..
The '<replication_slave>' user is assigned to the slave replication
thread so this name appears in the auditing logs.
2024-02-14 00:02:29 +04:00
fa3171df08 MDEV-27666 User variable not parsed as geometry variable in geometry function
Adding GEOMETRY type user variables.
2024-01-16 18:53:23 +04:00
3a3a4f044f Merge 10.4 into 10.5 2024-01-03 12:07:51 +02:00
9695974e4b MDEV-33019 The database part is not case sensitive in SP names
Problem:

sp_cache erroneously looked up fully qualified SP names (e.g. `DB`.`SP`),
in case insensitive style. It was wrong, because only the "name"
part is always case insensitive, while the "db" part should be compared
according to lower_case_table_names (case sensitively for 0,
case insensitively for 1 and 2).

Fix:

Adding a "casedn_name" parameter make_qname() to tell
if the name part should be lower cased:
  `DB1`.`SP` -> "DB1.SP"  (when casedn_name=false)
  `DB1`.`SP` -> "DB1.sp"  (when casedn_name=true)
and using make_qname() with casedn_name=true when creating
sp_cache hash lookup keys.

Details:

As a result, it now works as follows:
- sp_head::m_db is converted to lower case if lower_case_table_names>0
  during the sp_name initialization phase. So when make_qname() is called,
  sp_head::m_db is already normalized. There are no changes in here.

- The initialization phase of sp_head when creating sp_head::m_qname
  now calls make_qname() with casedn_name=true,
  so sp_head::m_name gets written to sp_head::m_qname in lower case.

- sp_cache_lookup() now also calls make_qname() with casedn_name=true,
  so sp_head::m_name gets written to the temporary lookup key in lower case.

- sp_cache::m_hashtable now uses case sensitive comparison
2023-12-27 13:41:42 +04:00
916caac2a5 MDEV-33019 The database part is not case sensitive in SP names
Part#1 A non-functional change

Changing the signature of Identifier_chain2::make_qname() from

  bool make_qname(MEM_ROOT *mem_root, LEX_CSTRING *dst) const;

to

  LEX_CSTRING make_qname(MEM_ROOT *mem_root) const;

Now the result is returned as LEX_CSTRING from the function rather than
is passed as a parameter.
The return value {NULL,0} means "EOM".
2023-12-27 13:22:49 +04:00
371bf4abc6 A 11.3->10.4 backport for MDEV-31991 Split class Database_qualified_name
This is a requirement step to fix and merge easier
  MDEV-33019 The database part is not case sensitive in SP names

The original MDEV-31991 commit commend:

- Moving some of Database_qualified_name methods into a new class
  Identifier_chain2.

- Changing the data type of the following variables from
  Database_qualified_name to Identifier_chain2:

  * q_pkg_proc in LEX::call_statement_start()
  * q_pkg_func in LEX::make_item_func_call_generic()

Rationale:

The data type of Database_qualified_name::m_db will be changed
to Lex_ident_db soon. So Database_qualified_name won't be able
to store the `pkg.routine` part of `db.pkg.routine` any more,
because `pkg` must not depend on lower-case-table-names.
2023-12-27 13:02:58 +04:00
98a39b0c91 Merge branch '10.4' into 10.5 2023-12-02 01:02:50 +01:00
dc1165419a Do not use MEM_ROOT in set_killed_no_mutex()
The reason for this change are the following:
- If we call set_killed() from one thread to kill another thread with
  a message, there may be concurrent usage of the MEM_ROOT which is
  not supported (this could cause memory corruption).
  We do not currently have code that does this, but the API allows this
  and it is better to be fix the issue before it happens.
- The per thread memory tracking does not work if one thread uses
  another threads MEM_ROOT.
- set_killed() can be called if a MEM_ROOT allocation fails.  In this case
  it is not good to try to allocate more memory from potentially the same
  MEM_ROOT.

Fix is to use my_malloc() instead of mem_root for killed messages.
2023-11-27 19:08:14 +02:00
5064750fbf MDEV-32466: Potential memory leak on executing of create view statement
This patch is actually follow-up for the task
  MDEV-23902: MariaDB crash on calling function
to use correct query arena for a statement. In case invocation of
a function is in progress use its call arena, else use current
query arena that can be either a statement or a regular query arena.
2023-11-24 16:26:12 +07:00
6cfd2ba397 Merge branch '10.4' into 10.5 2023-11-08 12:59:00 +01:00
2447172afb Ensure that process "State" is properly cleaned after query execution
In some cases "SHOW PROCESSLIST" could show "Reset for next command"
as State, even if the previous query had finished properly.

Fixed by clearing State after end of command and also setting the State
for the "Connect" command.

Other things:
- Changed usage of 'thd->set_command(COM_SLEEP)' to
  'thd->mark_connection_idle()'.
- Changed thread_state_info() to return "" instead of NULL. This is
  just a safety measurement and in line with the logic of the
  rest of the function.
2023-11-07 10:07:30 +02:00
3a8eb405e7 MDEV-27832 disable binary logging for SQL SERVICE.
Binary logging is now disabled for the queries run by SQL SERVICE.
The binlogging can be turned on with the 'SET SQL_LOG_BIN=On' query.

Conflicts:
	sql/sql_prepare.cc

Conflicts:
	sql/sql_prepare.cc
2023-11-05 23:35:31 +04:00
1fa196a559 MDEV-27595 Backport SQL service, introduced by MDEV-19275.
The SQL SERVICE backported into the 10.4.
2023-11-05 23:35:31 +04:00
df93b4f259 Fix MDEV-30820 problem found by Monty 2023-11-02 07:03:32 +01:00
d2d657e722 MDEV-31187 Add class Sql_mode_save_for_frm_handling 2023-10-23 13:44:31 +04:00
f293b2b211 cleanup 2023-10-17 14:32:05 +02:00
f57deb314f MDEV-31660 : Assertion `client_state.transaction().active() in wsrep_append_key
At the moment we cannot support
wsrep_forced_binlog_format=[MIXED|STATEMENT]
during CREATE TABLE AS SELECT.
Statement will use ROW instead and give
a warning.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2023-09-29 12:54:04 +02:00
6c05edfdcd Merge 10.4 into 10.5 2023-09-19 10:20:09 +03:00
68353dc92a MDEV-23902: MariaDB crash on calling function
On creation of a VIEW that depends on a stored routine an instance of
the class Item_func_sp is allocated on a memory root of SP statement.
It happens since mysql_make_view() calls the method
  THD::activate_stmt_arena_if_needed()
before parsing definition of the view.
On the other hand, when sp_head's rcontext is created an instance of
the class Field referenced by the data member
  Item_func_sp::result_field
is allocated on the Item_func_sp's Query_arena (call arena) that set up
inside the method
  Item_sp::execute_impl
just before calling the method
  sp_head::execute_function()

On return from the method sp_head::execute_function() all items allocated
on the Item_func_sp's Query_arena are released and its memory root is freed
(see implementation of the method Item_sp::execute_impl). As a consequence,
the pointer
  Item_func_sp::result_field
references to the deallocated memory. Later, when the method
  sp_head::execute
cleans up items allocated for just executed SP instruction the method
  Item_func_sp::cleanup is invoked and tries to delete an object referenced
by data member Item_func_sp::result_field that points to already deallocated
memory, that results in a server abnormal termination.

To fix the issue the current active arena shouldn't be switched to
a statement arena inside the function mysql_make_view() that invoked indirectly
by the method sp_head::rcontext_create. It is implemented by introducing
the new Query_arena's state STMT_SP_QUERY_ARGUMENTS that is set when explicit
Query_arena is created for placing SP arguments and other caller's side items
used during SP execution. Then the method THD::activate_stmt_arena_if_needed()
checks Query_arena's state and returns immediately without switching to
statement's arena.
2023-09-19 08:57:36 +07:00
f8f7d9de2c Merge 10.4 into 10.5 2023-09-11 11:29:31 +03:00
e937a64d46 MDEV-10356: rpl.rpl_parallel_temptable failure due to incorrect commit optimization of temptables
The problem was that parallel replication of temporary tables using
statement-based binlogging could overlap the COMMIT in one thread with a DML
or DROP TEMPORARY TABLE in another thread using the same temporary table.
Temporary tables are not safe for concurrent access, so this caused
reference to freed memory and possibly other nastiness.

The fix is to disable the optimisation with overlapping commits of one
transaction with the start of a later transaction, when temporary tables are
in use. Then the following event groups will be blocked from starting until
the one using temporary tables is completed.

This also fixes occasional test failures of rpl.rpl_parallel_temptable seen
in Buildbot.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2023-09-07 14:40:05 +02:00
ab1191c039 cleanup: key->key_create_info.check_for_duplicate_indexes -> key->old
mark old keys in the ALTER TABLE with the `old` flag, not with
the `key_create_info.check_for_duplicate_indexes`.

This allows to mark old foreign keys too.
2023-08-01 22:43:16 +02:00
b8233b38da cleanup: put db/table_name into Alter_info
also, prefer Lex_table_name and Lex_ident over LEX_CSTRING
2023-08-01 22:43:16 +02:00
383baa812e cleanup: invert return code 2023-08-01 22:42:24 +02:00
7564be1352 Merge branch '10.4' into 10.5 2023-07-26 16:02:57 +02:00
400c101332 MDEV-30662 SQL/PL package body does not appear in I_S.ROUTINES.ROUTINE_DEFINITION
- Moving the code from a public function trim_whitespaces()
  to the class Lex_cstring as methods. This code may
  be useful in other contexts, and also this code becomes
  visible inside sql_class.h

- Adding a helper method THD::strmake_lex_cstring_trim_whitespaces()

- Unifying the way how CREATE PROCEDURE/CREATE FUNCTION and
  CREATE PACKAGE/CREATE PACKAGE BODY work:

  a) Now CREATE PACKAGE/CREATE PACKAGE BODY also calls
  Lex->sphead->set_body_start() to remember the cpp body start inside
  an sp_head member.

  b) adding a "const char *cpp_body_end" parameter to
  sp_head::set_stmt_end().

  These changes made it possible to reuse sp_head::set_stmt_end() inside
  LEX::create_package_finalize() and remove the duplucate code.

- Renaming sp_head::m_body_begin to m_cpp_body_begin and adding a comment
  to make it clear that this member is used only during parsing, and
  points to a fragment inside the cpp buffer.

- Changed sp_head::set_body_start() and sp_head::set_stmt_end()
  to skip the calls related to "body_utf8" in cases when m_parent is not NULL.
  A non-NULL m_parent means that we're inside a package routine.
  "body_utf8" in such case belongs not to the current sphead itself,
  but to parent (the package) sphead.
  So an sphead instance of a package routine should neither initialize,
  nor finalize, nor change in any other ways the "body_utf8" related
  members of Lex_input_stream, and should not take over or copy "body_utf8"
  data from Lex_input_stream to "this".
2023-07-14 13:26:26 +04:00
5d61442c85 MDEV-31448: Killing a replica thread awaiting its GCO can hang/crash a parallel replica
The problem is that when a worker thread is (user) killed in
wait_for_prior_commit, the event group may complete out-of-order since the
wait for prior commit was aborted by the kill.

This fix ensures that event groups will always complete in-order, even
in the error case. This is done in finish_event_group() by doing an
extra wait_for_prior_commit(), if necessary, that ignores kills.

This fix supersedes the fix for MDEV-30780, so the earlier fix for
that is reverted in this patch.

Also fix that an error from wait_for_prior_commit() inside
finish_event_group() would not signal the error to
wakeup_subsequent_commits().

Based on earlier work by Brandon Nesterenko and Andrei Elkin, with
some changes to simplify the semantics of wait_for_prior_commit() and
make the code more robust to future changes.

Reviewed-by: Andrei Elkin <andrei.elkin@mariadb.com>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2023-07-12 09:41:32 +02:00
9808ebe195 MDEV-30978: On slave XA COMMIT/XA ROLLBACK fail to return an error in read-only mode
Where a read-only server permits writes through replication, it
should not permit user connections to commit/rollback XA
transactions prepared via replication. The bug reported in
MDEV-30978 shows that this can happen. This is because there is no
read only check in the XA transaction logic, the most relevant one
occurs in ha_commit_trans() for normal statements/transactions.

This patch extends the XA transaction logic to check the read only
status of the server before performing an XA COMMIT or ROLLBACK.

Reviewed By:
Andrei Elkin <andrei.elkin@mariadb.com>
2023-07-11 07:49:44 -06:00
8ed88e3455 Revert "MDEV-13915: STOP SLAVE takes very long time on a busy system"
This reverts commit 0a99d457b3
because it should go into only 10.5+
2023-06-06 08:11:38 -06:00
0a99d457b3 MDEV-13915: STOP SLAVE takes very long time on a busy system
The problem is that a parallel replica would not immediately stop
running/queued transactions when issued STOP SLAVE. That is, it
allowed the current group of transactions to run, and sometimes the
transactions which belong to the next group could be started and run
through commit after STOP SLAVE was issued too, if the last group
had started committing. This would lead to long periods to wait for
all waiting transactions to finish.

This patch updates a parallel replica to try and abort immediately
and roll-back any ongoing transactions. The exception to this is any
transactions which are non-transactional (e.g. those modifying
sequences or non-transactional tables), and any prior transactions,
will be run to completion.

The specifics are as follows:

 1. A new stage was added to SHOW PROCESSLIST output for the SQL
Thread when it is waiting for a replica thread to either rollback or
finish its transaction before stopping. This stage presents as
“Waiting for worker thread to stop”

 2. Worker threads which error or are killed no longer perform GCO
cleanup if there is a concurrently running prior transaction. This
is because a worker thread scheduled to run in a future GCO could be
killed and incorrectly perform cleanup of the active GCO.

 3. Refined cases when the FL_TRANSACTIONAL flag is added to GTID
binlog events to disallow adding it to transactions which modify
both transactional and non-transactional engines when the binlogging
configuration allow the modifications to exist in the same event,
i.e. when using binlog_direct_non_trans_update == 0 and
binlog_format == statement.

 4. A few existing MTR tests relied on the completion of certain
transactions after issuing STOP SLAVE, and were re-recorded
(potentially with added synchronizations) under the new rollback
behavior.

Reviewed By
===========
Andrei Elkin <andrei.elkin@mariadb.com>
2023-06-05 10:03:06 -06:00
3f59bbeeae MDEV-29293 MariaDB stuck on starting commit state
The problem seems to be a deadlock between KILL command execution
and BF abort issued by an applier, where:
* KILL has locked victim's LOCK_thd_kill and LOCK_thd_data.
* Applier has innodb side global lock mutex and victim trx mutex.
* KILL is calling innobase_kill_query, and is blocked by innodb
  global lock mutex.
* Applier is in wsrep_innobase_kill_one_trx and is blocked by
  victim's LOCK_thd_kill.

The fix in this commit removes the TOI replication of KILL command
and makes KILL execution less intrusive operation. Aborting the
victim happens now by using awake_no_mutex() and ha_abort_transaction().
If the KILL happens when the transaction is committing, the
KILL operation is postponed to happen after the statement
has completed in order to avoid KILL to interrupt commit
processing.

Notable changes in this commit:
* wsrep client connections's error state may remain sticky after
  client connection is closed. This error message will then pop
  up for the next client session issuing first SQL statement.
  This problem raised with test galera.galera_bf_kill.
  The fix is to reset wsrep client error state, before a THD is
  reused for next connetion.
* Release THD locks in wsrep_abort_transaction when locking
  innodb mutexes. This guarantees same locking order as with applier
  BF aborting.
* BF abort from MDL was changed to do BF abort on server/wsrep-lib
  side first, and only then do the BF abort on InnoDB side. This
  removes the need to call back from InnoDB for BF aborts which originate
  from MDL and simplifies the locking.
* Removed wsrep_thd_set_wsrep_aborter() from service_wsrep.h.
  The manipulation of the wsrep_aborter can be done solely on
  server side. Moreover, it is now debug only variable and
  could be excluded from optimized builds.
* Remove LOCK_thd_kill from wsrep_thd_LOCK/UNLOCK to allow more
  fine grained locking for SR BF abort which may require locking
  of victim LOCK_thd_kill. Added explicit call for
  wsrep_thd_kill_LOCK/UNLOCK where appropriate.
* Wsrep-lib was updated to version which allows external
  locking for BF abort calls.

Changes to MTR tests:
* Disable galera_bf_abort_group_commit. This test is going to
  be removed (MDEV-30855).
* Record galera_gcache_recover_manytrx as result file was incomplete.
  Trivial change.
* Make galera_create_table_as_select more deterministic:
  Wait until CTAS execution has reached MDL wait for multi-master
  conflict case. Expected error from multi-master conflict is
  ER_QUERY_INTERRUPTED. This is because CTAS does not yet have open
  wsrep transaction when it is waiting for MDL, query gets interrupted
  instead of BF aborted. This should be addressed in separate task.
* A new test galera_kill_group_commit to verify correct behavior
  when KILL is executed while the transaction is committing.

Co-authored-by: Seppo Jaakola <seppo.jaakola@iki.fi>
Co-authored-by: Jan Lindström <jan.lindstrom@galeracluster.com>
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2023-05-22 00:39:43 +02:00
6966d7fe4b MDEV-29293 MariaDB stuck on starting commit state
This is a backport from 10.5.

The problem seems to be a deadlock between KILL command execution
and BF abort issued by an applier, where:
* KILL has locked victim's LOCK_thd_kill and LOCK_thd_data.
* Applier has innodb side global lock mutex and victim trx mutex.
* KILL is calling innobase_kill_query, and is blocked by innodb
  global lock mutex.
* Applier is in wsrep_innobase_kill_one_trx and is blocked by
  victim's LOCK_thd_kill.

The fix in this commit removes the TOI replication of KILL command
and makes KILL execution less intrusive operation. Aborting the
victim happens now by using awake_no_mutex() and ha_abort_transaction().
If the KILL happens when the transaction is committing, the
KILL operation is postponed to happen after the statement
has completed in order to avoid KILL to interrupt commit
processing.

Notable changes in this commit:
* wsrep client connections's error state may remain sticky after
  client connection is closed. This error message will then pop
  up for the next client session issuing first SQL statement.
  This problem raised with test galera.galera_bf_kill.
  The fix is to reset wsrep client error state, before a THD is
  reused for next connetion.
* Release THD locks in wsrep_abort_transaction when locking
  innodb mutexes. This guarantees same locking order as with applier
  BF aborting.
* BF abort from MDL was changed to do BF abort on server/wsrep-lib
  side first, and only then do the BF abort on InnoDB side. This
  removes the need to call back from InnoDB for BF aborts which originate
  from MDL and simplifies the locking.
* Removed wsrep_thd_set_wsrep_aborter() from service_wsrep.h.
  The manipulation of the wsrep_aborter can be done solely on
  server side. Moreover, it is now debug only variable and
  could be excluded from optimized builds.
* Remove LOCK_thd_kill from wsrep_thd_LOCK/UNLOCK to allow more
  fine grained locking for SR BF abort which may require locking
  of victim LOCK_thd_kill. Added explicit call for
  wsrep_thd_kill_LOCK/UNLOCK where appropriate.
* Wsrep-lib was updated to version which allows external
  locking for BF abort calls.

Changes to MTR tests:
* Disable galera_bf_abort_group_commit. This test is going to
  be removed (MDEV-30855).
* Record galera_gcache_recover_manytrx as result file was incomplete.
  Trivial change.
* Make galera_create_table_as_select more deterministic:
  Wait until CTAS execution has reached MDL wait for multi-master
  conflict case. Expected error from multi-master conflict is
  ER_QUERY_INTERRUPTED. This is because CTAS does not yet have open
  wsrep transaction when it is waiting for MDL, query gets interrupted
  instead of BF aborted. This should be addressed in separate task.
* A new test galera_kill_group_commit to verify correct behavior
  when KILL is executed while the transaction is committing.

Co-authored-by: Seppo Jaakola <seppo.jaakola@iki.fi>
Co-authored-by: Jan Lindström <jan.lindstrom@galeracluster.com>
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2023-05-22 00:33:37 +02:00
ac5a534a4c Merge remote-tracking branch '10.4' into 10.5 2023-03-31 21:32:41 +02:00