MDEV-28127 did is_equal() which compared vcol expressions
literally. But another table vcol expression is not equal because of
different table name.
We implement another comparison method is_identical() which respects
different table name in vcol comparison. If any field item points to
table_A and compared field item points to table_B, such items are
treated as equal in (table_A, table_B) comparison. This is done by
cloning table_B expression and renaming any table_B entries to table_A
in it.
The problems were that:
1) resources was freed "asimetric" normal execution in send_eof,
in case of error in destructor.
2) destructor was not called in case of SP for result objects.
(so if the last SP execution ended with error resorces was not
freeded on reinit before execution (cleanup() called before next
execution) and destructor also was not called due to lack of
delete call for the object)
Result cleanup() renamed to reset_for_next_ps_execution() to better
reflect function().
All result method revised and freeing resources made "symetric".
Destructor of result object called for SP.
Added skipped invalidation in case of error in insert.
Removed misleading naming of reset(thd) (could be mixed with
with reset()).
Partial commit of the greater MDEV-34348 scope.
MDEV-34348: MariaDB is violating clang-16 -Wcast-function-type-strict
The functions queue_compare, qsort2_cmp, and qsort_cmp2
all had similar interfaces, and were used interchangable
and unsafely cast to one another.
This patch consolidates the functions all into the
qsort_cmp2 interface.
Reviewed By:
============
Marko Mäkelä <marko.makela@mariadb.com>
Post-fix for MDEV-35144.
Cannot allocate options values on the statement arena, because
HA_CREATE_INFO is shallow-copied for every execution, so if the
option_list was initially empty, it will be reset for every execution
and any values allocated on the statement arena will be lost.
Cannot allocate option values on the execution arena, because
HA_CREATE_INFO is shallow-copied for every execution, so if the
option_list was initially NOT empty, any values appended to the
end will be preserved and if they're on the execution arena their
content will be destroyed.
Let's use thd->change_item_tree() to save and restore necessary pointers
for every execution.
followup for 3da565c41d
The problem was that when using clang + asan, we do not get a correct value
for the thread stack as some local variables are not allocated at the
normal stack.
It looks like that for example clang 18.1.3, when compiling with
-O2 -fsanitize=addressan it puts local variables and things allocated by
alloca() in other areas than on the stack.
The following code shows the issue
Thread 6 "mariadbd" hit Breakpoint 3, do_handle_one_connection
(connect=0x5080000027b8,
put_in_cache=<optimized out>) at sql/sql_connect.cc:1399
THD *thd;
1399 thd->thread_stack= (char*) &thd;
(gdb) p &thd
(THD **) 0x7fffedee7060
(gdb) p $sp
(void *) 0x7fffef4e7bc0
The address of thd is 24M away from the stack pointer
(gdb) info reg
...
rsp 0x7fffef4e7bc0 0x7fffef4e7bc0
...
r13 0x7fffedee7060 140737185214560
r13 is pointing to the address of the thd. Probably some kind of
"local stack" used by the sanitizer
I have verified this with gdb on a recursive call that calls alloca()
in a loop. In this case all objects was stored in a local heap,
not on the stack.
To solve this issue in a portable way, I have added two functions:
my_get_stack_pointer() returns the address of the current stack pointer.
The code is using asm instructions for intel 32/64 bit, powerpc,
arm 32/64 bit and sparc 32/64 bit.
Supported compilers are gcc, clang and MSVC.
For MSVC 64 bit we are using _AddressOfReturnAddress()
As a fallback for other compilers/arch we use the address of a local
variable.
my_get_stack_bounds() that will return the address of the base stack
and stack size using pthread_attr_getstack() or NtCurrentTed() with
fallback to using the address of a local variable and user provided
stack size.
Server changes are:
- Moving setting of thread_stack to THD::store_globals() using
my_get_stack_bounds().
- Removing setting of thd->thread_stack, except in functions that
allocates a lot on the stack before calling store_globals(). When
using estimates for stack start, we reduce stack_size with
MY_STACK_SAFE_MARGIN (8192) to take into account the stack used
before calling store_globals().
I also added a unittest, stack_allocation-t, to verify the new code.
Reviewed-by: Sergei Golubchik <serg@mariadb.org>
Implement variable legacy_xa_rollback_at_disconnect to support
backwards compatibility for applications that rely on the pre-10.5
behavior for connection disconnect, which is to rollback the
transaction (in violation of the XA specification).
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
Field_blob::store() has special code for GROUP_CONCAT temporary table
(to store blob values in Blob_mem_storage - this prevents them
from being freed/overwritten when a next row is read).
Field_geom and Field_blob_compressed inherit from Field_blob but they
have their own ::store() method without this special Blob_mem_storage
support.
Considering that non-grouping CONCAT() of such fields converts
them to plain BLOB, let's do the same for GROUP_CONCAT. To do it,
Item_func_group_concat::setup will signal that it's creating
a temporary table for GROUP_CONCAT, and Field_blog::make_new_field()
override will create base Field_blob when under group concat.
We have found that my_errno can be "passed" to the next commad in some cases.
It is practically impossible to check/fix all cases of my_errno in the server,
plugins and engines so we will reset it as we reset other errors.
The test case will be fixed by CSV engine fix so will be added with it
(see part2).
The patch for MDEV-31340 fixed the following bugs:
MDEV-33084 LASTVAL(t1) and LASTVAL(T1) do not work well with lower-case-table-names=0
MDEV-33085 Tables T1 and t1 do not work well with ENGINE=CSV and lower-case-table-names=0
MDEV-33086 SHOW OPEN TABLES IN DB1 -- is case insensitive with lower-case-table-names=0
MDEV-33088 Cannot create triggers in the database `MYSQL`
MDEV-33103 LOCK TABLE t1 AS t2 -- alias is not case sensitive with lower-case-table-names=0
MDEV-33108 TABLE_STATISTICS and INDEX_STATISTICS are case insensitive with lower-case-table-names=0
MDEV-33109 DROP DATABASE MYSQL -- does not drop SP with lower-case-table-names=0
MDEV-33110 HANDLER commands are case insensitive with lower-case-table-names=0
MDEV-33119 User is case insensitive in INFORMATION_SCHEMA.VIEWS
MDEV-33120 System log table names are case insensitive with lower-cast-table-names=0
Backporting the fixes from 11.5 to 10.5
Add "real ip:<ip_or_localhost>" part to the aborted message
Only for proxy-protocoled connection, so it does not not to cause
confusion to normal users.
In case there is a view that queried from a stored routine or
a prepared statement and this temporary table is dropped between
executions of SP/PS, then it leads to hitting an assertion
at the SELECT_LEX::fix_prepare_information. The fired assertion
was added by the commit 85f2e4f8e8
(MDEV-32466: Potential memory leak on executing of create view statement).
Firing of this assertion means memory leaking on execution of SP/PS.
Moreover, if the added assert be commented out, different result sets
can be produced by the statement SELECT * FROM the hidden table.
Both hitting the assertion and different result sets have the same root
cause. This cause is usage of temporary table's metadata after the table
itself has been dropped. To fix the issue, reload the cache of stored
routines. To do it cache of stored routines is reset at the end of
execution of the function dispatch_command(). Next time any stored routine
be called it will be loaded from the table mysql.proc. This happens inside
the method Sp_handler::sp_cache_routine where loading of a stored routine
is performed in case it missed in cache. Loading is performed unconditionally
while previously it was controlled by the parameter lookup_only. By that
reason the signature of the method Sroutine_hash_entry::sp_cache_routine
was changed by removing unused parameter lookup_only.
Clearing of sp caches affects the test main.lock_sync since it forces
opening and locking the table mysql.proc but the test assumes that each
statement locks its tables once during its execution. To keep this invariant
the debug sync points with names "before_lock_tables_takes_lock" and
"after_lock_tables_takes_lock" are not activated on handling the table
mysql.proc
Problem:
sp_cache erroneously looked up fully qualified SP names (e.g. `DB`.`SP`),
in case insensitive style. It was wrong, because only the "name"
part is always case insensitive, while the "db" part should be compared
according to lower_case_table_names (case sensitively for 0,
case insensitively for 1 and 2).
Fix:
Adding a "casedn_name" parameter make_qname() to tell
if the name part should be lower cased:
`DB1`.`SP` -> "DB1.SP" (when casedn_name=false)
`DB1`.`SP` -> "DB1.sp" (when casedn_name=true)
and using make_qname() with casedn_name=true when creating
sp_cache hash lookup keys.
Details:
As a result, it now works as follows:
- sp_head::m_db is converted to lower case if lower_case_table_names>0
during the sp_name initialization phase. So when make_qname() is called,
sp_head::m_db is already normalized. There are no changes in here.
- The initialization phase of sp_head when creating sp_head::m_qname
now calls make_qname() with casedn_name=true,
so sp_head::m_name gets written to sp_head::m_qname in lower case.
- sp_cache_lookup() now also calls make_qname() with casedn_name=true,
so sp_head::m_name gets written to the temporary lookup key in lower case.
- sp_cache::m_hashtable now uses case sensitive comparison
Part#1 A non-functional change
Changing the signature of Identifier_chain2::make_qname() from
bool make_qname(MEM_ROOT *mem_root, LEX_CSTRING *dst) const;
to
LEX_CSTRING make_qname(MEM_ROOT *mem_root) const;
Now the result is returned as LEX_CSTRING from the function rather than
is passed as a parameter.
The return value {NULL,0} means "EOM".
This is a requirement step to fix and merge easier
MDEV-33019 The database part is not case sensitive in SP names
The original MDEV-31991 commit commend:
- Moving some of Database_qualified_name methods into a new class
Identifier_chain2.
- Changing the data type of the following variables from
Database_qualified_name to Identifier_chain2:
* q_pkg_proc in LEX::call_statement_start()
* q_pkg_func in LEX::make_item_func_call_generic()
Rationale:
The data type of Database_qualified_name::m_db will be changed
to Lex_ident_db soon. So Database_qualified_name won't be able
to store the `pkg.routine` part of `db.pkg.routine` any more,
because `pkg` must not depend on lower-case-table-names.
The reason for this change are the following:
- If we call set_killed() from one thread to kill another thread with
a message, there may be concurrent usage of the MEM_ROOT which is
not supported (this could cause memory corruption).
We do not currently have code that does this, but the API allows this
and it is better to be fix the issue before it happens.
- The per thread memory tracking does not work if one thread uses
another threads MEM_ROOT.
- set_killed() can be called if a MEM_ROOT allocation fails. In this case
it is not good to try to allocate more memory from potentially the same
MEM_ROOT.
Fix is to use my_malloc() instead of mem_root for killed messages.
This patch is actually follow-up for the task
MDEV-23902: MariaDB crash on calling function
to use correct query arena for a statement. In case invocation of
a function is in progress use its call arena, else use current
query arena that can be either a statement or a regular query arena.
In some cases "SHOW PROCESSLIST" could show "Reset for next command"
as State, even if the previous query had finished properly.
Fixed by clearing State after end of command and also setting the State
for the "Connect" command.
Other things:
- Changed usage of 'thd->set_command(COM_SLEEP)' to
'thd->mark_connection_idle()'.
- Changed thread_state_info() to return "" instead of NULL. This is
just a safety measurement and in line with the logic of the
rest of the function.
Binary logging is now disabled for the queries run by SQL SERVICE.
The binlogging can be turned on with the 'SET SQL_LOG_BIN=On' query.
Conflicts:
sql/sql_prepare.cc
Conflicts:
sql/sql_prepare.cc
At the moment we cannot support
wsrep_forced_binlog_format=[MIXED|STATEMENT]
during CREATE TABLE AS SELECT.
Statement will use ROW instead and give
a warning.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
On creation of a VIEW that depends on a stored routine an instance of
the class Item_func_sp is allocated on a memory root of SP statement.
It happens since mysql_make_view() calls the method
THD::activate_stmt_arena_if_needed()
before parsing definition of the view.
On the other hand, when sp_head's rcontext is created an instance of
the class Field referenced by the data member
Item_func_sp::result_field
is allocated on the Item_func_sp's Query_arena (call arena) that set up
inside the method
Item_sp::execute_impl
just before calling the method
sp_head::execute_function()
On return from the method sp_head::execute_function() all items allocated
on the Item_func_sp's Query_arena are released and its memory root is freed
(see implementation of the method Item_sp::execute_impl). As a consequence,
the pointer
Item_func_sp::result_field
references to the deallocated memory. Later, when the method
sp_head::execute
cleans up items allocated for just executed SP instruction the method
Item_func_sp::cleanup is invoked and tries to delete an object referenced
by data member Item_func_sp::result_field that points to already deallocated
memory, that results in a server abnormal termination.
To fix the issue the current active arena shouldn't be switched to
a statement arena inside the function mysql_make_view() that invoked indirectly
by the method sp_head::rcontext_create. It is implemented by introducing
the new Query_arena's state STMT_SP_QUERY_ARGUMENTS that is set when explicit
Query_arena is created for placing SP arguments and other caller's side items
used during SP execution. Then the method THD::activate_stmt_arena_if_needed()
checks Query_arena's state and returns immediately without switching to
statement's arena.
The problem was that parallel replication of temporary tables using
statement-based binlogging could overlap the COMMIT in one thread with a DML
or DROP TEMPORARY TABLE in another thread using the same temporary table.
Temporary tables are not safe for concurrent access, so this caused
reference to freed memory and possibly other nastiness.
The fix is to disable the optimisation with overlapping commits of one
transaction with the start of a later transaction, when temporary tables are
in use. Then the following event groups will be blocked from starting until
the one using temporary tables is completed.
This also fixes occasional test failures of rpl.rpl_parallel_temptable seen
in Buildbot.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
mark old keys in the ALTER TABLE with the `old` flag, not with
the `key_create_info.check_for_duplicate_indexes`.
This allows to mark old foreign keys too.
- Moving the code from a public function trim_whitespaces()
to the class Lex_cstring as methods. This code may
be useful in other contexts, and also this code becomes
visible inside sql_class.h
- Adding a helper method THD::strmake_lex_cstring_trim_whitespaces()
- Unifying the way how CREATE PROCEDURE/CREATE FUNCTION and
CREATE PACKAGE/CREATE PACKAGE BODY work:
a) Now CREATE PACKAGE/CREATE PACKAGE BODY also calls
Lex->sphead->set_body_start() to remember the cpp body start inside
an sp_head member.
b) adding a "const char *cpp_body_end" parameter to
sp_head::set_stmt_end().
These changes made it possible to reuse sp_head::set_stmt_end() inside
LEX::create_package_finalize() and remove the duplucate code.
- Renaming sp_head::m_body_begin to m_cpp_body_begin and adding a comment
to make it clear that this member is used only during parsing, and
points to a fragment inside the cpp buffer.
- Changed sp_head::set_body_start() and sp_head::set_stmt_end()
to skip the calls related to "body_utf8" in cases when m_parent is not NULL.
A non-NULL m_parent means that we're inside a package routine.
"body_utf8" in such case belongs not to the current sphead itself,
but to parent (the package) sphead.
So an sphead instance of a package routine should neither initialize,
nor finalize, nor change in any other ways the "body_utf8" related
members of Lex_input_stream, and should not take over or copy "body_utf8"
data from Lex_input_stream to "this".
The problem is that when a worker thread is (user) killed in
wait_for_prior_commit, the event group may complete out-of-order since the
wait for prior commit was aborted by the kill.
This fix ensures that event groups will always complete in-order, even
in the error case. This is done in finish_event_group() by doing an
extra wait_for_prior_commit(), if necessary, that ignores kills.
This fix supersedes the fix for MDEV-30780, so the earlier fix for
that is reverted in this patch.
Also fix that an error from wait_for_prior_commit() inside
finish_event_group() would not signal the error to
wakeup_subsequent_commits().
Based on earlier work by Brandon Nesterenko and Andrei Elkin, with
some changes to simplify the semantics of wait_for_prior_commit() and
make the code more robust to future changes.
Reviewed-by: Andrei Elkin <andrei.elkin@mariadb.com>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
Where a read-only server permits writes through replication, it
should not permit user connections to commit/rollback XA
transactions prepared via replication. The bug reported in
MDEV-30978 shows that this can happen. This is because there is no
read only check in the XA transaction logic, the most relevant one
occurs in ha_commit_trans() for normal statements/transactions.
This patch extends the XA transaction logic to check the read only
status of the server before performing an XA COMMIT or ROLLBACK.
Reviewed By:
Andrei Elkin <andrei.elkin@mariadb.com>
The problem is that a parallel replica would not immediately stop
running/queued transactions when issued STOP SLAVE. That is, it
allowed the current group of transactions to run, and sometimes the
transactions which belong to the next group could be started and run
through commit after STOP SLAVE was issued too, if the last group
had started committing. This would lead to long periods to wait for
all waiting transactions to finish.
This patch updates a parallel replica to try and abort immediately
and roll-back any ongoing transactions. The exception to this is any
transactions which are non-transactional (e.g. those modifying
sequences or non-transactional tables), and any prior transactions,
will be run to completion.
The specifics are as follows:
1. A new stage was added to SHOW PROCESSLIST output for the SQL
Thread when it is waiting for a replica thread to either rollback or
finish its transaction before stopping. This stage presents as
“Waiting for worker thread to stop”
2. Worker threads which error or are killed no longer perform GCO
cleanup if there is a concurrently running prior transaction. This
is because a worker thread scheduled to run in a future GCO could be
killed and incorrectly perform cleanup of the active GCO.
3. Refined cases when the FL_TRANSACTIONAL flag is added to GTID
binlog events to disallow adding it to transactions which modify
both transactional and non-transactional engines when the binlogging
configuration allow the modifications to exist in the same event,
i.e. when using binlog_direct_non_trans_update == 0 and
binlog_format == statement.
4. A few existing MTR tests relied on the completion of certain
transactions after issuing STOP SLAVE, and were re-recorded
(potentially with added synchronizations) under the new rollback
behavior.
Reviewed By
===========
Andrei Elkin <andrei.elkin@mariadb.com>
The problem seems to be a deadlock between KILL command execution
and BF abort issued by an applier, where:
* KILL has locked victim's LOCK_thd_kill and LOCK_thd_data.
* Applier has innodb side global lock mutex and victim trx mutex.
* KILL is calling innobase_kill_query, and is blocked by innodb
global lock mutex.
* Applier is in wsrep_innobase_kill_one_trx and is blocked by
victim's LOCK_thd_kill.
The fix in this commit removes the TOI replication of KILL command
and makes KILL execution less intrusive operation. Aborting the
victim happens now by using awake_no_mutex() and ha_abort_transaction().
If the KILL happens when the transaction is committing, the
KILL operation is postponed to happen after the statement
has completed in order to avoid KILL to interrupt commit
processing.
Notable changes in this commit:
* wsrep client connections's error state may remain sticky after
client connection is closed. This error message will then pop
up for the next client session issuing first SQL statement.
This problem raised with test galera.galera_bf_kill.
The fix is to reset wsrep client error state, before a THD is
reused for next connetion.
* Release THD locks in wsrep_abort_transaction when locking
innodb mutexes. This guarantees same locking order as with applier
BF aborting.
* BF abort from MDL was changed to do BF abort on server/wsrep-lib
side first, and only then do the BF abort on InnoDB side. This
removes the need to call back from InnoDB for BF aborts which originate
from MDL and simplifies the locking.
* Removed wsrep_thd_set_wsrep_aborter() from service_wsrep.h.
The manipulation of the wsrep_aborter can be done solely on
server side. Moreover, it is now debug only variable and
could be excluded from optimized builds.
* Remove LOCK_thd_kill from wsrep_thd_LOCK/UNLOCK to allow more
fine grained locking for SR BF abort which may require locking
of victim LOCK_thd_kill. Added explicit call for
wsrep_thd_kill_LOCK/UNLOCK where appropriate.
* Wsrep-lib was updated to version which allows external
locking for BF abort calls.
Changes to MTR tests:
* Disable galera_bf_abort_group_commit. This test is going to
be removed (MDEV-30855).
* Record galera_gcache_recover_manytrx as result file was incomplete.
Trivial change.
* Make galera_create_table_as_select more deterministic:
Wait until CTAS execution has reached MDL wait for multi-master
conflict case. Expected error from multi-master conflict is
ER_QUERY_INTERRUPTED. This is because CTAS does not yet have open
wsrep transaction when it is waiting for MDL, query gets interrupted
instead of BF aborted. This should be addressed in separate task.
* A new test galera_kill_group_commit to verify correct behavior
when KILL is executed while the transaction is committing.
Co-authored-by: Seppo Jaakola <seppo.jaakola@iki.fi>
Co-authored-by: Jan Lindström <jan.lindstrom@galeracluster.com>
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
This is a backport from 10.5.
The problem seems to be a deadlock between KILL command execution
and BF abort issued by an applier, where:
* KILL has locked victim's LOCK_thd_kill and LOCK_thd_data.
* Applier has innodb side global lock mutex and victim trx mutex.
* KILL is calling innobase_kill_query, and is blocked by innodb
global lock mutex.
* Applier is in wsrep_innobase_kill_one_trx and is blocked by
victim's LOCK_thd_kill.
The fix in this commit removes the TOI replication of KILL command
and makes KILL execution less intrusive operation. Aborting the
victim happens now by using awake_no_mutex() and ha_abort_transaction().
If the KILL happens when the transaction is committing, the
KILL operation is postponed to happen after the statement
has completed in order to avoid KILL to interrupt commit
processing.
Notable changes in this commit:
* wsrep client connections's error state may remain sticky after
client connection is closed. This error message will then pop
up for the next client session issuing first SQL statement.
This problem raised with test galera.galera_bf_kill.
The fix is to reset wsrep client error state, before a THD is
reused for next connetion.
* Release THD locks in wsrep_abort_transaction when locking
innodb mutexes. This guarantees same locking order as with applier
BF aborting.
* BF abort from MDL was changed to do BF abort on server/wsrep-lib
side first, and only then do the BF abort on InnoDB side. This
removes the need to call back from InnoDB for BF aborts which originate
from MDL and simplifies the locking.
* Removed wsrep_thd_set_wsrep_aborter() from service_wsrep.h.
The manipulation of the wsrep_aborter can be done solely on
server side. Moreover, it is now debug only variable and
could be excluded from optimized builds.
* Remove LOCK_thd_kill from wsrep_thd_LOCK/UNLOCK to allow more
fine grained locking for SR BF abort which may require locking
of victim LOCK_thd_kill. Added explicit call for
wsrep_thd_kill_LOCK/UNLOCK where appropriate.
* Wsrep-lib was updated to version which allows external
locking for BF abort calls.
Changes to MTR tests:
* Disable galera_bf_abort_group_commit. This test is going to
be removed (MDEV-30855).
* Record galera_gcache_recover_manytrx as result file was incomplete.
Trivial change.
* Make galera_create_table_as_select more deterministic:
Wait until CTAS execution has reached MDL wait for multi-master
conflict case. Expected error from multi-master conflict is
ER_QUERY_INTERRUPTED. This is because CTAS does not yet have open
wsrep transaction when it is waiting for MDL, query gets interrupted
instead of BF aborted. This should be addressed in separate task.
* A new test galera_kill_group_commit to verify correct behavior
when KILL is executed while the transaction is committing.
Co-authored-by: Seppo Jaakola <seppo.jaakola@iki.fi>
Co-authored-by: Jan Lindström <jan.lindstrom@galeracluster.com>
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>