1
0
mirror of https://github.com/MariaDB/server.git synced 2025-11-09 11:41:36 +03:00
Commit Graph

4077 Commits

Author SHA1 Message Date
Sergei Golubchik
aed5928207 cleanup: extract transaction-related part of handlerton
into a separate transaction_participant structure

handlerton inherits it, so handlerton itself doesn't change.
but entities that only need to participate in a transaction,
like binlog or online alter log, use a transaction_participant
and no longer need to pretend to be a full-blown but invisible
storage engine which doesn't support create table.
2024-11-05 14:00:50 -08:00
Sergei Golubchik
44c6328cbb cleanup: thd->alloc<>() and thd->calloc<>()
create templates

  thd->alloc<X>(n) to use instead of (X*)thd->alloc(sizeof(X)*n)

and the same for thd->calloc(). By the default the type is char,
so old usage of thd->alloc(size) works too.
2024-11-05 14:00:48 -08:00
Sergei Golubchik
32e6f8ff2e cleanup: remove unconditional #ifdef's 2024-11-05 14:00:47 -08:00
Oleg Smirnov
a914087fab MDEV-35307 Unexpected error WARN_SORTING_ON_TRUNCATED_LENGTH or assertion failure in diagnostics area #2
When strict mode is enabled, all warnings during `INSERT` are
converted to errors regardless of their actual severity.
`WARN_SORTING_ON_TRUNCATED_LENGTH` is not considered severe enough
to be elevated to the ERROR level, and this commit fixes that
2024-11-05 14:52:20 +07:00
Oleksandr Byelkin
c770bce898 Merge branch '11.2' into 11.4 2024-10-30 15:11:17 +01:00
Oleksandr Byelkin
69d033d165 Merge branch '10.11' into 11.2 2024-10-29 16:42:46 +01:00
Oleksandr Byelkin
3d0fb15028 Merge branch '10.6' into 10.11 2024-10-29 15:24:38 +01:00
Oleksandr Byelkin
f00711bba2 Merge branch '10.5' into 10.6 2024-10-29 14:20:03 +01:00
Vlad Lesin
8c7786e7d5 MDEV-34690 lock_rec_unlock_unmodified() causes deadlock
lock_rec_unlock_unmodified() is executed either under lock_sys.wr_lock()
or under a combination of lock_sys.rd_lock() + record locks hash table
cell latch. It also requests page latch to check if locked records were
changed by the current transaction or not.

Usually InnoDB requests page latch to find the certain record on the
page, and then requests lock_sys and/or record lock hash cell latch to
request record lock. lock_rec_unlock_unmodified() requests the latches
in the opposite order, what causes deadlocks. One of the possible
scenario for the deadlock is the following:

thread 1 - lock_rec_unlock_unmodified() is invoked under locks hash table
           cell latch, the latch is acquired;
thread 2 - purge thread acquires page latch and tries to remove
           delete-marked record, it invokes lock_update_delete(), which
           requests locks hash table cell latch, held by thread 1;
thread 1 - requests page latch, held by thread 2.

To fix it we need to release lock_sys.latch and/or lock hash cell latch,
acquire page latch and re-acquire lock_sys related latches.

When lock_sys.latch and/or lock hash cell latch are released in
lock_release_on_prepare() and lock_release_on_prepare_try(), the page on
which the current lock is held, can be merged. In this case the bitmap
of the current lock must be cleared, and the new lock must be added to
the end of trx->lock.trx_locks list, or bitmap of already existing lock
must be changed.

The new field trx_lock_t::set_nth_bit_calls indicates if new locks
(bits in existing lock bitmaps or new lock objects) were created during
the period when lock_sys was released in trx->lock.trx_locks list
iteration loop in lock_release_on_prepare() or
lock_release_on_prepare_try(). And, if so, we traverse the list again.

The block can be freed during pages merging, what causes assertion
failure in buf_page_get_gen(), as btr_block_get() passes BUF_GET as page
get mode to it. That's why page_get_mode parameter was added to
btr_block_get() to pass BUF_GET_POSSIBLY_FREED from
lock_release_on_prepare() and lock_release_on_prepare_try() to
buf_page_get_gen().

As searching for id of trx, which modified secondary index record, is
quite expensive operation, restrict its usage for master. System variable
was added to remove the restriction for testing simplifying. The
variable exists only either for debug build or for build with
-DINNODB_ENABLE_XAP_UNLOCK_UNMODIFIED_FOR_PRIMARY option to increase the
probability of catching bugs for release build with RQG.

Note that the code, which does primary index lookup to find out what
transaction modified secondary index record, is necessary only when
there is no primary key and no unique secondary key on replica with row
based replication, because only in this case extra X locks on unmodified
records can be set during scan phase.

Reviewed by Marko Mäkelä.
2024-10-23 12:36:17 +03:00
Oleg Smirnov
fd87e01f38 MDEV-27277 Add a warning when max_sort_length is reached
During a query execution some sorting and grouping operations
on strings may be involved. System variable max_sort_length defines
the maximum number of bytes to use when comparing strings during
sorting/grouping. Thus, the comparable parts of strings may be less
than their actual size, so the results of the query may be not
sorted/grouped properly.
To indicate that some comparisons were done on a truncated lengths,
a new warning has been introduced with this commit.
2024-10-22 22:39:36 +07:00
Brandon Nesterenko
1ed30e08af MDEV-34122: Assertion `entry' failed in Active_tranx::assert_thd_is_waiter
If semi-sync is switched off then on while a transaction is
in-between binlogging and waiting for an ACK, the semi-sync state of
the transaction is removed, leading to a debug assertion that
indicates the transaction tried to wait, but cannot receive an ACK
signal. More specifically, when semi-sync is switched off, the
Active_tranx list is cleared (where a transaction adds an entry to
this list during binlogging), and each entry in this list saves the
thread which will wait for an ACK, and the thread has the COND
variable to signal to wake itself. So if the entry is lost, the
Ack_receiver thread won’t be able to find the thread to wake up when
an ACK comes in

The fix is to ensure that the entry exists before awaiting the ACK,
and if there is no entry, skip the wait. In debug builds, an
informative message is written explaining that the transaction is
skipping its wait. Additional debug-build only logic is added to
ensure that the cause of the missing entry is due to semi-sync being
turned off and on

Reviewed By:
============
Kristian Nielsen <knielsen@knielsen-hq.org>
2024-10-21 15:35:54 -06:00
Sergei Golubchik
3a1cf2c85b MDEV-34679 ER_BAD_FIELD uses non-localizable substrings 2024-10-17 21:37:37 +02:00
Monty
bddbef3573 MDEV-34533 asan error about stack overflow when writing record in Aria
The problem was that when using clang + asan, we do not get a correct value
for the thread stack as some local variables are not allocated at the
normal stack.

It looks like that for example clang 18.1.3, when compiling with
-O2 -fsanitize=addressan it puts local variables and things allocated by
alloca() in other areas than on the stack.

The following code shows the issue

Thread 6 "mariadbd" hit Breakpoint 3, do_handle_one_connection
    (connect=0x5080000027b8,
    put_in_cache=<optimized out>) at sql/sql_connect.cc:1399

THD *thd;
1399      thd->thread_stack= (char*) &thd;
(gdb) p &thd
(THD **) 0x7fffedee7060
(gdb) p $sp
(void *) 0x7fffef4e7bc0

The address of thd is 24M away from the stack pointer

(gdb) info reg
...
rsp            0x7fffef4e7bc0      0x7fffef4e7bc0
...
r13            0x7fffedee7060      140737185214560

r13 is pointing to the address of the thd. Probably some kind of
"local stack" used by the sanitizer

I have verified this with gdb on a recursive call that calls alloca()
in a loop. In this case all objects was stored in a local heap,
not on the stack.

To solve this issue in a portable way, I have added two functions:

my_get_stack_pointer() returns the address of the current stack pointer.
The code is using asm instructions for intel 32/64 bit, powerpc,
arm 32/64 bit and sparc 32/64 bit.
Supported compilers are gcc, clang and MSVC.
For MSVC 64 bit we are using _AddressOfReturnAddress()

As a fallback for other compilers/arch we use the address of a local
variable.

my_get_stack_bounds() that will return the address of the base stack
and stack size using pthread_attr_getstack() or NtCurrentTed() with
fallback to using the address of a local variable and user provided
stack size.

Server changes are:

- Moving setting of thread_stack to THD::store_globals() using
  my_get_stack_bounds().
- Removing setting of thd->thread_stack, except in functions that
  allocates a lot on the stack before calling store_globals().  When
  using estimates for stack start, we reduce stack_size with
  MY_STACK_SAFE_MARGIN (8192) to take into account the stack used
  before calling store_globals().

I also added a unittest, stack_allocation-t, to verify the new code.

Reviewed-by: Sergei Golubchik <serg@mariadb.org>
2024-10-16 17:24:46 +03:00
Sergei Golubchik
5ebda30ccc Revert "MDEV-35019 Provide a way to enable "rollback XA on disconnect" behavior we had before 10.5.2"
This reverts commit 8ae462a220.
2024-10-16 13:23:47 +02:00
Kristian Nielsen
8ae462a220 MDEV-35019 Provide a way to enable "rollback XA on disconnect" behavior we had before 10.5.2
Implement variable legacy_xa_rollback_at_disconnect to support
backwards compatibility for applications that rely on the pre-10.5
behavior for connection disconnect, which is to rollback the
transaction (in violation of the XA specification).

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-10-16 10:18:36 +02:00
Monty
6f6c1911dc MDEV-34251 Conditional jump or move depends on uninitialised value in ha_handler_stats::has_stats
Fixed by checking handler_stats if it's active instead of
thd->variables.log_slow_verbosity & LOG_SLOW_VERBOSITY_ENGINE.

Reviewed-by: Sergei Petrunia <sergey@mariadb.com>
2024-10-03 13:45:26 +03:00
Kristian Nielsen
db5d1cde45 MDEV-34857: Implement --slave-abort-blocking-timeout
If a slave replicating an event has waited for more than
@@slave_abort_blocking_timeout for a conflicting metadata lock held by a
non-replication thread, the blocking query is killed to allow replication to
proceed and not be blocked indefinitely by a user query.

Reviewed-by: Monty <monty@mariadb.org>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-09-04 11:44:14 +02:00
Oleksandr Byelkin
492a7c2430 Merge branch '11.5' into 11.6 2024-08-21 15:13:47 +02:00
Oleksandr Byelkin
342fa29615 Merge branch '11.4' into 11.5 2024-08-21 11:52:54 +02:00
Oleksandr Byelkin
eb70e0d6e2 Merge branch '11.2' into 11.4 2024-08-21 09:30:54 +02:00
Oleksandr Byelkin
6197e6abc4 Merge branch '10.11' into 11.2 2024-08-21 07:58:46 +02:00
Marko Mäkelä
62bfcfd8b2 Merge 10.6 into 10.11 2024-08-14 11:36:52 +03:00
Marko Mäkelä
757c368139 Merge 10.5 into 10.6 2024-08-14 10:56:11 +03:00
Jan Lindström
cd8b8bb964 MDEV-34594 : Assertion `client_state.transaction().active()' failed in
int wsrep_thd_append_key(THD*, const wsrep_key*, int, Wsrep_service_key_type)

CREATE TABLE [SELECT|REPLACE SELECT] is CTAS and idea was that
we force ROW format. However, it was not correctly enforced
and keys were appended before wsrep transaction was started.

At THD::decide_logging_format we should force used stmt binlog
format to ROW in CTAS case and produce a warning if used
binlog format was not ROW.

At ha_innodb::update_row we should not append keys similarly
as in ha_innodb::write_row if sql_command is SQLCOM_CREATE_TABLE.
Improved error logging on ::write_row, ::update_row and ::delete_row
if wsrep key append fails.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-08-12 23:54:30 +02:00
Oleksandr Byelkin
d6444022ca Merge branch 'bb-11.5-release' into bb-11.6-release 2024-08-06 17:28:38 +02:00
Oleksandr Byelkin
ea75a0b600 Merge branch '11.4' into 11.5 2024-08-05 17:50:18 +02:00
Oleksandr Byelkin
1640c9b06e Merge branch '11.2' into 11.4 2024-08-04 17:27:48 +02:00
Oleksandr Byelkin
dced6cbdb6 Merge branch '11.1' into 11.2 2024-08-03 09:50:16 +02:00
Oleksandr Byelkin
80abd847da Merge branch '10.11' into 11.1 2024-08-03 09:32:42 +02:00
Oleksandr Byelkin
0e8fb977b0 Merge branch '10.6' into 10.11 2024-08-03 09:15:40 +02:00
Monty
4bf7c966b3 MDEV-34664: Add an option to fix InnoDB's doubling of secondary index cardinalities
(With trivial fixes by sergey@mariadb.com)
Added option fix_innodb_cardinality to optimizer_adjust_secondary_key_costs

Using fix_innodb_cardinality disables the 'divide by 2' of rec_per_key_int
in InnoDB that in effect doubles the Cardinality for secondary keys.
This has the biggest effect for indexes where a few rows has the same key
value. Using this may also cause table scans for very small tables (which
in some cases may be better than an index scan).

The user visible effect is that 'SHOW INDEX FROM table_name' will for
InnoDB show the true Cardinality (and not 2x the real value). It will
also allow the optimizer to chose a better index in some cases as the
division by 2 could have a bad effect for tables with 2-5 identical values
per key.

A few notes about using fix_innodb_cardinality:
- It has direct affect for SHOW INDEX FROM table_name. SHOW INDEX
  will also update the statistics in table share.
- The effect of fix_innodb_cardinality for query plans or EXPLAIN
  is only visible after first open of the table. This is why one must
  do a flush tables or use SHOW INDEX for the option to take effect.
- Using fix_innodb_cardinality can thus affect all user in their query
  plans if they are using the same tables.

Because of this, it is strongly recommended that one uses
optimizer_adjust_secondary_key_costs=fix_innodb_cardinality mainly
in configuration files to not cause issues for other users.
2024-07-29 16:40:53 +03:00
Oleksandr Byelkin
0fe39d368a Merge branch '10.6' into 10.11 2024-07-22 15:14:50 +02:00
Dave Gosselin
02e38e2ece MDEV-33971 NAME_CONST in WHERE clause replaced by inner item
Improve performance of queries like
  SELECT * FROM t1 WHERE field = NAME_CONST('a', 4);
by, in this example, replacing the WHERE clause with field = 4
in the case of ref access.

The rewrite is done during fix_fields and we disambiguate this
case from other cases of NAME_CONST by inspecting where we are
in parsing.  We rely on THD::where to accomplish this.  To
improve performance there, we change the type of THD::where to
be an enumeration, so we can avoid string comparisons during
Item_name_const::fix_fields.  Consequently, this patch also
changes all usages of THD::where to conform likewise.
2024-07-10 17:23:43 -04:00
Alexander Barkov
a2a5ba14a8 Merge remote-tracking branch 'origin/11.5' into 11.6 2024-07-10 19:18:52 +04:00
Alexander Barkov
4e805aed85 Merge remote-tracking branch 'origin/11.4' into 11.5 2024-07-10 12:17:09 +04:00
Alexander Barkov
5fb07d942b Merge remote-tracking branch 'origin/11.2' into 11.4 2024-07-09 21:45:37 +04:00
Alexander Barkov
8aad19ddfc Merge remote-tracking branch 'origin/11.1' into 11.2 2024-07-09 14:04:11 +04:00
Oleksandr Byelkin
2447dda2c0 Merge branch '10.11' into 11.1 2024-07-08 22:40:16 +02:00
Alexander Barkov
027f137741 Merge remote-tracking branch 'origin/11.5' into 11.6 2024-07-08 14:15:04 +04:00
Alexander Barkov
8f4ec79d09 Merge remote-tracking branch 'origin/11.4' into 11.5 2024-07-08 12:25:04 +04:00
Oleksandr Byelkin
034a175982 Merge branch '10.6' into 10.11 2024-07-04 11:52:07 +02:00
Denis Protivensky
cfbd57dfb7 MDEV-33064: Sync trx->wsrep state from THD on trx start
InnoDB transactions may be reused after committed:
- when taken from the transaction pool
- during a DDL operation execution

In this case wsrep flag on trx object is cleared, which may cause wrong
execution logic afterwards (wsrep-related hooks are not run).

Make trx->wsrep flag initialize from THD object only once on InnoDB transaction
start and don't change it throughout the transaction's lifetime.
The flag is reset at commit time as before.

Unconditionally set wsrep=OFF for THD objects that represent InnoDB background
threads.

Make Wsrep_schema::store_view() operate in its own transaction.

Fix streaming replication transactions' fragments rollback to not switch
THD->wsrep value during transaction's execution
(use THD->wsrep_ignore_table as a workaround).

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-07-01 13:07:39 +02:00
Vladislav Vaintroub
7c5fdc9b6a MDEV-33748 get rid of pthread_(get_/set_)specific, use thread_local
Apart from better performance when accessing thread local variables,
we'll get rid of things that depend on initialization/cleanup of
pthread_key_t variables.

Where appropriate, use compiler-dependent pre-C++11 thread-local
equivalents, where it makes sense, to avoid initialization check overhead
that non-static thread_local can suffer from.
2024-06-21 13:46:41 +02:00
Alexander Barkov
c4bf4ce948 Merge remote-tracking branch 'origin/11.2' into 11.4 2024-06-17 15:46:39 +04:00
Marko Mäkelä
a21e49cbcc Merge 11.1 into 11.2 2024-06-17 12:02:03 +03:00
Yuchen Pei
2d3e2c58b6 Merge branch '10.11' into 11.1 2024-05-31 10:54:31 +10:00
Marko Mäkelä
22ba7e4ff8 Merge 10.6 into 10.11 2024-05-30 16:04:00 +03:00
Marko Mäkelä
5ba542e9ee Merge 10.5 into 10.6 2024-05-30 14:27:07 +03:00
Sergei Golubchik
aebd16201f don't use session locale for the error log 2024-05-27 12:39:04 +02:00
Monty
381e9adb6c MDEV-34150 Assertion failure in Diagnostics_area::set_error_status upon binary logging hitting tmp space limit
- Moved writing to binlog_cache from close_thread_tables() to
  binlog_commit().
- In select_create() delete cached row events instead of flushing them
  to disk. This was done to avoid possible disk write error in this code.
2024-05-27 12:39:04 +02:00