1
0
mirror of https://github.com/MariaDB/server.git synced 2025-11-08 00:28:29 +03:00
Commit Graph

3067 Commits

Author SHA1 Message Date
Sergei Golubchik
f230c0ff6e Merge branch '10.11' into 11.4 2025-11-04 13:44:16 +01:00
Sergei Golubchik
2bf1d089cf MDEV-37906 Server crash or UBSAN errors in Item_func_nextval::update_table upon INSERT DELAYED
sequences, just as triggers or check constraints, disable DELAYED
2025-10-28 10:49:42 +01:00
Marko Mäkelä
e8ef8c0055 Merge 10.11 into 11.4 2025-09-24 13:40:09 +03:00
Aleksey Midenkov
1e7d451e99 MDEV-37404 InnoDB: Failing assertion: node->pcur->rel_pos == BTR_PCUR_ON
Caused by optimization done in 2e2b2a0469.

Cannot use lookup_handler in default branch of locate_dup_record() as
InnoDB update depends on positioned record and update is done in table
main handler.

The patch reverts some non-pure changes done by 2e2b2a0469 to
original logic from 72429cad. There was no long_unique_table condition
to init search on table->file, so we get into default branch with long
unique and table->file search uninitialized.

ha_rnd_init_with_error() on demand for HA_DUPLICATE_POS branch was
original logic as well.

More info: 2e2b2a0469 reverts 5e345281e3, but it seems to be OK as
MDEV-3888 test case passes. mysql-5.6.13 has the original code with
HA_WHOLE_KEY as well.
2025-09-19 12:44:08 +03:00
Nikita Malyavin
0108664a8a Merge branch 10.11 into 11.4
# Conflicts:
#	sql/handler.h
#	sql/log_event.h
#	sql/log_event_server.cc
2025-09-02 15:58:39 +02:00
Nikita Malyavin
e56f6c585e MDEV-15990 handle timestamp-based collisions as well
Timestamp-versioned row deletion was exposed to a collisional problem: if
current timestamp wasn't changed, then a sequence of row delete+insert could
get a duplication error. A row delete would find another conflicting history row
and return an error.

This is true both for REPLACE and DELETE statements, however in REPLACE, the
"optimized" path is usually taken, especially in the tests. There, delete+insert
is substituted for a single versioned row update. In the end, both paths end up
as ha_update_row + ha_write_row.

The solution is to handle a history collision somehow.

From the design perspective, the user shouldn't experience history rows loss,
unless there's a technical limitation.

To the contrary, trxid-based changes should never generate history for the same
transaction, see MDEV-15427.

If two operations on the same row happened too quickly, so that they happen at
the same timestamp, the history row shouldn't be lost. We can still write a
history row, though it'll have row_start == row_end.

We cannot store more than one such historical row, as this will violate the
unique constraint on row_end. So we will have to phisically delete the row if
the history row is already available.

In this commit:
1. Improve TABLE::delete_row to handle the history collision: if an update
   results with a duplicate error, delete a row for real.
2. use TABLE::delete_row in a non-optimistic path of REPLACE, where the
   system-versioned case now belongs entirely.
2025-08-04 17:44:05 +02:00
Nikita Malyavin
6353a80ef5 MDEV-15990 REPLACE on a precise-versioned table returns ER_DUP_ENTRY
We had a protection against it, by allowing versioned delete if:
trx->id != table->vers_start_id()

For replace this check fails: replace calls ha_delete_row(record[2]), but
table->vers_start_id() returns the value from record[0], which is irrelevant.

The same problem hits Field::is_max, which may have checked the wrong record.

Fix:
* Refactor Field::is_max to optionally accept a pointer as an argument.
* Refactor vers_start_id and vers_end_id to always accept a pointer to the
record. there is a difference with is_max is that is_max accepts the pointer to
the
field data, rather than to the record.

Method val_int() would be too effortful to refactor to accept the argument, so
instead the value in record is fetched directly, like it is done in
Field_longlong.
2025-08-04 17:44:05 +02:00
Nikita Malyavin
2e2b2a0469 MDEV-15990 Refactor write_record and fix idempotent replication
See also MDEV-30046.

Idempotent write_row works same as REPLACE: if there is a duplicating
record in the table, then it will be deleted and re-inserted, with the
same update optimization.

The code in Rows:log_event::write_row was basically copy-pasted from
write_record.

What's done:
REPLACE operation was unified across replication and sql. It is now
representred as a Write_record class, that holds the whole state, and allows
re-using some resources in between the row writes.

Replace, IODKU and single insert implementations are split across different
methods, reluting in a much cleaner code.

The entry point is preserved as a single Write_record::write_record() call.
The implementation to call is chosen on the constructor stage.

This allowed several optimizations to be done:
1. The table key list is not iterated for every row. We find last unique key in
the order of checking once and preserve it across the rows. See last_uniq_key().
2. ib_handler::referenced_by_foreign_key acquires a global lock. This call was
done per row as well. Not all the table config that allows optimized replace is
folded into a single boolean field can_optimize. All the fields to check are
even stored in a single register on a 64-bit platform.
3. DUP_REPLACE and DUP_UPDATE cases now have one less level of indirection
4. modified_non_trans_tables is checked and set only when it's really needed.
5. Obsolete bitmap manipulations are removed.

Also:
* Unify replace initialization step across implementations:
  add prepare_for_replace and finalize_replace
* alloca is removed in favor of mem_root allocation. This memory is reused
  across the rows.
* An rpl-related callback is added to the replace branch, meaning that an extra
check is made per row replace even for the common case. It can be avoided with
templates if considered a problem.
2025-08-04 17:44:05 +02:00
Sergei Golubchik
c4ed889b74 Merge branch '10.11' into 11.4 2025-07-28 19:40:10 +02:00
Sergei Golubchik
053f9bcb5b Merge branch '10.6' into 10.11 2025-07-28 18:06:31 +02:00
Sergei Golubchik
c27d78beb5 MDEV-36870 Spurious unrelated permission error when selecting from table with default that uses nextval(sequence)
Lots of different cases, SELECT, SELECT DEFAULT(),
UPDATE t SET x=DEFAULT, prepares statements,
opening of a table for the I_S, prelocking (so TL_WRITE),
insert with subquery (so SQLCOM_SELECT), etc.

Don't check NEXTVAL privileges in fix_fields() anymore, it cannot
possibly handle all the cases correctly. Make a special method
Item_func_nextval::check_access() for that and invoke it from

* fix_fields on explicit SELECT NEXTVAL()
  (but not if NEXTVAL() is used in a DEFAULT clause)
* when DEFAULT bareword in used in, say, UPDATE t SET x=DEFAULT
  (but not if DEFAULT() itself is used in a DEFAULT clause)
* in CREATE TABLE
* in ALTER TABLE ALGORITHM=INPLACE (that doesn't go CREATE TABLE path)
* on INSERT

helpers
* Virtual_column_info::check_access() to walk the item tree and invoke
  Item::check_access()
* TABLE::check_sequence_privileges() to iterate default expressions
  and invoke Virtual_column_info::check_access()

also, single-table UPDATE in prepared statements now associates
value items with fields just as multi-update already did, fixes the
case of PREPARE s "UPDATE t SET x=?"; EXECUTE s USING DEFAULT.
2025-07-09 18:04:46 +02:00
Oleksandr Byelkin
a8d4642375 Merge branch '10.11' into 11.4 2025-04-26 10:53:02 +02:00
Oleksandr Byelkin
4d41ec081e Merge branch '10.6' into 10.11 2025-04-26 10:47:03 +02:00
Oleksandr Byelkin
19644f6821 Merge branch '10.5' into 10.6 2025-04-26 10:41:52 +02:00
Oleksandr Byelkin
4fc9dc84b0 MDEV-32086 (part 2) Server crash when inserting from derived table containing insert target table
Get rid of need of matherialization for usual INSERT (cache results in
Item_cache* if needed)

- subqueries in VALUE do not see new records in the table we are
  inserting to
- subqueries in RETIRNING prohibited to use the table we are inserting to
2025-04-25 15:10:36 +02:00
Oleksandr Byelkin
9b313d2de1 MDEV-32086 Server crash when inserting from derived table containing insert target table
Use original solution for INSERT ... SELECT - select result buferisation.

Also fix MDEV-36447 and MDEV-33139
2025-04-25 12:58:24 +02:00
Thirunarayanan Balathandayuthapani
f388222d49 MDEV-36504 Memory leak after CREATE TABLE..SELECT
Problem:
========
- After commit cc8eefb0dc (MDEV-33087),
InnoDB does use bulk insert operation for ALTER TABLE.. ALGORITHM=COPY
and CREATE TABLE..SELECT as well. InnoDB fails to clear the bulk
buffer when it encounters error during CREATE..SELECT. Problem
is that while transaction cleanup, InnoDB fails to identify
the bulk insert for DDL operation.

Fix:
====
- Represent bulk_insert in trx by 2 bits. By doing that, InnoDB
can distinguish between TRX_DML_BULK, TRX_DDL_BULK. During DDL,
set bulk insert value for transaction to TRX_DDL_BULK.

- Introduce a parameter HA_EXTRA_ABORT_ALTER_COPY which rollbacks
only TRX_DDL_BULK transaction.

- bulk_insert_apply() happens for TRX_DDL_BULK transaction happens
only during HA_EXTRA_END_ALTER_COPY extra() call.
2025-04-17 12:04:09 +05:30
Andrei Elkin
1b4efbeb8c MDEV-35207 ignored error at binlogging by CREATE-TABLE-SELECT leads to assert
MDEV-35499 Errored-out CREATE-or-REPLACE-SELECT does not log DROP table into binlog
MDEV-35502 Failed at ROW-format binlogging CREATE-TABLE-SELECT should
           not generate Incident event

When a CREATE TABLE .. SELECT errors while inserting data, a user
would expect that all changes are rolled back
and the table would not exist after executing the query.

However CREATE-TABLE-SELECT can face an error near the end of its execution
select_create::send_eof() so that the error was never checked which
led to various assert inside binlogging path that should not be
attended at all.
Specifically when binlog_commit() of ha_commit_one_phase() that
CREATE-TABLE-SELECT employs errored out because of a limited cache size
(binlog_commit may try writing to a transactional cache) the cache
was not flushed to binlog. The missed error check allowed further
execution down to trans_commit_implicit() in whose stack

  DBUG_ASSERT(!(entry->using_trx_cache && !mngr->trx_cache.empty() &&
                mngr->get_binlog_cache_log(TRUE)->error));

fired. In a non-debug build that table remains created/populated
inconsistently with binlog.

The fixes need and install the error checking in select_create::send_eof().
That prevents from any further execution when ha_commit_one_phase() fails
for any reason (typically due to binlog_commit()).

This commit also covers CREATE-or-REPLACE-SELECT that additionally had
a specific issue in that DROP TABLE was not logged the binary log, MDEV-35499.
See changes select_create::abort_result_set().

The current commit also corrects an unnecessary Incident event
logging when CREATE-TABLE-SELECT encounters a binloging issue, MDEV-35502.
The Incident was actually only harmful in this case as the table was
never going to be created, therefore replicated, in such a case.
In "normal" cases when the SELECT phase errors due to binlogging, an
internal incident flag gets reset inside select_create::abort_result_set().

A hunk in select_insert::prepare_eof() addresses a specific kind of
this issue that deals with incorrect computation of the binlog cache type.
Because of that in the OLD version execution was allowed to proceed along
ha_commit_trans()..binlog_commit() while a Pending event was not
flushed to the transactional cache. That might lead to the unnecessary
binlogged Incident despite the select_create::abort_result_set()
measures. However now with the corrected cache type any binlogging error
to flush the Pending event is covered according to the normal case.
non-transaction table, updates to the non-transactional table

NOTE the commit contains few tests overlapping with unfixed yet MDEV-36027.

Thanks to Brandon Nesterenko and Kristian Nielsen for thorough review,
and Kristian additionally for ideas to simplify the patch and some
code contribution.
2025-04-03 19:00:02 +03:00
Marko Mäkelä
f5bd250f5b Merge 10.11 into 11.4 2025-03-28 13:55:21 +02:00
Marko Mäkelä
ab0f2a00b6 Merge 10.6 into 10.11 2025-03-27 08:01:47 +02:00
Julius Goryavsky
e3d7d5ca26 Merge branch '10.5' into '10.6' 2025-02-27 04:02:33 +01:00
Sergei Golubchik
cd7513995d MDEV-36026 Problem with INSERT SELECT on NOT NULL columns while having BEFORE UPDATE trigger
MDEV-8605 and MDEV-19761 didn't handle INSERT (columns) SELECT

followup for a69da0c31e
2025-02-10 15:55:00 +01:00
Sergei Golubchik
7d657fda64 Merge branch '10.11 into 11.4 2025-01-30 12:01:11 +01:00
Sergei Golubchik
e69f8cae1a Merge branch '10.6' into 10.11 2025-01-30 11:55:13 +01:00
Sergei Golubchik
066e8d6aea Merge branch '10.5' into 10.6 2025-01-29 11:17:38 +01:00
Sergei Golubchik
9a0ac0cdf7 MDEV-35911 Assertion `marked_for_write_or_computed()' failed in bool Field_new_decimal::store_value(const my_decimal*, int*)
disable the assert.

also, use the same check for check_that_all_fields_are_given_values()
as it's used in not_null_fields_have_null_values() - to avoid
issuing the same warning twice.
2025-01-28 19:29:54 +01:00
Jan Lindström
43c36b3c88 MDEV-35852 : ASAN heap-use-after-free in WSREP_DEBUG after INSERT DELAYED
Problem was that in case of INSERT DELAYED thd->query() is
freed before we call trans_rollback where WSREP_DEBUG
could access thd->query() in wsrep_thd_query().

Fix is to reset thd->query() to NULL in delayed_insert
destructor after it is freed. There is already
null guard at wsrep_thd_query().

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2025-01-20 12:19:31 +01:00
Marko Mäkelä
98dbe3bfaf Merge 10.5 into 10.6 2025-01-20 09:57:37 +02:00
Sergei Golubchik
a69da0c31e MDEV-19761 Before Trigger not processed for Not Null Columns if no explicit value and no DEFAULT
it's incorrect to zero out table->triggers->extra_null_bitmap
before a statement, because if insert uses an explicit field list
and omits a field that has no default value, the field should
get NULL implicitly. So extra_null_bitmap should have 1s for all
fields that have no defaults

* create extra_null_bitmap_init and initialize it as above
* copy extra_null_bitmap_init to extra_null_bitmap for inserts
* still zero out extra_null_bitmap for updates/deletes where
  all fields definitely have a value
* make not_null_fields_have_null_values() to send
  ER_NO_DEFAULT_FOR_FIELD for fields with no default and no value,
  otherwise creation of a trigger with an empty body would change the
  error message
2025-01-17 23:42:56 +01:00
Aleksey Midenkov
78c192644c MDEV-35343 unexpected replace behaviour when long unique index on
system versioned table

For versioned table REPLACE first tries to insert a row, if it gets
duplicate key error and optimization is possible it does UPDATE +
INSERT history. If optimization is not possible it goes normal branch
for UPDATE to history and repeats the cycle of INSERT.

The failure was in normal branch when we tried UPDATE to history but
such history already exists from previous cycles. There is no such
failures in optimized branch because vers_insert_history_row() already
ignores duplicates.

The fix ignores duplicate errors for UPDATE to history and does DELETE
instead.
2025-01-14 18:56:13 +03:00
Aleksey Midenkov
92383f8db1 MDEV-26891 Segfault in Field::register_field_in_read_map upon INSERT
DELAYED with virtual columns

Segfault was cause by two different copies of same Field instance in
prepared delayed insert. One was made by
Delayed_insert::get_local_table() (see make_new_field()). That copy
went through parse_vcol_defs() and received new vcol_info->expr.

Another one was made by copy_keys_from_share() by this code:

        /*
          We are using only a prefix of the column as a key:
          Create a new field for the key part that matches the index
        */
        field= key_part->field=field->make_new_field(root, outparam, 0);
        field->field_length= key_part->length;

So, key_part and table got different objects of same field and the
crash was because key_part->field->vcol_info->expr is NULL.

The fix does update_keypart_vcol_info() to update vcol_info->expr in
key_part->field.

Cleanup: memdup_vcol() is static inline instead of macro + check OOM.
2025-01-14 18:56:13 +03:00
Oleksandr Byelkin
0d35fe6e57 MDEV-35326: Memory Leak in init_io_cache_ext upon SHUTDOWN
The problems were that:
1) resources was freed "asimetric" normal execution in send_eof,
 in case of error in destructor.
2) destructor was not called in case of SP for result objects.
(so if the last SP execution ended with error resorces was not
freeded on reinit before execution (cleanup() called before next
execution) and destructor also was not called due to lack of
delete call for the object)

Result cleanup() renamed to reset_for_next_ps_execution() to better
reflect function().

All result method revised and freeing resources made "symetric".

Destructor of result object called for SP.

Added skipped invalidation in case of error in insert.

Removed misleading naming of reset(thd) (could be mixed with
with reset()).
2025-01-13 10:04:27 +01:00
Marko Mäkelä
17f01186f5 Merge 10.11 into 11.4 2025-01-09 07:58:08 +02:00
Marko Mäkelä
3f914afd3a Merge 10.6 into 10.11 2025-01-02 12:39:56 +02:00
Yuchen Pei
671f80c738 Merge branch '10.5' into 10.6 2024-12-17 11:06:09 +11:00
Oleg Smirnov
d98ac8511e MDEV-26247 MariaDB Server SEGV on INSERT .. SELECT
This problem occured for statements like `INSERT INTO t1 SELECT 1`,
which do not have tables in the SELECT part. In such scenarios
SELECT_LEX::insert_tables was not properly set at `setup_tables()`,
and this led to either incorrect execution or a crash

Reviewer: Oleksandr Byelkin <sanja@mariadb.com>
2024-12-14 14:04:21 +07:00
Oleg Smirnov
e640373389 Revert "MDEV-26427 MariaDB Server SEGV on INSERT .. SELECT"
This reverts commit 49e14000ee
as it introduces regression MDEV-29935 and has to be reconsidered
in general
2024-12-14 13:08:17 +07:00
Dmitry Shulga
54c1031b74 MDEV-34958: after Trigger doesn't work correctly with bulk insert
This bug has the same nature as the issues
  MDEV-34718: Trigger doesn't work correctly with bulk update
  MDEV-24411: Trigger doesn't work correctly with bulk insert

To fix the issue covering all use cases, resetting the thd->bulk_param
temporary to the value nullptr before invoking triggers and restoring
its original value on finishing execution of a trigger is moved to the method
  Table_triggers_list::process_triggers
that be invoked ultimately for any kind of triggers.
2024-12-13 16:19:39 +07:00
Oleksandr Byelkin
c770bce898 Merge branch '11.2' into 11.4 2024-10-30 15:11:17 +01:00
Oleksandr Byelkin
3d0fb15028 Merge branch '10.6' into 10.11 2024-10-29 15:24:38 +01:00
Oleksandr Byelkin
f00711bba2 Merge branch '10.5' into 10.6 2024-10-29 14:20:03 +01:00
Sergei Golubchik
3a1cf2c85b MDEV-34679 ER_BAD_FIELD uses non-localizable substrings 2024-10-17 21:37:37 +02:00
Monty
bddbef3573 MDEV-34533 asan error about stack overflow when writing record in Aria
The problem was that when using clang + asan, we do not get a correct value
for the thread stack as some local variables are not allocated at the
normal stack.

It looks like that for example clang 18.1.3, when compiling with
-O2 -fsanitize=addressan it puts local variables and things allocated by
alloca() in other areas than on the stack.

The following code shows the issue

Thread 6 "mariadbd" hit Breakpoint 3, do_handle_one_connection
    (connect=0x5080000027b8,
    put_in_cache=<optimized out>) at sql/sql_connect.cc:1399

THD *thd;
1399      thd->thread_stack= (char*) &thd;
(gdb) p &thd
(THD **) 0x7fffedee7060
(gdb) p $sp
(void *) 0x7fffef4e7bc0

The address of thd is 24M away from the stack pointer

(gdb) info reg
...
rsp            0x7fffef4e7bc0      0x7fffef4e7bc0
...
r13            0x7fffedee7060      140737185214560

r13 is pointing to the address of the thd. Probably some kind of
"local stack" used by the sanitizer

I have verified this with gdb on a recursive call that calls alloca()
in a loop. In this case all objects was stored in a local heap,
not on the stack.

To solve this issue in a portable way, I have added two functions:

my_get_stack_pointer() returns the address of the current stack pointer.
The code is using asm instructions for intel 32/64 bit, powerpc,
arm 32/64 bit and sparc 32/64 bit.
Supported compilers are gcc, clang and MSVC.
For MSVC 64 bit we are using _AddressOfReturnAddress()

As a fallback for other compilers/arch we use the address of a local
variable.

my_get_stack_bounds() that will return the address of the base stack
and stack size using pthread_attr_getstack() or NtCurrentTed() with
fallback to using the address of a local variable and user provided
stack size.

Server changes are:

- Moving setting of thread_stack to THD::store_globals() using
  my_get_stack_bounds().
- Removing setting of thd->thread_stack, except in functions that
  allocates a lot on the stack before calling store_globals().  When
  using estimates for stack start, we reduce stack_size with
  MY_STACK_SAFE_MARGIN (8192) to take into account the stack used
  before calling store_globals().

I also added a unittest, stack_allocation-t, to verify the new code.

Reviewed-by: Sergei Golubchik <serg@mariadb.org>
2024-10-16 17:24:46 +03:00
Oleksandr Byelkin
1d0e94c55f Merge branch '10.5' into 10.6 2024-10-09 08:38:48 +02:00
Sergei Golubchik
3ea71a2c8e MDEV-16699 heap-use-after-free in group_concat with compressed or GIS columns
Field_blob::store() has special code for GROUP_CONCAT temporary table
(to store blob values in Blob_mem_storage - this prevents them
from being freed/overwritten when a next row is read).

Field_geom and Field_blob_compressed inherit from Field_blob but they
have their own ::store() method without this special Blob_mem_storage
support.

Considering that non-grouping CONCAT() of such fields converts
them to plain BLOB, let's do the same for GROUP_CONCAT. To do it,
Item_func_group_concat::setup will signal that it's creating
a temporary table for GROUP_CONCAT, and Field_blog::make_new_field()
override will create base Field_blob when under group concat.
2024-10-08 15:31:02 +02:00
Aleksey Midenkov
706a8bcb5b MDEV-33470 Unique hash index is broken on DML for system-versioned table
Hash index is vcol-based wrapper (MDEV-371). row_end is added to
unique index. So when row_end is updated unique hash index must be
recalculated via vcol_update_fields(). DELETE did not update virtual
fields, so DELETE HISTORY was getting wrong hash value.

The fix does update_virtual_fields() on vers_update_end() so in every
case row_end is updated virtual fields are updated as well.
2024-10-08 13:08:10 +03:00
Aleksey Midenkov
d37bb140b1 MDEV-31297 Create table as select on system versioned tables do not
work consistently on replication

Row-based replication does not execute CREATE .. SELECT but instead
CREATE TABLE. CREATE .. SELECT creates implict system fields on
unusual place: in-between declared fields and select fields. That was
done because select_field_pos logic requires select fields go last in
create_list.

So, CREATE .. SELECT on master and CREATE TABLE on slave create system
fields on different positions and replication gets field mismatch.

To fix this we've changed CREATE .. SELECT to create implicit system
fields on usual place in the end and updated select_field_pos for
handling this case.
2024-10-08 13:08:10 +03:00
Marko Mäkelä
7ea9e1358f Merge 11.2 into 11.4 2024-09-18 08:07:22 +03:00
Julius Goryavsky
f176248d4b Merge branch '10.6' into '10.11' 2024-09-17 06:23:10 +02:00
Julius Goryavsky
80fff4c6b1 Merge branch '10.5' into '10.6' 2024-09-16 16:39:59 +02:00