1
0
mirror of https://github.com/MariaDB/server.git synced 2025-11-09 11:41:36 +03:00
Commit Graph

3043 Commits

Author SHA1 Message Date
Monty
882f6fa3aa Fixed typos
- Removed duplicate words, like "the the" and "to to"
- Removed duplicate lines (one double sort line found in mysql.cc)
- Fixed some typos found while searching for duplicate words.

Command used to find duplicate words:
egrep -rI "\s([a-zA-Z]+)\s+\1\s" | grep -v param

Thanks to Artjoms Rimdjonoks for the command and pointing out the
spelling errors.
2025-09-04 18:08:39 +03:00
Nikita Malyavin
6353a80ef5 MDEV-15990 REPLACE on a precise-versioned table returns ER_DUP_ENTRY
We had a protection against it, by allowing versioned delete if:
trx->id != table->vers_start_id()

For replace this check fails: replace calls ha_delete_row(record[2]), but
table->vers_start_id() returns the value from record[0], which is irrelevant.

The same problem hits Field::is_max, which may have checked the wrong record.

Fix:
* Refactor Field::is_max to optionally accept a pointer as an argument.
* Refactor vers_start_id and vers_end_id to always accept a pointer to the
record. there is a difference with is_max is that is_max accepts the pointer to
the
field data, rather than to the record.

Method val_int() would be too effortful to refactor to accept the argument, so
instead the value in record is fetched directly, like it is done in
Field_longlong.
2025-08-04 17:44:05 +02:00
Nikita Malyavin
2e2b2a0469 MDEV-15990 Refactor write_record and fix idempotent replication
See also MDEV-30046.

Idempotent write_row works same as REPLACE: if there is a duplicating
record in the table, then it will be deleted and re-inserted, with the
same update optimization.

The code in Rows:log_event::write_row was basically copy-pasted from
write_record.

What's done:
REPLACE operation was unified across replication and sql. It is now
representred as a Write_record class, that holds the whole state, and allows
re-using some resources in between the row writes.

Replace, IODKU and single insert implementations are split across different
methods, reluting in a much cleaner code.

The entry point is preserved as a single Write_record::write_record() call.
The implementation to call is chosen on the constructor stage.

This allowed several optimizations to be done:
1. The table key list is not iterated for every row. We find last unique key in
the order of checking once and preserve it across the rows. See last_uniq_key().
2. ib_handler::referenced_by_foreign_key acquires a global lock. This call was
done per row as well. Not all the table config that allows optimized replace is
folded into a single boolean field can_optimize. All the fields to check are
even stored in a single register on a 64-bit platform.
3. DUP_REPLACE and DUP_UPDATE cases now have one less level of indirection
4. modified_non_trans_tables is checked and set only when it's really needed.
5. Obsolete bitmap manipulations are removed.

Also:
* Unify replace initialization step across implementations:
  add prepare_for_replace and finalize_replace
* alloca is removed in favor of mem_root allocation. This memory is reused
  across the rows.
* An rpl-related callback is added to the replace branch, meaning that an extra
check is made per row replace even for the common case. It can be avoided with
templates if considered a problem.
2025-08-04 17:44:05 +02:00
Sergei Golubchik
053f9bcb5b Merge branch '10.6' into 10.11 2025-07-28 18:06:31 +02:00
Sergei Golubchik
633417308f MDEV-37312 ASAN errors or assertion failure upon attempt to UPDATE FOR PORTION violating long unique under READ COMMITTED
in case of a long unique conflict ha_write_row() used delete_row()
to remove the newly inserted row, and it used rnd_pos() to position
the cursor before deletion. This rnd_pos() was freeing and reallocating
blobs in record[0]. So when the code for FOR PORTION OF did
  store_record(record[2]);
  ha_write_row()
  restore_record(record[2]);
it ended up with blob pointers to a freed memory.

Let's use lookup_handler for deletion.
2025-07-26 10:54:28 +02:00
Sergei Golubchik
fb2f324f85 MDEV-37310 Non-debug failing assertion node->pcur->rel_pos == BTR_PCUR_ON upon violating long unique under READ-COMMITTED
let's disallow UPDATE IGNORE in READ COMMITTED with the table
has UNIQUE constraint that is USING HASH or is WITHOUT OVERLAPS

This rarely-used combination should not block a release, with be
fixed in MDEV-37233
2025-07-25 12:28:30 +02:00
Sergei Golubchik
5622f3f5e8 MDEV-37268 HA_ERR_KEY_NOT_FOUND upon UPDATE or partitioned table with unique hash under READ-COMMITTED
followup for 9703c90712 (MDEV-37199 UNIQUE KEY USING HASH accepting duplicate records)

ha_partition can return HA_ERR_KEY_NOT_FOUND even in the middle
of the index scan
2025-07-21 08:59:08 +02:00
Sergei Golubchik
2b11a0e991 MDEV-37268 assert upon UPDATE or partitioned table with unique hash under READ-COMMITTED
followup for 9703c90712 (MDEV-37199 UNIQUE KEY USING HASH accepting duplicate records)

maintain the invariant, that handler::ha_update_row()
is always invoked as handler::ha_update_row(record[0], record[1])
2025-07-21 08:59:08 +02:00
Sergei Golubchik
774039e410 MDEV-37268 ER_DUP_ENTRY upon REPLACE into table with unique hash under READ-COMMITTED
followup for 9703c90712 (MDEV-37199 UNIQUE KEY USING HASH accepting duplicate records)

when looking for long unique duplicates and the new row is already
inserted, we cannot simply "skip one conflict" we must skip exactly
the new row and find a conflict which isn't a new row - otherwise
table->file->dup_ref can be set incorrectly and REPLACE won't work.
2025-07-21 08:59:08 +02:00
Sergei Golubchik
3a2e1f87a1 MDEV-37268 ER_NOT_KEYFILE or assertion failure upon REPLACE into table with unique hash under READ-COMMITTED
followup for 9703c90712 (MDEV-37199 UNIQUE KEY USING HASH accepting duplicate records)

don't forget to rnd_init()/rnd_end() around rnd_pos()
2025-07-21 08:59:08 +02:00
Sergei Golubchik
cb7978a12d MDEV-36720 Possible memory leak on updating table with index without overlaps
when closing a lookup_handler, don't forget to close it in PSI too
2025-07-17 09:18:17 +02:00
Sergei Golubchik
9703c90712 MDEV-37199 UNIQUE KEY USING HASH accepting duplicate records
Server-level UNIQUE constraints (namely, WITHOUT OVERLAPS and USING HASH)
only worked with InnoDB in REPEATABLE READ isolation mode, when the
constraint was checked first and then the row was inserted or updated.
Gap locks prevented race conditions when a concurrent connection
could've also checked the constraint and inserted/updated a row
at the same time.

In READ COMMITTED there are no gap locks. To avoid race conditions,
we now check the constraint *after* the row operation. This is
enabled by the HA_CHECK_UNIQUE_AFTER_WRITE table flag that InnoDB
sets in the READ COMMITTED transactions.

Checking the constraint after the row operation is more complex.
First, the constraint will see the current (inserted/updated) row,
and needs to skip it. Second, IGNORE operations become tricky,
as we need to revert the insert/update and continue statement execution.

write_row() (INSERT IGNORE) is reverted with delete_row(). Conveniently
it deletes the current row, that is, the last inserted row.

update_row(a,b) (UPDATE IGNORE) is reverted with a reversed update,
update_row(b,a). Conveniently, it updates the current row too.

Except in InnoDB when the PK is updated - in this case InnoDB internally
performs delete+insert, but does not move the cursor, so the "current"
row is the deleted one and the reverse update doesn't work.
This combination now throws an "unsupported" error and will
be fixed in MDEV-37233
2025-07-16 13:02:44 +02:00
Sergei Golubchik
2746c19a9c MDEV-37203 UBSAN: applying zero offset to null pointer in strings/ctype-uca.inl | my_uca_strnncollsp_onelevel_utf8mb4 | handler::check_duplicate_long_entries_update 2025-07-16 13:02:44 +02:00
Sergei Golubchik
d8c2362912 cleanup: long unique checks
consolidate and unify long unique checks.

fix a bug where an update of a long unique blob
was ignoring the prefix length
2025-07-16 13:02:44 +02:00
Sergei Golubchik
c27d78beb5 MDEV-36870 Spurious unrelated permission error when selecting from table with default that uses nextval(sequence)
Lots of different cases, SELECT, SELECT DEFAULT(),
UPDATE t SET x=DEFAULT, prepares statements,
opening of a table for the I_S, prelocking (so TL_WRITE),
insert with subquery (so SQLCOM_SELECT), etc.

Don't check NEXTVAL privileges in fix_fields() anymore, it cannot
possibly handle all the cases correctly. Make a special method
Item_func_nextval::check_access() for that and invoke it from

* fix_fields on explicit SELECT NEXTVAL()
  (but not if NEXTVAL() is used in a DEFAULT clause)
* when DEFAULT bareword in used in, say, UPDATE t SET x=DEFAULT
  (but not if DEFAULT() itself is used in a DEFAULT clause)
* in CREATE TABLE
* in ALTER TABLE ALGORITHM=INPLACE (that doesn't go CREATE TABLE path)
* on INSERT

helpers
* Virtual_column_info::check_access() to walk the item tree and invoke
  Item::check_access()
* TABLE::check_sequence_privileges() to iterate default expressions
  and invoke Virtual_column_info::check_access()

also, single-table UPDATE in prepared statements now associates
value items with fields just as multi-update already did, fixes the
case of PREPARE s "UPDATE t SET x=?"; EXECUTE s USING DEFAULT.
2025-07-09 18:04:46 +02:00
Monty
2d5dfc47a9 Define error message for HA_ERR_INCOMPATIBLE_DEFINITION 2025-06-30 18:34:47 +03:00
Oleksandr Byelkin
28d6530571 Merge branch '10.6' into 10.11 2025-06-04 14:09:23 +02:00
Monty
ce4f83e6b9 MDEV-29157 SELECT using ror_merged scan fails with s3 tables
handler::clone() call did not work with read only tables like S3.
It gave a wrong error message (out of memory instead of a permission
error) and aborted the query.

The issue was that the clone call had a wrong parameter to ha_open().
This now fixed. I also changed the clone call to provide the correct
error message if things fails.

This patch fixes an 'out of memory' error when using the S3 engine
for queries that could use multiple indexes together to find the matching
rows, like the following:
SELECT * FROM t1 WHERE key1 = 99 OR key2 = 2
2025-06-02 14:02:53 +03:00
Monty
22024da64e MDEV-36143 Row event replication with Aria does not honour BLOCK_COMMIT
This commit fixes a bug where Aria tables are used in
(master->slave1->slave2) and a backup is taken on slave2. In this case
it is possible that the replication position in the backup, stored in
mysql.gtid_slave_pos, will be wrong. This will lead to replication
errors if one is trying to use the backup as a new slave.

Analyze:
Replicated row events are committed with trans_commit_stmt() and
thd->transaction->all.ha_list != 0.
This means that backup_commit_lock is not taken for Aria tables,
which means the rows are committed and binary logged on the slave
under BLOCK_COMMIT which should not happen.

This issue does not occur on the master as thd->transaction->all.ha_list
is == 0 under AUTO_COMMIT, which sets 'is_real_trans' and 'rw_trans'
which in turn causes backup_commit_lock to be taken.

Fixed by checking in ha_check_and_coalesce_trx_read_only() if all handlers
supports rollback and if not, then wait for BLOCK_COMMIT also for
statement commit.
2025-06-02 14:02:53 +03:00
Yuchen Pei
6f8ef26885 MDEV-36032 Check whether a table can be a sequence when ALTERed with SEQUENCE=1
To check the rows, the table needs to be opened. To that end, and like
MDEV-36038, we force COPY algorithm on ALTER TABLE ... SEQUENCE=1.
This also results in checking the sequence state / metadata.

The table structure was already validated before this patch.
2025-04-29 16:28:01 +10:00
Sergei Golubchik
63a69ab936 cleanup: remote automatic conversion char* -> Lex_ident
considered harmful, see e.g. changes in check_period_fields()
2025-04-22 12:03:05 +02:00
Julius Goryavsky
1a013cea95 Merge branch '10.6' into '10.11' 2025-04-16 03:34:40 +02:00
Julius Goryavsky
88dfa6bcee Merge branch '10.5' into '10.6' 2025-04-15 01:49:48 +02:00
Nikita Malyavin
e6ea5d568c MDEV-36507 fix dbug_print_row concurrent access
7544fd4cae had to make use of a static array to avoid memory
use-after-free or leak.

Instead, let us make a function returning String, this is the only way
to automatically manage the memory after the function returned.

To make it all correct, move constructor is added. Normally, it is
expected, that the constructor will be elided upon return of an object
by value, but if something goes different, or -fno-elide-constructors is
used, we can have a problem. So this was a move constructor avoids
copy elision-related UB.

dbug_print_row returning char* is still there for convenient use in a
debugger.
2025-04-11 13:42:53 +02:00
Andrei Elkin
c06c36218a MDEV-35506 commit policy of one-phase-commit even at errored-out binlogging leads to assert
Currently execution of commit in one phase proceeds to commit by
engines when binlog_commit() does not succeed.

There are two issues with that:

1. absence of binlog_rollback() or lower-level
`binlog_cache_data::reset()` along the following execution of the
failing statement eventually will raise an assert on non-empty binlog
cache, find in the MDEV description

  # --error assert(sql/log.cc:1712(binlog_close_connection))
  # --disconnect default

2. engines, including ones that are rollback capable, commit in this
particular error situation.

Both effects can be observed with a new mtr test that would fail when run on
a BASE of this commit.
The BASE has to include MDEV-35207 et all fixes because the test is written
with CREATE-TABLE-SELECTs.

A new test file verifies the new behaviour to rollback including
cases with a side effect of modified non-transactional engine which
expose another MDEV-36027 (TODO: fix).
2025-04-03 20:13:10 +03:00
Julius Goryavsky
74f0b99edf Merge branch '10.6' into '10.11' 2025-04-02 06:33:39 +02:00
Julius Goryavsky
b983a911e9 galera mtr tests: synchronization between branches and editions 2025-04-02 04:50:11 +02:00
Julius Goryavsky
03c31ab099 Merge branch '10.5' into '10.6' 2025-04-02 04:43:24 +02:00
Julius Goryavsky
41565615c5 galera: synchronization changes to stop random test failures 2025-04-02 04:29:34 +02:00
Julius Goryavsky
c61345169a galera tests: synchronization after merge 2025-03-28 02:53:59 +01:00
Marko Mäkelä
ab0f2a00b6 Merge 10.6 into 10.11 2025-03-27 08:01:47 +02:00
Monty
cc4d9200c4 MDEV-33813 ERROR 1021 (HY000): Disk full (./org/test1.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
The problem with MariaDB waiting was fixed earlier.
However the server still gives the old error,in case of disk full,
that includes "waiting for someone to free some space" even if
there is now wait.

This commit changes the error message for the non waiting case to:
Disk got full writing 'db.table' (Errcode: 28 "No space left on device")

Disk got full writing 'test.t1' (Errcode: 28 "No space left on device")Disk got full writing 'test.t1' (Errcode: 28 "No space left on device")Disk got full writing 'test.t1' (Errcode: 28 "No space left on device")
2025-03-06 09:40:55 +02:00
Julius Goryavsky
15139c88a8 Merge branch '10.5' into '10.6' 2025-03-05 01:54:40 +01:00
Julius Goryavsky
3a4c0295ae galera: synchronization between branches and editions 2025-03-05 01:47:15 +01:00
Julius Goryavsky
e3d7d5ca26 Merge branch '10.5' into '10.6' 2025-02-27 04:02:33 +01:00
Jan Lindström
b167730499 MDEV-34891 : SST failure occurs when gtid_strict_mode is enabled
Problem was that initial GTID was set on wsrep_before_prepare
out-of-order. In practice GTID was set to same as previous
executed transaction GTID. In recovery valid GTID was found
from prepared transaction and this transaction is committed
leading to fact that same GTID was executed twice.

This is fixed by setting invalid GTID at wsrep_before_prepare
and later in wsrep_before_commit actual correct GTID is set
and this setting is done while we are in commit monitor i.e.
assigment is done in order of replication.

In recovery if prepared transaction is found we check its
GTID, if it is invalid transaction will be rolled back
and if it is valid it will be committed.

Initialize gtid seqno from recovered seqno when
bootstrapping a new cluster.

Added two test cases for both mariabackup and rsync SST methods
to show that GTIDs remain consistent on cluster and that
all expected rows are in the table.

Added tests for wsrep GTID recovery with binlog on and off.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2025-02-18 19:30:04 +01:00
Sergei Golubchik
e69f8cae1a Merge branch '10.6' into 10.11 2025-01-30 11:55:13 +01:00
Sergei Golubchik
066e8d6aea Merge branch '10.5' into 10.6 2025-01-29 11:17:38 +01:00
Daniele Sciascia
0018df2b55 galera fix: Assertion WSREP(thd) failed in wsrep_restore_kill_after_commit()
Wsrep_commit_empty happens too early when wsrep is disabled. Let the
cleanup happen at end of statement.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2025-01-27 19:05:27 +01:00
Julius Goryavsky
862d1be2e6 MDEV-25718 addendum: stabilization of test success (especially for 11.4+)
Added DEBUG_SYNC_С("ha_write_row_end") in the WSREP branch,
and added a new status to the list of pending statuses in
the mtr test.
2025-01-27 19:05:26 +01:00
Sergey Vojtovich
b730abda09 MDEV-33285 - Assertion `m_table' failed in ha_perfschema::rnd_end on CHECKSUM TABLE
CHECKSUM TABLE causes variety of crashes when killed. This bug it not
specific to PERFORMANCE_SCHEMA.

Removed duplicate handler::ha_rnd_end() call.
2025-01-22 15:28:44 +01:00
Julius Goryavsky
d32ec7d48e MDEV-35852 : ASAN heap-use-after-free in WSREP_DEBUG after INSERT DELAYED
Post-fix: remove unnecessary warning messages when wrep is not used.
2025-01-20 12:19:37 +01:00
Jan Lindström
43c36b3c88 MDEV-35852 : ASAN heap-use-after-free in WSREP_DEBUG after INSERT DELAYED
Problem was that in case of INSERT DELAYED thd->query() is
freed before we call trans_rollback where WSREP_DEBUG
could access thd->query() in wsrep_thd_query().

Fix is to reset thd->query() to NULL in delayed_insert
destructor after it is freed. There is already
null guard at wsrep_thd_query().

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2025-01-20 12:19:31 +01:00
Marko Mäkelä
98dbe3bfaf Merge 10.5 into 10.6 2025-01-20 09:57:37 +02:00
Aleksey Midenkov
e1e1e50bba MDEV-35343 DML debug logging
Usage:

mtr --mysqld=--debug=d,dml,query:i:o,/tmp/dml.log

Example output:

T@6    : dispatch_command: query: insert into t1 values ('a')
T@6    : handler::ha_write_row: exit: INSERT: t1(a) = 0
T@6    : dispatch_command: query: alter ignore table t1 add unique index (data)
T@6    : handler::ha_write_row: exit: INSERT: t1(a) = 0
T@6    : dispatch_command: query: alter ignore table t1 add unique index (data)
T@6    : handler::ha_write_row: exit: INSERT: t1(a) = 0

T@6    : dispatch_command: query: replace into t1 values ('b'), ('c'), ('a'), ('b')
T@6    : handler::ha_write_row: exit: INSERT: t1(b) = 0
T@6    : handler::ha_write_row: exit: INSERT: t1(c) = 0
T@6    : handler::ha_write_row: exit: INSERT: t1(a) = 121
T@6    : write_record: exit: DELETE: t1(a) = 0
T@6    : handler::ha_write_row: exit: INSERT: t1(a) = 0
T@6    : handler::ha_write_row: exit: INSERT: t1(b) = 121
T@6    : write_record: exit: DELETE: t1(b) = 0
T@6    : handler::ha_write_row: exit: INSERT: t1(b) = 0
2025-01-14 18:56:13 +03:00
Denis Protivensky
901c6c7ab6 MDEV-33064: Sync trx->wsrep state from THD on trx start
InnoDB transactions may be reused after committed:
- when taken from the transaction pool
- during a DDL operation execution

In this case wsrep flag on trx object is cleared, which may cause wrong
execution logic afterwards (wsrep-related hooks are not run).

Make trx->wsrep flag initialize from THD object only once on InnoDB transaction
start and don't change it throughout the transaction's lifetime.
The flag is reset at commit time as before.

Unconditionally set wsrep=OFF for THD objects that represent InnoDB background
threads.

Make Wsrep_schema::store_view() operate in its own transaction.

Fix streaming replication transactions' fragments rollback to not switch
THD->wsrep value during transaction's execution
(use THD->wsrep_ignore_table as a workaround).

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2025-01-14 02:17:22 +01:00
Marko Mäkelä
a54d151fc1 Merge 10.6 into 10.11 2024-12-19 15:38:53 +02:00
Marko Mäkelä
ddd7d5d8e3 MDEV-24035 Failing assertion: UT_LIST_GET_LEN(lock.trx_locks) == 0 causing disruption and replication failure
Under unknown circumstances, the SQL layer may wrongly disregard an
invocation of thd_mark_transaction_to_rollback() when an InnoDB
transaction had been aborted (rolled back) due to one of the following errors:
* HA_ERR_LOCK_DEADLOCK
* HA_ERR_RECORD_CHANGED (if innodb_snapshot_isolation=ON)
* HA_ERR_LOCK_WAIT_TIMEOUT (if innodb_rollback_on_timeout=ON)

Such an error used to cause a crash of InnoDB during transaction commit.
These changes aim to catch and report the error earlier, so that not only
this crash can be avoided but also the original root cause be found and
fixed more easily later.

The idea of this fix is from Michael 'Monty' Widenius.

HA_ERR_ROLLBACK: A new error code that will be translated into
ER_ROLLBACK_ONLY, signalling that the current transaction
has been aborted and the only allowed action is ROLLBACK.

trx_t::state: Add TRX_STATE_ABORTED that is like
TRX_STATE_NOT_STARTED, but noting that the transaction had been
rolled back and aborted.

trx_t::is_started(): Replaces trx_is_started().

ha_innobase: Check the transaction state in various places.
Simplify the logic around SAVEPOINT.

ha_innobase::is_valid_trx(): Replaces ha_innobase::is_read_only().

The InnoDB logic around transaction savepoints, commit, and rollback
was unnecessarily complex and might have contributed to this
inconsistency. So, we are simplifying that logic as well.

trx_savept_t: Replace with const undo_no_t*. When we rollback to
a savepoint, all we need to know is the number of undo log records
that must survive.

trx_named_savept_t, DB_NO_SAVEPOINT: Remove. We can store undo_no_t
directly in the space allocated at innobase_hton->savepoint_offset.

fts_trx_create(): Do not copy previous savepoints.

fts_savepoint_rollback(): If a savepoint was not found, roll back
everything after the default savepoint of fts_trx_create().
The test innodb_fts.savepoint is extended to cover this code.

Reviewed by: Vladislav Lesin
Tested by: Matthias Leich
2024-12-12 18:02:00 +02:00
Marko Mäkelä
3d23adb766 Merge 10.6 into 10.11 2024-11-29 13:43:17 +02:00
Marko Mäkelä
7d4077cc11 Merge 10.5 into 10.6 2024-11-29 12:37:46 +02:00