See also MDEV-30046.
Idempotent write_row works same as REPLACE: if there is a duplicating
record in the table, then it will be deleted and re-inserted, with the
same update optimization.
The code in Rows:log_event::write_row was basically copy-pasted from
write_record.
What's done:
REPLACE operation was unified across replication and sql. It is now
representred as a Write_record class, that holds the whole state, and allows
re-using some resources in between the row writes.
Replace, IODKU and single insert implementations are split across different
methods, reluting in a much cleaner code.
The entry point is preserved as a single Write_record::write_record() call.
The implementation to call is chosen on the constructor stage.
This allowed several optimizations to be done:
1. The table key list is not iterated for every row. We find last unique key in
the order of checking once and preserve it across the rows. See last_uniq_key().
2. ib_handler::referenced_by_foreign_key acquires a global lock. This call was
done per row as well. Not all the table config that allows optimized replace is
folded into a single boolean field can_optimize. All the fields to check are
even stored in a single register on a 64-bit platform.
3. DUP_REPLACE and DUP_UPDATE cases now have one less level of indirection
4. modified_non_trans_tables is checked and set only when it's really needed.
5. Obsolete bitmap manipulations are removed.
Also:
* Unify replace initialization step across implementations:
add prepare_for_replace and finalize_replace
* alloca is removed in favor of mem_root allocation. This memory is reused
across the rows.
* An rpl-related callback is added to the replace branch, meaning that an extra
check is made per row replace even for the common case. It can be avoided with
templates if considered a problem.
MDEV-33658 part 1’s refactoring ecaedbe299
introduced a new function init_key_info which (in part) aims to
calculate the total key length; however, it doesn’t account for the
key already having been initialized (as happens when called via
ALTER TABLE .. CONVERT PARTITION .. TO TABLE). This leads to crashes
when this key is later iterated over, because the iterator will try
to iterate over additional key parts which don’t exist because the
length reports as longer than the actual memory owned. The crash
reported by MDEV-36906 highlights this in function key_copy.
To explain how the keys already have been initialized, init_key_info
is called multiple times. That is, init_key_info is called from
mysql_prepare_create_table, which prepares a table and its key
structures for table creation, which is in turn called by
mysql_write_frm when using flags MFRM_WRITE_SHADOW and
MFRM_WRITE_CONVERTED_TO. The
ALTER TABLE .. CONVERT PARTITION .. TO TABLE use case (see function
fast_alter_partition_table), calls mysql_write_frm multiple times with
both of these flags set (first with MFRM_WRITE_CONVERTED_TO and then
with MFRM_WRITE_SHADOW).
Raising it up a level, mysql_prepare_create_table doesn't need to be
called again after it has already been invoked when just writing frms.
Init_key_info is the only place in that function which leads to side
effects, but the rest is redundant and can be skipped on the second
call (i.e. when writing the shadow).
The patch fixes this by skipping the call to mysql_prepare_create_table
in mysql_write_frm in the MFRM_WRITE_SHADOW block when it has already
been called previously. To track whether or not it has been previously
called, we add a new flag for the mysql_write_frm function,
MFRM_ALTER_INFO_PREPARED, which is hard-coded into the function call on
the later invocation.
Test case based on work by Elena Stepanova <elenst@mariadb.com>
Reviewed By:
============
Sergei Golubchik <serg@mariadb.org>
Nikita Malyavin <nikita.malyavin@mariadb.com>
In locked_tables_mode when table is opened without
MYSQL_OPEN_GET_NEW_TABLE flag it is taken from pre-opened and locked
tables. In that case we upgrade its MDL ticket to MDL_EXCLUSIVE before
the operation and downgrade after operation.
Lots of different cases, SELECT, SELECT DEFAULT(),
UPDATE t SET x=DEFAULT, prepares statements,
opening of a table for the I_S, prelocking (so TL_WRITE),
insert with subquery (so SQLCOM_SELECT), etc.
Don't check NEXTVAL privileges in fix_fields() anymore, it cannot
possibly handle all the cases correctly. Make a special method
Item_func_nextval::check_access() for that and invoke it from
* fix_fields on explicit SELECT NEXTVAL()
(but not if NEXTVAL() is used in a DEFAULT clause)
* when DEFAULT bareword in used in, say, UPDATE t SET x=DEFAULT
(but not if DEFAULT() itself is used in a DEFAULT clause)
* in CREATE TABLE
* in ALTER TABLE ALGORITHM=INPLACE (that doesn't go CREATE TABLE path)
* on INSERT
helpers
* Virtual_column_info::check_access() to walk the item tree and invoke
Item::check_access()
* TABLE::check_sequence_privileges() to iterate default expressions
and invoke Virtual_column_info::check_access()
also, single-table UPDATE in prepared statements now associates
value items with fields just as multi-update already did, fixes the
case of PREPARE s "UPDATE t SET x=?"; EXECUTE s USING DEFAULT.
@@enforce_storage_engine is local setting and there is no
knowledge how other nodes are configured. Statement
CREATE TABLE xxx ENGINE=yyy is replicated as it is and
if required engine != enforced engine it could lead
inconsistent used storage engine in the cluster.
Fix is to return error and a warning if required engine is not
same as enforced engine.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Log tables cannot work with transactional InnoDB or Aria, that is
checked by ALTER TABLE for ER_UNSUPORTED_LOG_ENGINE. But it was
possible to circumvent this check with CREATE TABLE. The patch makes
the check of supported engine common for ALTER TABLE and CREATE TABLE.
Two new error codes ER_SEQUENCE_TABLE_HAS_TOO_FEW_ROWS and
ER_SEQUENCE_TABLE_HAS_TOO_MANY_ROWS were introduced in MDEV-36032 in
both 10.11 and, as part of MDEV-22491, 12.0. Here we remove them from
10.11, but they should remain in 12.0.
Ensure that backup_reset_alter_copy_lock() is called in case of rollback
or error in mysql_inplace_alter_table() or copy_data_between_tables().
Other things:
- Improved error from mariabackup when unexpected DDL operation is
encountered.
- Added assert if backup_ddl_log() is called in the wrong context.
To check the rows, the table needs to be opened. To that end, and like
MDEV-36038, we force COPY algorithm on ALTER TABLE ... SEQUENCE=1.
This also results in checking the sequence state / metadata.
The table structure was already validated before this patch.
Problem:
========
- After commit cc8eefb0dc (MDEV-33087),
InnoDB does use bulk insert operation for ALTER TABLE.. ALGORITHM=COPY
and CREATE TABLE..SELECT as well. InnoDB fails to clear the bulk
buffer when it encounters error during CREATE..SELECT. Problem
is that while transaction cleanup, InnoDB fails to identify
the bulk insert for DDL operation.
Fix:
====
- Represent bulk_insert in trx by 2 bits. By doing that, InnoDB
can distinguish between TRX_DML_BULK, TRX_DDL_BULK. During DDL,
set bulk insert value for transaction to TRX_DDL_BULK.
- Introduce a parameter HA_EXTRA_ABORT_ALTER_COPY which rollbacks
only TRX_DDL_BULK transaction.
- bulk_insert_apply() happens for TRX_DDL_BULK transaction happens
only during HA_EXTRA_END_ALTER_COPY extra() call.
ha_innobase::extra(): Conditionally avoid a log write that had been
added in commit e5b9dc1536 (MDEV-25910)
because it may be invoked as part of select_insert::prepare_eof()
and not only during DDL operations.
Reviewed by: Sergei Golubchik
mysql_alter_table(): Consider ha_sequence::storage_ht() when determining
if the storage engine changed.
ha_sequence::check_if_supported_inplace_alter(): A new function, to
ensure that ha_innobase::check_if_supported_inplace_alter() will be
called on ALTER TABLE name_of_sequence SEQUENCE=0.
ha_innobase::check_if_supported_inplace_alter(): For any change of
the SEQUENCE attribute, always return HA_ALTER_INPLACE_NOT_SUPPORTED,
forcing ALGORITHM=COPY.
Problem was incorrect handling of partitioned tables,
because db_type == DB_TYPE_PARTITION_DB
wsrep_should_replicate_ddl incorrectly marked
DDL as not replicatable. However, in partitioned
tables we should check implementing storage engine
from table->file->partition_ht() if available because
if partition handler is InnoDB all DDL should be allowed
even with wsrep_strict_ddl. For other storage engines
DDL should not be allowed and error should be issued.
This is 10.5 version of the fix.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Problem was incorrect condition on wsrep_check_sequence
when ENGINE!=InnoDB.
Fix is not use DB_TYPE_XXX because it is not correct
on dynamic storage engines. Instead used storage engine
name is looked from thd->lex->m_sql_cmd->option_storage_engine_name.
For CREATE TABLE allow anyting except ENGINE=SEQUENCE.
For CREATE SEQUENCE only ENGINE=InnoDB is supported.
For ALTER TABLE if original table contains sequence information
only ENGINE=InnoDB is supported.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
mysql_prepare_create_table: Extract a Key initialization part that
relates to length calculation and long unique index designation.
append_system_key_parts call also moves there.
Move this initialization before the duplicate elimination.
Extract WITHOUT OVERPLAPS check into a separate function. It had to be moved
earlier in the code to preserve the order of the error checks, as in the tests.
partion_engine_name was not reset when looping over tables in
mysql_rm_table_no_locks.
This could cause maria_backup to think that at normal droped
table was partitioned.
This issue was discovered in 11.8 as part of atomic created and replace
and only the fix was backported.
- MDEV-34392(commit cc810e64d4) adds
the check for nullability of foreign key column when foreign key
relation is of UPDATE_CASCADE or UPDATE SET NULL. This check
makes DDL fail when it violates foreign key nullability.
This patch basically does the nullability check for foreign key
column only for strict sql mode
mysql_compare_tables() failed because of no long hash index generated
fields in prepared tmp_create_info. In comparison we should skip these
fields in the original table by:
1. skipping INVISIBLE_SYSTEM fields;
2. getting key_info from table->s instead of table as TABLE_SHARE
contains unwrapped key_info.
`limit >= trx_id' failed in purge_node_t::skip
For fast alter partition ALTER lost hash fields in frm field
count. mysql_prepare_create_table() did not call add_hash_field()
because the logic of ALTER-ing field types implies automatic
promotion/demotion to/from hash index. So we don't pass hash algorithm
to mysql_prepare_create_table() and let it decide itself, but it
cannot decide it correctly for fast alter partition.
So now mysql_prepare_alter_table() is a bit more sophisticated on what
to pass in the algorithm. If not changed any fields it will force
mysql_prepare_create_table() to re-add hash fields by setting
HA_KEY_ALG_HASH.
The problem with the original logic is mysql_prepare_alter_table()
does not care 100% about hash property so the decision is blurred
between mysql_prepare_alter_table() and mysql_prepare_create_table().
MDEV-28127 did is_equal() which compared vcol expressions
literally. But another table vcol expression is not equal because of
different table name.
We implement another comparison method is_identical() which respects
different table name in vcol comparison. If any field item points to
table_A and compared field item points to table_B, such items are
treated as equal in (table_A, table_B) comparison. This is done by
cloning table_B expression and renaming any table_B entries to table_A
in it.
Partial commit of the greater MDEV-34348 scope.
MDEV-34348: MariaDB is violating clang-16 -Wcast-function-type-strict
The functions queue_compare, qsort2_cmp, and qsort_cmp2
all had similar interfaces, and were used interchangable
and unsafely cast to one another.
This patch consolidates the functions all into the
qsort_cmp2 interface.
Reviewed By:
============
Marko Mäkelä <marko.makela@mariadb.com>
That PR uncovered countless issues on `my_snprintf` uses.
This commit backports a squashed subset of their fixes.
(Excludes previous parts #3485 and #3493)
Also fixes
MDEV-35392 Assertion `!__asan_region_is_poisoned((vo id*) dest,templ->mysql_col_len)' failed in void row_sel_field_store_in_mysql_format_func(byte *, const mysql_row_templ_t *, const byte *, ulint)
Conversion from CHAR to VARCHAR must be done before the call
for create_length_to_internal_length_string().
Moving the conversion code from Column_definition::prepare_blob_field()
to Column_definition::prepare_stage1_string().
In commit 1c55b845e0 (MDEV-32932) the
test mariabackup.innodb_ddl_on_intermediate_table was introduced but
disabled.
xb_load_single_table_tablespace(): Properly handle missing FTS_ tables.
backup_file_op_fail(): Properly handle FILE_DELETE records.
Post-fix for MDEV-35144.
Cannot allocate options values on the statement arena, because
HA_CREATE_INFO is shallow-copied for every execution, so if the
option_list was initially empty, it will be reset for every execution
and any values allocated on the statement arena will be lost.
Cannot allocate option values on the execution arena, because
HA_CREATE_INFO is shallow-copied for every execution, so if the
option_list was initially NOT empty, any values appended to the
end will be preserved and if they're on the execution arena their
content will be destroyed.
Let's use thd->change_item_tree() to save and restore necessary pointers
for every execution.
followup for 3da565c41d