Problem was that thd->lex->m_sql_cmd is not always set
especially when user has not provided ENGINE=xxx so
requesting option_storage_engine_name from there is not
safe.
Fixed by accessing thd->lex->m_sql_cmd only when user
has used ENGINE= and if not using ha_default_handlerton
and requesting engine name after it.
See also MDEV-30046.
Idempotent write_row works same as REPLACE: if there is a duplicating
record in the table, then it will be deleted and re-inserted, with the
same update optimization.
The code in Rows:log_event::write_row was basically copy-pasted from
write_record.
What's done:
REPLACE operation was unified across replication and sql. It is now
representred as a Write_record class, that holds the whole state, and allows
re-using some resources in between the row writes.
Replace, IODKU and single insert implementations are split across different
methods, reluting in a much cleaner code.
The entry point is preserved as a single Write_record::write_record() call.
The implementation to call is chosen on the constructor stage.
This allowed several optimizations to be done:
1. The table key list is not iterated for every row. We find last unique key in
the order of checking once and preserve it across the rows. See last_uniq_key().
2. ib_handler::referenced_by_foreign_key acquires a global lock. This call was
done per row as well. Not all the table config that allows optimized replace is
folded into a single boolean field can_optimize. All the fields to check are
even stored in a single register on a 64-bit platform.
3. DUP_REPLACE and DUP_UPDATE cases now have one less level of indirection
4. modified_non_trans_tables is checked and set only when it's really needed.
5. Obsolete bitmap manipulations are removed.
Also:
* Unify replace initialization step across implementations:
add prepare_for_replace and finalize_replace
* alloca is removed in favor of mem_root allocation. This memory is reused
across the rows.
* An rpl-related callback is added to the replace branch, meaning that an extra
check is made per row replace even for the common case. It can be avoided with
templates if considered a problem.
- Removed duplicate words, like "the the" and "to to"
- Removed duplicate lines (one double sort line found in mysql.cc)
- Fixed some typos found while searching for duplicate words.
Command used to find duplicate words:
egrep -rI "\s([a-zA-Z]+)\s+\1\s" | grep -v param
Thanks to Artjoms Rimdjonoks for the command and pointing out the
spelling errors.
Ensure that Annotate_rows is always written direct after GTID information,
before any table_map events.
Before this patch, the following problems existed when mixing
transactional and not transactional tables in the same statement:
- Annotate rows could be written after row events or in the next GTID
event.
- See rpl_row_mixing_engines
- Annotate_rows was not always written to binary log in case of error
with a transactional table (rolled back) but a not transactional
table was updated.
- See sp_trans_log, binlog_row_mix_innodb_myisam
Fixed by writing the Annotate_rows event into the non transactional
cache if there are not transactional tables used. If not, write the
event into the transactional cache.
See also MDEV-30046.
Idempotent write_row works same as REPLACE: if there is a duplicating
record in the table, then it will be deleted and re-inserted, with the
same update optimization.
The code in Rows:log_event::write_row was basically copy-pasted from
write_record.
What's done:
REPLACE operation was unified across replication and sql. It is now
representred as a Write_record class, that holds the whole state, and allows
re-using some resources in between the row writes.
Replace, IODKU and single insert implementations are split across different
methods, reluting in a much cleaner code.
The entry point is preserved as a single Write_record::write_record() call.
The implementation to call is chosen on the constructor stage.
This allowed several optimizations to be done:
1. The table key list is not iterated for every row. We find last unique key in
the order of checking once and preserve it across the rows. See last_uniq_key().
2. ib_handler::referenced_by_foreign_key acquires a global lock. This call was
done per row as well. Not all the table config that allows optimized replace is
folded into a single boolean field can_optimize. All the fields to check are
even stored in a single register on a 64-bit platform.
3. DUP_REPLACE and DUP_UPDATE cases now have one less level of indirection
4. modified_non_trans_tables is checked and set only when it's really needed.
5. Obsolete bitmap manipulations are removed.
Also:
* Unify replace initialization step across implementations:
add prepare_for_replace and finalize_replace
* alloca is removed in favor of mem_root allocation. This memory is reused
across the rows.
* An rpl-related callback is added to the replace branch, meaning that an extra
check is made per row replace even for the common case. It can be avoided with
templates if considered a problem.
main/statistics_json.result is updated for f8ba5ced55 (MDEV-36099)
The test uses 'delete from t1' in many places and then populates
the table again. The natural order of rows in a MyISAM table is well
defined and the test was implicitly relying on that.
before f8ba5ced55 delete was deleting rows one by one, using
ha_myisam::delete_row() because the connection was stuck in rbr mode.
This caused rows to be shown in the reverse insertion order (because of
the delete link list).
MDEV-36099 fixes this bug and the server now correctly uses
ha_myisam::delete_all_rows(). This makes rows to be shown in the
insertion order as expected.
The problem is that a few constructs are missing from atomic create table
patch that is not yet committed:
- table_creation_was_logged is now set to 1 when logging create temporary.
- Testing for BINLOG_FORMAT_STMT changed to binlog_create_tmp_table() to
take create_tmp_table_binlog_formats value into account.
- When logging the table creation of temporary tables to the binlog, the
source table was used instead of the newly temporary table.
MDEV-33658 part 1’s refactoring ecaedbe299
introduced a new function init_key_info which (in part) aims to
calculate the total key length; however, it doesn’t account for the
key already having been initialized (as happens when called via
ALTER TABLE .. CONVERT PARTITION .. TO TABLE). This leads to crashes
when this key is later iterated over, because the iterator will try
to iterate over additional key parts which don’t exist because the
length reports as longer than the actual memory owned. The crash
reported by MDEV-36906 highlights this in function key_copy.
To explain how the keys already have been initialized, init_key_info
is called multiple times. That is, init_key_info is called from
mysql_prepare_create_table, which prepares a table and its key
structures for table creation, which is in turn called by
mysql_write_frm when using flags MFRM_WRITE_SHADOW and
MFRM_WRITE_CONVERTED_TO. The
ALTER TABLE .. CONVERT PARTITION .. TO TABLE use case (see function
fast_alter_partition_table), calls mysql_write_frm multiple times with
both of these flags set (first with MFRM_WRITE_CONVERTED_TO and then
with MFRM_WRITE_SHADOW).
Raising it up a level, mysql_prepare_create_table doesn't need to be
called again after it has already been invoked when just writing frms.
Init_key_info is the only place in that function which leads to side
effects, but the rest is redundant and can be skipped on the second
call (i.e. when writing the shadow).
The patch fixes this by skipping the call to mysql_prepare_create_table
in mysql_write_frm in the MFRM_WRITE_SHADOW block when it has already
been called previously. To track whether or not it has been previously
called, we add a new flag for the mysql_write_frm function,
MFRM_ALTER_INFO_PREPARED, which is hard-coded into the function call on
the later invocation.
Test case based on work by Elena Stepanova <elenst@mariadb.com>
Reviewed By:
============
Sergei Golubchik <serg@mariadb.org>
Nikita Malyavin <nikita.malyavin@mariadb.com>
In locked_tables_mode when table is opened without
MYSQL_OPEN_GET_NEW_TABLE flag it is taken from pre-opened and locked
tables. In that case we upgrade its MDL ticket to MDL_EXCLUSIVE before
the operation and downgrade after operation.
Lots of different cases, SELECT, SELECT DEFAULT(),
UPDATE t SET x=DEFAULT, prepares statements,
opening of a table for the I_S, prelocking (so TL_WRITE),
insert with subquery (so SQLCOM_SELECT), etc.
Don't check NEXTVAL privileges in fix_fields() anymore, it cannot
possibly handle all the cases correctly. Make a special method
Item_func_nextval::check_access() for that and invoke it from
* fix_fields on explicit SELECT NEXTVAL()
(but not if NEXTVAL() is used in a DEFAULT clause)
* when DEFAULT bareword in used in, say, UPDATE t SET x=DEFAULT
(but not if DEFAULT() itself is used in a DEFAULT clause)
* in CREATE TABLE
* in ALTER TABLE ALGORITHM=INPLACE (that doesn't go CREATE TABLE path)
* on INSERT
helpers
* Virtual_column_info::check_access() to walk the item tree and invoke
Item::check_access()
* TABLE::check_sequence_privileges() to iterate default expressions
and invoke Virtual_column_info::check_access()
also, single-table UPDATE in prepared statements now associates
value items with fields just as multi-update already did, fixes the
case of PREPARE s "UPDATE t SET x=?"; EXECUTE s USING DEFAULT.
@@enforce_storage_engine is local setting and there is no
knowledge how other nodes are configured. Statement
CREATE TABLE xxx ENGINE=yyy is replicated as it is and
if required engine != enforced engine it could lead
inconsistent used storage engine in the cluster.
Fix is to return error and a warning if required engine is not
same as enforced engine.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Check stmt_da::is_error before calling stmt_da::sql_errno(which is what
the assertion ensures)
The bug did not have any negative effects in optimized builds.
Log tables cannot work with transactional InnoDB or Aria, that is
checked by ALTER TABLE for ER_UNSUPORTED_LOG_ENGINE. But it was
possible to circumvent this check with CREATE TABLE. The patch makes
the check of supported engine common for ALTER TABLE and CREATE TABLE.
To check the rows, the table needs to be opened. To that end, and like
MDEV-36038, we force COPY algorithm on ALTER TABLE ... SEQUENCE=1.
This also results in checking the sequence state / metadata.
The table structure was already validated before this patch.
(cherry picked from commit 6f8ef26885)
Two new error codes ER_SEQUENCE_TABLE_HAS_TOO_FEW_ROWS and
ER_SEQUENCE_TABLE_HAS_TOO_MANY_ROWS were introduced in MDEV-36032 in
both 10.11 and, as part of MDEV-22491, 12.0. Here we remove them from
10.11, but they should remain in 12.0.
REPAIR of InnoDB tables was logging ALTER TABLE and REPAIR to ddl log.
ALTER TABLE contained the new tableid and REPAIR, wrongly, contained the
old rowid.
Now only REPAIR is logged
ddl.log changes:
REPAIR TABLE and OPTIMIZE TABLE that are done through ALTER TABLE will
now contain the old and new table id. If not done through ALTER TABLE,
only the current rowid will be shown (as before).
Remove not anymore used
bool ddl_log_write_execute_entry(uint first_entry,
DDL_LOG_MEMORY_ENTRY **active_entry)
Simple transformations, no logic changes.
There is a need in MDEV-25292 to have both C_ALTER_TABLE and
select_field_count in one call. Semantically creation mode and field
count are two different things. Making creation mode negative
constants and field count positive variable into one parameter seems
to be a lazy hack for not making the second parameter.
select_count does not make sense without alter_info->create_list, so
the natural way is to hold it in Alter_info too. select_count is now
stored in member select_field_count.
Merged and updated by: Monty
MDEV-36563 Assertion `!mysql_bin_log.is_open()' failed in
THD::mark_tmp_table_as_free_for_reuse
The purpose of this commit is to ensure that creation and changes of
temporary tables are properly and predicable logged to the binary
log. It also fixes some bugs where ROW logging was used in MIXED mode,
when STATEMENT would be a better (and expected) choice.
In this comment STATEMENT stands for logging to binary log in
STATEMENT format, MIXED stands for MIXED binlog format and ROW for ROW
binlog format.
New rules for logging of temporary tables
- CREATE of temporary tables are now by default binlogged only if
STATEMENT binlog format is used. If it is binlogged, 1 is stored in
TABLE_SHARE->table_creation_was_logged. The user can change this
behavior by setting create_temporary_table_binlog_formats to
MIXED,STATEMENT in which case the create is logged in statement
format also in MIXED mode (as before).
- Changes to temporary tables are only binlogged if and only if
the CREATE was logged. The logging happens under STATEMENT or MIXED.
If binlog_format=ROW, temporary table changes are not binlogged. A
temporary table that are changed under ROW are marked as 'not up to
date in binlog' and no future row changes are logged. Any usage of
this temporary table will force row logging of other tables in any
future statements using the temporary table to be row logged.
- DROP TEMPORARY is binlogged only of the CREATE was binlogged.
Changes done:
- Row logging is forced for any statement using temporary tables that
are not up to date in the binary log.
(Before the row logging was forced if the user has a temporary table)
- If there is any changes to the temporary table that is not binlogged,
the table is marked as not up to date.
- TABLE_SHARE->table_creation_was_logged has a new definition for
temporary tables:
0 Table creating was not logged to binary log
1 Table creating was logged to binary log and table is up to date.
2 Table creating was logged to binary log but some changes where
not logged to binary log.
Table is not up to date in binary log is defined as value 0 or 2.
- If a multi-table-update or multi-table-delete fails then
all updated temporary tables are marked as not up to date.
- Enforce row logging if the query is using temporary tables
that are not up to date.
Before row logging was enforced if the user had any
temporary tables.
- When dropping temporary tables use IF EXISTS. This ensures
that slave will not stop if it had crashed and lost the
temporary tables.
- Remove comment and version from DROP /*!4000 TEMPORARY.. generated when
a connection closes that has open temporary tables. Added 'generated by
server' at the end of the DROP.
Bugs fixed:
- When using temporary tables with commands that forced row based,
like INSERT INTO temporary_table VALUES (UUID()), this was never
logged which causes the temporary table to be inconsistent on
master and slave.
- Used binlog format is now clearly defined. It is now only depending
on the current binlog_format and the tables used.
Before it was depending on the user had ANY temporary tables and
the state of 'current_stmt_binlog_format' set by previous queries.
This also caused temporary tables to be logged to binary log in
some cases.
- CREATE TABLE t1 LIKE not_logged_temporary_table caused replication
to stop.
- Rename of not binlogged temporary tables where binlogged to binary log
which caused replication to stop.
Changes in behavior:
- By default create_temporary_table_binlog_formats=STATEMENT, which
means that CREATE TEMPORARY is not logged to binary log under MIXED
binary logging. This can be changed by setting
create_temporary_table_binlog_formats to MIXED,STATEMENT.
- Using temporary tables that was not logged to the binary log will
cause any query using them for updating other tables to be logged in
ROW format. Before all queries was logged in ROW format if the user had
any temporary tables, even if they were not used by the query.
- Generated DROP TEMPORARY TABLE is now always using IF EXISTS and
has a "generated by server" comment in the binary log.
The consequences of the above is that manipulations of a lot of rows
through temporary tables will by default be be slower in mixed mode.
For example:
BEGIN;
CREATE TEMPORARY TABLE tmp AS SELECT a, b, c FROM
large_table1 JOIN large_table2 ON ...;
INSERT INTO other_table SELECT b, c FROM tmp WHERE a <100;
DROP TEMPORARY TABLE tmp;
COMMIT;
By default this will create a huge entry in the binary log, compared
to just a few hundred bytes in statement mode. However the change in
this commit will make usage of temporary tables more reliable and
predicable and is thus worth it. Using statement mode or
create_temporary_table_binlog_formats can be used to avoid this issue.
Ensure that backup_reset_alter_copy_lock() is called in case of rollback
or error in mysql_inplace_alter_table() or copy_data_between_tables().
Other things:
- Improved error from mariabackup when unexpected DDL operation is
encountered.
- Added assert if backup_ddl_log() is called in the wrong context.