@@new_mode is a set of flags to control introduced features.
Flags are by default off. Setting a flag in @@new_mode will introduce
a new different server behaviour and/or set of features.
We also introduce a new option 'hidden_values' into some system variable
types to hide options that we do not wish to show as options.
- Don't print hidden values in mysqld --help output.
- Make get_options() use the same logic as check_new_mode_value() does.
- Setting @@new_mode=ALL shouldn't give warnings.
(cherry picked from commit f7387cb13d)
In the BASE of this patch when a slave parallel worker proceeds from
the wait-for-prior-commit stage into retrying it may have its
backup-lock related sub-state, specifically `THD::backup_commit_lock`,
not reset, that is the pointer dangling.
That caused segfault at the pointer's dereferencing in the worker retrying.
The reason THD::backup_commit_lock is left dangling was unexpected
state of THD having non-NULL of `THD::backup_commit_lock` and NULL of `mdl_backup.ticket`.
This combination turns out possible when the slave worker is killed
for retry *and* few instruction later it does not succeed
to (re-)acquire the Backup MDL at exiting from
`MYSQL_BIN_LOG::queue_for_group_commit()`.
While it did not succeed it also did not expose that fact with
timing out from `MDL_context::acquire_lock` and it did not have to,
as it before to start waiting it found itself killed.
The bug is fixed with amendment of `backup_commit_lock` reset condition
at the end of `ha_commit_trans()`. The amended reset remains careful to
affect only the stack allocated lock.
A test is added to confirm the fixes with reproducing all stages
described above. In the patch BASE it causes the segfault.
@@new_mode is a set of flags to control introduced features.
Flags are by default off. Setting a flag in @@new_mode will introduce
a new different server behaviour and/or set of features.
We also introduce a new option 'hidden_values' into some system variable
types to hide options that we do not wish to show as options.
- Don't print hidden values in mysqld --help output.
- Make get_options() use the same logic as check_new_mode_value() does.
- Setting @@new_mode=ALL shouldn't give warnings.
We had a protection against it, by allowing versioned delete if:
trx->id != table->vers_start_id()
For replace this check fails: replace calls ha_delete_row(record[2]), but
table->vers_start_id() returns the value from record[0], which is irrelevant.
The same problem hits Field::is_max, which may have checked the wrong record.
Fix:
* Refactor Field::is_max to optionally accept a pointer as an argument.
* Refactor vers_start_id and vers_end_id to always accept a pointer to the
record. there is a difference with is_max is that is_max accepts the pointer to
the
field data, rather than to the record.
Method val_int() would be too effortful to refactor to accept the argument, so
instead the value in record is fetched directly, like it is done in
Field_longlong.
See also MDEV-30046.
Idempotent write_row works same as REPLACE: if there is a duplicating
record in the table, then it will be deleted and re-inserted, with the
same update optimization.
The code in Rows:log_event::write_row was basically copy-pasted from
write_record.
What's done:
REPLACE operation was unified across replication and sql. It is now
representred as a Write_record class, that holds the whole state, and allows
re-using some resources in between the row writes.
Replace, IODKU and single insert implementations are split across different
methods, reluting in a much cleaner code.
The entry point is preserved as a single Write_record::write_record() call.
The implementation to call is chosen on the constructor stage.
This allowed several optimizations to be done:
1. The table key list is not iterated for every row. We find last unique key in
the order of checking once and preserve it across the rows. See last_uniq_key().
2. ib_handler::referenced_by_foreign_key acquires a global lock. This call was
done per row as well. Not all the table config that allows optimized replace is
folded into a single boolean field can_optimize. All the fields to check are
even stored in a single register on a 64-bit platform.
3. DUP_REPLACE and DUP_UPDATE cases now have one less level of indirection
4. modified_non_trans_tables is checked and set only when it's really needed.
5. Obsolete bitmap manipulations are removed.
Also:
* Unify replace initialization step across implementations:
add prepare_for_replace and finalize_replace
* alloca is removed in favor of mem_root allocation. This memory is reused
across the rows.
* An rpl-related callback is added to the replace branch, meaning that an extra
check is made per row replace even for the common case. It can be avoided with
templates if considered a problem.
- Removed duplicate words, like "the the" and "to to"
- Removed duplicate lines (one double sort line found in mysql.cc)
- Fixed some typos found while searching for duplicate words.
Command used to find duplicate words:
egrep -rI "\s([a-zA-Z]+)\s+\1\s" | grep -v param
Thanks to Artjoms Rimdjonoks for the command and pointing out the
spelling errors.
We had a protection against it, by allowing versioned delete if:
trx->id != table->vers_start_id()
For replace this check fails: replace calls ha_delete_row(record[2]), but
table->vers_start_id() returns the value from record[0], which is irrelevant.
The same problem hits Field::is_max, which may have checked the wrong record.
Fix:
* Refactor Field::is_max to optionally accept a pointer as an argument.
* Refactor vers_start_id and vers_end_id to always accept a pointer to the
record. there is a difference with is_max is that is_max accepts the pointer to
the
field data, rather than to the record.
Method val_int() would be too effortful to refactor to accept the argument, so
instead the value in record is fetched directly, like it is done in
Field_longlong.
See also MDEV-30046.
Idempotent write_row works same as REPLACE: if there is a duplicating
record in the table, then it will be deleted and re-inserted, with the
same update optimization.
The code in Rows:log_event::write_row was basically copy-pasted from
write_record.
What's done:
REPLACE operation was unified across replication and sql. It is now
representred as a Write_record class, that holds the whole state, and allows
re-using some resources in between the row writes.
Replace, IODKU and single insert implementations are split across different
methods, reluting in a much cleaner code.
The entry point is preserved as a single Write_record::write_record() call.
The implementation to call is chosen on the constructor stage.
This allowed several optimizations to be done:
1. The table key list is not iterated for every row. We find last unique key in
the order of checking once and preserve it across the rows. See last_uniq_key().
2. ib_handler::referenced_by_foreign_key acquires a global lock. This call was
done per row as well. Not all the table config that allows optimized replace is
folded into a single boolean field can_optimize. All the fields to check are
even stored in a single register on a 64-bit platform.
3. DUP_REPLACE and DUP_UPDATE cases now have one less level of indirection
4. modified_non_trans_tables is checked and set only when it's really needed.
5. Obsolete bitmap manipulations are removed.
Also:
* Unify replace initialization step across implementations:
add prepare_for_replace and finalize_replace
* alloca is removed in favor of mem_root allocation. This memory is reused
across the rows.
* An rpl-related callback is added to the replace branch, meaning that an extra
check is made per row replace even for the common case. It can be avoided with
templates if considered a problem.
main/statistics_json.result is updated for f8ba5ced55 (MDEV-36099)
The test uses 'delete from t1' in many places and then populates
the table again. The natural order of rows in a MyISAM table is well
defined and the test was implicitly relying on that.
before f8ba5ced55 delete was deleting rows one by one, using
ha_myisam::delete_row() because the connection was stuck in rbr mode.
This caused rows to be shown in the reverse insertion order (because of
the delete link list).
MDEV-36099 fixes this bug and the server now correctly uses
ha_myisam::delete_all_rows(). This makes rows to be shown in the
insertion order as expected.
in case of a long unique conflict ha_write_row() used delete_row()
to remove the newly inserted row, and it used rnd_pos() to position
the cursor before deletion. This rnd_pos() was freeing and reallocating
blobs in record[0]. So when the code for FOR PORTION OF did
store_record(record[2]);
ha_write_row()
restore_record(record[2]);
it ended up with blob pointers to a freed memory.
Let's use lookup_handler for deletion.
let's disallow UPDATE IGNORE in READ COMMITTED with the table
has UNIQUE constraint that is USING HASH or is WITHOUT OVERLAPS
This rarely-used combination should not block a release, with be
fixed in MDEV-37233
mark creating an index as an SQLCOM_CREATE_INDEX operation.
currently the only effect of this is to tell InnoDB that it's not
a "CREATE TABLE" as such, so not affected by innodb_force_primary_key
followup for 9703c90712 (MDEV-37199 UNIQUE KEY USING HASH accepting duplicate records)
ha_partition can return HA_ERR_KEY_NOT_FOUND even in the middle
of the index scan
followup for 9703c90712 (MDEV-37199 UNIQUE KEY USING HASH accepting duplicate records)
maintain the invariant, that handler::ha_update_row()
is always invoked as handler::ha_update_row(record[0], record[1])
followup for 9703c90712 (MDEV-37199 UNIQUE KEY USING HASH accepting duplicate records)
when looking for long unique duplicates and the new row is already
inserted, we cannot simply "skip one conflict" we must skip exactly
the new row and find a conflict which isn't a new row - otherwise
table->file->dup_ref can be set incorrectly and REPLACE won't work.
Server-level UNIQUE constraints (namely, WITHOUT OVERLAPS and USING HASH)
only worked with InnoDB in REPEATABLE READ isolation mode, when the
constraint was checked first and then the row was inserted or updated.
Gap locks prevented race conditions when a concurrent connection
could've also checked the constraint and inserted/updated a row
at the same time.
In READ COMMITTED there are no gap locks. To avoid race conditions,
we now check the constraint *after* the row operation. This is
enabled by the HA_CHECK_UNIQUE_AFTER_WRITE table flag that InnoDB
sets in the READ COMMITTED transactions.
Checking the constraint after the row operation is more complex.
First, the constraint will see the current (inserted/updated) row,
and needs to skip it. Second, IGNORE operations become tricky,
as we need to revert the insert/update and continue statement execution.
write_row() (INSERT IGNORE) is reverted with delete_row(). Conveniently
it deletes the current row, that is, the last inserted row.
update_row(a,b) (UPDATE IGNORE) is reverted with a reversed update,
update_row(b,a). Conveniently, it updates the current row too.
Except in InnoDB when the PK is updated - in this case InnoDB internally
performs delete+insert, but does not move the cursor, so the "current"
row is the deleted one and the reverse update doesn't work.
This combination now throws an "unsupported" error and will
be fixed in MDEV-37233
Lots of different cases, SELECT, SELECT DEFAULT(),
UPDATE t SET x=DEFAULT, prepares statements,
opening of a table for the I_S, prelocking (so TL_WRITE),
insert with subquery (so SQLCOM_SELECT), etc.
Don't check NEXTVAL privileges in fix_fields() anymore, it cannot
possibly handle all the cases correctly. Make a special method
Item_func_nextval::check_access() for that and invoke it from
* fix_fields on explicit SELECT NEXTVAL()
(but not if NEXTVAL() is used in a DEFAULT clause)
* when DEFAULT bareword in used in, say, UPDATE t SET x=DEFAULT
(but not if DEFAULT() itself is used in a DEFAULT clause)
* in CREATE TABLE
* in ALTER TABLE ALGORITHM=INPLACE (that doesn't go CREATE TABLE path)
* on INSERT
helpers
* Virtual_column_info::check_access() to walk the item tree and invoke
Item::check_access()
* TABLE::check_sequence_privileges() to iterate default expressions
and invoke Virtual_column_info::check_access()
also, single-table UPDATE in prepared statements now associates
value items with fields just as multi-update already did, fixes the
case of PREPARE s "UPDATE t SET x=?"; EXECUTE s USING DEFAULT.
handler::clone() call did not work with read only tables like S3.
It gave a wrong error message (out of memory instead of a permission
error) and aborted the query.
The issue was that the clone call had a wrong parameter to ha_open().
This now fixed. I also changed the clone call to provide the correct
error message if things fails.
This patch fixes an 'out of memory' error when using the S3 engine
for queries that could use multiple indexes together to find the matching
rows, like the following:
SELECT * FROM t1 WHERE key1 = 99 OR key2 = 2
This commit fixes a bug where Aria tables are used in
(master->slave1->slave2) and a backup is taken on slave2. In this case
it is possible that the replication position in the backup, stored in
mysql.gtid_slave_pos, will be wrong. This will lead to replication
errors if one is trying to use the backup as a new slave.
Analyze:
Replicated row events are committed with trans_commit_stmt() and
thd->transaction->all.ha_list != 0.
This means that backup_commit_lock is not taken for Aria tables,
which means the rows are committed and binary logged on the slave
under BLOCK_COMMIT which should not happen.
This issue does not occur on the master as thd->transaction->all.ha_list
is == 0 under AUTO_COMMIT, which sets 'is_real_trans' and 'rw_trans'
which in turn causes backup_commit_lock to be taken.
Fixed by checking in ha_check_and_coalesce_trx_read_only() if all handlers
supports rollback and if not, then wait for BLOCK_COMMIT also for
statement commit.
The main purpose of this allow one to use the --read-only
option to ensure that no one can issue a query that can
block replication.
The --read-only option can now take 4 different values:
0 No read only (as before).
1 Blocks changes for users without the 'READ ONLY ADMIN'
privilege (as before).
2 Blocks in addition LOCK TABLES and SELECT IN SHARE MODE
for not 'READ ONLY ADMIN' users.
3 Blocks in addition 'READ_ONLY_ADMIN' users for all the
previous statements.
read_only is changed to an enum and one can use the following
names for the lock levels:
OFF, ON, NO_LOCK, NO_LOCK_NO_ADMIN
Too keep things compatible with older versions config files, one can
still use values FALSE and TRUE, which are mapped to OFF and ON.
The main visible changes are:
- 'show variables like "read_only"' now returns a string
instead of a number.
- Error messages related to read_only violations now contains
the current value off readonly.
Other things:
- is_read_only_ctx() renamed to check_read_only_with_error()
- Moved TL_READ_SKIP_LOCKED to it's logical place
Reviewed by: Sergei Golubchik <serg@mariadb.org>
There is a need in MDEV-25292 to have both C_ALTER_TABLE and
select_field_count in one call. Semantically creation mode and field
count are two different things. Making creation mode negative
constants and field count positive variable into one parameter seems
to be a lazy hack for not making the second parameter.
select_count does not make sense without alter_info->create_list, so
the natural way is to hold it in Alter_info too. select_count is now
stored in member select_field_count.
Merged and updated by: Monty