Issue:
The Commit: 0584846 'Add TL_FIRST_WRITE in SQL layer for determining R/W'
limits INSERT statements on write nodes to acquire MDL locks on it's all child
tables and thereby wsrep certification keys added, but on applier nodes it does
acquire MDL locks for all child tables. This can result into MDL BF-BF conflict
on applier node when transactions referring to parent and child tables are
executed concurrently. For example:
Tables with foreign keys: t1<-t2<-t3<-t4
Conflicting transactions: INSERT t1 and DROP TABLE t4
Wsrep certification keys taken on write node:
- for INSERT t1: t1 and t2
- for DROP TABLE t4: t4
On applier node MDL BF-BF conflict happened between two transaction because
MDL locks on t1, t2, t3 and t4 were taken for INSERT t1, which conflicted
with MDL lock on t4 taken by DROP TABLE t4.
The Wsrep certification keys helps in resolving this MDL BF-BF conflict by
prioritizing and scheduling concurrent transactions. But to generate Wsrep
certification keys it needs to open and take MDL locks on all the child tables.
The Commit: 0584846 change limits MDL lock to be taken on all child nodes for
read-only FK checks (INSERT t1). But this doesn't works on applier nodes
because Write_rows_log_event event logged for INSERT is also used to record
update (check Write_rows_log_event::get_trg_event_map()), and
therefore MDL locks is taken for all the child tables on applier node
for update and insert event.
Solution:
Additional keys for the referenced/foreign table needs to be added
to avoid potential MDL conflicts with concurrent update and DDLs.
- Removed duplicate words, like "the the" and "to to"
- Removed duplicate lines (one double sort line found in mysql.cc)
- Fixed some typos found while searching for duplicate words.
Command used to find duplicate words:
egrep -rI "\s([a-zA-Z]+)\s+\1\s" | grep -v param
Thanks to Artjoms Rimdjonoks for the command and pointing out the
spelling errors.
Ensure that Annotate_rows is always written direct after GTID information,
before any table_map events.
Before this patch, the following problems existed when mixing
transactional and not transactional tables in the same statement:
- Annotate rows could be written after row events or in the next GTID
event.
- See rpl_row_mixing_engines
- Annotate_rows was not always written to binary log in case of error
with a transactional table (rolled back) but a not transactional
table was updated.
- See sp_trans_log, binlog_row_mix_innodb_myisam
Fixed by writing the Annotate_rows event into the non transactional
cache if there are not transactional tables used. If not, write the
event into the transactional cache.
Timestamp-versioned row deletion was exposed to a collisional problem: if
current timestamp wasn't changed, then a sequence of row delete+insert could
get a duplication error. A row delete would find another conflicting history row
and return an error.
This is true both for REPLACE and DELETE statements, however in REPLACE, the
"optimized" path is usually taken, especially in the tests. There, delete+insert
is substituted for a single versioned row update. In the end, both paths end up
as ha_update_row + ha_write_row.
The solution is to handle a history collision somehow.
From the design perspective, the user shouldn't experience history rows loss,
unless there's a technical limitation.
To the contrary, trxid-based changes should never generate history for the same
transaction, see MDEV-15427.
If two operations on the same row happened too quickly, so that they happen at
the same timestamp, the history row shouldn't be lost. We can still write a
history row, though it'll have row_start == row_end.
We cannot store more than one such historical row, as this will violate the
unique constraint on row_end. So we will have to phisically delete the row if
the history row is already available.
In this commit:
1. Improve TABLE::delete_row to handle the history collision: if an update
results with a duplicate error, delete a row for real.
2. use TABLE::delete_row in a non-optimistic path of REPLACE, where the
system-versioned case now belongs entirely.
We had a protection against it, by allowing versioned delete if:
trx->id != table->vers_start_id()
For replace this check fails: replace calls ha_delete_row(record[2]), but
table->vers_start_id() returns the value from record[0], which is irrelevant.
The same problem hits Field::is_max, which may have checked the wrong record.
Fix:
* Refactor Field::is_max to optionally accept a pointer as an argument.
* Refactor vers_start_id and vers_end_id to always accept a pointer to the
record. there is a difference with is_max is that is_max accepts the pointer to
the
field data, rather than to the record.
Method val_int() would be too effortful to refactor to accept the argument, so
instead the value in record is fetched directly, like it is done in
Field_longlong.
Lots of different cases, SELECT, SELECT DEFAULT(),
UPDATE t SET x=DEFAULT, prepares statements,
opening of a table for the I_S, prelocking (so TL_WRITE),
insert with subquery (so SQLCOM_SELECT), etc.
Don't check NEXTVAL privileges in fix_fields() anymore, it cannot
possibly handle all the cases correctly. Make a special method
Item_func_nextval::check_access() for that and invoke it from
* fix_fields on explicit SELECT NEXTVAL()
(but not if NEXTVAL() is used in a DEFAULT clause)
* when DEFAULT bareword in used in, say, UPDATE t SET x=DEFAULT
(but not if DEFAULT() itself is used in a DEFAULT clause)
* in CREATE TABLE
* in ALTER TABLE ALGORITHM=INPLACE (that doesn't go CREATE TABLE path)
* on INSERT
helpers
* Virtual_column_info::check_access() to walk the item tree and invoke
Item::check_access()
* TABLE::check_sequence_privileges() to iterate default expressions
and invoke Virtual_column_info::check_access()
also, single-table UPDATE in prepared statements now associates
value items with fields just as multi-update already did, fixes the
case of PREPARE s "UPDATE t SET x=?"; EXECUTE s USING DEFAULT.
temporary table
Compressed field cannot be part of a key by its nature: there is no
data to order, only the compressed data.
For optimizer temporary table we create uncompressed substitute.
In all other cases (MDEV-16808) we don't use key: add_keyuse() is
skipped by !field->compression_method() condition.
replication problems
DELETE HISTORY did not process parameterized PS properly as the
history expression was checked on prepare stage when the parameters
was not yet substituted. In that case check_units() succeeded as there
is no invalid type: Item_param has type_handler_null which is
inherited from string type and this is valid type for history
expression. The warning was thrown when the expression was evaluated
for comparison on delete execution (when the parameter was already
substituted).
The fix postpones check_units() until the first PS execution. We have
to postpone where conditions processing until the first execution and
update select_lex.where on every execution as it is reset to the state
after prepare.
Get rid of need of matherialization for usual INSERT (cache results in
Item_cache* if needed)
- subqueries in VALUE do not see new records in the table we are
inserting to
- subqueries in RETIRNING prohibited to use the table we are inserting to
If one of the selected field is a MIN or MAX and it has been optimized
into a constant, it is not added to the temp table used by a group by
handler (GBH). The GBH therefore cannot store results to this missing
field.
On the other hand, when SELECTing from a view or a derived table,
TMP_TABLE_ALL_COLUMNS is set. If the query has no group by or order
by, an Item_temptable_field is created for this MIN/MAX field and
added to the JOIN. Since the GBH could not store results to the
corresponding field in the temp table, the value of this
Item_temptable_field remains NULL. And the NULL value is passed to the
record, then the temp row, and finally output as the (wrong) result.
To fix this, we opt to not creating a spider GBH when a view or
derived table is involved.
This fixes spider/bugfix.mdev_26345 for --view-protocol
Also fixed a comment:
TABLE_LIST::belong_to_derived is NULL if the table belongs to a
derived table that has non-MERGE type.
DELAYED with virtual columns
Segfault was cause by two different copies of same Field instance in
prepared delayed insert. One was made by
Delayed_insert::get_local_table() (see make_new_field()). That copy
went through parse_vcol_defs() and received new vcol_info->expr.
Another one was made by copy_keys_from_share() by this code:
/*
We are using only a prefix of the column as a key:
Create a new field for the key part that matches the index
*/
field= key_part->field=field->make_new_field(root, outparam, 0);
field->field_length= key_part->length;
So, key_part and table got different objects of same field and the
crash was because key_part->field->vcol_info->expr is NULL.
The fix does update_keypart_vcol_info() to update vcol_info->expr in
key_part->field.
Cleanup: memdup_vcol() is static inline instead of macro + check OOM.
Field_blob::store() has special code for GROUP_CONCAT temporary table
(to store blob values in Blob_mem_storage - this prevents them
from being freed/overwritten when a next row is read).
Field_geom and Field_blob_compressed inherit from Field_blob but they
have their own ::store() method without this special Blob_mem_storage
support.
Considering that non-grouping CONCAT() of such fields converts
them to plain BLOB, let's do the same for GROUP_CONCAT. To do it,
Item_func_group_concat::setup will signal that it's creating
a temporary table for GROUP_CONCAT, and Field_blog::make_new_field()
override will create base Field_blob when under group concat.
Don't allow the referencing key column from NULL TO NOT NULL
when
1) Foreign key constraint type is ON UPDATE SET NULL
2) Foreign key constraint type is ON DELETE SET NULL
3) Foreign key constraint type is UPDATE CASCADE and referenced
column declared as NULL
Don't allow the referenced key column from NOT NULL to NULL
when foreign key constraint type is UPDATE CASCADE
and referencing key columns doesn't allow NULL values
get_foreign_key_info(): InnoDB sends the information about
nullability of the foreign key fields and referenced key fields.
fk_check_column_changes(): Enforce the above rules for COPY
algorithm
innobase_check_foreign_drop_col(): Checks whether the dropped
column exists in existing foreign key relation
innobase_check_foreign_low() : Enforce the above rules for
INPLACE algorithm
dict_foreign_t::check_fk_constraint_valid(): This is used
by CREATE TABLE statement to check nullability for foreign
key relation.