Issue:
The Commit: 0584846 'Add TL_FIRST_WRITE in SQL layer for determining R/W'
limits INSERT statements on write nodes to acquire MDL locks on it's all child
tables and thereby wsrep certification keys added, but on applier nodes it does
acquire MDL locks for all child tables. This can result into MDL BF-BF conflict
on applier node when transactions referring to parent and child tables are
executed concurrently. For example:
Tables with foreign keys: t1<-t2<-t3<-t4
Conflicting transactions: INSERT t1 and DROP TABLE t4
Wsrep certification keys taken on write node:
- for INSERT t1: t1 and t2
- for DROP TABLE t4: t4
On applier node MDL BF-BF conflict happened between two transaction because
MDL locks on t1, t2, t3 and t4 were taken for INSERT t1, which conflicted
with MDL lock on t4 taken by DROP TABLE t4.
The Wsrep certification keys helps in resolving this MDL BF-BF conflict by
prioritizing and scheduling concurrent transactions. But to generate Wsrep
certification keys it needs to open and take MDL locks on all the child tables.
The Commit: 0584846 change limits MDL lock to be taken on all child nodes for
read-only FK checks (INSERT t1). But this doesn't works on applier nodes
because Write_rows_log_event event logged for INSERT is also used to record
update (check Write_rows_log_event::get_trg_event_map()), and
therefore MDL locks is taken for all the child tables on applier node
for update and insert event.
Solution:
Additional keys for the referenced/foreign table needs to be added
to avoid potential MDL conflicts with concurrent update and DDLs.
close_thread_tables() would not flush pending row events to the binlog cache
in certain conditions if LOCK TABLES was active. This could result in the
row events being binlogged without STMT_END_F flag, and eventually leave the
THD in an invalid state that triggered assertions later.
Reviewed-by: Monty <monty@mariadb.org>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
The issue was that unpack_vcol_info_from_frm() wrongly linked the used
sequence tables into tables->internal_tables when more than one sequence
table was used.
Other things:
- Fixed internal_table_exists() to take db into account.
(This is making the code easier to read. As we where comparing
pointers the old code also worked).
TABLE_LIST parsed from procedure code is transferred into tables to
lock for INSERT. The procedure code is CREATE VIEW so its TABLE_LIST
is parsed as TL_IGNORE, but same view exists and when existing view is
opened mysql_make_view() uses same TABLE_LIST that was initialized
from CREATE VIEW and then added as part of prelocking context. So
existing view is opened and its table is assigned TL_IGNORE from
prelocking context. Finally, INSERT has TABLE_LIST duplication: the
one that was parsed from INSERT; the another one came from procedure
prelocking, its lock_type came from the procedure code and the real
table was found via existing view.
The sequence of execution:
1. Procedure p is compiled as part of open_and_process_routine(), its
code is parsed and create_or_alter_view_finalize() initializes v
TABLE_LIST as TL_IGNORE;
2. Procedure p prelocking adds v to prelocking_ctx with TL_IGNORE;
3. DML prelocking adds v from prelocking_ctx;
4. View is opened, mysql_make_view() assigns t lock_type from v;
5. open_and_lock_tables() attempts to lock t with TL_IGNORE.
The fix skips TL_IGNORE at 2. when table list parsed by procedure is
added for prelocking:
if (my_hash_insert(&m_sptabs, (uchar *)tab))
return FALSE;
m_sptabs designation was defined as strictly for prelocking:
/**
Multi-set representing optimized list of tables to be locked by this
routine. Does not include tables which are used by invoked routines.
@note
For prelocking-free SPs this multiset is constructed too.
We do so because the same instance of sp_head may be called both
in prelocked mode and in non-prelocked mode.
*/
HASH m_sptabs;
The fix was proposed by Sergei Golubchik <serg@mariadb.org>.
Only do trigger prelocking for tables that are doing to be
modified (with a write lock). A table can cause prelocking
if its DEFAULT value is used (because DEFAULT can be NEXTVAL),
even if the table itself is only used for reads. Don't process
triggers for such a table
XAER_RMFAIL means the admin statement was not allowed, it's not a
per-table message, so must fail the whole statement. And must
not rollback, obviously, it's not allowed after prepare.
Also, remove duplicate XAER_RMFAIL in open_tables(),
check_has_uncommitted_xa() already issues it.
Issue:
Mariadb acquires additional MDL locks on UPDATE/INSERT/DELETE statements
on table with foreign keys. For example, table t1 references t2, an
UPDATE to t1 will MDL lock t2 in addition to t1.
A replica may deliver an ALTER t1 and UPDATE t2 concurrently for
applying. Then the UPDATE may acquire MDL lock for t1, followed by a
conflict when the ALTER attempts to MDL lock on t1. Causing a BF-BF
conflict.
Solution:
Additional keys for the referenced/foreign table needs to be added
to avoid potential MDL conflicts with concurrent update and DDLs.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Valgrind is single threaded and only changes threads as part of
system calls or waits.
Some busy loops were identified and fixed where the server assumes
that some other thread will change the state, which will not happen
with valgrind.
Based on patch by Monty. Original patch introduced VALGRIND_YIELD,
which emits pthread_yield() only in valgrind builds. However it was
agreed that it is a good idea to emit yield() unconditionally, such
that other affected schedulers (like SCHED_FIFO) benefit from this
change. Also avoid pthread_yield() in favour of standard
std::this_thread::yield().
Get rid of need of matherialization for usual INSERT (cache results in
Item_cache* if needed)
- subqueries in VALUE do not see new records in the table we are
inserting to
- subqueries in RETIRNING prohibited to use the table we are inserting to
"t1 JOIN t2 USING(col1,...)" calls mark_common_columns() to mark the
listed columns as used in both used tables, t1 and t2.
Due to a typo bug, it would mark the wrong column in the second table
(t2): instead of t2.col1 it would mark the last column in t2.
The harmful effects included JOIN_TAB(t2)->covering_keys not being
set correctly. This changed the cost to access the table and then
caused different query plans depending on which table was the second
in the JOIN ... USING syntax.
let's unlock and relock hlindexes under LOCK TABLES.
an alternative would be to do start_stmt for them too,
but this is simpler and keeps them locked only for statements
that actually use them
Newer gcc reports:
error: 'rfield' may be used uninitialized [-Werror=maybe-uninitialized]
9041 | unwind_stored_field_offsets(fields, rfield);
After investigation, it turned to be an impossible case:
1. The only way it could be broken is if
if (!(field= fld->field_for_view_update()))
line case would succeed from the first time.
2. Consequent checks initialize rfield.
fld may return NULL in field_for_view_update() only for views.
3. Before fill_record, UPDATE first calls check_fields, where
field_for_view_update() result is already checked. INSERT calls
check_view_insertability that checks that all view fields are
updateable.
It all means that field_for_view_update() cannot be NULL in fill_record,
so the if can be converted to DBUG_ASSERT.
This essentially shifts the responsibility on preliminary
field_for_view_update() check to the caller.
In this patch:
1. convert field_for_view_update() check to DBUG_ASSERT
2. harden unwind_stored_field_offsets function so that it can be used
even if field_for_view_update() is NULL
3. As a consequence, `field` is passed instead of `rfield` as a
terminator.
4. Initialize `field` to NULL to bypass a false-positive warning!
MDEV-34171 denied removing indirect routines/tables after
recover_from_failed_open() for auto-create partition case. Now we are
going further and keep them for any failed table reopen.
MDEV-34171 did not handle correctly open_and_process_routine() after
that skip of sp_remove_not_own_routines(). Now it is fixed by
sroutine_to_open correct usage.