specific temporary errors
The optimistic parallel slave's worker thread could face a run-time error due to
the algorithm's specifics which allows for conflicts like the reported
"Can't find record in 'table'".
A typical stack is like
{noformat}
#0 handler::print_error (this=0x61c00008f8a0, error=149, errflag=0) at handler.cc:3650
#1 0x0000555555e95361 in write_record (thd=thd@entry=0x62a0000a2208, table=table@entry=0x61f00008ce88, info=info@entry=0x7fffdee356d0) at sql_insert.cc:1944
#2 0x0000555555ea7767 in mysql_insert (thd=thd@entry=0x62a0000a2208, table_list=0x61b00012ada0, fields=..., values_list=..., update_fields=..., update_values=..., duplic=<optimized out>, ignore=<optimized out>) at sql_insert.cc:1039
#3 0x0000555555efda90 in mysql_execute_command (thd=thd@entry=0x62a0000a2208) at sql_parse.cc:3927
#4 0x0000555555f0cc50 in mysql_parse (thd=0x62a0000a2208, rawbuf=<optimized out>, length=<optimized out>, parser_state=<optimized out>) at sql_parse.cc:7449
#5 0x00005555566d4444 in Query_log_event::do_apply_event (this=0x61200005b9c8, rgi=<optimized out>, query_arg=<optimized out>, q_len_arg=<optimized out>) at log_event.cc:4508
#6 0x00005555566d639e in Query_log_event::do_apply_event (this=<optimized out>, rgi=<optimized out>) at log_event.cc:4185
#7 0x0000555555d738cf in Log_event::apply_event (rgi=0x61d0001ea080, this=0x61200005b9c8) at log_event.h:1343
#8 apply_event_and_update_pos_apply (ev=ev@entry=0x61200005b9c8, thd=thd@entry=0x62a0000a2208, rgi=rgi@entry=0x61d0001ea080, reason=<optimized out>) at slave.cc:3479
#9 0x0000555555d8596b in apply_event_and_update_pos_for_parallel (ev=ev@entry=0x61200005b9c8, thd=thd@entry=0x62a0000a2208, rgi=rgi@entry=0x61d0001ea080) at slave.cc:3623
#10 0x00005555562aca83 in rpt_handle_event (qev=qev@entry=0x6190000fa088, rpt=rpt@entry=0x62200002bd68) at rpl_parallel.cc:50
#11 0x00005555562bd04e in handle_rpl_parallel_thread (arg=arg@entry=0x62200002bd68) at rpl_parallel.cc:1258
{noformat}
Here {{handler::print_error}} computes whether to error log the
current error when --log-warnings > 1. The decision flag is consulted
bu {{my_message_sql()}} which can be eventually called.
In the bug case the decision is to log.
However in the optimistic mode slave applier case any conflict is
attempted to resolve with rollback and retry to success. Hence the
logging is at least extraneous.
The case is fixed with adding a new flag {{ME_LOG_AS_WARN}} which
{{handler::print_error}} may propagate further on through {{my_error}}
when the error comes from an optimistically running slave worker thread.
The new flag effectively requests the warning level for the errlog record,
while the thread's DA records the actual error (which is regarded as temporary one
by the parallel slave error handler).
virtual Item_null_result::get_date() was not overridden.
It used the inherited Item::get_date(), which tests field_type(),
which in case of Item_null_result calls result_field->field_type(),
and result_field is not really always set (e.g. it's not set in the
test case from the bug report).
Overriding Item_null::get_date() like it's done for other val_xxx() methods.
This make the code more symmetric across data types.
In the new reduction, get_date() immediately returns NULL without entering
into any data type specific code.
subquery when ORDER BY is present
Currently for setting r_limit we divide with the number of iterations we invoke the dependent subquery.
This is not needed for the case of limit. For varying limits we produce the output that the limit varies with
execution.
Also there is a type for filtered , we forgot to multiply by 100 as it is represented as a percent.
materialization scan over materialization lookup
For non-mergeable semi-joins we don't store the estimates of the IN subquery in table->file->stats.records.
In the function TABLE_LIST::fetch_number_of_rows, we store the number of rows in the tables
(estimates in case of derived table/views).
Currently we don't store the estimates for non-mergeable semi-joins, which leads to a problem of selecting
materialization scan over materialization lookup.
Fixed this by storing these estimated appropriately
For non-semi-join subquery optimization we do a cost based decision between
Materialisation and IN -> EXIST transformation. The issue in this case is that for IN->EXIST transformation
we run JOIN::reoptimize with the IN->EXISt conditions and we come up with a new query plan. But when we compare
the cost with Materialization, we make the decision to chose Materialization so we need to restore the query plan
for Materilization.
The saving and restoring for keyuse array and join_tab keyuse is only done when we have atleast
one element in the keyuse_array , we are now changing to do it even for 0 elements to main the generality.
Crash happened because in discover, table->work_part_info was not properly
reset before execution.
Fixed by resetting before calling execute alter table, create table or
mysql_create_frm_image.
MDEV-16123 ASAN heap-use-after-free handler::ha_index_or_rnd_end
MDEV-13828 Segmentation fault on RENAME TABLE
Problem was that destructor called methods for closed table.
Fixed by removing code in destructor.
Problem was that detection of temporary tables was all wrong for
RENAME TABLE.
(Temporary tables where opened by top level call to
open_temporary_tables(), which can't detect if a temporary table
was renamed to something and then reused).
Fixed by adding proper parsing of rename list to check against
the current name of a table at each rename stage.
Also change do_rename_temporary() to check against the current
state of temporary tables, not according to the state of start
of RENAME TABLE.
MDEV-10130 Assertion `share->in_trans == 0' failed in storage/maria/ma_close.c
MDEV-10378 Assertion `trn' failed in virtual int ha_maria::start_stmt
The problem was that maria_handler->trn was not properly reset
at commit/rollback and ha_maria::exernal_lock() could get confused
because.
There was some old code in ha_maria::implicit_commit() that tried
to take care of this, but it was not bullet proof.
Fixed by adding list of all tables that is part of the maria transaction to
TRN.
A nice side effect was of the fix is that loops in
ha_maria::implict_commit() got to be much simpler.
Other things:
- Fixed a bug in mysql_admin_table() where argument open_for_modify
was wrongly reset for the next table in the chain
- rollback admin command also in case of fatal error.
- Split _ma_set_trn_for_table() to three version to simplify code
and debugging.
- Several new asserts to detect the original problem (that file was
not properly removed from trn before calling ma_close())
upon SELECT .. LIMIT 0
The code must differentiate between a SELECT with contradictory
WHERE/HAVING and one with LIMIT 0.
Also for the latter printed 'Zero limit' instead of 'Impossible where'
in the EXPLAIN output.
order with Galera and encrypt-tmp-files=1
Problem:- If trans_cache (IO_CACHE) uses encrypted tmp file
then on next DML server will crash.
Case:-
Lets take a case , we have a table t1 , We try to do 2 inserts in t1
1. A really long insert so that trans_cache has to use temp_file
2. Just a small insert
Analysis:- Actually server crashes from inside of galera
library.
/lib64/libc.so.6(abort+0x175)[0x7fb5ba779dc5]
/usr/lib64/galera/libgalera_smm.so(_ZN6galera3FSMINS_9TrxHandle5State...
mysys/stacktrace.c:247(my_print_stacktrace)[0x7fb5a714940e]
sql/signal_handler.cc:160(handle_fatal_signal)[0x7fb5a715c1bd]
sql/wsrep_hton.cc:257(wsrep_rollback)[0x7fb5bcce923a]
sql/wsrep_hton.cc:268(wsrep_rollback)[0x7fb5bcce9368]
sql/handler.cc:1658(ha_rollback_trans(THD*, bool))[0x7fb5bcd4f41a]
sql/handler.cc:1483(ha_commit_trans(THD*, bool))[0x7fb5bcd4f804]
but actual issue is not in galera but in mariadb, because for 2nd
insert we should never call rollback. We are calling rollback because
log_and_order fails it fails because write_cache fails , It fails
because after reinit_io_cache(trans_cache) , my_b_bytes_in_cache says 0
so we look into tmp_file for data , which is obviously wrong since temp
was used for previous insert and it no longer exist.
wsrep_write_cache_inc() reads the IO_CACHE in a loop, filling it with
my_b_fill() until it returns "0 bytes read". Later
MYSQL_BIN_LOG::write_cache() does the same. wsrep_write_cache_inc()
assumes that reading a zero bytes past EOF leaves the old data in the
cache
Solution:- There is two issue in my_b_encr_read
1st we should never equal read_end to info->buffer. I mean this
does not make sense read_end should always point to end of buffer.
2nd For most of the case(apart from async IO_CACHE) info->pos_in_file
should be equal to info->buffer position wrt to temp file , since
in this case we are not changing info->buffer it should remain
unchanged.
Fixed by extending unique_table() with a flag to not allow usage of
the replaced table.
I also cleaned up find_dup_table() to not use goto next.
I also added more comments to the code in find_dup_table()
Problem was that if copy_data_between_tables() didn't do proper
clean up in case of failures:
- copy object was not properly freed
- end_bulk_insert() was not called
- mysql_trans_prepare_alter_copy_data() set THD->transaction.on to
false which was not properly restored
The last part caused a crash in Aria as Aria depends on that THD
is correct.
Other things:
- Reset info->switched_transactional after usage (safety)
- Reset bulk_insert_single_undo (safety)
The test was not deterministic and would occasionally fail, due to the
use of `sleep`.
This patch is a complete rewrite of the test using proper sync points.
multiple times with different arguments.
If the ON expression of an outer join is an OR formula with one
of the disjunct being a constant formula then the expression
cannot be null-rejected if the constant formula is true. Otherwise
it can be null-rejected and if so the outer join can be converted
into inner join. This optimization was added in the patch for
mdev-4817. Yet the code had a defect: if the query was used in
a stored procedure with parameters and the constant item contained
some of them then the value of this constant item depended on the
values of the parameters. With some parameters it may be true,
for others not. The validity of conversion to inner join is checked
only once and it happens only for the first call of procedure.
So if the parameters in the first call allowed the conversion it
was done and next calls used the transformed query though there
could be calls whose parameters made the conversion invalid.
Fixed by cheking whether the constant disjunct in the ON expression
originally contained an SP parameter. If so the expression is not
considered as null-rejected. For this check a new item's attribute
was intruduced: Item::with_param. It is calculated for each item
by fix fields() functions.
Also moved the call of optimize_constant_subqueries() in
JOIN::optimize after the call of simplify_joins(). The reason
for this is that after the optimization introduced by the patch
for mdev-4817 simplify_joins() can use the results of execution
of non-expensive constant subqueries and this is not valid.
The problem was that SJ (semi-join) used secondary list (array) of subquery select list. The items there was prepared once then cleaned up (but not really freed from memory because it was made in statement memory).
Original list was not prepared after first execution because select was removed by conversion to SJ.
The solution is to use original list but prepare it first.
These test can sporadically show mutex deadlock warnings between LOCK_wsrep_thd
and LOCK_thd_data mutexes. This means that these mutexes can be locked in opposite
order by different threads, and thus result in deadlock situation.
To fix such issue, the locking policy of these mutexes should be revised and
enforced to be uniform. However, a quick code review shows that the number of
lock/unlock operations for these mutexes combined is between 100-200, and all these
mutex invocations should be checked/fixed.
On the other hand, it turns out that LOCK_wsrep_thd is used for protecting access to
wsrep variables of THD (wsrep_conflict_state, wsrep_query_state), whereas LOCK_thd_data
protects query, db and mysys_var variables in THD. Extending LOCK_thd_data to protect
also wsrep variables looks like a viable solution, as there should not be a use case
where separate threads need simultaneous access to wsrep variables and THD data variables.
In this commit LOCK_wsrep_thd mutex is refactored to be replaced by LOCK_thd_data.
By bluntly replacing LOCK_wsrep_thd by LOCK_thd_data, will result in double locking
of LOCK_thd_data, and some adjustements have been performed to fix such situations.
followup for a3c980b381
same change in Locked_tables_list::unlink_from_list(), otherwise
thd->locked_tables_list will keep pointers to a free'd TABLE if
prelocked under lock tables.
This fixes a crash in main.create_or_replace on debug Win builds
after bcb36ee21e