Don't let SET PASSWORD to set the password, if auth_string is set.
Now SET PASSWORD always sets the plugin/auth_string fields and clears
the password field (on pre-plugin mysql.user table it works as before).
specific temporary errors
The optimistic parallel slave's worker thread could face a run-time error due to
the algorithm's specifics which allows for conflicts like the reported
"Can't find record in 'table'".
A typical stack is like
{noformat}
#0 handler::print_error (this=0x61c00008f8a0, error=149, errflag=0) at handler.cc:3650
#1 0x0000555555e95361 in write_record (thd=thd@entry=0x62a0000a2208, table=table@entry=0x61f00008ce88, info=info@entry=0x7fffdee356d0) at sql_insert.cc:1944
#2 0x0000555555ea7767 in mysql_insert (thd=thd@entry=0x62a0000a2208, table_list=0x61b00012ada0, fields=..., values_list=..., update_fields=..., update_values=..., duplic=<optimized out>, ignore=<optimized out>) at sql_insert.cc:1039
#3 0x0000555555efda90 in mysql_execute_command (thd=thd@entry=0x62a0000a2208) at sql_parse.cc:3927
#4 0x0000555555f0cc50 in mysql_parse (thd=0x62a0000a2208, rawbuf=<optimized out>, length=<optimized out>, parser_state=<optimized out>) at sql_parse.cc:7449
#5 0x00005555566d4444 in Query_log_event::do_apply_event (this=0x61200005b9c8, rgi=<optimized out>, query_arg=<optimized out>, q_len_arg=<optimized out>) at log_event.cc:4508
#6 0x00005555566d639e in Query_log_event::do_apply_event (this=<optimized out>, rgi=<optimized out>) at log_event.cc:4185
#7 0x0000555555d738cf in Log_event::apply_event (rgi=0x61d0001ea080, this=0x61200005b9c8) at log_event.h:1343
#8 apply_event_and_update_pos_apply (ev=ev@entry=0x61200005b9c8, thd=thd@entry=0x62a0000a2208, rgi=rgi@entry=0x61d0001ea080, reason=<optimized out>) at slave.cc:3479
#9 0x0000555555d8596b in apply_event_and_update_pos_for_parallel (ev=ev@entry=0x61200005b9c8, thd=thd@entry=0x62a0000a2208, rgi=rgi@entry=0x61d0001ea080) at slave.cc:3623
#10 0x00005555562aca83 in rpt_handle_event (qev=qev@entry=0x6190000fa088, rpt=rpt@entry=0x62200002bd68) at rpl_parallel.cc:50
#11 0x00005555562bd04e in handle_rpl_parallel_thread (arg=arg@entry=0x62200002bd68) at rpl_parallel.cc:1258
{noformat}
Here {{handler::print_error}} computes whether to error log the
current error when --log-warnings > 1. The decision flag is consulted
bu {{my_message_sql()}} which can be eventually called.
In the bug case the decision is to log.
However in the optimistic mode slave applier case any conflict is
attempted to resolve with rollback and retry to success. Hence the
logging is at least extraneous.
The case is fixed with adding a new flag {{ME_LOG_AS_WARN}} which
{{handler::print_error}} may propagate further on through {{my_error}}
when the error comes from an optimistically running slave worker thread.
The new flag effectively requests the warning level for the errlog record,
while the thread's DA records the actual error (which is regarded as temporary one
by the parallel slave error handler).
Problem was that detection of temporary tables was all wrong for
RENAME TABLE.
(Temporary tables where opened by top level call to
open_temporary_tables(), which can't detect if a temporary table
was renamed to something and then reused).
Fixed by adding proper parsing of rename list to check against
the current name of a table at each rename stage.
Also change do_rename_temporary() to check against the current
state of temporary tables, not according to the state of start
of RENAME TABLE.
Replicated transaction extra gtid statement on slave failed to specify
an engine gtid_slave_pos name correctly. In case lower-case-table-names > 0
the InnoDB table name was generated to reproduce the lower-case-table-names=0 version
which is of mixed cases.
In rpl.rpl_mdev12179 test run this triggered a failure to DROP table which
was due to the innodb table handle was not closed:
InnoDB: Waited XYZ seconds for ref-count on table: `mysql`.`gtid_slave_pos_innodb`
on windows.
The closing issue was caused by having the table registered twice in the table cache,
for its lower- and mixed- case name versions. The DROP-table handler closed only
only one of the cache item to leave the 2nd one active.
(On Linux a failure occurs earlier at attempt to open an expected lower-cased table:
Last_Error: Error during XID COMMIT: failed to update GTID state in mysql.gtid_slave_pos: 1146: Table 'mysql.gtid_slave_pos_InnoDB' doesn't exist
but the table's name as the message shows is not in the right case).
Fixed with consulting lower-case-table-names when the engine gtid-slave-pos table
is created.
Note the lower-case-table-names=a-value created table will not recognized when next
the lower case option changes to a different value.
In 10.4 a follow-up patch is going to lowercase gtid-slave-pos autocreated table
at once at their origination, and a warning is issued in the 10.3 current patch.
For some reason, two replication tests started failing intermittently
after commit 442a6e6b25
changed the InnoDB version number from 5.7.21 to 10.3.7.
There is no obvious reference to INNODB_VERSION or
PLUGIN_AUTH_VERSION or PLUGIN_LIBRARY_VERSION in the replication tests.
The code in Type_handler_blob****::make_conversion_table_field()
erroneously assumed that row format replication uses
MYSQL_TYPE_TINYBLOB, MYSQL_TYPE_BLOB, MYSQL_TYPE_MEDIUMBLOB,
MYSQL_TYPE_LONGBLOB type codes to tranfer BLOB variations.
In fact, all BLOB variations use MYSQL_TYPE_BLOB as the type
code, while the BLOB packlength (1,2,3 or 4) it tranferred
in metadata.
The bug was introduced by aee068085d
(MDEV-9238 Wrap create_virtual_tmp_table() into a class, split into different steps)
out of order at retry
The test failures were of two sorts. One is that the number of retries
what the slave thought as a temporary error exceeded
the default value of the slave retry option.
The 2nd issue was an out of order commit by transactions that
were supposed to error out instead.
Both issues are caused by the same reason that the post-temporary-error
retry did not check possibly already existing error status.
This is mended with refining conditions to retry. Specifically, a retrying
worker checks `rpl_parallel_entry::stop_on_error_sub_id` that
a potential failing predecessor could set to its own sub id.
Now should the member be set the retrying follower errors out with
ER_PRIOR_COMMIT_FAILED.
replicate_events_marked_for_skip=FILTER_ON_MASTER
[Note this is a cherry-pick from 10.2 branch.]
When events of a big transaction are binlogged offsetting over 2GB from
the beginning of the log the semisync master's dump thread
lost such events.
The events were skipped by the Dump thread that found their skipping
status erroneously.
The current fixes make sure the skipping status is computed correctly.
The test verifies them simulating the 2GB offset.
The test was used to result in mismatch due to unaccounted specifics
of the master-slave handshake protocol that sets the Slave_IO_Running
status to true while the semisync master status is set to active a bit later.
The test is refined to expect that.
- Galera tests that was not updated with connection change
messages
- Test where out of memory error was changed (We are now using the
standard out of memory error in most places)
- Removed tokudb tests that uses include files that doesn't exist
in MariaDB
- Removed not supported mariadb startup option from option file
replicate_events_marked_for_skip=FILTER_ON_MASTER
When events of a big transaction are binlogged offsetting over 2GB from
the beginning of the log the semisync master's dump thread
lost such events.
The events were skipped by the Dump thread that found their skipping
status erroneously.
The current fixes make sure the skipping status is computed correctly.
The test verifies them simulating the 2GB offset.
Problems --------
The slave io thread did not conduct integrity check
for a group of row-based events. Specifically it tolerates missed
terminal block event that must be flagged with STMT_END. Failure to
react on its loss can confuse the applier thread in various ways.
Another potential issue was that there were no check of impossible
second in row Gtid-log-event while the slave io thread is receiving
to be skipped events after reconnect.
Fixes
-----
The slave io thread is made by this patch to track the rows event
STMT_END status.
Whenever at next event reading the IO thread finds out that a preceding
Rows event did not actually had the flag, an
explicit error is issued.
Replication can be resumed after the source of failure is eliminated,
see a provided test.
Note that currently the row-based group integrity check excludes
the compressed version 2 Rows events (which are not generated by MariaDB
master).
Its uncompressed counterpart is manually tested.
The 2nd issue is covered to produce an error in case the io thread
receives a successive Gtid_log_event while it is post-reconnect
skipping.
trx_undo_page_report_modify(): For SPATIAL INDEX, keep logging
updated off-page columns twice, so that
the minimum bounding rectangle (MBR) will be logged.
Avoiding the redundant logging would require larger changes
to the undo log format.
row_build_index_entry_low(): Handle SPATIAL_UNKNOWN more robustly,
by refusing to purge the record from the spatial index.
We can get this code when processing old undo log from 10.2.10 or
10.2.11 (the releases affected by MDEV-14799, which was a regression
from MDEV-14051).
* don't use Env module in tests, use $ENV{xxx} instead
* collateral changes:
** $file in the error message was unset
** $file in the other error message was unset too :)
** source file arguments are conventionally upper-cased
** abort the test (die) on error, don't just echo/exit
one) and affected result files.
Specifically to rpl_semi_sync_after_sync*, the changed results reflect
a fact that thanks to fixes in the dump thread functionality
there's no longer zombie thread to kill neither such thread represent
a semisync client (so the counter drops).
and specifically the ack receiving functionality.
Semisync is turned to be static instead of plugin so its functions
are invoked at the same points as RUN_HOOKS.
The RUN_HOOKS and the observer interface remain to be removed by later
patch.
Todo:
React on killed status by repl_semisync_master.wait_after_sync(). Currently
Repl_semi_sync_master::commit_trx does not check the killed status.
There were few bugfixes found that are present in mysql and its unclear
whether/how they are covered. Those include:
Bug#15985893: GTID SKIPPED EVENTS ON MASTER CAUSE SEMI SYNC TIME-OUTS
Bug#17932935 CALLING IS_SEMI_SYNC_SLAVE() IN EACH FUNCTION CALL
HAS BAD PERFORMANCE
Bug#20574628: SEMI-SYNC REPLICATION PERFORMANCE DEGRADES WITH A HIGH NUMBER OF THREADS
Part of MDEV-13073 AliSQL Optimize performance of semisync
Did the following renames to match other similar variables
key_ss_mutex_LOCK_binlog_ > key_LOCK_bing
key_ss_cond_COND_binlog_send_ -> key_COND_binlog_send
COND_binlog_send_ -> COND_binlog_send
LOCK_binlog_ -> LOCK_binlog
debian/mariadb-server-10.2.install does not install semisync libs.