Do not allow setting wsrep_sst_donor as NULL as it is
incorrect value. User can use value '' (default) that represents
same as NULL. Setting wsrep_cluster_address to NULL is
already handled correctly.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
- Description:
- Before 10.3.8 semisync was a plugin that is built into the server with
MDEV-13073,starting with commit cbc71485e2.
There are still some usage of `rpl_semi_sync_master` in mtr.
Note:
- To recognize the replica in the `dump_thread`, replica is creating
local variable `rpl_semi_sync_slave` (the keyword of plugin) in
function `request_transmit`, that is catched by primary in
`is_semi_sync_slave()`. This is the user variable and as such not
related to the obsolete plugin.
- Found in `sys_vars.all_vars` and `rpl_semi_sync_wait_point` tests,
usage of plugins `rpl_semi_sync_master`, `rpl_semi_sync_slave`.
The former test is disabled by default (`sys_vars/disabled.def`)
and marked as `obsolete`, however this patch will remove the queries.
- Add cosmetic fixes to semisync codebase
Reviewer: <brandon.nesterenko@mariadb.com>
Closes PR #2528, PR #2380
Let us disable Valgrind on tests that would fail because a
server shutdown or a STOP SLAVE command would take longer,
causing the test harness to forcibly and silently kill the server
due to an exceeded timeout.
There are separate flags DBUG_OFF for disabling the DBUG facility
and ENABLED_DEBUG_SYNC for enabling the DEBUG_SYNC facility.
Let us allow debug builds without DEBUG_SYNC.
Note: For CMAKE_BUILD_TYPE=Debug, CMakeLists.txt will continue to
define ENABLED_DEBUG_SYNC.
Test fixes:
Since fix for CONC-603 (wrong error handling in TLS read/write) in case
of a read/write error client doesn't return always error 2013 (server
has gone away), so in addition we need to check for error 2026
(TLS/SSL error) and 5014 (write error).
`m_status == DA_ERROR' failed on SELECT after setting tmp_disk_table_size.
Analysis: Mismatch in number of warnings between "194 warnings" vs
"64 rows in set" is because of max_error_count variable which has default
value of 64.
About the corrupted tables, the error that occurs because of insufficient
tmp_disk_table_size variable is not reported correctly and we continue to
execute the statement. But because the previous error (about table being
full)is not reported correctly, this error moves up the stack and is
wrongly reported as parsing error later on while parsing frm file of one
of the information schema table. This parsing error gives corrupted table
error.
As for the innodb error, it occurs even when tmp_disk_table_size is not
insufficient is default but the internal error handler takes care of it
and the error doesn't show. But when tmp_disk_table_size is insufficient,
the fatal error which wasn't reported correctly moves up the stack so
internal error handler is not called. So it shows errors.
Fix: Report the error correctly.
We will remove the parameter innodb_disallow_writes because it is badly
designed and implemented. The parameter was never allowed at startup.
It was only internally used by Galera snapshot transfer.
If a user executed
SET GLOBAL innodb_disallow_writes=ON;
the server could hang even on subsequent read operations.
During Galera snapshot transfer, we will block writes
to implement an rsync friendly snapshot, as follows:
sst_flush_tables() will acquire a global lock by executing
FLUSH TABLES WITH READ LOCK, which will block any writes
at the high level.
sst_disable_innodb_writes(), invoked via ha_disable_internal_writes(true),
will suspend or disable InnoDB background tasks or threads that could
initiate writes. As part of this, log_make_checkpoint() will be invoked
to ensure that anything in the InnoDB buf_pool.flush_list will be written
to the data files. This has the nice side effect that the Galera joiner
will avoid crash recovery.
The changes to sql/wsrep.cc and to the tests are based on a prototype
that was developed by Jan Lindström.
Reviewed by: Jan Lindström
- Make innodb_ft_cache_size & innodb_ft_total_cache_size are dynamic
variable and increase the maximum value of innodb_ft_cache_size to
512MB for 32-bit system and 1 TB for 64-bit system and set
innodb_ft_total_cache_size maximum value to 1 TB for 64-bit system.
- Print warning if the fts cache exceeds the innodb_ft_cache_size
and also unlock the cache if fts cache memory reduces less than
innodb_ft_cache_size.
The easiest way to compile and test the server with UBSAN is to run:
./BUILD/compile-pentium64-ubsan
and then run mysql-test-run.
After this commit, one should be able to run this without any UBSAN
warnings. There is still a few compiler warnings that should be fixed
at some point, but these do not expose any real bugs.
The 'special' cases where we disable, suppress or circumvent UBSAN are:
- ref10 source (as here we intentionally do some shifts that UBSAN
complains about.
- x86 version of optimized int#korr() methods. UBSAN do not like unaligned
memory access of integers. Fixed by using byte_order_generic.h when
compiling with UBSAN
- We use smaller thread stack with ASAN and UBSAN, which forced me to
disable a few tests that prints the thread stack size.
- Verifying class types does not work for shared libraries. I added
suppression in mysql-test-run.pl for this case.
- Added '#ifdef WITH_UBSAN' when using integer arithmetic where it is
safe to have overflows (two cases, in item_func.cc).
Things fixed:
- Don't left shift signed values
(byte_order_generic.h, mysqltest.c, item_sum.cc and many more)
- Don't assign not non existing values to enum variables.
- Ensure that bool and enum values are properly initialized in
constructors. This was needed as UBSAN checks that these types has
correct values when one copies an object.
(gcalc_tools.h, ha_partition.cc, item_sum.cc, partition_element.h ...)
- Ensure we do not called handler functions on unallocated objects or
deleted objects.
(events.cc, sql_acl.cc).
- Fixed bugs in Item_sp::Item_sp() where we did not call constructor
on Query_arena object.
- Fixed several cast of objects to an incompatible class!
(Item.cc, Item_buff.cc, item_timefunc.cc, opt_subselect.cc, sql_acl.cc,
sql_select.cc ...)
- Ensure we do not do integer arithmetic that causes over or underflows.
This includes also ++ and -- of integers.
(Item_func.cc, Item_strfunc.cc, item_timefunc.cc, sql_base.cc ...)
- Added JSON_VALUE_UNITIALIZED to json_value_types and ensure that
value_type is initialized to this instead of to -1, which is not a valid
enum value for json_value_types.
- Ensure we do not call memcpy() when second argument could be null.
- Fixed that Item_func_str::make_empty_result() creates an empty string
instead of a null string (safer as it ensures we do not do arithmetic
on null strings).
Other things:
- Changed struct st_position to an OBJECT and added an initialization
function to it to ensure that we do not copy or use uninitialized
members. The change to a class was also motived that we used "struct
st_position" and POSITION randomly trough the code which was
confusing.
- Notably big rewrite in sql_acl.cc to avoid using deleted objects.
- Changed in sql_partition to use '^' instead of '-'. This is safe as
the operator is either 0 or 0x8000000000000000ULL.
- Added check for select_nr < INT_MAX in JOIN::build_explain() to
avoid bug when get_select() could return NULL.
- Reordered elements in POSITION for better alignment.
- Changed sql_test.cc::print_plan() to use pointers instead of objects.
- Fixed bug in find_set() where could could execute '1 << -1'.
- Added variable have_sanitizer, used by mtr. (This variable was before
only in 10.5 and up). It can now have one of two values:
ASAN or UBSAN.
- Moved ~Archive_share() from ha_archive.cc to ha_archive.h and marked
it virtual. This was an effort to get UBSAN to work with loaded storage
engines. I kept the change as the new place is better.
- Added in CONNECT engine COLBLK::SetName(), to get around a wrong cast
in tabutil.cpp.
- Added HAVE_REPLICATION around usage of rgi_slave, to get embedded
server to compile with UBSAN. (Patch from Marko).
- Added #ifdef for powerpc64 to avoid a bug in old gcc versions related
to integer arithmetic.
Changes that should not be needed but had to be done to suppress warnings
from UBSAN:
- Added static_cast<<uint16_t>> around shift to get rid of a LOT of
compiler warnings when using UBSAN.
- Had to change some '/' of 2 base integers to shift to get rid of
some compile time warnings.
Reviewed by:
- Json changes: Alexey Botchkov
- Charset changes in ctype-uca.c: Alexander Barkov
- InnoDB changes & Embedded server: Marko Mäkelä
- sql_acl.cc changes: Vicențiu Ciorbaru
- build_explain() changes: Sergey Petrunia
* Disallow setting wsrep_on = 1 if wsrep_provider is unset. Also, move
wsrep_on_basic from sys_vars to wsrep suite: this test now requires
to run with wsrep_provider set
* Disallow setting @@session.wsrep_on = 1 when @@global.wsrep_on = 0
* Handle the case where a new connection turns @@global.wsrep_on from
off to on. In this case we would miss a call to wsrep_open, causing
unexpected states in wsrep::client_state (causing assertions).
* Disable wsrep.MDEV-22443 because it is no longer possible to enable
wsrep_on, if server is started with wsrep_provider='none'
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
The debug parameter innodb_simulate_comp_failures injected compression
failures for ROW_FORMAT=COMPRESSED tables, breaking the pre-existing
logic that I had implemented in the InnoDB Plugin for MySQL 5.1 to prevent
compressed page overflows. A much better check is already achieved by
defining UNIV_ZIP_COPY at the compilation time.
(Only UNIV_ZIP_DEBUG is part of cmake -DWITH_INNODB_EXTRA_DEBUG=ON.)
There were multiple problems here
* wsrep_trx_fragment_size should not be set when wsrep is disabled or provider is not loaded
* wsrep_trx_fragment_unit should not be set when wsrep is disabled or provider is not loaded
* wsrep_debug has no effect if wsrep is disabled or provider is not loaded
* wsrep_start_position should not be set when wsrep is disabled or provider is not loaded any other value than default
* wsrep_start_position should be changed only when we are joiner or initialized
* wsrep_start_position should be allowed to set only a value that exits, thus
we need to add error handling to wsrep_sst_complete
Actual assertion mentioned on MDEV seems to be already fixed but
setting seqno to -2 will trigger a different assertion
mysqld: /home/jan/mysql/10.4-bugs/wsrep-lib/src/server_state.cpp:702: void wsrep::server_state::sst_received(wsrep::client_service&, int): Assertion `state_ == s_joiner || state_ == s_initialized' failed.
Fixed this by not allowing user to set seqno < -1 (-1 is special
seqno meaning undefined and seqno is initialized to it). MariaDB
releases 10.2 and 10.3 already do not allow to set seqno < -1.
session_track_system_variables and max_relay_log_size.
lock LOCK_global_system_variables around the get_one_variable() call
in the Session_sysvars_tracker::store_variable().
problem:
========
mysqltest: In included file "./include/assert.inc":
included from mysql-test/suite/sys_vars/t/rpl_init_slave_func.test at line 69:
Assertion text: '@@global.max_connections = @start_max_connections'
Assertion result: '0'
mysqltest: In included file "./include/assert.inc":
included from mysql-test/suite/sys_vars/t/rpl_init_slave_func.test at line 86:
Assertion text: '@@global.max_connections = @start_max_connections + 1'
Assertion result: '0'
Analysis:
=========
A slave SQL thread sets its Running state to Yes very early in its
initialisation, before the majority of initialisation actions, including
executing the init_slave command, are done. Thus the testcase has a race
condition where the initial replication setup might finish executing later
than the testcase SET GLOBAL init_slave, making the testcase see its effect
where it checks for its absence.
Fix:
===
Include 'sync_slave_sql_with_master.inc' at the beginning of the test to
ensure that slave applier has completed the execution of 'init_slave' command
and proceeded to event application. Replace the apparently needless RESET
MASTER / RESET SLAVE etc.
Patch is based on:
b91e2e6f90
Author: laurynas-biveinis
Changes to be committed:
modified: mysql-test/suite/sys_vars/r/wsrep_cluster_address_basic.result
modified: mysql-test/suite/sys_vars/t/wsrep_cluster_address_basic.test
The setting innodb_lock_schedule_algorithm=VATS that was introduced
in MDEV-11039 (commit 021212b525)
causes conflicting exclusive locks to be incorrectly granted to
two transactions. Specifically, in lock_rec_insert_by_trx_age()
the predicate !lock_rec_has_to_wait_in_queue(in_lock) would hold even
though an active transaction is already holding an exclusive lock.
This was observed between two DELETE of the same clustered index record.
The HASH_DELETE invocation in lock_rec_enqueue_waiting() may be related.
Due to lack of progress in diagnosing the problem, we will deprecate the
option and issue a warning that using it may corrupt data. The unsafe
option was enabled between
commit 0c15d1a6ff (MariaDB 10.2.3)
and the parent of
commit 1cc1d0429d (MariaDB 10.2.17, 10.3.9).
Example of the failure:
http://buildbot.askmonty.org/buildbot/builders/bld-p9-rhel7/builds/4417/steps/mtr/logs/stdio
```
main.mysqld--help 'unix' w17 [ fail ]
Test ended at 2020-06-20 18:51:45
CURRENT_TEST: main.mysqld--help
--- /opt/buildbot-slave/bld-p9-rhel7/build/mysql-test/main/mysqld--help.result 2020-06-20 16:06:49.903604179 +0300
+++ /opt/buildbot-slave/bld-p9-rhel7/build/mysql-test/main/mysqld--help.reject 2020-06-20 18:51:44.886766820 +0300
@@ -1797,10 +1797,10 @@
sync-relay-log-info 10000
sysdate-is-now FALSE
system-versioning-alter-history ERROR
-table-cache 421
+table-cache 2000
table-definition-cache 400
-table-open-cache 421
-table-open-cache-instances 1
+table-open-cache 2000
+table-open-cache-instances 8
tc-heuristic-recover OFF
tcp-keepalive-interval 0
tcp-keepalive-probes 0
mysqltest: Result length mismatch
```
mtr: table_open_cache_basic autosized:
Lets assume that >400 are available and that
we can set the result back to the start value.
All of these system variables are autosized and can
generate MTR output differences.
Closes#1527