New mtr option --skip-not-found makes it to show not found tests
as skipped
main.a [ skipped ] not found
(but only if the test was specified with the suite name)
and not error out early with
mysql-test-run: *** ERROR: Could not find 'a' in 'main' suite
This is useful in buildbot, on builders that generate the list
of tests dynamically.
The problem is in manager/worker communication when worker sends
WARNINGS and then TESTRESULT. If manager yet didn't read WARNINGS
response both responses get into the same buffer, can_read() will
indicate we have data only once and we must read all the data from the
socket at once. Otherwise TESTRESULT response is lost and manager
waits it forever.
The fix now instead of single line reads the socket in a loop. But if
there is only one response in the buffer the second read will be
blocked waiting until new data arrives. That can be overcame by
blocking(0) which sets the handle into non-blocking mode. If there is
no data second read just returns undef.
The problem is non-blocking mode is not supported by all perl flavors
on Windows. Strawberry and ActiveState do not support it. Cygwin and
MSYS2 do support. There is some ioctl() hack that was known to "work"
but it doesn't do what is expected (it does not return data when there
is data). So for Windows if it is not Cygwin we disable the fix.
Cygwin is more Unix-oriented. It does not treat \n as \r\n in regexps
(fixed by \R), it supplies Unix-style paths (fixed by
mixed_path()). It does some cleanup on paths when running exe, so it
will be different in exe output (like with $exe_mysqld, comparing
basename() is enough).
Cygwin installation
1. Just install latest perl version (only base package) and
patchutils from cygwin-setup;
2. Don't forget to add c:\cygwin64\bin into system path
before any other perl flavors;
3. There is path-style conflict (see below), you must replace
c:\cygwin64\bin\sh.exe with the wrapper. Run MTR with
--cygwin-subshell-fix=do for that. Make sure you are running Cygwin
perl for the option to work.
4. Restart buildbot via net stop buildbot; net start buildbot
Path-style conflict of Cygwin-ish Perl
Some exe paths are passed to mysqltest which are executed by a native
call. This requires native-style paths (\-style). These exe paths also
executed by Perl itself. Either by MTR itself which is not so
critical, but also by tests' --perl blocks which is impossible to
change. And if Perl detects shell-expansion or uses pipe command it
passess this exe path to /bin/sh which is Cygwin-compiled bash that
cannot work with \-style (or at least in -c processing). Thus we require
\-style on some parts of MTR execution and /-style on another parts.
The examples of tests which cover these different parts are:
main.mysqlbinlog_row_compressed \
main.sp_trans_log
That could be great to force Perl to use something different from
/bin/sh, but unfortunately /bin/sh is compiled-in into binary. So the
only solution left is to overwrite /bin/sh with some wrapper script
which passes the command to cmd.exe instead of bash.
See "Path-style conflict" in "MDEV-30836 MTR Cygwin fix" for explanation.
To install subshell fix use --cygwin-subshell-fix=do
To uninstall use --cygwin-subshell-fix=remove
This works only from Cygwin environment. As long as perl on PATH is
from Cygwin you are on Cygwin environment. Check it with
perl --version
This is perl 5, version 36, subversion 1 (v5.36.1) built for
x86_64-cygwin-threads-multi
run_test_server() is actually manager main loop. We move out this
function into Manager package and split into run() and
parse_protocol(). The latter is needed for the fix. Moving into
separate package helps to make some common variables which was local
to run_test_server().
Functions from the main package is now prefixed with main:: (should be
reorganized somehow later or auto-imported).
Adds new parameter $restart_bindir for restart_mysqld.inc.
Example:
let $restart_bindir= /home/midenok/src/mariadb/10.3b/build;
--source include/restart_mysqld.inc
It is good to return back original server before check_mysqld will be
run at the test end:
let $restart_bindir=;
--source include/restart_mysqld.inc
Passing $opt_parallel as $childs is wrong: child can be killed before
it connects and you will never decrement $childs for this.
Another problem is (and that is the cause of this bug): child can be
killed and never close server socket. This can happen f.ex. after
unmaskable KILL signal. In such case the socket is closed by reaping
the child but that never happens inside reading the socket loop in
run_test_server().
The proper design is the waitless reap of children inside the socket
loop and if there is no more children we finish the socket loop. Since
there is Windows variation where we don't control the children via
waitpid(), all the clients must normally close the socket and only
this can finish the socket loop. For Unix variation we reckon that
case as all children closed the socket but not all yet died and for
that we do final waiting waitpid() (was done before the patch as
well).
To be more complete, we now handle 3 end-of-game scenarios in Unix:
1. all children closed socket, all children died: everything is
handled by the socket loop;
2. all children closed socket, not all yet died: we wait for alive
children to die after exiting the socket loop;
3. not all children closed socket, all children died: everything is
handled by the socket loop.
For Windows end-of-game scenario is only one:
All children close the socket.
66832e3a introduced change that prints core dumps in very detailed
format. That's completely out of user-friendliness but serves as a
measure for debugging hard-reproducible bugs.
The proper way to implement this:
1. it must be controlled by command-line and environment variable;
2. detailed traces must be default for buildbots only, for user
invocations normal stack traces should be printed.
Options for control are: MTR_PRINT_CORE and --print-core that accept
the following values:
no Don't print core
short Print stack trace of failed thread
medium Print stack traces of all threads
detailed Print all stack traces with debug context
custom:<code> Use debugger commands <code> to print stack trace
Default setting is: short (see env_or_default() call in pre_setup())
For environment variable wrong values are silently ignored (falls back
to default setting, see env_or_default()).
Command-line option --print-core (or -C) overrides environment
variable. Its default value is 'short' if not specified explicitly
(same env_or_default() call in pre_setup()). Explicit values are
checked for validity.
--print-method option can specify by which debugger we print
cores. For Windows there is only one choice: cdb. For Unix the values
are: gdb, dbx, lldb, auto. Default value is: auto
In 'auto' we try to use all possible debuggers until success.
I change from `exit;` to `exit(1);` on a function `usage()`.
When we try to run mtr with a wrong option, a function `usage()` is called with the wrong option as its argument. In this case, because the function call `exit` in a first if statement, we get exit status 0.
mtr is checking the wrong path for the embedded executable
on out of tree builds.
The is_embedded.inc tests are also checking the version rather
than the MTR MYSQL_EMBEDDED environment variable.
As a result, a few tests are out of date in the result recordings.
expect file is always removed before starting a server.
So if it exists here, it means the server started successfully,
mysqltest continued executing the test, created a new expect file,
and shut down the server. All while we were waiting for the server
to start.
In other words, if the expect file exists, the server did actually start.
Even if it isn't running now.
This fixes occasional failures of innodb.log_corruption (in 10.6)
* return a success/failure value from mysqld_start()
and don't error out / exit in mysqld_start(), the caller will do
* pass the correct $mysqld object into check_expected_crash_and_restart()
instead of searching for it inside. Search in the caller
* so that when a failed restart changes $mysqld->{proc}, mtr would
still detect it as a failed mysqld (by updating $proc to match)
also: log the server command line into the server error log
The easiest way to compile and test the server with UBSAN is to run:
./BUILD/compile-pentium64-ubsan
and then run mysql-test-run.
After this commit, one should be able to run this without any UBSAN
warnings. There is still a few compiler warnings that should be fixed
at some point, but these do not expose any real bugs.
The 'special' cases where we disable, suppress or circumvent UBSAN are:
- ref10 source (as here we intentionally do some shifts that UBSAN
complains about.
- x86 version of optimized int#korr() methods. UBSAN do not like unaligned
memory access of integers. Fixed by using byte_order_generic.h when
compiling with UBSAN
- We use smaller thread stack with ASAN and UBSAN, which forced me to
disable a few tests that prints the thread stack size.
- Verifying class types does not work for shared libraries. I added
suppression in mysql-test-run.pl for this case.
- Added '#ifdef WITH_UBSAN' when using integer arithmetic where it is
safe to have overflows (two cases, in item_func.cc).
Things fixed:
- Don't left shift signed values
(byte_order_generic.h, mysqltest.c, item_sum.cc and many more)
- Don't assign not non existing values to enum variables.
- Ensure that bool and enum values are properly initialized in
constructors. This was needed as UBSAN checks that these types has
correct values when one copies an object.
(gcalc_tools.h, ha_partition.cc, item_sum.cc, partition_element.h ...)
- Ensure we do not called handler functions on unallocated objects or
deleted objects.
(events.cc, sql_acl.cc).
- Fixed bugs in Item_sp::Item_sp() where we did not call constructor
on Query_arena object.
- Fixed several cast of objects to an incompatible class!
(Item.cc, Item_buff.cc, item_timefunc.cc, opt_subselect.cc, sql_acl.cc,
sql_select.cc ...)
- Ensure we do not do integer arithmetic that causes over or underflows.
This includes also ++ and -- of integers.
(Item_func.cc, Item_strfunc.cc, item_timefunc.cc, sql_base.cc ...)
- Added JSON_VALUE_UNITIALIZED to json_value_types and ensure that
value_type is initialized to this instead of to -1, which is not a valid
enum value for json_value_types.
- Ensure we do not call memcpy() when second argument could be null.
- Fixed that Item_func_str::make_empty_result() creates an empty string
instead of a null string (safer as it ensures we do not do arithmetic
on null strings).
Other things:
- Changed struct st_position to an OBJECT and added an initialization
function to it to ensure that we do not copy or use uninitialized
members. The change to a class was also motived that we used "struct
st_position" and POSITION randomly trough the code which was
confusing.
- Notably big rewrite in sql_acl.cc to avoid using deleted objects.
- Changed in sql_partition to use '^' instead of '-'. This is safe as
the operator is either 0 or 0x8000000000000000ULL.
- Added check for select_nr < INT_MAX in JOIN::build_explain() to
avoid bug when get_select() could return NULL.
- Reordered elements in POSITION for better alignment.
- Changed sql_test.cc::print_plan() to use pointers instead of objects.
- Fixed bug in find_set() where could could execute '1 << -1'.
- Added variable have_sanitizer, used by mtr. (This variable was before
only in 10.5 and up). It can now have one of two values:
ASAN or UBSAN.
- Moved ~Archive_share() from ha_archive.cc to ha_archive.h and marked
it virtual. This was an effort to get UBSAN to work with loaded storage
engines. I kept the change as the new place is better.
- Added in CONNECT engine COLBLK::SetName(), to get around a wrong cast
in tabutil.cpp.
- Added HAVE_REPLICATION around usage of rgi_slave, to get embedded
server to compile with UBSAN. (Patch from Marko).
- Added #ifdef for powerpc64 to avoid a bug in old gcc versions related
to integer arithmetic.
Changes that should not be needed but had to be done to suppress warnings
from UBSAN:
- Added static_cast<<uint16_t>> around shift to get rid of a LOT of
compiler warnings when using UBSAN.
- Had to change some '/' of 2 base integers to shift to get rid of
some compile time warnings.
Reviewed by:
- Json changes: Alexey Botchkov
- Charset changes in ctype-uca.c: Alexander Barkov
- InnoDB changes & Embedded server: Marko Mäkelä
- sql_acl.cc changes: Vicențiu Ciorbaru
- build_explain() changes: Sergey Petrunia