MDL wait consists of short 1 second waits (this is not configurable)
repeated until lock_wait_timeout is reached. The stage is changed
to Waiting and back every second. To have predictable result in the
test the query should filter all sequences of X, "Waiting for MDL", X,
leaving just X.
On Windows systems, occurrences of ERROR_SHARING_VIOLATION due to
conflicting share modes between processes accessing the same file can
result in CreateFile failures.
mysys' my_open() already incorporates a workaround by implementing
wait/retry logic on Windows.
But this does not help if files are opened using shell redirection like
mysqltest traditionally did it, i.e via
--echo exec "some text" > output_file
In such cases, it is cmd.exe, that opens the output_file, and it
won't do any sharing-violation retries.
This commit addresses the issue by introducing a new built-in command,
'write_line', in mysqltest. This new command serves as a brief alternative
to 'write_file', with a single line output, that also resolves variables
like "exec" would.
Internally, this command will use my_open(), and therefore retry-on-error
logic.
Hopefully this will eliminate the very sporadic "can't open file because
it is used by another process" error on CI.
the value of 200 isn't enough for some tests anymore, this causes
some random threads to become not instrumented and any table operations
there are not reflected in the perfschema. If, say, a DROP TABLE
doesn't change perfschema state, perfschema tables might show
ghost tables that no longer exist in the server
Some fixes related to commit f838b2d799 and
Rows_log_event::do_apply_event() and Update_rows_log_event::do_exec_row()
for system-versioned tables were provided by Nikita Malyavin.
This was required by test versioning.rpl,trx_id,row.
perfschema aggregation, like SHOW STATUS, is only statistically correct.
It doesn't use atomics for performance reasons and might miss individual
increments, particularly when two connections are disconnecting at the
same time.
To have stable results tests should avoid doing it.
This will makes it easier to find out what replication workers are
doing and what they are waiting for.
Things changed in processlist:
- Slave_SQL time was not consistent. Now time for state "Slave has
read all relay log; waiting for more updates" shows how long it has
waited for getting the next event.
- Slave_worker threads did often show "Closing tables" for a long
time. Now the state is reverted to the previous state after
"Closing tables" is done.
- Commit and Rollback states where not shown for replication (and some
other threads). Now Commit and Rollback states are always shown and
the state is reverted to previous state when the Commit/Rollback
have finished.
Code changes:
- Added thd->set_time_for_next_stage() for parallel replication when
when starting to wait for prior transactions to commit, group commit,
and FTWRL and for free space in thread pool.
Before we reset the time only after the above events.
- Moved THD_STAGE_INFO(stage_rollback) and THD_STAGE_INFO(stage_commit)
from sql_parse.cc to transaction.cc to ensure this is done for
all commits and not only 'normal connection queries'.
Test case changes:
- close_thread_tables() reverting stage to previous stage caused the
counter in performance_schema to be increased. In many case it is
the 'sql/starting' stage that was effected.
- We only change to "Commit" stage if there is a need for a commit.
This caused some "Commit" stages to disapper from perfschema reports.
TODO in 11.#:
- Slave_IO always showes "Waiting for master to send event" and the time is
from SLAVE START. We should in 11.# change this to be the time since
reading the last event.
Under terms of MDEV 27490 we'll add support for non-BMP identifiers
and upgrade casefolding information to Unicode version 14.0.0.
In Unicode-14.0.0 conversion to lower and upper cases can increase octet length
of the string, so conversion won't be possible in-place any more.
This patch removes virtual functions performing in-place casefolding:
- my_charset_handler_st::casedn_str()
- my_charset_handler_st::caseup_str()
and fixes the code to use the non-inplace functions instead:
- my_charset_handler_st::casedn()
- my_charset_handler_st::caseup()
binlog_space_limit is a variable in Percona server used to limit the total
size of all binary logs.
This implementation is based on code from Percona server 5.7.
In MariaDB we decided to call the variable max-binlog-total-size to be
similar to max-binlog-size. This makes it easier to find in the output
from 'mariadbd --help --verbose'). MariaDB will also support
binlog_space_limit for compatibility with Percona.
Some internal notes to explain implementation notes:
- When running MariaDB does not delete binary logs that are either
used by slaves or have active xid that are not yet committed.
Some implementation notes:
- max-binlog-total-size is by default 0 (no limit).
- max-binlog-total-size can be changed without server restart.
- Binlog file sizes are checked on startup, or if
max-binlog-total-size is set to a value > 0, not for every log write.
The total size of all binary logs is cached and dynamically updated
when updating the binary log on binary log rotation.
- max-binlog-total-size is checked against existing log files during
serverstart, binlog rotation, FLUSH LOGS, when writing to binary log
or when max-binlog-total-size changes value.
- Option --slave-connections-needed-for-purge with 1 as default added.
This allows one to ensure that we do not delete binary logs if there
is less than 'slave-connections-needed-for-purge' connected.
Without this option max-binlog-total-size would potentially delete
binlogs needed by slaves on server startup or when a slave disconnects
as there are then no connected slaves to protect active binlogs.
- PURGE BINARY LOGS TO ... will be executed as if
slave-connectitons-needed-for-purge would be zero. In other words
it will do the purge even if there is no slaves connected. If there
are connected slaves working on the logs, these will be protected.
- If binary log is on and max-binlog-total_size <> 0 then the status
variable 'Binlog_disk_use' shows the current size of all old binary
logs + the state of the current one.
- Removed test of strcmp(log_file_name, log_info.log_file_name) in
purge_logs_before_date() as this is tested in can_purge_logs()
- To avoid expensive calls of log_in_use() we cache the result for the
last log that is in use by a slave. Future calls to can_purge_logs()
for this binary log will be quickly detected and false will be returned
until a slave starts working on a new log.
- Note that after a binary log rotation caused by max_binlog_size,
the last log will not be purged directly as it is still in use
internally. The next binary log write will purge binlogs if needed.
Reviewer:Kristian Nielsen <knielsen@knielsen-hq.org>
Two new information_schema views are added:
* PERIOD table -- columns TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME,
PERIOD_NAME, START_COLUMN_NAME, END_COLUMN_NAME.
* KEY_PERIOD_USAGE -- works similar to KEY_COLUMN_USAGE, but for periods.
Columns CONSTRAINT_CATALOG, CONSTRAINT_SCHEMA, CONSTRAINT_NAME,
TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, PERIOD_NAME
Two new columns are added to the COLUMNS view:
IS_SYSTEM_TIME_PERIOD_START, IS_SYSTEM_TIME_PERIOD_END - contain YES/NO.
not default_mysqld.cnf. The latter has only server settings,
it misses mtr-specific client configuration
Except for spider, that doesn't use mysqld.1 server
and default_my.cnf starts it automatically.
Spider tests have to include both default_mysqld.cnf and
default_client.cnf
Improve the performance of slave connect using B+-Tree indexes on each binlog
file. The index allows fast lookup of a GTID position to the corresponding
offset in the binlog file, as well as lookup of a position to find the
corresponding GTID position.
This eliminates a costly sequential scan of the starting binlog file
to find the GTID starting position when a slave connects. This is
especially costly if the binlog file is not cached in memory (IO
cost), or if it is encrypted or a lot of slaves connect simultaneously
(CPU cost).
The size of the index files is generally less than 1% of the binlog data, so
not expected to be an issue.
Most of the work writing the index is done as a background task, in
the binlog background thread. This minimises the performance impact on
transaction commit. A simple global mutex is used to protect index
reads and (background) index writes; this is fine as slave connect is
a relatively infrequent operation.
Here are the user-visible options and status variables. The feature is on by
default and is expected to need no tuning or configuration for most users.
binlog_gtid_index
On by default. Can be used to disable the indexes for testing purposes.
binlog_gtid_index_page_size (default 4096)
Page size to use for the binlog GTID index. This is the size of the nodes
in the B+-tree used internally in the index. A very small page-size (64 is
the minimum) will be less efficient, but can be used to stress the
BTree-code during testing.
binlog_gtid_index_span_min (default 65536)
Control sparseness of the binlog GTID index. If set to N, at most one
index record will be added for every N bytes of binlog file written.
This can be used to reduce the number of records in the index, at
the cost only of having to scan a few more events in the binlog file
before finding the target position
Two status variables are available to monitor the use of the GTID indexes:
Binlog_gtid_index_hit
Binlog_gtid_index_miss
The "hit" status increments for each successful lookup in a GTID index.
The "miss" increments when a lookup is not possible. This indicates that the
index file is missing (eg. binlog written by old server version
without GTID index support), or corrupt.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
The IDENT_sys doesn't include keywords, so the function with the
keyword name can be created, but cannot be called.
Moving keywords to new rules keyword_func_sp_var_and_label and
keyword_func_sp_var_not_label so the functions with these
names are allowed.
perfschema thread walker needs to take thread's LOCK_thd_kill to prevent
the thread from disappearing why it's being looked at.
But there's no need to lock it for the current thread.
In fact, it was harmful as some code down the stack might take
LOCK_thd_kill (e.g. set_killed() does it, and my_malloc_size_cb_func()
calls set_killed()). And it caused a bunch of mutexes being locked under
LOCK_thd_kill, which created problems later when my_malloc_size_cb_func()
called set_killed() at some unspecified point under some
random mutexes.
need to protect access to thread-local cache_mngr with LOCK_thd_data
technically only access from different threads has to be protected,
but this is the SHOW STATUS code path, so the difference is neglectable