mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-11-30 05:23:50 +03:00

Files

Marko Mäkelä 03ca6495df MDEV-24142: Replace InnoDB rw_lock_t with sux_lock

InnoDB buffer pool block and index tree latches depend on a
special kind of read-update-write lock that allows reentrant
(recursive) acquisition of the 'update' and 'write' locks
as well as an upgrade from 'update' lock to 'write' lock.
The 'update' lock allows any number of reader locks from
other threads, but no concurrent 'update' or 'write' lock.

If there were no requirement to support an upgrade from 'update'
to 'write', we could compose the lock out of two srw_lock
(implemented as any type of native rw-lock, such as SRWLOCK on
Microsoft Windows). Removing this requirement is very difficult,
so in commit f7e7f487d4b06695f91f6fbeb0396b9d87fc7bbf we
implemented an 'update' mode to our srw_lock.

Re-entrant or recursive locking is mostly needed when writing or
freeing BLOB pages, but also in crash recovery or when merging
buffered changes to an index page. The re-entrancy allows us to
attach a previously acquired page to a sub-mini-transaction that
will be committed before whatever else is holding the page latch.

The SUX lock supports Shared ('read'), Update, and eXclusive ('write')
locking modes. The S latches are not re-entrant, but a single S latch
may be acquired even if the thread already holds an U latch.

The idea of the U latch is to allow a write of something that concurrent
readers do not care about (such as the contents of BTR_SEG_LEAF,
BTR_SEG_TOP and other page allocation metadata structures, or
the MDEV-6076 PAGE_ROOT_AUTO_INC). (The PAGE_ROOT_AUTO_INC field
is only updated when a dict_table_t for the table exists, and only
read when a dict_table_t for the table is being added to dict_sys.)

block_lock::u_lock_try(bool for_io=true) is used in buf_flush_page()
to allow concurrent readers but no concurrent modifications while the
page is being written to the data file. That latch will be released
by buf_page_write_complete() in a different thread. Hence, we use
the special lock owner value FOR_IO.

The index_lock::u_lock() improves concurrency on operations that
involve non-leaf index pages.

The interface has been cleaned up a little. We will use
x_lock_recursive() instead of x_lock() when we know that a
lock is already held by the current thread. Similarly,
a lock upgrade from U to X is only allowed via u_x_upgrade()
or x_lock_upgraded() but not via x_lock().

We will disable the LatchDebug and sync_array interfaces to
InnoDB rw-locks.

The SEMAPHORES section of SHOW ENGINE INNODB STATUS output
will no longer include any information about InnoDB rw-locks,
only TTASEventMutex (cmake -DMUTEXTYPE=event) waits.
This will make a part of the 'innotop' script dead code.

The block_lock buf_block_t::lock will not be covered by any
PERFORMANCE_SCHEMA instrumentation.

SHOW ENGINE INNODB MUTEX and INFORMATION_SCHEMA.INNODB_MUTEXES
will no longer output source code file names or line numbers.
The dict_index_t::lock will be identified by index and table names,
which should be much more useful. PERFORMANCE_SCHEMA is lumping
information about all dict_index_t::lock together as
event_name='wait/synch/sxlock/innodb/index_tree_rw_lock'.

buf_page_free(): Remove the file,line parameters. The sux_lock will
not store such diagnostic information.

buf_block_dbg_add_level(): Define as empty macro, to be removed
in a subsequent commit.

Unless the build was configured with cmake -DPLUGIN_PERFSCHEMA=NO
the index_lock dict_index_t::lock will be instrumented via
PERFORMANCE_SCHEMA. Similar to
commit 1669c8890c
we will distinguish lock waits by registering shared_lock,exclusive_lock
events instead of try_shared_lock,try_exclusive_lock.
Actual 'try' operations will not be instrumented at all.

rw_lock_list: Remove. After MDEV-24167, this only covered
buf_block_t::lock and dict_index_t::lock. We will output their
information by traversing buf_pool or dict_sys.

2020-12-03 15:19:49 +02:00

collections

…

include

Merge 10.5 into 10.6

2020-11-12 15:54:08 +02:00

lib

Merge 10.4 into 10.5

2020-12-02 18:29:49 +02:00

main

Merge 10.5 into 10.6

2020-12-03 08:12:47 +02:00

std_data

Merge 10.5 into 10.6

2020-11-02 12:49:19 +02:00

suite

MDEV-24142: Replace InnoDB rw_lock_t with sux_lock

2020-12-03 15:19:49 +02:00

asan.supp

…

CMakeLists.txt

…

dgcov.pl

…

lsan.supp

…

mtr.out-of-source

…

mysql-stress-test.pl

…

mysql-test-run.pl

Merge 10.4 into 10.5

2020-11-13 21:54:21 +02:00

purify.supp

…

README

…

README-gcov

…

README.stress

…

suite.pm

…

unstable-tests

Merge 10.3 into 10.4

2020-11-03 14:49:17 +02:00

valgrind.supp

…

README

This directory contains test suites for the MariaDB server. To run
currently existing test cases, execute ./mysql-test-run in this directory.

Some tests are known to fail on some platforms or be otherwise unreliable.
The file "unstable-tests" contains the list of such tests along with
a comment for every test.
To exclude them from the test run, execute
# ./mysql-test-run --skip-test-list=unstable-tests

In general you do not have to have to do "make install", and you can have
a co-existing MariaDB installation, the tests will not conflict with it.
To run the tests in a source directory, you must do "make" first.

In Red Hat distributions, you should run the script as user "mysql".
The user is created with nologin shell, so the best bet is something like
# su -
# cd /usr/share/mysql-test
# su -s /bin/bash mysql -c "./mysql-test-run --skip-test-list=unstable-tests"

This will use the installed MariaDB executables, but will run a private
copy of the server process (using data files within /usr/share/mysql-test),
so you need not start the mysqld service beforehand.

You can omit --skip-test-list option if you want to check whether
the listed failures occur for you.

To clean up afterwards, remove the created "var" subdirectory, e.g.
# su -s /bin/bash - mysql -c "rm -rf /usr/share/mysql-test/var"

If one or more tests fail on your system on reasons other than listed
in lists of unstable tests, please read the following manual section
for instructions on how to report the problem:

https://mariadb.com/kb/en/reporting-bugs

If you want to use an already running MySQL server for specific tests,
use the --extern option to mysql-test-run. Please note that in this mode,
you are expected to provide names of the tests to run.

For example, here is the command to run the "alias" and "analyze" tests
with an external server:

# mysql-test-run --extern socket=/tmp/mysql.sock alias analyze

To match your setup, you might need to provide other relevant options.

With no test names on the command line, mysql-test-run will attempt
to execute the default set of tests, which will certainly fail, because
many tests cannot run with an external server (they need to control the
options with which the server is started, restart the server during
execution, etc.)

You can create your own test cases. To create a test case, create a new
file in the main subdirectory using a text editor. The file should have a .test
extension. For example:

# xemacs t/test_case_name.test

In the file, put a set of SQL statements that create some tables,
load test data, and run some queries to manipulate it.

Your test should begin by dropping the tables you are going to create and
end by dropping them again. This ensures that you can run the test over
and over again.

If you are using mysqltest commands in your test case, you should create
the result file as follows:

# mysql-test-run --record test_case_name

# mysqltest --record < t/test_case_name.test

If you only have a simple test case consisting of SQL statements and
comments, you can create the result file in one of the following ways:

# mysql-test-run --record test_case_name

# mysql test < t/test_case_name.test > r/test_case_name.result

# mysqltest --record --database test --result-file=r/test_case_name.result < t/test_case_name.test

When this is done, take a look at r/test_case_name.result.
If the result is incorrect, you have found a bug. In this case, you should
edit the test result to the correct results so that we can verify that
the bug is corrected in future releases.

If you want to submit your test case you can send it
to maria-developers@lists.launchpad.net or attach it to a bug report on
https://mariadb.org/jira/.

If the test case is really big or if it contains 'not public' data,
then put your .test file and .result file(s) into a tar.gz archive,
add a README that explains the problem, ftp the archive to
ftp://ftp.askmonty.org/private and submit a report to
https://mariadb.org/jira about it.

The latest information about mysql-test-run can be found at:
https://mariadb.com/kb/en/mariadb/mysqltest/

If you want to create .rdiff files, check
https://mariadb.com/kb/en/mariadb/mysql-test-auxiliary-files/