mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-09-13 13:47:59 +03:00

Author	SHA1	Message	Date
Nayuta Yanagisawa	cbf9d8a8d5	Merge 10.7 into 10.8	2022-04-13 17:52:27 +09:00
Marko Mäkelä	aa3a9d1ef5	Merge 10.6 into 10.7	2022-04-12 16:11:29 +03:00
Marko Mäkelä	ca3bbf4c0c	Merge 10.5 into 10.6	2022-04-12 09:26:02 +03:00
Sergei Golubchik	cfdb621243	MDEV-28255 "Error" instead of NULL in P_S.THREADS_CONNECTION_TYPE for background threads use vio_type_names[] values as in MySQL	2022-04-09 10:46:10 +02:00
Daniel Black	88ce8a3d8b	Merge 10.7 into 10.8	2022-03-25 15:06:56 +11:00
Daniel Black	8b92e346b1	Merge 10.6 into 10.7	2022-03-25 14:31:59 +11:00
Daniel Black	ec62f46a61	Merge 10.5 to 10.6	2022-03-25 11:31:49 +11:00
Marko Mäkelä	b101f19d29	MDEV-23974 fixup: rpl.rpl_gtid_stop_start fails The call mtr.add_suppression() that was added in commit `75b7cd680b` for MemorySanitizer and Valgrind runs is causing a result difference for the test rpl.rpl_gtid_stop_start. Let us disable the binlog for executing that statement. Also, the test perfschema.statement_program_lost_inst would fail due to the changes to have_innodb.inc in this commit. To compensate for that, we will make more --suite=perfschema tests run without InnoDB, and explicitly enable InnoDB in those tests that depend on a transactional storage engine.	2022-03-24 13:43:58 +02:00
Marko Mäkelä	32d741b5b0	Merge 10.7 into 10.8	2022-02-25 16:24:13 +02:00
Marko Mäkelä	3d88f9f34c	Merge 10.6 into 10.7	2022-02-25 16:09:16 +02:00
Marko Mäkelä	e04b5eaa79	Merge 10.5 into 10.6	2022-02-25 12:01:21 +02:00
Krunal Bauskar	83212632e4	MDEV-27935: Enable performance_schema profiling for trx_rseg_t latch - In 10.6, trx_rseg_t mutex was ported to use latch. As part of this porting profiling of the patch was removed. This patch reenables it given that the said latch continues to occupy the top-slots in the contention list.	2022-02-24 19:48:51 +08:00
Daniel Black	863c1a0206	MDEV-27932: perfschema.dml_file_instances mtr failure Order by results in MTR test to make it predictable.	2022-02-24 16:39:12 +11:00
Marko Mäkelä	a635c40648	MDEV-27774 Reduce scalability bottlenecks in mtr_t::commit() A prominent bottleneck in mtr_t::commit() is log_sys.mutex between log_sys.append_prepare() and log_close(). User-visible change: The minimum innodb_log_file_size will be increased from 1MiB to 4MiB so that some conditions can be trivially satisfied. log_sys.latch (log_latch): Replaces log_sys.mutex and log_sys.flush_order_mutex. Copying mtr_t::m_log to log_sys.buf is protected by a shared log_sys.latch. Writes from log_sys.buf to the file system will be protected by an exclusive log_sys.latch. log_sys.lsn_lock: Protects the allocation of log buffer in log_sys.append_prepare(). sspin_lock: A simple spin lock, for log_sys.lsn_lock. Thanks to Vladislav Vaintroub for suggesting this idea, and for reviewing these changes. mariadb-backup: Replace some use of log_sys.mutex with recv_sys.mutex. buf_pool_t::insert_into_flush_list(): Implement sorting of flush_list because ordering is otherwise no longer guaranteed. Ordering by LSN is needed for the proper operation of redo log checkpoints. log_sys.append_prepare(): Advance log_sys.lsn and log_sys.buf_free by the length, and return the old values. Also increment write_to_buf, which was previously done in log_close(). mtr_t::finish_write(): Obtain the buffer pointer from log_sys.append_prepare(). log_sys.buf_free: Make the field Atomic_relaxed, to simplify log_flush_margin(). Use only loads and stores to avoid costly read-modify-write atomic operations. buf_pool.flush_list_requests: Replaces export_vars.innodb_buffer_pool_write_requests and srv_stats.buf_pool_write_requests. Protected by buf_pool.flush_list_mutex. buf_pool_t::insert_into_flush_list(): Do not invoke page_cleaner_wakeup(). Let the caller do that after a batch of calls. recv_recover_page(): Invoke a minimal part of buf_pool.insert_into_flush_list(). ReleaseBlocks::modified: A number of pages added to buf_pool.flush_list. ReleaseBlocks::operator(): Merge buf_flush_note_modification() here. log_t::set_capacity(): Renamed from log_set_capacity().	2022-02-10 16:37:12 +02:00
Oleksandr Byelkin	4fb2cb1a30	Merge branch '10.7' into 10.8	2022-02-04 14:50:25 +01:00
Oleksandr Byelkin	9ed8deb656	Merge branch '10.6' into 10.7	2022-02-04 14:11:46 +01:00
Oleksandr Byelkin	f5c5f8e41e	Merge branch '10.5' into 10.6	2022-02-03 17:01:31 +01:00
Oleksandr Byelkin	880d543554	Merge branch 'merge-perfschema-5.7' into 10.5	2022-01-28 11:57:52 +01:00
Oleksandr Byelkin	157e66273b	5.7.37	2022-01-25 11:13:39 +01:00
Marko Mäkelä	685d958e38	MDEV-14425 Improve the redo log for concurrency The InnoDB redo log used to be formatted in blocks of 512 bytes. The log blocks were encrypted and the checksum was calculated while holding log_sys.mutex, creating a serious scalability bottleneck. We remove the fixed-size redo log block structure altogether and essentially turn every mini-transaction into a log block of its own. This allows encryption and checksum calculations to be performed on local mtr_t::m_log buffers, before acquiring log_sys.mutex. The mutex only protects a memcpy() of the data to the shared log_sys.buf, as well as the padding of the log, in case the to-be-written part of the log would not end in a block boundary of the underlying storage. For now, the "padding" consists of writing a single NUL byte, to allow recovery and mariadb-backup to detect the end of the circular log faster. Like the previous implementation, we will overwrite the last log block over and over again, until it has been completely filled. It would be possible to write only up to the last completed block (if no more recent write was requested), or to write dummy FILE_CHECKPOINT records to fill the incomplete block, by invoking the currently disabled function log_pad(). This would require adjustments to some logic around log checkpoints, page flushing, and shutdown. An upgrade after a crash of any previous version is not supported. Logically empty log files from a previous version will be upgraded. An attempt to start up InnoDB without a valid ib_logfile0 will be refused. Previously, the redo log used to be created automatically if it was missing. Only with with innodb_force_recovery=6, it is possible to start InnoDB in read-only mode even if the log file does not exist. This allows the contents of a possibly corrupted database to be dumped. Because a prepared backup from an earlier version of mariadb-backup will create a 0-sized log file, we will allow an upgrade from such log files, provided that the FIL_PAGE_FILE_FLUSH_LSN in the system tablespace looks valid. The 512-byte log checkpoint blocks at 0x200 and 0x600 will be replaced with 64-byte log checkpoint blocks at 0x1000 and 0x2000. The start of log records will move from 0x800 to 0x3000. This allows us to use 4096-byte aligned blocks for all I/O in a future revision. We extend the MDEV-12353 redo log record format as follows. (1) Empty mini-transactions or extra NUL bytes will not be allowed. (2) The end-of-minitransaction marker (a NUL byte) will be replaced with a 1-bit sequence number, which will be toggled each time when the circular log file wraps back to the beginning. (3) After the sequence bit, a CRC-32C checksum of all data (excluding the sequence bit) will written. (4) If the log is encrypted, 8 bytes will be written before the checksum and included in it. This is part of the initialization vector (IV) of encrypted log data. (5) File names, page numbers, and checkpoint information will not be encrypted. Only the payload bytes of page-level log will be encrypted. The tablespace ID and page number will form part of the IV. (6) For padding, arbitrary-length FILE_CHECKPOINT records may be written, with all-zero payload, and with the normal end marker and checksum. The minimum size is 7 bytes, or 7+8 with innodb_encrypt_log=ON. In mariadb-backup and in Galera snapshot transfer (SST) scripts, we will no longer remove ib_logfile0 or create an empty ib_logfile0. Server startup will require a valid log file. When resizing the log, we will create a logically empty ib_logfile101 at the current LSN and use an atomic rename to replace ib_logfile0 with it. See the test innodb.log_file_size. Because there is no mandatory padding in the log file, we are able to create a dummy log file as of an arbitrary log sequence number. See the test mariabackup.huge_lsn. The parameter innodb_log_write_ahead_size and the INFORMATION_SCHEMA.INNODB_METRICS counter log_padded will be removed. The minimum value of innodb_log_buffer_size will be increased to 2MiB (because log_sys.buf will replace recv_sys.buf) and the increment adjusted to 4096 bytes (the maximum log block size). The following INFORMATION_SCHEMA.INNODB_METRICS counters will be removed: os_log_fsyncs os_log_pending_fsyncs log_pending_log_flushes log_pending_checkpoint_writes The following status variables will be removed: Innodb_os_log_fsyncs (this is included in Innodb_data_fsyncs) Innodb_os_log_pending_fsyncs (this was limited to at most 1 by design) log_sys.get_block_size(): Return the physical block size of the log file. This is only implemented on Linux and Microsoft Windows for now, and for the power-of-2 block sizes between 64 and 4096 bytes (the minimum and maximum size of a checkpoint block). If the block size is anything else, the traditional 512-byte size will be used via normal file system buffering. If the file system buffers can be bypassed, a message like the following will be issued: InnoDB: File system buffers for log disabled (block size=512 bytes) InnoDB: File system buffers for log disabled (block size=4096 bytes) This has been tested on Linux and Microsoft Windows with both sizes. On Linux, only enable O_DIRECT on the log for innodb_flush_method=O_DSYNC. Tests in 3 different environments where the log is stored in a device with a physical block size of 512 bytes are yielding better throughput without O_DIRECT. This could be due to the fact that in the event the last log block is being overwritten (if multiple transactions would become durable at the same time, and each of will write a small number of bytes to the last log block), it should be faster to re-copy data from log_sys.buf or log_sys.flush_buf to the kernel buffer, to be finally written at fdatasync() time. The parameter innodb_flush_method=O_DSYNC will imply O_DIRECT for data files. This option will enable O_DIRECT on the log file on Linux. It may be unsafe to use when the storage device does not support FUA (Force Unit Access) mode. When the server is compiled WITH_PMEM=ON, we will use memory-mapped I/O for the log file if the log resides on a "mount -o dax" device. We will identify PMEM in a start-up message: InnoDB: log sequence number 0 (memory-mapped); transaction id 3 On Linux, we will also invoke mmap() on any ib_logfile0 that resides in /dev/shm, effectively treating the log file as persistent memory. This should speed up "./mtr --mem" and increase the test coverage of PMEM on non-PMEM hardware. It also allows users to estimate how much the performance would be improved by installing persistent memory. On other tmpfs file systems such as /run, we will not use mmap(). mariadb-backup: Eliminated several variables. We will refer directly to recv_sys and log_sys. backup_wait_for_lsn(): Detect non-progress of xtrabackup_copy_logfile(). In this new log format with arbitrary-sized blocks, we can only detect log file overrun indirectly, by observing that the scanned log sequence number is not advancing. xtrabackup_copy_logfile(): On PMEM, do not modify the sequence bit, because we are not allowed to modify the server's log file, and our memory mapping is read-only. trx_flush_log_if_needed_low(): Do not use the callback on pmem. Using neither flush_lock nor write_lock around PMEM writes seems to yield the best performance. The pmem_persist() calls may still be somewhat slower than the pwrite() and fdatasync() based interface (PMEM mounted without -o dax). recv_sys_t::buf: Remove. We will use log_sys.buf for parsing. recv_sys_t::MTR_SIZE_MAX: Replaces RECV_SCAN_SIZE. recv_sys_t::file_checkpoint: Renamed from mlog_checkpoint_lsn. recv_sys_t, log_sys_t: Removed many data members. recv_sys.lsn: Renamed from recv_sys.recovered_lsn. recv_sys.offset: Renamed from recv_sys.recovered_offset. log_sys.buf_size: Replaces srv_log_buffer_size. recv_buf: A smart pointer that wraps log_sys.buf[recv_sys.offset] when the buffer is being allocated from the memory heap. recv_ring: A smart pointer that wraps a circular log_sys.buf[] that is backed by ib_logfile0. The pointer will wrap from recv_sys.len (log_sys.file_size) to log_sys.START_OFFSET. For the record that wraps around, we may copy file name or record payload data to the auxiliary buffer decrypt_buf in order to have a contiguous block of memory. The maximum size of a record is less than innodb_page_size bytes. recv_sys_t::parse(): Take the smart pointer as a template parameter. Do not temporarily add a trailing NUL byte to FILE_ records, because we are not supposed to modify the memory-mapped log file. (It is attached in read-write mode already during recovery.) recv_sys_t::parse_mtr(): Wrapper for recv_sys_t::parse(). recv_sys_t::parse_pmem(): Like parse_mtr(), but if PREMATURE_EOF would be returned on PMEM, use recv_ring to wrap around the buffer to the start. mtr_t::finish_write(), log_close(): Do not enforce log_sys.max_buf_free on PMEM, because it has no meaning on the mmap-based log. log_sys.write_to_buf: Count writes to log_sys.buf. Replaces srv_stats.log_write_requests and export_vars.innodb_log_write_requests. Protected by log_sys.mutex. Updated consistently in log_close(). Previously, mtr_t::commit() conditionally updated the count, which was inconsistent. log_sys.write_to_log: Count swaps of log_sys.buf and log_sys.flush_buf, for writing to log_sys.log (the ib_logfile0). Replaces srv_stats.log_writes and export_vars.innodb_log_writes. Protected by log_sys.mutex. log_sys.waits: Count waits in append_prepare(). Replaces srv_stats.log_waits and export_vars.innodb_log_waits. recv_recover_page(): Do not unnecessarily acquire log_sys.flush_order_mutex. We are inserting the blocks in arbitary order anyway, to be adjusted in recv_sys.apply(true). We will change the definition of flush_lock and write_lock to avoid potential false sharing. Depending on sizeof(log_sys) and CPU_LEVEL1_DCACHE_LINESIZE, the flush_lock and write_lock could share a cache line with each other or with the last data members of log_sys. Thanks to Matthias Leich for providing https://rr-project.org traces for various failures during the development, and to Thirunarayanan Balathandayuthapani for his help in debugging some of the recovery code. And thanks to the developers of the rr debugger for a tool without which extensive changes to InnoDB would be very challenging to get right. Thanks to Vladislav Vaintroub for useful feedback and to him, Axel Schwenke and Krunal Bauskar for testing the performance.	2022-01-21 16:03:47 +02:00
Marko Mäkelä	7dfaded962	Merge 10.6 into 10.7	2022-01-04 09:55:58 +02:00
Marko Mäkelä	3f5726768f	Merge 10.5 into 10.6	2022-01-04 09:26:38 +02:00
Marko Mäkelä	88b339805d	Fix a test for cmake -DPLUGIN_PERFSCHEMA=NO	2021-12-11 15:27:14 +02:00
Sergei Golubchik	ef77c05126	Merge branch '10.6' into 10.7	2021-12-08 10:33:36 +01:00
Sergei Golubchik	186c1fa250	Merge branch '10.5' into 10.6	2021-12-07 22:11:30 +01:00
Sergei Golubchik	62ea1b4407	BUG#31761802 STATISTICS ANY QUERIES USING VIEWS ARE SUMMARIZED TOGETHER WITH THE VIEW DEFINITION SELECT test case only	2021-12-07 21:31:27 +01:00
Marko Mäkelä	4489a89c71	MDEV-27094 Debug builds include useless InnoDB "disabled" options The following options were introduced in commit `2e814d4702` (mariadb-10.2.2) and have little use: innodb_disable_resize_buffer_pool_debug had no effect even in MariaDB 10.2.2 or MySQL 5.7.9. It was introduced in mysql/mysql-server@5c4094cf49 to work around a problem that was fixed in mysql/mysql-server@2957ae4f99 (but the parameter was not removed). innodb_page_cleaner_disabled_debug and innodb_master_thread_disabled_debug are only used by the test innodb.redo_log_during_checkpoint that will be removed as part of this commit. innodb_dict_stats_disabled_debug is only used by that test, and it is redundant because one could simply use innodb_stats_persistent=OFF or the STATS_PERSISTENT=0 attribute of the table in the test to achieve the same effect.	2021-11-19 17:46:16 +02:00
Rucha Deodhar	d5e606c605	MDEV-26611: ERROR_INDEX isn't intuitively clear Fixup for MDEV-10075 Analysis: ERROR_INDEX implemented in MDEV-10075 was not intuitively clear. Fix: changed parser to use ROW_NUMBER instead of ERROR_INDEX. Removed ERROR_INDEX and ERROR_INDEX_SYM from related files. Changed m_error_index to m_row_number.	2021-10-05 12:44:55 +05:30
Marko Mäkelä	2255649939	Merge 10.6 into 10.7	2021-09-17 20:23:17 +03:00
Marko Mäkelä	03c09837fc	Merge 10.5 into 10.6	2021-09-16 20:17:12 +03:00
Monty	b4f24c745a	Merge branch '10.4' into 10.5 Fixed also an error in suite/perfschema/t/transaction_nested_events-master.opt	2021-09-15 20:23:07 +03:00
Sergei Golubchik	9d65d2f9d0	fix tests after `ea06c67a49`	2021-09-15 12:21:25 +02:00
Monty	8d08971c84	Removed CREATE/DROP TABLESPACE and related commands - DISCARD/IMPORT TABLESPACE are the only tablespace commands left - TABLESPACE arguments for CREATE TABLE and ALTER ... ADD PARTITION are ignored. - Tablespace names are not shown anymore in .frm and not shown in information schema Other things - Removed end spaces from sql/CMakeList.txt	2021-09-14 18:04:09 +03:00
Monty	267a07e846	MDEV-26307 multi-source-replication support mysql syntax(for channel) Author: woqutech Reviewer: monty@mariadb.org	2021-09-14 17:57:27 +03:00
Marko Mäkelä	15139964d5	Merge 10.5 into 10.6	2021-09-11 17:55:27 +03:00
Sergei Golubchik	40b743f99e	remove redundant select in the perfschema.show_aggregate test instead, include handler_rollback in the following per-connection selects	2021-09-11 12:10:23 +02:00
Vicențiu Ciorbaru	8fe927e6de	Expand performance_schema tables definitions with column comments Cover all columns that did not have comments. Adjust docs based off of MariaDB implementation.	2021-09-10 17:16:50 +03:00
Haidong Ji	cc71dc0b61	MDEV-25325 built-in documentation for performance_schema tables Improve documentation of performance_schema tables by appending COLUMN comments to tables. Additionally improve test coverage and update corresponding tests. This is part of the patch covering newer columns and tables in 10.5.	2021-09-10 17:16:40 +03:00
Vicențiu Ciorbaru	7c33ecb665	Merge remote-tracking branch 'upstream/10.4' into 10.5	2021-09-10 17:16:18 +03:00
Vicențiu Ciorbaru	de7e027d5e	Merge remote-tracking branch 'upstream/10.3' into 10.4	2021-09-09 09:23:35 +03:00
Vicențiu Ciorbaru	b85b8348e7	Merge branch '10.2' into 10.3	2021-09-07 16:32:35 +03:00
Haidong Ji	528abc749e	MDEV-25325 built-in documentation for performance_schema tables Improve documentation of performance_schema tables by appending COLUMN comments to tables. Additionally improve test coverage and update corresponding tests.	2021-09-07 08:45:19 +03:00
Marko Mäkelä	82b7c561b7	MDEV-24258 Merge dict_sys.mutex into dict_sys.latch In the parent commit, dict_sys.latch could theoretically have been replaced with a mutex. But, we can do better and merge dict_sys.mutex into dict_sys.latch. Generally, every occurrence of dict_sys.mutex_lock() will be replaced with dict_sys.lock(). The PERFORMANCE_SCHEMA instrumentation for dict_sys_mutex will be removed along with dict_sys.mutex. The dict_sys.latch will remain instrumented as dict_operation_lock. Some use of dict_sys.lock() will be replaced with dict_sys.freeze(), which we will reintroduce for the new shared mode. Most notably, concurrent table lookups are possible as long as the tables are present in the dict_sys cache. In particular, this will allow more concurrency among InnoDB purge workers. Because dict_sys.mutex will no longer 'throttle' the threads that purge InnoDB transaction history, a performance degradation may be observed unless innodb_purge_threads=1. The table cache eviction policy will become FIFO-like, similar to what happened to fil_system.LRU in commit `45ed9dd957`. The name of the list dict_sys.table_LRU will become somewhat misleading; that list contains tables that may be evicted, even though the eviction policy no longer is least-recently-used but first-in-first-out. (Note: Tables can never be evicted as long as locks exist on them or the tables are in use by some thread.) As demonstrated by the test perfschema.sxlock_func, there will be less contention on dict_sys.latch, because some previous use of exclusive latches will be replaced with shared latches. fts_parse_sql_no_dict_lock(): Replaced with pars_sql(). fts_get_table_name_prefix(): Merged to fts_optimize_create(). dict_stats_update_transient_for_index(): Deduplicated some code. ha_innobase::info_low(), dict_stats_stop_bg(): Use a combination of dict_sys.latch and table->stats_mutex_lock() to cover the changes of BG_STAT_SHOULD_QUIT, because the flag is being read in dict_stats_update_persistent() while not holding dict_sys.latch. row_discard_tablespace_for_mysql(): Protect stats_bg_flag by exclusive dict_sys.latch, like most other code does. row_quiesce_table_has_fts_index(): Remove unnecessary mutex acquisition. FLUSH TABLES...FOR EXPORT is protected by MDL. row_import::set_root_by_heuristic(): Remove unnecessary mutex acquisition. ALTER TABLE...IMPORT TABLESPACE is protected by MDL. row_ins_sec_index_entry_low(): Replace a call to dict_set_corrupted_index_cache_only(). Reads of index->type were not really protected by dict_sys.mutex, and writes (flagging an index corrupted) should be extremely rare. dict_stats_process_entry_from_defrag_pool(): Only freeze the dictionary, do not lock it exclusively. dict_stats_wait_bg_to_stop_using_table(), DICT_BG_YIELD: Remove trx. We can simply invoke dict_sys.unlock() and dict_sys.lock() directly. dict_acquire_mdl_shared()<trylock=false>: Assert that dict_sys.latch is only held in shared more, not exclusive mode. Only acquire it in exclusive mode if the table needs to be loaded to the cache. dict_sys_t::acquire(): Remove. Relocating elements in dict_sys.table_LRU would require holding an exclusive latch, which we want to avoid for performance reasons. dict_sys_t::allow_eviction(): Add the table first to dict_sys.table_LRU, to compensate for the removal of dict_sys_t::acquire(). This function is only invoked by INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS. dict_table_open_on_id(), dict_table_open_on_name(): If dict_locked=false, try to acquire dict_sys.latch in shared mode. Only acquire the latch in exclusive mode if the table is not found in the cache. Reviewed by: Thirunarayanan Balathandayuthapani	2021-08-31 13:51:35 +03:00
Marko Mäkelä	eb9a28478f	Merge 10.5 into 10.6	2021-07-20 10:54:17 +03:00
Marko Mäkelä	b4ec3313f6	Merge 10.4 into 10.5	2021-07-20 09:32:11 +03:00
Sergei Golubchik	bd3ac6758a	fix perfschema.sizing_* tests to run still cannot be enabled permanently, but at least they could be run manually, if needed	2021-07-16 19:14:28 +02:00
Dmitry Shulga	97e8d27bed	MDEV-16708: fix in test failures(added --enable_prepared_warnings/--disable_prepared_warnings)	2021-06-17 19:30:24 +02:00
Dmitry Shulga	ccb0504fb0	MDEV-16708: fix in test failures caused by missing warnings received in prepare response packet	2021-06-17 19:30:24 +02:00
Sergei Golubchik	f4943b4ace	cleanup perfschema.short_options_1 test the test tests whether short options work on the server command line * remove 'show variables' for variables not affected by short options * remove options, that are not short * remove options, that cannot be tested from SQL * in particular, -T12 doesn't affect the test output, but cases ~30sec delay on shutdown * use -W1 as -W2 is the default, so doesn't affect the test output	2021-06-11 13:02:55 +02:00
Marko Mäkelä	860e754349	Merge 10.5 into 10.6	2021-05-26 11:22:40 +03:00

... 4 5 6 7 8 ...

892 Commits