mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-12-07 17:42:39 +03:00

Author	SHA1	Message	Date
Marko Mäkelä	4ca355d863	MDEV-33894: Resurrect innodb_log_write_ahead_size As part of commit `685d958e38` (MDEV-14425) the parameter innodb_log_write_ahead_size was removed, because it was thought that determining the physical block size would be a sufficient replacement. However, we can only determine the physical block size on Linux or Microsoft Windows. On some file systems, the physical block size is not relevant. For example, XFS uses a block size of 4096 bytes even if the underlying block size may be smaller. On Linux, we failed to determine the physical block size if innodb_log_file_buffered=OFF was not requested or possible. This will be fixed. log_sys.write_size: The value of the reintroduced parameter innodb_log_write_ahead_size. To keep it simple, this is read-only and a power of two between 512 and 4096 bytes, so that the previous alignment guarantees are fulfilled. This will replace the previous log_sys.get_block_size(). log_sys.block_size, log_t::get_block_size(): Remove. log_t::set_block_size(): Ensure that write_size will not be less than the physical block size. There is no point to invoke this function with 512 or less, because that is the minimum value of write_size. innodb_params_adjust(): Add some disabled code for adjusting the minimum value and default value of innodb_log_write_ahead_size to reflect the log_sys.write_size. log_t::set_recovered(): Mark the recovery completed. This is the place to adjust some things if we want to allow write_size>4096. log_t::resize_write_buf(): Refer to write_size. log_t::resize_start(): Refer to write_size instead of get_block_size(). log_write_buf(): Simplify some arithmetics and remove a goto. log_t::write_buf(): Refer to write_size. If we are writing less than that, do not switch buffers, but keep writing to the same buffer. Move some code to improve the locality of reference. recv_scan_log(): Refer to write_size instead of get_block_size(). os_file_create_func(): For type==OS_LOG_FILE on Linux, always invoke os_file_log_maybe_unbuffered(), so that log_sys.set_block_size() will be invoked even if we are not attempting to use O_DIRECT. recv_sys_t::find_checkpoint(): Read the entire log header in a single 12 KiB request into log_sys.buf. Tested with: ./mtr --loose-innodb-log-write-ahead-size=4096 ./mtr --loose-innodb-log-write-ahead-size=2048	2024-06-27 16:38:08 +03:00
Alexey Yurchenko	a1e5a284fc	MDEV-31809 Automatic SST user account management Implement automatic creation of temporary accounts for SST and pass account credentials to SST script via socket as opposed to environment variables. Delete the user after the SST script returns, Respect wsrep_sst_auth set by the adminitrator in case some additional privilege grants are needed for particular SST method. mysqldump SST requires significant change to make use of the new automatic user generation facility. For now just make it compatible by ignoring automatically generated user and rely only on wsrep_sst_auth setting on the joiner node to keep backward compatibility. Adapt mysqldump SST to automatic SST user generation changes: - disable special treatment for mysqldump SST on donor - make mysqldump SST script compatible with the new SST script interface. Differentiate user privileges for different SST methods: - grant minimum required privileges for clone and xtrabackup SST accounts - grant all privileges to custom SST accounts as it is not known what is needed. - disable SST account generation for rsync SST since it is not needed. MTR tests: - add MTR tests for clone and xtrabackup SSTs without wsrep_sst_auth, - add MTR test for testing masking of wsrep_sst_auth. - don't attmept to restore original wsrep_sst_auth in MTR tests as it is always masked. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-06-10 23:29:05 +02:00
Sergei Golubchik	26f01f8be5	11.6 branch	2024-05-28 10:07:05 +02:00
Monty	b9f5793176	MDEV-9101 Limit size of created disk temporary files and tables Two new variables added: - max_tmp_space_usage : Limits the the temporary space allowance per user - max_total_tmp_space_usage: Limits the temporary space allowance for all users. New status variables: tmp_space_used & max_tmp_space_used New field in information_schema.process_list: TMP_SPACE_USED The temporary space is counted for: - All SQL level temporary files. This includes files for filesort, transaction temporary space, analyze, binlog_stmt_cache etc. It does not include engine internal temporary files used for repair, alter table, index pre sorting etc. - All internal on disk temporary tables created as part of resolving a SELECT, multi-source update etc. Special cases: - When doing a commit, the last flush of the binlog_stmt_cache will not cause an error even if the temporary space limit is exceeded. This is to avoid giving errors on commit. This means that a user can temporary go over the limit with up to binlog_stmt_cache_size. Noteworthy issue: - One has to be careful when using small values for max_tmp_space_limit together with binary logging and with non transactional tables. If a the binary log entry for the query is bigger than binlog_stmt_cache_size and one hits the limit of max_tmp_space_limit when flushing the entry to disk, the query will abort and the binary log will not contain the last changes to the table. This will also stop the slave! This is also true for all Aria tables as Aria cannot do rollback (except in case of crashes)! One way to avoid it is to use @@binlog_format=statement for queries that updates a lot of rows. Implementation: - All writes to temporary files or internal temporary tables, that increases the file size, are routed through temp_file_size_cb_func() which updates and checks the temp space usage. - Most of the temporary file monitoring is done inside IO_CACHE. Temporary file monitoring is done inside the Aria engine. - MY_TRACK and MY_TRACK_WITH_LIMIT are new flags for ini_io_cache(). MY_TRACK means that we track the file usage. TRACK_WITH_LIMIT means that we track the file usage and we give an error if the limit is breached. This is used to not give an error on commit when binlog_stmp_cache is flushed. - global_tmp_space_used contains the total tmp space used so far. This is needed quickly check against max_total_tmp_space_usage. - Temporary space errors are using EE_LOCAL_TMP_SPACE_FULL and handler errors are using HA_ERR_LOCAL_TMP_SPACE_FULL. This is needed until we move general errors to it's own error space so that they cannot conflict with system error numbers. - Return value of my_chsize() and mysql_file_chsize() has changed so that -1 is returned in the case my_chsize() could not decrease the file size (very unlikely and will not happen on modern systems). All calls to _chsize() are updated to check for > 0 as the error condition. - At the destruction of THD we check that THD::tmp_file_space == 0 - At server end we check that global_tmp_space_used == 0 - As a precaution against errors in the tmp_space_used code, one can set max_tmp_space_usage and max_total_tmp_space_usage to 0 to disable the tmp space quota errors. - truncate_io_cache() function added. - Aria tables using static or dynamic row length are registered in 8K increments to avoid some calls to update_tmp_file_size(). Other things: - Ensure that all handler errors are registered. Before, some engine errors could be printed as "Unknown error". - Fixed bug in filesort() that causes a assert if there was an error when writing to the temporay file. - Fixed that compute_window_func() now takes into account write errors. - In case of parallel replication, rpl_group_info::cleanup_context() could call trans_rollback() with thd->error set, which would cause an assert. Fixed by resetting the error before calling trans_rollback(). - Fixed bug in subselect3.inc which caused following test to use heap tables with low value for max_heap_table_size - Fixed bug in sql_expression_cache where it did not overflow heap table to Aria table. - Added Max_tmp_disk_space_used to slow query log. - Fixed some bugs in log_slow_innodb.test	2024-05-27 12:39:04 +02:00
Sergei Golubchik	9293d40fa7	MDEV-33145 support for old-mode=OLD_FLUSH_STATUS add old-mode that restores inconsistent legacy behavior for FLUSH STATUS. It doesn't affect FLUSH { SESSION \| GLOBAL } STATUS.	2024-05-27 12:39:03 +02:00
Monty	775cba4d0f	MDEV-33145 Add FLUSH GLOBAL STATUS - FLUSH GLOBAL STATUS now resets most global_status_vars. At this stage, this is mainly to be used for testing. - FLUSH SESSION STATUS added as an alias for FLUSH STATUS. - FLUSH STATUS does not require any privilege (before required RELOAD). - FLUSH GLOBAL STATUS requires RELOAD privilege. - All global status reset moved to FLUSH GLOBAL STATUS. - Replication semisync status variables are now reset by FLUSH GLOBAL STATUS. - In test cases, the only changes are: - Replace FLUSH STATUS with FLUSH GLOBAL STATUS - Replace FLUSH STATUS with FLUSH STATUS; FLUSH GLOBAL STATUS. This was only done in a few tests where the test was using SHOW STATUS for both local and global variables. - Uptime_since_flush_status is now always provided, independent if ENABLED_PROFILING is enabled when compiling MariaDB. - @@global.Uptime_since_flush_status is reset on FLUSH GLOBAL STATUS and @@session.Uptime_since_flush_status is reset on FLUSH SESSION STATUS. - When connected, @@session.Uptime_since_flush_status is set to 0.	2024-05-27 12:39:03 +02:00
Sergei Golubchik	cc758332ba	ER_VARIABLE_DELETED fix typos, adjust wording, fix plugins. plugins can have unused variables too. If they use a literal "Unused" string a compiler might or might not merge two identical strings into one (-fmerge-constants) and depending on that the server will or will not issue a "variable is ignored" warning.	2024-05-27 12:39:03 +02:00
Monty	2464ee758a	MDEV-33655 Remove alter_algorithm Remove alter_algorithm but keep the variable as no-op (with a warning). The reasons for removing alter_algorithm are: - alter_algorithm was introduced as a replacement for the old_alter_table that was used to force the usage of the original alter table algorithm (copy) in the cases where the new alter algorithm did not work. The new option was added as a way to force the usage of a specific algorithm when it should instead have made it possible to disable algorithms that would not work for some reason. - alter_algorithm introduced some cases where ALTER TABLE would not work without specifying the ALGORITHM=XXX option together with ALTER TABLE. - Having different values of alter_algorithm on master and slave could cause slave to stop unexpectedly. - ALTER TABLE FORCE, as used by mariadb-upgrade, would not always work if alter_algorithm was set for the server. - As part of the MDEV-33449 "improving repair of tables" it become clear that alter- algorithm made it harder to provide a better and more consistent ALTER TABLE FORCE and REPAIR TABLE and it would be better to remove it.	2024-05-27 12:39:03 +02:00
Monty	8af7a99443	Fixed warnings when using deprecated variables Also fixed that all unused variables are using the same variable comment. The warning will be tested with the next commit that deprecates the variable alter_algorithm.	2024-05-27 12:39:02 +02:00
Monty	dfdedd46e4	MDEV-32188 make TIMESTAMP use whole 32-bit unsigned range This patch extends the timestamp from 2038-01-19 03:14:07.999999 to 2106-02-07 06:28:15.999999 for 64 bit hardware and OS where 'long' is 64 bits. This is true for 64 bit Linux but not for Windows. This is done by treating the 32 bit stored int as unsigned instead of signed. This is safe as MariaDB has never accepted dates before the epoch (1970). The benefit of this approach that for normal timestamp the storage is compatible with earlier version. However for tables using system versioning we before stored a timestamp with the year 2038 as the 'max timestamp', which is used to detect current values. This patch stores the new 2106 year max value as the max timestamp. This means that old tables using system versioning needs to be updated with mariadb-upgrade when moving them to 11.4. That will be done in a separate commit.	2024-05-27 12:39:02 +02:00
Sergei Golubchik	5296f908ed	MDEV-28671 post-testing fixes Various help message improvements: * MySQL->MariaDB, mysqld->mariadbd, "mysqld daemon" -> "mariadbd process" * typos * don't specify defaults directly in the help message * don't say that an option is deprecated, mark is as such * missing spaces in the middle of the text etc	2024-05-27 12:39:02 +02:00
Sergei Golubchik	df10a945fc	MDEV-28671 post-merge fixes * use new deprecated printer for all deprecated server options * restore alphabetic option sorting order * move deprecated printer from mysqld.cc to my_getopt.c * in --help print deprecation message at the end of the option help * move 'ALL' help text where it belongs - to other SET options, and with a correct indentation. * consistently end all or none command-line option help strings with a dot - my_print_help() needs that. It's about 50/50 now, so let's do none, less line wraps in --help * remove trailing spaces from command-line option help strings	2024-05-27 12:39:02 +02:00
Oleksandr Byelkin	6c323c7a03	Fix version	2024-05-23 21:54:29 +02:00
Oleksandr Byelkin	dd7d9d7fb1	Merge branch '11.4' into 11.5	2024-05-23 17:01:43 +02:00
Sergei Golubchik	19f7edf420	mysqltest: support MARIADB_OPT_RESTRICTED_AUTH C/C 3.4 disables mysql_old_password by default, so add an option for the `connect` command to support specifying allowed authentication plugins (MARIADB_OPT_RESTRICTED_AUTH). use it to enable mysql_old_password when needed for testing	2024-05-21 19:40:03 +02:00
Oleksandr Byelkin	99b370e023	Merge branch '11.2' into 11.4	2024-05-21 19:38:51 +02:00
Sergei Golubchik	bf5da43e50	Merge branch '11.1' into 11.2	2024-05-13 10:00:26 +02:00
Sergei Golubchik	f8621f2a16	remove redundant slow tests	2024-05-13 09:55:28 +02:00
Sergei Golubchik	f0a5412037	Merge branch '11.0' into 11.1	2024-05-13 09:52:30 +02:00
Sergei Golubchik	f9807aadef	Merge branch '10.11' into 11.0	2024-05-12 12:18:28 +02:00
Yuchen Pei	b86a2f03b6	MDEV-32640 Reset thd->lex->mi.connection_name.str towards the end of mysql_execute_command Reset the connection_name to contain a null string, if the pointer points to the same space as that of the system variable default_master_connection. We do this because the system variable may be updated which could free the pointer and create a new one, causing use-after-free for re-execution of prepared statements and stored procedures where the LEX may be reused. This allows connection_name to be set again be to the system variable pointer in the next call of this function (see earlier in this function), after any possible updates to the system variable.	2024-05-07 14:54:13 +10:00
Sergei Golubchik	018d537ec1	Merge branch '10.6' into 10.11	2024-04-22 15:23:10 +02:00
Sergei Golubchik	3f9182126c	mysqltest: support MARIADB_OPT_RESTRICTED_AUTH C/C 3.4 disables mysql_old_password by default, so add an option for the `connect` command to support specifying allowed authentication plugins (MARIADB_OPT_RESTRICTED_AUTH). use it to enable mysql_old_password when needed for testing	2024-04-22 14:59:05 +02:00
Sergei Golubchik	41296a07c8	Merge branch '10.5' into 10.6	2024-04-11 13:58:22 +02:00
Alexander Barkov	9fb8881ef8	MDEV-28366 GLOBAL debug_dbug setting affected by collation_connection=utf16... When the system variables @@debug_dbug was assigned to some expression, Sys_debug_dbug::do_check() did not properly convert the value from the expression character set to utf8. So the value was erroneously re-interpretted as utf8 without conversion. In case of a tricky expression character set (e.g. utf16le), this led to unexpected results. Fix: Re-using Sys_var_charptr::do_string_check() in Sys_debug_dbug::do_check().	2024-04-10 06:09:45 +04:00
Yuchen Pei	e0b6db2de7	MDEV-31609 Send initial values of system variables in first OK packet Values of all session tracking system variables will be sent in the first ok packet upon connection after successful authentication. Also updated mtr to print session track info on connection (h/t Sergei Golubchik) so that we can write mtr tests for this change.	2024-04-10 11:13:46 +10:00
Oleksandr Byelkin	cd28b2479c	Merge branch '11.1' into 11.2	2024-04-09 12:12:33 +02:00
Marko Mäkelä	0892e6d028	MDEV-33585 The maximum innodb_log_buffer_size is too large On Microsoft Windows, ReadFile() as well as WriteFile() limit the size of the request to DWORD, which is 32 bits (at most 4 GiB - 1) also on 64-bit systems. On FreeBSD, sysctl debug.iosize_max_clamp could limit the size of a write request to INT_MAX. The size of a read request is always limited to INT_MAX. This would allow the request size to be 4095 bytes more than the Linux limit (0x7ffff000 according to "man 2 read" and "man 2 write"). On OpenBSD, Solaris and possibly NetBSD, the read request size is limited to SSIZE_T_MAX, which would be half the current maximum innodb_log_buffer_size. This should be not much of an issue anyway, because on contemporary 64-bit platforms, the virtual addresses are limited to 48 bits. IBM AIX documentation mentions OFF_MAX which would apply when a 64-bit application is running on a 32-bit kernel. Let us declare innodb_log_buffer_size as 32-bit unsigned and make the maximum 0x7ffff000, to be compatible with the least common denominator (Linux). The maximum innodb_sort_buffer_size already was 64 MiB, which is not a problem. SyncFileIO::execute(): Assert that the size of a synchronous read or write request is limited to the maximum. Reviewed by: Vladislav Vaintroub	2024-04-09 09:32:47 +03:00
Marko Mäkelä	1122ac978e	MDEV-33545: Improve innodb_doublewrite to cover NO_FSYNC In commit `24648768b4` (MDEV-30136) the parameter innodb_flush_method was deprecated, with no direct replacement for innodb_flush_method=O_DIRECT_NO_FSYNC. Let us change innodb_doublewrite from Boolean to ENUM that can be changed while the server is running: OFF: Assume that writes of innodb_page_size are atomic ON: Prevent torn writes (the default) fast: Like ON, but avoid synchronizing writes to data files The deprecated start-up parameter innodb_flush_method=NO_FSYNC will cause innodb_doublewrite=ON to be changed to innodb_doublewrite=fast, which will prevent InnoDB from making any durable writes to data files. This would normally be done right before the log checkpoint LSN is updated. Depending on the file systems being used and their configuration, this may or may not be safe. The value innodb_doublewrite=fast differs from the previous combination of innodb_doublewrite=ON and innodb_flush_method=O_DIRECT_NO_FSYNC by always invoking os_file_flush() on the doublewrite buffer itself in buf_dblwr_t::flush_buffered_writes_completed(). This should be safer when there are multiple doublewrite batches between checkpoints. Typically, once per second, buf_flush_page_cleaner() would write out up to innodb_io_capacity pages and advance the log checkpoint. Also typically, innodb_io_capacity>128, which is the size of the doublewrite buffer in pages. Should os_file_flush_func() not be invoked between doublewrite batches, writes could be reordered in an unsafe way. The setting innodb_doublewrite=fast could be safe when the doublewrite buffer (the first file of the system tablespace) and the data files reside in the same file system. This was tested by running "./mtr --rr innodb.alter_kill". On the first server startup, with innodb_doublewrite=fast, os_file_flush_func() would only be invoked on the ibdata1 file and possibly ib_logfile0. On subsequent startups with innodb_doublewrite=OFF, os_file_flush_func() will be invoked on the individual data files during log_checkpoint(). Note: The setting debug_no_sync (in the code, my_disable_sync) would disable all durable writes to InnoDB files, which would be much less safe. IORequest::Type: Introduce special values WRITE_DBL and PUNCH_DBL for asynchronous writes that are submitted via the doublewrite buffer. In this way, fil_space_t::use_doublewrite() or buf_dblwr.in_use() will only be consulted during buf_page_t::flush() and the doublewrite buffer can be enabled or disabled without any fear of inconsistency. buf_dblwr_t::block_size: Replaces block_size(). buf_dblwr_t::flush_buffered_writes(): If !in_use() and the doublewrite buffer is empty, just invoke fil_flush_file_spaces() and return. The doublewrite buffer could have been disabled while a batch was in progress. innodb_init_params(): If innodb_flush_method=O_DIRECT_NO_FSYNC, set innodb_doublewrite=fast or innodb_doublewrite=fearless. Thanks to Mark Callaghan for reporting this, and Vladislav Vaintroub for feedback.	2024-04-04 08:12:54 +03:00
Marko Mäkelä	683fbced6b	Merge 11.0 into 11.1	2024-03-28 12:15:36 +02:00
Marko Mäkelä	fec2fd6add	Merge 10.11 into 11.0	2024-03-28 10:51:36 +02:00
Marko Mäkelä	788953463d	Merge 10.6 into 10.11 Some fixes related to commit `f838b2d799` and Rows_log_event::do_apply_event() and Update_rows_log_event::do_exec_row() for system-versioned tables were provided by Nikita Malyavin. This was required by test versioning.rpl,trx_id,row.	2024-03-28 09:16:57 +02:00
Sergei Golubchik	f50694c52b	remove pointless test	2024-03-27 16:14:55 +01:00
Marko Mäkelä	bf0b82d24b	MDEV-33515 log_sys.lsn_lock causes excessive context switching The log_sys.lsn_lock is a very contended resource with a small critical section in log_sys.append_prepare(). On many processor microarchitectures, replacing the system call based log_sys.lsn_lock with a pure spin lock would fare worse during high concurrency workloads, wasting a significant amount of CPU cycles in the spin loop. On other microarchitectures, we would see a significant amount of time being spent in native_queued_spin_lock_slowpath() in the Linux kernel, plus context switching between user and kernel address space. This was pointed out by Steve Shaw from Intel Corporation. Depending on the workload and the hardware implementation, it may be useful to use a pure spin lock in log_sys.append_prepare(). We will introduce a parameter. The statement SET GLOBAL INNODB_LOG_SPIN_WAIT_DELAY=50; would enable a spin lock that will execute that many MY_RELAX_CPU() operations (such as the x86 PAUSE instruction) between successive attempts of acquiring the spin lock. The use of a system call based log_sys.lsn_lock (which is the default setting) can be enabled by SET GLOBAL INNODB_LOG_SPIN_WAIT_DELAY=0; This patch will also introduce #ifdef LOG_LATCH_DEBUG (part of cmake -DWITH_INNODB_EXTRA_DEBUG=ON) for more accurate tracking of log_sys.latch ownership and reorganize the fields of log_sys to improve the locality of reference and to reduce the chances of false sharing. When a spin lock is being used, it will be maintained in the most significant bit of log_sys.buf_free. This is useful, because that is one of the fields that is covered by the lock. For IA-32 or AMD64, we implement the spin lock specially via log_t::lsn_lock_bts(), employing the i386 LOCK BTS instruction. A straightforward std::atomic::fetch_or() would translate into an inefficient loop around LOCK CMPXCHG. mtr_t::spin_wait_delay: The value of innodb_log_spin_wait_delay. mtr_t::finisher: Pointer to the currently used mtr_t::finish_write() implementation. This allows to avoid introducing conditional branches. We no longer invoke log_sys.is_pmem() at the mini-transaction level, but we would do that in log_write_up_to(). mtr_t::finisher_update(): Update finisher when spin_wait_delay is changed from or to 0 (the spin lock is changed to log_sys.lsn_lock or vice versa).	2024-03-22 12:29:01 +02:00
Marko Mäkelä	b8a6719889	MDEV-26642/MDEV-26643/MDEV-32898 Implement innodb_snapshot_isolation https://jepsen.io/analyses/mysql-8.0.34 highlights that the transaction isolation levels in the InnoDB storage engine do not correspond to any widely accepted definitions, such as "Generalized Isolation Level Definitions" https://pmg.csail.mit.edu/papers/icde00.pdf (PL-1 = READ UNCOMMITTED, PL-2 = READ COMMITTED, PL-2.99 = REPEATABLE READ, PL-3 = SERIALIZABLE). Only READ UNCOMMITTED in InnoDB seems to match the above definition. The issue is that InnoDB does not detect write/write conflicts (Section 4.4.3, Definition 6) in the above. It appears that as soon as we implement write/write conflict detection (SET SESSION innodb_snapshot_isolation=ON), the default isolation level (SET TRANSACTION ISOLATION LEVEL REPEATABLE READ) will become Snapshot Isolation (similar to Postgres), as defined in Section 4.2 of "A Critique of ANSI SQL Isolation Levels", MSR-TR-95-51, June 1995 https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/tr-95-51.pdf Locking reads inside InnoDB used to read the latest committed version, ignoring what should actually be visible to the transaction. The added test innodb.lock_isolation illustrates this. The statement UPDATE t SET a=3 WHERE b=2; is executed in a transaction that was started before a read view or a snapshot of the current transaction was created, and committed before the current transaction attempts to execute UPDATE t SET b=3; If SET innodb_snapshot_isolation=ON is in effect when the second transaction was started, the second transaction will be aborted with the error ER_CHECKREAD. By default (innodb_snapshot_isolation=OFF), the second transaction would execute inconsistently, displaying an incorrect SELECT COUNT(*) FROM t in its read view. If innodb_snapshot_isolation=ON, if an attempt to acquire a lock on a record that does not exist in the current read view is made, an error DB_RECORD_CHANGED (HA_ERR_RECORD_CHANGED, ER_CHECKREAD) will be raised. This error will be treated in the same way as a deadlock: the transaction will be rolled back. lock_clust_rec_read_check_and_lock(): If the current transaction has a read view where the record is not visible and innodb_snapshot_isolation=ON, fail before trying to acquire the lock. row_sel_build_committed_vers_for_mysql(): If innodb_snapshot_isolation=ON, disable the "semi-consistent read" logic that had been implemented by myself on the directions of Heikki Tuuri in order to address https://bugs.mysql.com/bug.php?id=3300 that was motivated by a customer wanting UPDATE to skip locked rows that do not match the WHERE condition. It looks like my changes were included in the MySQL 5.1.5 commit ad126d90e019f223470e73e1b2b528f9007c4532; at that time, employees of Innobase Oy (a recent acquisition of Oracle) had lost write access to the repository. The only reason why we set innodb_snapshot_isolation=OFF by default is backward compatibility with applications, such as the one that motivated the implementation of "semi-consistent read" back in 2005. In a later major release, we can default to innodb_snapshot_isolation=ON. Thanks to Peter Alvaro, Kyle Kingsbury and Alexey Gotsman for their work on https://github.com/jepsen-io/ and to Kyle and Alexey for explanations and some testing of this fix. Thanks to Vladislav Lesin for the initial test for MDEV-26643, as well as reviewing these changes.	2024-03-20 09:48:03 +02:00
Alexander Barkov	929c2e06aa	MDEV-31531 Remove my_casedn_str() and my_caseup_str() Under terms of MDEV 27490 we'll add support for non-BMP identifiers and upgrade casefolding information to Unicode version 14.0.0. In Unicode-14.0.0 conversion to lower and upper cases can increase octet length of the string, so conversion won't be possible in-place any more. This patch removes virtual functions performing in-place casefolding: - my_charset_handler_st::casedn_str() - my_charset_handler_st::caseup_str() and fixes the code to use the non-inplace functions instead: - my_charset_handler_st::casedn() - my_charset_handler_st::caseup()	2024-02-28 22:20:29 +04:00
Marko Mäkelä	d73baa402a	Merge 10.11 into 11.0	2024-02-20 12:02:01 +02:00
Sergei Golubchik	eeba940311	remove deprecated since 10.4	2024-02-17 17:10:25 +01:00
Sergei Golubchik	04f0504831	11.5 branch	2024-02-17 17:10:25 +01:00
Daniel Bartholomew	9b6e267bfd	bump the VERSION	2024-02-17 15:30:50 +01:00
Oleksandr Byelkin	fa69b085b1	Merge branch '11.3' into 11.4	2024-02-15 13:53:21 +01:00
Sergei Golubchik	3ae6680eec	update 32bit rdiffs	2024-02-15 00:03:27 +01:00
Marko Mäkelä	64cce8d5bf	Merge 10.6 into 10.11	2024-02-14 16:12:53 +02:00
Monty	18dfcfdecf	MDEV-31404 Implement binlog_space_limit binlog_space_limit is a variable in Percona server used to limit the total size of all binary logs. This implementation is based on code from Percona server 5.7. In MariaDB we decided to call the variable max-binlog-total-size to be similar to max-binlog-size. This makes it easier to find in the output from 'mariadbd --help --verbose'). MariaDB will also support binlog_space_limit for compatibility with Percona. Some internal notes to explain implementation notes: - When running MariaDB does not delete binary logs that are either used by slaves or have active xid that are not yet committed. Some implementation notes: - max-binlog-total-size is by default 0 (no limit). - max-binlog-total-size can be changed without server restart. - Binlog file sizes are checked on startup, or if max-binlog-total-size is set to a value > 0, not for every log write. The total size of all binary logs is cached and dynamically updated when updating the binary log on binary log rotation. - max-binlog-total-size is checked against existing log files during serverstart, binlog rotation, FLUSH LOGS, when writing to binary log or when max-binlog-total-size changes value. - Option --slave-connections-needed-for-purge with 1 as default added. This allows one to ensure that we do not delete binary logs if there is less than 'slave-connections-needed-for-purge' connected. Without this option max-binlog-total-size would potentially delete binlogs needed by slaves on server startup or when a slave disconnects as there are then no connected slaves to protect active binlogs. - PURGE BINARY LOGS TO ... will be executed as if slave-connectitons-needed-for-purge would be zero. In other words it will do the purge even if there is no slaves connected. If there are connected slaves working on the logs, these will be protected. - If binary log is on and max-binlog-total_size <> 0 then the status variable 'Binlog_disk_use' shows the current size of all old binary logs + the state of the current one. - Removed test of strcmp(log_file_name, log_info.log_file_name) in purge_logs_before_date() as this is tested in can_purge_logs() - To avoid expensive calls of log_in_use() we cache the result for the last log that is in use by a slave. Future calls to can_purge_logs() for this binary log will be quickly detected and false will be returned until a slave starts working on a new log. - Note that after a binary log rotation caused by max_binlog_size, the last log will not be purged directly as it is still in use internally. The next binary log write will purge binlogs if needed. Reviewer:Kristian Nielsen <knielsen@knielsen-hq.org>	2024-02-14 15:02:21 +01:00
Monty	3907345e22	MDEV-33306 Optimizer choosing incorrect index in 10.6, 10.5 but not in 10.4 In MariaDB up to 10.11, the test_if_cheaper_ordering() code (that tries to optimizer how GROUP BY is executed) assumes that if a table scan is used then if there is any index usable by GROUP BY it will be used. The reason MySQL 10.4 provides a better plan is because of two differences: - Plans using 'ref' has a cost of 1/10 of what it should be (as a protection against table scans). This is why 'ref' is used in 10.4 and not in 10.5. - When 'ref' is used, then GROUP BY will not use an index for GROUP BY. In MariaDB 10.5 the chosen plan is a table scan (as it calculated to be faster) but as 'ref' is not used, the test_if_cheaper_ordering() optimizer phase decides (as ref is not usd) to use an index for GROUP BY, which has bad performance. Description of fix: - All new code is protected by the "optimizer_adjust_secondary_key_costs" variable, which is now a bit map, and is only executed if the option "disable_forced_index_in_group_by" set. - Corrects GROUP BY handling in test_if_cheaper_ordering() by making the choise of using and index with GROUP BY cost based instead of rule based. - Adds TIME_FOR_COMPARE to all costs, when using group by, to make read_time, index_scan_time and range_cost comparable. Other things: - Made optimizer_adjust_secondary_key_costs a bit map (compatible with old code). Notes: Current code ignores costs for the algorithm used when doing GROUP BY on the first table: - Create an in-memory temporary table for handling group by and doing a filesort of the result file We can probably in 10.6 continue to ignore this cost. This patch should NOT be merged to 11.0 series (not needed in 11.0).	2024-02-12 16:43:00 +02:00
Marko Mäkelä	86c2c89743	Merge 10.6 into 10.11	2024-02-08 15:04:46 +02:00
Marko Mäkelä	91a2192bf2	Merge 10.5 into 10.6	2024-02-07 13:51:03 +02:00
Thirunarayanan Balathandayuthapani	c31b1ee26a	MDEV-33341 innodb.undo_space_dblwr test case fails with Unknown Storage Engine InnoDB - Failed to reset the innodb_fil_make_page_dirty_debug variable in innodb_saved_page_number_debug_basic test case.	2024-02-07 12:35:18 +02:00
Oleksandr Byelkin	d21cb43db1	Merge branch '11.2' into 11.3	2024-02-04 16:42:31 +01:00
Sergei Golubchik	75bfb4b8a3	deprecate SQL_NOTES variable in favor of NOTE_VERBOSITY as suggested by Monty	2024-02-03 11:22:20 +01:00

1 2 3 4 5 ...

2103 Commits