mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-09-08 06:27:57 +03:00

Author	SHA1	Message	Date
Yuchen Pei	ba7088d462	Merge '11.4' into 11.6	2024-10-03 15:59:20 +10:00
Sergei Petrunia	2c3b298337	Merge 11.2 into 11.4	2024-09-09 14:40:02 +03:00
Sergei Petrunia	abd98336d2	Merge 10.11 -> 11.2	2024-09-09 13:50:38 +03:00
Marko Mäkelä	2da4839bb6	Merge 10.6 into 10.11	2024-09-06 14:45:22 +03:00
Sergei Petrunia	c8d040938a	MDEV-34720: Poor plan choice for large JOIN with ORDER BY and small LIMIT (Variant 2b: call greedy_search() twice, correct handling for limited search_depth) Modify the join optimizer to specifically try to produce join orders that can short-cut their execution for ORDER BY..LIMIT clause. The optimization is controlled by @@optimizer_join_limit_pref_ratio. Default value 0 means don't construct short-cutting join orders. Other value means construct short-cutting join order, and prefer it only if it promises speedup of more than #value times. In Optimizer Trace, look for these names: * join_limit_shortcut_is_applicable * join_limit_shortcut_plan_search * join_limit_shortcut_choice	2024-09-02 16:37:18 +03:00
Oleksandr Byelkin	ea75a0b600	Merge branch '11.4' into 11.5	2024-08-05 17:50:18 +02:00
Oleksandr Byelkin	0e8fb977b0	Merge branch '10.6' into 10.11	2024-08-03 09:15:40 +02:00
Monty	4bf7c966b3	MDEV-34664: Add an option to fix InnoDB's doubling of secondary index cardinalities (With trivial fixes by sergey@mariadb.com) Added option fix_innodb_cardinality to optimizer_adjust_secondary_key_costs Using fix_innodb_cardinality disables the 'divide by 2' of rec_per_key_int in InnoDB that in effect doubles the Cardinality for secondary keys. This has the biggest effect for indexes where a few rows has the same key value. Using this may also cause table scans for very small tables (which in some cases may be better than an index scan). The user visible effect is that 'SHOW INDEX FROM table_name' will for InnoDB show the true Cardinality (and not 2x the real value). It will also allow the optimizer to chose a better index in some cases as the division by 2 could have a bad effect for tables with 2-5 identical values per key. A few notes about using fix_innodb_cardinality: - It has direct affect for SHOW INDEX FROM table_name. SHOW INDEX will also update the statistics in table share. - The effect of fix_innodb_cardinality for query plans or EXPLAIN is only visible after first open of the table. This is why one must do a flush tables or use SHOW INDEX for the option to take effect. - Using fix_innodb_cardinality can thus affect all user in their query plans if they are using the same tables. Because of this, it is strongly recommended that one uses optimizer_adjust_secondary_key_costs=fix_innodb_cardinality mainly in configuration files to not cause issues for other users.	2024-07-29 16:40:53 +03:00
Monty	dd99780967	MDEV-34504 PURGE BINARY LOGS not working anymore PURGE BINARY LOGS did not always purge binary logs. This commit fixes some of the issues and adds notifications if a binary log cannot be purged. User visible changes: - 'PURGE BINARY LOG TO log_name' and 'PURGE BINARY LOGS BEFORE date' worked differently. 'TO' ignored 'slave_connections_needed_for_purge' while 'BEFORE' did not. Now both versions ignores the 'slave_connections_needed_for_purge variable'. - 'PURGE BINARY LOG..' commands now returns 'note' if a binary log cannot be deleted like Note 1375 Binary log 'master-bin.000004' is not purged because it is the current active binlog - Automatic binary log purges, based on date or size, will write a note to the error log if a binary log matching the size or date cannot yet be deleted. - If 'slave_connections_needed_for_purge' is set from a config or command line, it is set to 0 if Galera is enabled and 1 otherwise (old default). This ensures that automatic binary log purge works with Galera as before the addition of 'slave_connections_needed_for_purge'. If the variable is changed to 0, a warning will be printed to the error log. Code changes: - Added THD argument to several purge_logs related functions that needed THD. - Added 'interactive' options to purge_logs functions. This allowed me to remove testing of sql_command == SQLCOM_PURGE. - Changed purge_logs_before_date() to first check if log is applicable before calling can_purge_logs(). This ensures we do not get a notification for logs that does not match the remove criteria. - MYSQL_BIN_LOG::can_purge_log() will write notifications to the user or error log if a log cannot yet be removed. - log_in_use() will return reason why a binary log cannot be removed. Changes to keep code consistent: - Moved checking of binlog_format for Galera to be after Galera is initialized (The old check never worked). If Galera is enabled we now change the binlog_format to ROW, with a warning, instead of aborting the server. If this change happens a warning will be printed to the error log. - Print a warning if Galera or FLASHBACK changes the binlog_format to ROW. Before it was done silently. Reviewed by: Sergei Golubchik <serg@mariadb.com>, Kristian Nielsen <knielsen@knielsen-hq.org>	2024-07-10 18:50:08 +03:00
Alexander Barkov	4e805aed85	Merge remote-tracking branch 'origin/11.4' into 11.5	2024-07-10 12:17:09 +04:00
Alexander Barkov	5fb07d942b	Merge remote-tracking branch 'origin/11.2' into 11.4	2024-07-09 21:45:37 +04:00
Alexander Barkov	8aad19ddfc	Merge remote-tracking branch 'origin/11.1' into 11.2	2024-07-09 14:04:11 +04:00
Oleksandr Byelkin	2447dda2c0	Merge branch '10.11' into 11.1	2024-07-08 22:40:16 +02:00
Oleksandr Byelkin	034a175982	Merge branch '10.6' into 10.11	2024-07-04 11:52:07 +02:00
Oleksandr Byelkin	dcd8a64892	Merge branch '10.5' into 10.6	2024-07-03 13:27:23 +02:00
Monty	2739b5f5f8	MDEV-34494 Add server_uid global variable and add it to error log at startup The feedback plugin server_uid variable and the calculate_server_uid() function is moved from feedback/utils.cc to sql/mysqld.cc server_uid is added as a global variable (shown in 'show variables') and is written to the error log on server startup together with server version and server commit id.	2024-07-02 11:26:13 +03:00
Monty	d8c9c5ead6	MDEV-34491 Setting log_slow_admin="" at startup should be converted to log_slow_admin=ALL We have an issue if a user have the following in a configuration file: log_slow_filter="" # Log everything to slow query log log_queries_not_using_indexes=ON This set log_slow_filter to 'not_using_index' which disables slow_query_logging of most queries. In effect, on should never use log_slow_filter="" in config files but instead use log_slow_filter=ALL. Fixed by changing log_slow_filter="" that comes either from a configuration file or from the command line, when starting to the server, to log_slow_filter=ALL. A warning will be printed when this happens. Other things: - One can now use =ALL for any 'set' variable to set all options at once. (backported from 10.6)	2024-07-02 11:26:13 +03:00
Daniel Black	e7b76f87c4	MDEV-34437 restrict port and extra-port to tcp valid values extra_port and port are 16 bit numbers and not 32 bit as they are tcp ports. Restrict their value.	2024-07-01 17:43:12 +10:00
Monty	b9f5793176	MDEV-9101 Limit size of created disk temporary files and tables Two new variables added: - max_tmp_space_usage : Limits the the temporary space allowance per user - max_total_tmp_space_usage: Limits the temporary space allowance for all users. New status variables: tmp_space_used & max_tmp_space_used New field in information_schema.process_list: TMP_SPACE_USED The temporary space is counted for: - All SQL level temporary files. This includes files for filesort, transaction temporary space, analyze, binlog_stmt_cache etc. It does not include engine internal temporary files used for repair, alter table, index pre sorting etc. - All internal on disk temporary tables created as part of resolving a SELECT, multi-source update etc. Special cases: - When doing a commit, the last flush of the binlog_stmt_cache will not cause an error even if the temporary space limit is exceeded. This is to avoid giving errors on commit. This means that a user can temporary go over the limit with up to binlog_stmt_cache_size. Noteworthy issue: - One has to be careful when using small values for max_tmp_space_limit together with binary logging and with non transactional tables. If a the binary log entry for the query is bigger than binlog_stmt_cache_size and one hits the limit of max_tmp_space_limit when flushing the entry to disk, the query will abort and the binary log will not contain the last changes to the table. This will also stop the slave! This is also true for all Aria tables as Aria cannot do rollback (except in case of crashes)! One way to avoid it is to use @@binlog_format=statement for queries that updates a lot of rows. Implementation: - All writes to temporary files or internal temporary tables, that increases the file size, are routed through temp_file_size_cb_func() which updates and checks the temp space usage. - Most of the temporary file monitoring is done inside IO_CACHE. Temporary file monitoring is done inside the Aria engine. - MY_TRACK and MY_TRACK_WITH_LIMIT are new flags for ini_io_cache(). MY_TRACK means that we track the file usage. TRACK_WITH_LIMIT means that we track the file usage and we give an error if the limit is breached. This is used to not give an error on commit when binlog_stmp_cache is flushed. - global_tmp_space_used contains the total tmp space used so far. This is needed quickly check against max_total_tmp_space_usage. - Temporary space errors are using EE_LOCAL_TMP_SPACE_FULL and handler errors are using HA_ERR_LOCAL_TMP_SPACE_FULL. This is needed until we move general errors to it's own error space so that they cannot conflict with system error numbers. - Return value of my_chsize() and mysql_file_chsize() has changed so that -1 is returned in the case my_chsize() could not decrease the file size (very unlikely and will not happen on modern systems). All calls to _chsize() are updated to check for > 0 as the error condition. - At the destruction of THD we check that THD::tmp_file_space == 0 - At server end we check that global_tmp_space_used == 0 - As a precaution against errors in the tmp_space_used code, one can set max_tmp_space_usage and max_total_tmp_space_usage to 0 to disable the tmp space quota errors. - truncate_io_cache() function added. - Aria tables using static or dynamic row length are registered in 8K increments to avoid some calls to update_tmp_file_size(). Other things: - Ensure that all handler errors are registered. Before, some engine errors could be printed as "Unknown error". - Fixed bug in filesort() that causes a assert if there was an error when writing to the temporay file. - Fixed that compute_window_func() now takes into account write errors. - In case of parallel replication, rpl_group_info::cleanup_context() could call trans_rollback() with thd->error set, which would cause an assert. Fixed by resetting the error before calling trans_rollback(). - Fixed bug in subselect3.inc which caused following test to use heap tables with low value for max_heap_table_size - Fixed bug in sql_expression_cache where it did not overflow heap table to Aria table. - Added Max_tmp_disk_space_used to slow query log. - Fixed some bugs in log_slow_innodb.test	2024-05-27 12:39:04 +02:00
Sergei Golubchik	9293d40fa7	MDEV-33145 support for old-mode=OLD_FLUSH_STATUS add old-mode that restores inconsistent legacy behavior for FLUSH STATUS. It doesn't affect FLUSH { SESSION \| GLOBAL } STATUS.	2024-05-27 12:39:03 +02:00
Monty	2464ee758a	MDEV-33655 Remove alter_algorithm Remove alter_algorithm but keep the variable as no-op (with a warning). The reasons for removing alter_algorithm are: - alter_algorithm was introduced as a replacement for the old_alter_table that was used to force the usage of the original alter table algorithm (copy) in the cases where the new alter algorithm did not work. The new option was added as a way to force the usage of a specific algorithm when it should instead have made it possible to disable algorithms that would not work for some reason. - alter_algorithm introduced some cases where ALTER TABLE would not work without specifying the ALGORITHM=XXX option together with ALTER TABLE. - Having different values of alter_algorithm on master and slave could cause slave to stop unexpectedly. - ALTER TABLE FORCE, as used by mariadb-upgrade, would not always work if alter_algorithm was set for the server. - As part of the MDEV-33449 "improving repair of tables" it become clear that alter- algorithm made it harder to provide a better and more consistent ALTER TABLE FORCE and REPAIR TABLE and it would be better to remove it.	2024-05-27 12:39:03 +02:00
Monty	8af7a99443	Fixed warnings when using deprecated variables Also fixed that all unused variables are using the same variable comment. The warning will be tested with the next commit that deprecates the variable alter_algorithm.	2024-05-27 12:39:02 +02:00
Monty	dfdedd46e4	MDEV-32188 make TIMESTAMP use whole 32-bit unsigned range This patch extends the timestamp from 2038-01-19 03:14:07.999999 to 2106-02-07 06:28:15.999999 for 64 bit hardware and OS where 'long' is 64 bits. This is true for 64 bit Linux but not for Windows. This is done by treating the 32 bit stored int as unsigned instead of signed. This is safe as MariaDB has never accepted dates before the epoch (1970). The benefit of this approach that for normal timestamp the storage is compatible with earlier version. However for tables using system versioning we before stored a timestamp with the year 2038 as the 'max timestamp', which is used to detect current values. This patch stores the new 2106 year max value as the max timestamp. This means that old tables using system versioning needs to be updated with mariadb-upgrade when moving them to 11.4. That will be done in a separate commit.	2024-05-27 12:39:02 +02:00
Sergei Golubchik	5296f908ed	MDEV-28671 post-testing fixes Various help message improvements: * MySQL->MariaDB, mysqld->mariadbd, "mysqld daemon" -> "mariadbd process" * typos * don't specify defaults directly in the help message * don't say that an option is deprecated, mark is as such * missing spaces in the middle of the text etc	2024-05-27 12:39:02 +02:00
Sergei Golubchik	df10a945fc	MDEV-28671 post-merge fixes * use new deprecated printer for all deprecated server options * restore alphabetic option sorting order * move deprecated printer from mysqld.cc to my_getopt.c * in --help print deprecation message at the end of the option help * move 'ALL' help text where it belongs - to other SET options, and with a correct indentation. * consistently end all or none command-line option help strings with a dot - my_print_help() needs that. It's about 50/50 now, so let's do none, less line wraps in --help * remove trailing spaces from command-line option help strings	2024-05-27 12:39:02 +02:00
Oleksandr Byelkin	fa69b085b1	Merge branch '11.3' into 11.4	2024-02-15 13:53:21 +01:00
Marko Mäkelä	64cce8d5bf	Merge 10.6 into 10.11	2024-02-14 16:12:53 +02:00
Monty	18dfcfdecf	MDEV-31404 Implement binlog_space_limit binlog_space_limit is a variable in Percona server used to limit the total size of all binary logs. This implementation is based on code from Percona server 5.7. In MariaDB we decided to call the variable max-binlog-total-size to be similar to max-binlog-size. This makes it easier to find in the output from 'mariadbd --help --verbose'). MariaDB will also support binlog_space_limit for compatibility with Percona. Some internal notes to explain implementation notes: - When running MariaDB does not delete binary logs that are either used by slaves or have active xid that are not yet committed. Some implementation notes: - max-binlog-total-size is by default 0 (no limit). - max-binlog-total-size can be changed without server restart. - Binlog file sizes are checked on startup, or if max-binlog-total-size is set to a value > 0, not for every log write. The total size of all binary logs is cached and dynamically updated when updating the binary log on binary log rotation. - max-binlog-total-size is checked against existing log files during serverstart, binlog rotation, FLUSH LOGS, when writing to binary log or when max-binlog-total-size changes value. - Option --slave-connections-needed-for-purge with 1 as default added. This allows one to ensure that we do not delete binary logs if there is less than 'slave-connections-needed-for-purge' connected. Without this option max-binlog-total-size would potentially delete binlogs needed by slaves on server startup or when a slave disconnects as there are then no connected slaves to protect active binlogs. - PURGE BINARY LOGS TO ... will be executed as if slave-connectitons-needed-for-purge would be zero. In other words it will do the purge even if there is no slaves connected. If there are connected slaves working on the logs, these will be protected. - If binary log is on and max-binlog-total_size <> 0 then the status variable 'Binlog_disk_use' shows the current size of all old binary logs + the state of the current one. - Removed test of strcmp(log_file_name, log_info.log_file_name) in purge_logs_before_date() as this is tested in can_purge_logs() - To avoid expensive calls of log_in_use() we cache the result for the last log that is in use by a slave. Future calls to can_purge_logs() for this binary log will be quickly detected and false will be returned until a slave starts working on a new log. - Note that after a binary log rotation caused by max_binlog_size, the last log will not be purged directly as it is still in use internally. The next binary log write will purge binlogs if needed. Reviewer:Kristian Nielsen <knielsen@knielsen-hq.org>	2024-02-14 15:02:21 +01:00
Monty	3907345e22	MDEV-33306 Optimizer choosing incorrect index in 10.6, 10.5 but not in 10.4 In MariaDB up to 10.11, the test_if_cheaper_ordering() code (that tries to optimizer how GROUP BY is executed) assumes that if a table scan is used then if there is any index usable by GROUP BY it will be used. The reason MySQL 10.4 provides a better plan is because of two differences: - Plans using 'ref' has a cost of 1/10 of what it should be (as a protection against table scans). This is why 'ref' is used in 10.4 and not in 10.5. - When 'ref' is used, then GROUP BY will not use an index for GROUP BY. In MariaDB 10.5 the chosen plan is a table scan (as it calculated to be faster) but as 'ref' is not used, the test_if_cheaper_ordering() optimizer phase decides (as ref is not usd) to use an index for GROUP BY, which has bad performance. Description of fix: - All new code is protected by the "optimizer_adjust_secondary_key_costs" variable, which is now a bit map, and is only executed if the option "disable_forced_index_in_group_by" set. - Corrects GROUP BY handling in test_if_cheaper_ordering() by making the choise of using and index with GROUP BY cost based instead of rule based. - Adds TIME_FOR_COMPARE to all costs, when using group by, to make read_time, index_scan_time and range_cost comparable. Other things: - Made optimizer_adjust_secondary_key_costs a bit map (compatible with old code). Notes: Current code ignores costs for the algorithm used when doing GROUP BY on the first table: - Create an in-memory temporary table for handling group by and doing a filesort of the result file We can probably in 10.6 continue to ignore this cost. This patch should NOT be merged to 11.0 series (not needed in 11.0).	2024-02-12 16:43:00 +02:00
Oleksandr Byelkin	d21cb43db1	Merge branch '11.2' into 11.3	2024-02-04 16:42:31 +01:00
Sergei Golubchik	79580f4f96	Merge branch '11.1' into 11.2	2024-02-02 17:43:57 +01:00
Sergei Golubchik	b6680e0101	Merge branch '11.0' into 11.1	2024-02-02 11:30:47 +01:00
Sergei Golubchik	87e13722a9	Merge branch '10.6' into 10.11	2024-02-01 18:36:14 +01:00
Oleksandr Byelkin	fe490f85bb	Merge branch '10.11' into 11.0	2024-01-30 08:54:10 +01:00
Oleksandr Byelkin	14d930db5d	Merge branch '10.6' into 10.11	2024-01-30 08:17:58 +01:00
Kristian Nielsen	d039346a7a	MDEV-4991: GTID binlog indexing Improve the performance of slave connect using B+-Tree indexes on each binlog file. The index allows fast lookup of a GTID position to the corresponding offset in the binlog file, as well as lookup of a position to find the corresponding GTID position. This eliminates a costly sequential scan of the starting binlog file to find the GTID starting position when a slave connects. This is especially costly if the binlog file is not cached in memory (IO cost), or if it is encrypted or a lot of slaves connect simultaneously (CPU cost). The size of the index files is generally less than 1% of the binlog data, so not expected to be an issue. Most of the work writing the index is done as a background task, in the binlog background thread. This minimises the performance impact on transaction commit. A simple global mutex is used to protect index reads and (background) index writes; this is fine as slave connect is a relatively infrequent operation. Here are the user-visible options and status variables. The feature is on by default and is expected to need no tuning or configuration for most users. binlog_gtid_index On by default. Can be used to disable the indexes for testing purposes. binlog_gtid_index_page_size (default 4096) Page size to use for the binlog GTID index. This is the size of the nodes in the B+-tree used internally in the index. A very small page-size (64 is the minimum) will be less efficient, but can be used to stress the BTree-code during testing. binlog_gtid_index_span_min (default 65536) Control sparseness of the binlog GTID index. If set to N, at most one index record will be added for every N bytes of binlog file written. This can be used to reduce the number of records in the index, at the cost only of having to scan a few more events in the binlog file before finding the target position Two status variables are available to monitor the use of the GTID indexes: Binlog_gtid_index_hit Binlog_gtid_index_miss The "hit" status increments for each successful lookup in a GTID index. The "miss" increments when a lookup is not possible. This indicates that the index file is missing (eg. binlog written by old server version without GTID index support), or corrupt. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-01-27 12:09:54 +01:00
Monty	6f65e08277	MDEV-33118 optimizer_adjust_secondary_key_costs variable optimizer-adjust_secondary_key_costs is added to provide 2 small adjustments to the 10.x optimizer cost model. This can be used in the case where the optimizer wrongly uses a secondary key instead of a clustered primary key. The reason behind this change is that MariaDB 10.x does not take into account that for engines like InnoDB, that scanning a primary key can be up to 7x faster than scanning a secondary key + read the row data trough the primary key. The different values for optimizer_adjust_secondary_key_costs are: optimizer_adjust_secondary_key_costs=0 - No changes to current model optimizer_adjust_secondary_key_costs=1 - Ensure that the cost of of secondary indexes has a cost of at least 5x times the cost of a clustered primary key (if one exists). This disables part of the worst_seek optimization described below. optimizer_adjust_secondary_key_costs=2 - Disable "worst_seek optimization" and adjust filter cost slightly (add cost of 1 if filter is used). The idea behind 'worst_seek optimization' is that we limit the cost for all non clustered ref access to the least of: - best-rows-by-range (or all rows in no range found) / 10 - scan-time-table (roughly number of file blocks to scan table) * 3 In addition we also do not try to use rowid_filter if number of rows estimated for 'ref' access is less than the worst_seek limitation. The idea is that worst_seek is trying to take into account that if we do a lot of accesses through a key, this is likely to be cached. However it only does this for secondary keys, and not for clustered keys or index only reads. The effect of the worst_seek are: - In some cases 'ref' will have a much lower cost than range or using a clustered key. - Some possible rowid filters for secondary keys will be ignored. When implementing optimizer_adjust_secondary_key_costs=2, I noticed that there is a slightly different costs for how ref+filter and range+filter are calculated. This caused a lot of range and range+filter to change to ref+filter, which is not good as range+filter provides the optimizer a better estimate of how many accepted rows there will be in the result set. Adding a extra small cost (1 seek) when using filter mitigated the above problems in almost all cases. This patch should not be applied to MariaDB 11.0 as worst_seeks is removed in 11.0 and the cost calculation for clustered keys, secondary keys, index scan and filter is more exact. Test case changes for --optimizer-adjust_secondary_key_costs=1 (Fix secondary key costs to be 5x of primary key): - stat_tables_innodb: - Complex change (probably ok as number of rows are really small) - ref over 1 row changed to range over 10 rows with join buffer - ref over 5 rows changed to eq_ref - secondary ref over 1 row changed to ref of primary key over 4 rows - Change of key to use longer key with index pushdown (a little bit worse but not significant). - Change to use secondary (1 row) -> primary (4 rows) - rowid_filter_innodb: - index_merge (2 rows) & ref (1) -> all (23 rows) -> primary eq_ref. Test case changes for --optimizer-adjust_secondary_key_costs=2 (remove of worst_seeks & adjust filter cost): - stat_tables_innodb: - Join order change (probably ok as number of rows are really small) - ref (5 rows) & ref(1 row) changed to range (10 rows & join buffer) & eq_ref. - selectivity_innodb: - ref -> ref\|filter (ok) - rowid_filter_innodb: - ref -> ref\|filter (ok) - range\|filter (64 rows) changed to ref\|filter (128 rows). ok as ref\|filter outputs wrong number of rows in explain. - range, range_mrr_icp: -ref (500 rows -> ALL (1000 rows) (ok) - select_pkeycache, select, select_jcl6: - ref\|filter (2 rows) -> ref (2 rows) (ok) - selectivity: - ref -> ref_filter (ok) - range: - Change of 'filtered' but no stat or plan change (ok) - selectivity: - ref -> ref+filter (ok) - Change of filtered but no plan change (ok) - join_nested_jcl6: - range -> ref\|filter (ok as only 2 rows) - subselect3, subselect3_jcl6: - ref_or_null (4 rows) -> ALL (10 rows) (ok) - Index_subquery (4 rows) -> ALL (10 rows) (ok) - partition_mrr_myisam, partition_mrr_aria and partition_mrr_innodb: - Uses ALL instead of REF for a key value that is the same for > 50% of rows. (good) order_by_innodb: - range (200 rows) -> ref (20 rows)+filesort (ok) - subselect_sj2_mat: - One test changed. One ALL removed and replaced with eq_ref. Likely to be better. - join_cache: - Changed ref over 60% of the rows to use hash join (ok) - opt_tvc: - Changed to use eq_ref instead of ref with plan change (probably ok) - opt_trace: - No worst/max seeks clipping (good). - Almost double range_scan_time and index_scan_time (ok). - rowid_filter: - ref -> ref\|filtered (ok) - range\|filter (77 rows) changed to ref\|filter (151 rows). Proably ok as ref\|filter outputs wrong number of rows in explain. Reviewer: Sergei Petrunia <sergey@mariadb.com>	2024-01-23 13:03:11 +02:00
Michael Widenius	7af50e4df4	MDEV-32551: "Read semi-sync reply magic number error" warnings on master rpl_semi_sync_slave_enabled_consistent.test and the first part of the commit message comes from Brandon Nesterenko. A test to show how to induce the "Read semi-sync reply magic number error" message on a primary. In short, if semi-sync is turned on during the hand-shake process between a primary and replica, but later a user negates the rpl_semi_sync_slave_enabled variable while the replica's IO thread is running; if the io thread exits, the replica can skip a necessary call to kill_connection() in repl_semisync_slave.slave_stop() due to its reliance on a global variable. Then, the replica will send a COM_QUIT packet to the primary on an active semi-sync connection, causing the magic number error. The test in this patch exits the IO thread by forcing an error; though note a call to STOP SLAVE could also do this, but it ends up needing more synchronization. That is, the STOP SLAVE command also tries to kill the VIO of the replica, which makes a race with the IO thread to try and send the COM_QUIT before this happens (which would need more debug_sync to get around). See THD::awake_no_mutex for details as to the killing of the replica’s vio. Notes: - The MariaDB documentation does not make it clear that when one enables semi-sync replication it does not matter if one enables it first in the master or slave. Any order works. Changes done: - The rpl_semi_sync_slave_enabled variable is now a default value for when semisync is started. The variable does not anymore affect semisync if it is already running. This fixes the original reported bug. Internally we now use repl_semisync_slave.get_slave_enabled() instead of rpl_semi_sync_slave_enabled. To check if semisync is active on should check the @@rpl_semi_sync_slave_status variable (as before). - The semisync protocol conflicts in the way that the original MySQL/MariaDB client-server protocol was designed (client-server send and reply packets are strictly ordered and includes a packet number to allow one to check if a packet is lost). When using semi-sync the master and slave can send packets at 'any time', so packet numbering does not work. The 'solution' has been that each communication starts with packet number 1, but in some cases there is still a chance that the packet number check can fail. Fixed by adding a flag (pkt_nr_can_be_reset) in the NET struct that one can use to signal that packet number checking should not be done. This is flag is set when semi-sync is used. - Added Master_info::semi_sync_reply_enabled to allow one to configure some slaves with semisync and other other slaves without semisync. Removed global variable semi_sync_need_reply that would not work with multi-master. - Repl_semi_sync_master::report_reply_packet() can now recognize the COM_QUIT packet from semisync slave and not give a "Read semi-sync reply magic number error" error for this case. The slave will be removed from the Ack listener. - On Windows, don't stop semisync Ack listener just because one slave connection is using socket_id > FD_SETSIZE. - Removed busy loop in Ack_receiver::run() by using "Self-pipe trick" to signal new slave and stop Ack_receiver. - Changed some Repl_semi_sync_slave functions that always returns 0 from int to void. - Added Repl_semi_sync_slave::slave_reconnect(). - Removed dummy_function Repl_semi_sync_slave::reset_slave(). - Removed some duplicate semisync notes from the error log. - Add test of "if (get_slave_enabled() && semi_sync_need_reply)" before calling Repl_semi_sync_slave::slave_reply(). (Speeds up the code as we can skip all initializations). - If epl_semisync_slave.slave_reply() fails, we disable semisync for that connection. - We do not call semisync.switch_off() if there are no active slaves. Instead we check in Repl_semi_sync_master::commit_trx() if there are no active threads. This simplices the code. - Changed assert() to DBUG_ASSERT() to ensure that the DBUG log is flushed in case of asserts. - Removed the internal rpl_semi_sync_slave_status as it is not needed anymore. The @@rpl_semi_sync_slave_status status variable is now mapped to rpl_semi_sync_enabled. - Removed rpl_semi_sync_slave_enabled as it is not needed anymore. Repl_semi_sync_slave::get_slave_enabled() contains the active status. - Added checking that we do not add a slave twice with Ack_receiver::add_slave(). This could happen with old code. - Removed Repl_semi_sync_master::check_and_switch() as it is not needed anymore. - Ensure that when we call Ack_receiver::remove_slave() that the slave is removed from the listener before function returns. - Call listener.listen_on_sockets() outside of mutex for better performance and less contested mutex. - Ensure that listening is ignoring newly added slaves when checking for responses. - Fixed the master ack_receiver listener is not killed if there are no connected slaves (and thus stop semisync handling of future connections). This could happen if all slaves sockets where would be marked as unreliable. - Added unlink() to base_ilist_iterator and remove() to I_List_iterator. This enables us to remove 'dead' slaves in Ack_recever::run(). - kill_zombie_dump_threads() now does killing of dump threads properly. - It can now kill several threads (should be impossible but could happen if IO slaves reconnects very fast). - We now wait until the dump thread is done before starting the dump. - Added an error if kill_zombie_dump_threads() fails. - Set thd->variables.server_id before calling kill_zombie_dump_threads(). This simplies the code. - Added a lot of comments both in code and tests. - Removed DBUG_EVALUATE_IF "failed_slave_start" as it is not used. Test changes: - rpl.rpl_session_var2 added which runs rpl.rpl_session_var test with semisync enabled. - Some timings changed slight with startup of slave which caused rpl_binlog_dump_slave_gtid_state_info.text to fail as it checked the error log file before the slave had started properly. Fixed by adding wait_for_pattern_in_file.inc that allows waiting for the pattern to appear in the log file. - Tests have been updated so that we first set rpl_semi_sync_master_enabled on the master and then set rpl_semi_sync_slave_enabled on the slaves (this is according to how the MariaDB documentation document how to setup semi-sync). - Error text "Master server does not have semi-sync enabled" has been replaced with "Master server does not support semi-sync" for the case when the master supports semi-sync but semi-sync is not enabled. Other things: - Some trivial cleanups in Repl_semi_sync_master::update_sync_header(). - We should in 11.3 changed the default value for rpl-semi-sync-master-wait-no-slave from TRUE to FALSE as the TRUE does not make much sense as default. The main difference with using FALSE is that we do not wait for semisync Ack if there are no slave threads. In the case of TRUE we wait once, which did not bring any notable benefits except slower startup of master configured for using semisync. Co-author: Brandon Nesterenko <brandon.nesterenko@mariadb.com> This solves the problem reported in MDEV-32960 where a new slave may not be registered in time and the master disables semi sync because of that.	2024-01-23 13:03:11 +02:00
Sergei Golubchik	c154aafe1a	Merge remote-tracking branch '11.3' into 11.4	2023-12-21 15:40:55 +01:00
Sergei Golubchik	7f0094aac8	Merge branch '11.2' into 11.3	2023-12-21 02:14:59 +01:00
Sergei Golubchik	fef31a26f3	Merge branch '11.1' into 11.2	2023-12-20 23:43:05 +01:00
Sergei Golubchik	7a5448f8da	Merge branch '11.0' into 11.1	2023-12-19 20:11:54 +01:00
Sergei Golubchik	8c8bce05d2	Merge branch '10.11' into 11.0	2023-12-19 15:53:18 +01:00
Sergei Golubchik	fd0b47f9d6	Merge branch '10.6' into 10.11	2023-12-18 11:19:04 +01:00
Sergei Golubchik	e95bba9c58	Merge branch '10.5' into 10.6	2023-12-17 11:20:43 +01:00
Sergei Golubchik	98a39b0c91	Merge branch '10.4' into 10.5	2023-12-02 01:02:50 +01:00
Vladislav Vaintroub	96250c8269	Merge 11.1 into 11.2 Fix old_mode flags conflict between OLD_MODE_NO_NULL_COLLATION_IDS and OLD_MODE_LOCK_ALTER_TABLE_COPY. Both flags used to be 1 << 6, now OLD_MODE_LOCK_ALTER_TABLE_COPY changed to be 1 << 7	2023-11-30 22:12:31 +01:00
Vladislav Vaintroub	2b40f8d2ca	Merge branch '11.0' into 11.1	2023-11-30 19:13:30 +01:00
Vladislav Vaintroub	b42f318996	Merge 10.11 into 11.0	2023-11-30 19:12:01 +01:00
Vladislav Vaintroub	9d07b0520c	MDEV-31608 - Connector/NET fails to connect since 10.10 Connector/NET does not expect collation IDs returned by "show collations" to be NULL, runs into an exception. The fix is to determine connector/net using its connection attributes, then make sure "show collations" does not output NULL IDs. The patch introduces new old_mode NO_NULL_COLLATION_IDs, that is automatically set, once MySQL Connector/NET connection is determined. A test was added, that uses MySql.Data from powershell - only works if MySql.Data is installed into GAC (i.e with C/NET MSI package)	2023-11-30 13:53:45 +01:00

1 2 3 4 5 ...

426 Commits