mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-08-07 00:04:31 +03:00

Author	SHA1	Message	Date
Marko Mäkelä	49a6baec56	Merge 10.11 into 11.4	2025-03-03 11:07:56 +02:00
Oleg Smirnov	733852d4c3	BKA join cache buffer is employed despite join_cache_level=3 (flat BNLH) In the `check_join_cache_usage()` function there is a branching issue where an accidental fall-through to BKA/BKAH buffers may occur, even when the join_cache_level setting does not permit their use. This patch corrects the condition to ensure that BKA/BKAH join caching is only enabled when explicitly allowed by join_cache_level Reviewer: Sergei Petrunia <sergey@mariadb.com>	2025-02-27 16:47:25 +07:00
Oleksandr Byelkin	69d033d165	Merge branch '10.11' into 11.2	2024-10-29 16:42:46 +01:00
Oleksandr Byelkin	3d0fb15028	Merge branch '10.6' into 10.11	2024-10-29 15:24:38 +01:00
Sergei Petrunia	0540eac05c	MDEV-35180: ref_to_range rewrite causes poor query plan (Variant 2: only allow rewrite for ref(const)) make_join_select() has a "ref_to_range" rewrite: it would rewrite any ref access to a range access on the same index if the latter uses more keyparts. It seems, he initial intent of this was to fix poor query plan choice in cases like t.keypart1=const AND t.keypart2 < 'foo' Due to deficiency in cost model, ref access could be picked while range would enumerate fewer rows and be cheaper. However, the condition also forces a rewrite in cases like: t.keypart1=prev_table.col AND t.keypart1<='foo' AND t.keypart2<'bar' Here, it can be that * keypart1=prev_table.col is highly selective * (keypart1, keypart2) <= ('foo', 'bar') is not at all selective. Still, the rewrite would be made and poor query plan chosen. Fixed this by only doing the rewrite if ref access was ref(const) so we can be certain that quick select also used these restrictions and will scan a subset of rows that ref access would scan.	2024-10-18 13:37:04 +03:00
Oleksandr Byelkin	80abd847da	Merge branch '10.11' into 11.1	2024-08-03 09:32:42 +02:00
Oleksandr Byelkin	0e8fb977b0	Merge branch '10.6' into 10.11	2024-08-03 09:15:40 +02:00
Oleksandr Byelkin	8f020508c8	Merge branch '10.5' into 10.6	2024-08-03 09:04:24 +02:00
Sergei Petrunia	fdda8171b2	MDEV-34580: Assertion `(key_part->key_part_flag & 4) == 0' failed key_hashnr Remove an assert added by fix for MDEV-34417. BNL-H join can be used with prefix keys. This happens when there are real prefix indexes on the equi-join columns (although it probably doesn't make a lot of sense). Anyway, remove the assert. The code receives properly truncated key values for hashing/comparison so it can handle them just fine.	2024-07-30 17:49:09 +03:00
Oleksandr Byelkin	2447dda2c0	Merge branch '10.11' into 11.1	2024-07-08 22:40:16 +02:00
Oleksandr Byelkin	034a175982	Merge branch '10.6' into 10.11	2024-07-04 11:52:07 +02:00
Oleksandr Byelkin	dcd8a64892	Merge branch '10.5' into 10.6	2024-07-03 13:27:23 +02:00
Lena Startseva	9e74a7f4f3	Removing MDEV-27871 from tastcases because it is not a bug	2024-06-28 16:45:50 +07:00
Sergei Golubchik	f9807aadef	Merge branch '10.11' into 11.0	2024-05-12 12:18:28 +02:00
Sergei Golubchik	018d537ec1	Merge branch '10.6' into 10.11	2024-04-22 15:23:10 +02:00
Marko Mäkelä	829cb1a49c	Merge 10.5 into 10.6	2024-04-17 14:14:58 +03:00
Oleksandr Byelkin	9b18275623	Merge branch '10.4' into 10.5	2024-04-16 11:04:14 +02:00
Sergei Petrunia	8cc36fb743	MDEV-21102: Server crashes in JOIN_CACHE::write_record_data upon EXPLAIN with subqueries JOIN_CACHE has a light-weight initialization mode that's targeted at EXPLAINs. In that mode, JOIN_CACHE objects are not able to execute. Light-weight mode was used whenever the statement was an EXPLAIN. However the EXPLAIN can execute subqueries, provided they enumerate less than @@expensive_subquery_limit rows. Make sure we use light-weight initialization mode only when the select is more expensive @@expensive_subquery_limit. Also add an assert into JOIN_CACHE::put_record() which prevents its use if it was initialized for EXPLAIN only.	2024-04-04 10:30:09 +03:00
Oleksandr Byelkin	48af85db21	Merge branch '10.11' into 11.0	2023-11-08 17:09:44 +01:00
Oleksandr Byelkin	fecd78b837	Merge branch '10.10' into 10.11	2023-11-08 16:46:47 +01:00
Oleksandr Byelkin	04d9a46c41	Merge branch '10.6' into 10.10	2023-11-08 16:23:30 +01:00
Oleksandr Byelkin	b83c379420	Merge branch '10.5' into 10.6	2023-11-08 15:57:05 +01:00
Oleksandr Byelkin	6cfd2ba397	Merge branch '10.4' into 10.5	2023-11-08 12:59:00 +01:00
Igor Babaev	954a6decd4	MDEV-32351 Significant slowdown for query with many outer joins This patch fixes a performance regression introduced in the patch for the bug MDEV-21104. The performance regression could affect queries for which join buffer was used for an outer join such that its on expression from which a conjunctive condition depended only on outer tables can be extracted. If the number of records in the join buffer for which this condition was false greatly exceeded the number of other records the slowdown could be significant. If there is a conjunctive condition extracted from the ON expression depending only on outer tables this condition is evaluated when interesting fields of each survived record of outer tables are put into the join buffer. Each such set of fields for any join operation is supplied with a match flag field used to generate null complemented rows. If the result of the evaluation of the condition is false the flag is set to MATCH_IMPOSSIBLE. When looking in the join buffer for records matching a record of the right operand of the outer join operation the records with such flags are not needed to be unpacked into record buffers for evaluation of on expressions. The patch for MDEV-21104 fixing some problem of wrong results when 'not exists' optimization by mistake broke the code that allowed to ignore records with the match flag set to MATCH_IMPOSSIBLE when looking for matching records. As a result such records were unpacked for each record of the right operand of the outer join operation. This caused significant execution penalty in some cases. One of the test cases added in the patch can be used only for demonstration of the restored performance for the reported query. The second test case is needed to demonstrate the validity of the fix.	2023-10-27 15:44:46 +02:00
Oleksandr Byelkin	adf84c827b	Merge branch '10.11' into 11.0	2023-08-10 21:21:53 +02:00
Oleksandr Byelkin	587d0b944f	Merge branch '10.10' into 10.11	2023-08-10 21:21:22 +02:00
Oleksandr Byelkin	cce155cc90	Merge branch '10.9' into 10.10	2023-08-10 21:20:38 +02:00
Oleksandr Byelkin	3e0009dc3a	Merge branch '10.6' into 10.9	2023-08-10 21:19:03 +02:00
Oleksandr Byelkin	0d16eb35bc	Merge branch '10.5' into 10.6	2023-08-10 21:18:25 +02:00
Oleksandr Byelkin	7e650253dc	Merge branch '10.4' into 10.5	2023-08-10 21:17:44 +02:00
Monty	2aea938749	MDEV-31893 Valgrind reports issues in main.join_cache_notasan This is also related to MDEV-31348 Assertion `last_key_entry >= end_pos' failed in virtual bool JOIN_CACHE_HASHED::put_record() Valgrind exposed a problem with the join_cache for hash joins: =25636== Conditional jump or move depends on uninitialised value(s) ==25636== at 0xA8FF4E: JOIN_CACHE_HASHED::init_hash_table() (sql_join_cache.cc:2901) The reason for this was that avg_record_length contained a random value if one had used SET optimizer_switch='optimize_join_buffer_size=off'. This causes either 'random size' memory to be allocated (up to join_buffer_size) which can increase memory usage or, if avg_record_length is less than the row size, memory overwrites in thd->mem_root, which is bad. Fixed by setting avg_record_length in JOIN_CACHE_HASHED::init() before it's used. There is no test case for MDEV-31893 as valgrind of join_cache_notasan checks that. I added a test case for MDEV-31348.	2023-08-10 20:57:42 +02:00
Monty	e9333ff03c	MDEV-31893 Valgrind reports issues in main.join_cache_notasan This is also related to MDEV-31348 Assertion `last_key_entry >= end_pos' failed in virtual bool JOIN_CACHE_HASHED::put_record() Valgrind exposed a problem with the join_cache for hash joins: =25636== Conditional jump or move depends on uninitialised value(s) ==25636== at 0xA8FF4E: JOIN_CACHE_HASHED::init_hash_table() (sql_join_cache.cc:2901) The reason for this was that avg_record_length contained a random value if one had used SET optimizer_switch='optimize_join_buffer_size=off'. This causes either 'random size' memory to be allocated (up to join_buffer_size) which can increase memory usage or, if avg_record_length is less than the row size, memory overwrites in thd->mem_root, which is bad. Fixed by setting avg_record_length in JOIN_CACHE_HASHED::init() before it's used. There is no test case for MDEV-31893 as valgrind of join_cache_notasan checks that. I added a test case for MDEV-31348.	2023-08-10 17:35:37 +03:00
Oleksandr Byelkin	51f9d62005	Merge branch '10.11' into 11.0	2023-08-09 07:53:48 +02:00
Oleksandr Byelkin	036df5f970	Merge branch '10.10' into 10.11	2023-08-08 14:57:31 +02:00
Oleksandr Byelkin	ced243a099	Merge branch '10.9' into 10.10	2023-08-05 20:34:09 +02:00
Oleksandr Byelkin	34a8e78581	Merge branch '10.6' into 10.9	2023-08-04 08:01:06 +02:00
Oleksandr Byelkin	6bf8483cac	Merge branch '10.5' into 10.6	2023-08-01 15:08:52 +02:00
Oleksandr Byelkin	f52954ef42	Merge commit '10.4' into 10.5	2023-07-20 11:54:52 +02:00
Monty	d657f18ea7	MDEV-31226 Server crash or assertion failure with row size close to join_buffer_size The problem was that JOIN_CACHE::alloc_buffer() did not check if the given join_buffer_value is less than the query require. Added a check for this and disabled join cache if it cannot be used.	2023-05-27 16:55:39 +03:00
Monty	84b9fc25a2	Fixed wrong test cases (embedded and ASAN) - main.selectivity failed because one test produced different result with embedded (missing feature). Fixed by moving the failing part to selectivity_notembedded. - Disabled maria.encrypt-no-key for embedded as embedded does not support encryption - Moved test from join_cache to join_cache_notasan that tried to alloc() a buffer bigger than available memory.	2023-05-05 13:15:14 +03:00
Monty	5fd46be5a7	Fixed calculation of JOIN_CACHE::max_records The old code did set max_records to either number_of_rows (partial_join_cardinality) or memory size (join_buffer_space_limit) which did not make sense. Fixed by setting max_records to number of rows that fits into join_buffer_size. Other things: - Initialize buffer cache values in JOIN_CACHE constructors (safety) Reviewer: Sergei Petrunia <sergey@mariadb.com>	2023-05-05 12:50:05 +03:00
Monty	08a4732860	MDEV-28217 Incorrect Join Execution When Controlling Join Buffer Size The problem was that join_buffer_size conflicted with join_buffer_space_limit, which caused the query to be run without join buffer. However this caused wrong results as the optimizer assumed that hash+join buffer would ensure that the equi-join condition would be satisfied, and didn't check it itself. Fixed by not using join_buffer_space_limit when optimize_join_buffer_size=off. This matches the documentation at https://mariadb.com/kb/en/block-based-join-algorithms Other things: - Removed not used variable JOIN_TAB::join_buffer_size_limit - Give an error if we cannot allocate a join buffer. This can only happen if the join_buffer variables are wrongly configured or we are running out of memory. In the future, instead of returning an error, we could properly convert the query plan that uses BNL-H join into one that doesn't use join buffering: make sure the equi-join condition is checked where appropriate. Reviewer: Sergei Petrunia <sergey@mariadb.com>	2023-05-04 18:40:28 +03:00
Monty	3bdc5542dc	MDEV-30812: Improve output cardinality estimates for hash join Introduces @@optimizer_switch flag: hash_join_cardinality When this option is on, use EITS statistics to produce tighter bounds for hash join output cardinality. This patch is an extension / replacement to a similar patch in 10.6 New features: - optimizer_switch hash_join_cardinality is on by default - records_out is set to fanout when HASH is used - Fixed bug in is_eits_usable: The function did not work with views	2023-05-03 21:44:57 +03:00
Monty	15e889c300	MDEV-30699: Updated prev_record_reads() to be more exact The old code in prev_record_reads() did give wrong estimates when a join_buffer was used or if the table was depending on more than one other tables. When join_cache is used, it will cause a re-order of row combinations, which causes more calls to the engine for tables that are depending on tables before the join_cached one. The new prev_records_read() code provides more exact estimates and should never give a 'too low estimate', assuming that the data to the function is correct The definition of prev_record_read() is also updated. The new definition is: "Estimate the number of engine ha_index_read_calls for EQ_REF tables when taking into account the one-row-cache in join_read_always_key()" The cost of using prev_record_reads() value is changed. The value is now used similar as before to calculate the cost of the storage engine calls. However the cost of the WHERE cost is changed to take into account the total number of row combinations as the WHERE has to be checked even if the one-row-cache is used. This makes the cost slightly higher than before (for the same prev_record_reads() value). Other things: - Cached return value of prev_record_read() in best_access_path() to avoid some function calls. - Fixed bug where position[].use_join_buffer was set in best_acess_path() when join buffer was not used. This confused the semi join optimizer to try to reoptimize plans that did not need to be reoptimized. The effect of the bug fix is that we avoid doing some re-optimziations with semi-joins when join_buffer is not used. In these cases the value shown for the 'Filtering' column in EXPLAIN EXTENDED may change. - Added 'prev_record.cc' that was used to verify the logic in prev_record_reads(). Changes in test suite: - EQ_REF tables are moved up to be earlier. This is because either the higher WHERE cost when EQ_REF is used with more row combination or change of cost when using join_cache. - Filtered has changed (to the better) for some cases using semi-joins subselect_sj.test subselect_sj_jcl6.test	2023-02-21 15:36:39 +03:00
Marko Mäkelä	2e431ff7e6	Merge 10.11 into 11.0	2023-02-16 13:34:45 +02:00
Monty	3fa99f0c0e	Change cost for REF to take into account cost for 1 extra key read_next The main difference in code path between EQ_REF and REF is that for REF we have to do an extra read_next on the index to check that there is no more matching rows. Before this patch we added a preference of EQ_REF by ensuring that REF would always estimate to find at least 2 rows. This patch adds the cost of the extra key read_next to REF access and removes the code that limited REF to at least 2 rows. For some queries this can have a big effect as the total estimated rows will be halved for each REF table with 1 rows. multi_range cost calculations are also changed to take into account the difference between EQ_REF and REF. The effect of the patch to the test suite: - About 80 test case changed - Almost all changes where for EXPLAIN where estimated rows for REF where changed from 2 to 1. - A few test cases using explain extended had a change of 'filtered'. This is because of the estimated rows are now closer to the calculated selectivity. - A very few test had a change of table order. This is because the change of estimated rows from 2 to 1 or the small cost change for REF (main.subselect_sj_jcl6, main.group_by, main.dervied_cond_pushdown, main.distinct, main.join_nested, main.order_by, main.join_cache) - No key statistics and the estimated rows are now smaller which cased estimated filtering to be lower. (main.subselect_sj_mat) - The number of total rows are halved. (main.derived_cond_pushdown) - Plans with 1 row changed to use RANGE instead of REF. (main.group_min_max) - ALL changed to REF (main.key_diff) - Key changed from ref + index_only to PRIMARY key for InnoDB, as OPTIMIZER_ROW_LOOKUP_COST + OPTIMIZER_ROW_NEXT_FIND_COST is smaller than OPTIMIZER_KEY_LOOKUP_COST + OPTIMIZER_KEY_NEXT_FIND_COST. (main.join_outer_innodb) - Cost changes printouts (main.opt_trace*) - Result order change (innodb_gis.rtree)	2023-02-10 12:58:50 +02:00
Daniel Black	483ddb5684	MDEV-30621: Türkiye is the correct current country naming As requested to the UN the country formerly known as Turkey is to be refered to as Türkiye.	2023-02-10 08:44:14 +11:00
Sergei Petrunia	6c4076fac4	MDEV-30032: EXPLAIN FORMAT=JSON output: part #2 : print 'loops'.	2023-02-03 11:22:17 +03:00
Sergei Petrunia	ffe0beca25	MDEV-30032: EXPLAIN FORMAT=JSON output: print costs Basic printout for join and table execution costs.	2023-02-03 11:01:24 +03:00
Monty	727491b72a	Added test cases for preceding test This includes all test changes from "Changing all cost calculation to be given in milliseconds" and forwards. Some of the things that caused changes in the result files: - As part of fixing tests, I added 'echo' to some comments to be able to easier find out where things where wrong. - MATERIALIZED has now a higher cost compared to X than before. Because of this some MATERIALIZED types have changed to DEPENDEND SUBQUERY. - Some test cases that required MATERIALIZED to repeat a bug was changed by adding more rows to force MATERIALIZED to happen. - 'Filtered' in SHOW EXPLAIN has in many case changed from 100.00 to something smaller. This is because now filtered also takes into account the smallest possible ref access and filters, even if they where not used. Another reason for 'Filtered' being smaller is that we now also take into account implicit filtering done for subqueries using FIRSTMATCH. (main.subselect_no_exists_to_in) This is caluculated in best_access_path() and stored in records_out. - Table orders has changed because more accurate costs. - 'index' and 'ALL' for small tables has changed to use 'range' or 'ref' because of optimizer_scan_setup_cost. - index can be changed to 'range' as 'range' optimizer assumes we don't have to read the blocks from disk that range optimizer has already read. This can be confusing in the case where there is no obvious where clause but instead there is a hidden 'key_column > NULL' added by the optimizer. (main.subselect_no_exists_to_in) - Scan on primary clustered key does not report 'Using Index' anymore (It's a table scan, not an index scan). - For derived tables, the number of rows is now 100 instead of 2, which can be seen in EXPLAIN. - More tests have "Using index for group by" as the cost of this optimization is now more correct (lower). - A primary key could be preferred for a normal key, even if it would access more rows, as it's faster to do 1 lokoup and 3 'index_next' on a clustered primary key than one lookup trough a secondary. (main.stat_tables_innodb) Notes: - There was a 4.7% more calls to best_extension_by_limited_search() in the main.greedy_optimizer test. However examining the test results it looked that the plans where slightly better (eq_ref where more chained together) so I assume this is ok. - I have verified a few test cases where there was notable/unexpected changes in the plan and in all cases the new optimizer plans where faster. (main.greedy_optimizer and some others)	2023-02-03 00:00:35 +03:00

1 2 3

101 Commits