mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-08-08 11:22:35 +03:00

Author	SHA1	Message	Date
Alexander Barkov	f738cc9876	MDEV-29095 REGEXP_REPLACE treats empty strings different than REPLACE in ORACLE mode Turning REGEXP_REPLACE into two schema-qualified functions: - mariadb_schema.regexp_replace() - oracle_schema.regexp_replace() Fixing oracle_schema.regexp_replace(subj,pattern,replacement) to treat NULL in "replacement" as an empty string. Adding new classes implementing oracle_schema.regexp_replace(): - Item_func_regexp_replace_oracle - Create_func_regexp_replace_oracle Adding helper methods: - String Item::val_str_null_to_empty(String to) - String Item::val_str_null_to_empty(String to, bool null_to_empty) and reusing these methods in both Item_func_replace and Item_func_regexp_replace.	2024-01-24 10:59:17 +04:00
Alexey Botchkov	9d88c5b8b4	MDEV-31616 Problems with a stored function EMPTY() on upgrade to 10.6. The IDENT_sys doesn't include keywords, so the function with the keyword name can be created, but cannot be called. Moving keywords to new rules keyword_func_sp_var_and_label and keyword_func_sp_var_not_label so the functions with these names are allowed.	2024-01-24 09:59:55 +04:00
Marko Mäkelä	9374772ecd	Merge 10.11 into 11.0	2024-01-19 09:07:48 +02:00
Marko Mäkelä	9d20853c74	Merge 10.6 into 10.11	2024-01-18 19:22:23 +02:00
Marko Mäkelä	3a96eba25f	Merge 10.5 into 10.6	2024-01-17 13:35:05 +02:00
Kristian Nielsen	5b0a4159ef	Fix test failures on s390x in test following main.column_compression_rpl The problem is the test is skipped after sourcing include/master-slave.inc. This leaves the slave threads running after the test is skipped, causing a following test to fail during rpl setup. Also rename have_normal_bzip.inc to the more appropriate _zlib. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-01-12 17:22:08 +01:00
Sergei Golubchik	c154aafe1a	Merge remote-tracking branch '11.3' into 11.4	2023-12-21 15:40:55 +01:00
Sergei Golubchik	8c8bce05d2	Merge branch '10.11' into 11.0	2023-12-19 15:53:18 +01:00
Sergei Golubchik	fd0b47f9d6	Merge branch '10.6' into 10.11	2023-12-18 11:19:04 +01:00
Alexander Barkov	aed9c656a9	MDEV-32101 CREATE PACKAGE [BODY] for sql_mode=DEFAULT This patch adds PACKAGE support with SQL/PSM dialect for sql_mode=DEFAULT: - CREATE PACKAGE - DROP PACKAGE - CREATE PACKAGE BODY - DROP PACKAGE BODY - Package function and procedure invocation from outside of the package: -- using two step identifiers SELECT pkg.f1(); CALL pkg.p1() -- using three step identifiers SELECT db.pkg.f1(); CALL db.pkg.p1(); This is a non-standard MariaDB extension. However, later this code can be used to implement the SQL Standard and DB2 dialects of CREATE MODULE.	2023-12-18 13:34:55 +04:00
Sergei Golubchik	e95bba9c58	Merge branch '10.5' into 10.6	2023-12-17 11:20:43 +01:00
Sergei Golubchik	98a39b0c91	Merge branch '10.4' into 10.5	2023-12-02 01:02:50 +01:00
Oleksandr Byelkin	48af85db21	Merge branch '10.11' into 11.0	2023-11-08 17:09:44 +01:00
Oleksandr Byelkin	fecd78b837	Merge branch '10.10' into 10.11	2023-11-08 16:46:47 +01:00
Oleksandr Byelkin	04d9a46c41	Merge branch '10.6' into 10.10	2023-11-08 16:23:30 +01:00
Oleksandr Byelkin	b83c379420	Merge branch '10.5' into 10.6	2023-11-08 15:57:05 +01:00
Oleksandr Byelkin	6cfd2ba397	Merge branch '10.4' into 10.5	2023-11-08 12:59:00 +01:00
Alexander Barkov	2b6d241ee4	MDEV-27744 LPAD in vcol created in ORACLE mode makes table corrupted in non-ORACLE The crash happened with an indexed virtual column whose value is evaluated using a function that has a different meaning in sql_mode='' vs sql_mode=ORACLE: - DECODE() - LTRIM() - RTRIM() - LPAD() - RPAD() - REPLACE() - SUBSTR() For example: CREATE TABLE t1 ( b VARCHAR(1), g CHAR(1) GENERATED ALWAYS AS (SUBSTR(b,0,0)) VIRTUAL, KEY g(g) ); So far we had replacement XXX_ORACLE() functions for all mentioned function, e.g. SUBSTR_ORACLE() for SUBSTR(). So it was possible to correctly re-parse SUBSTR_ORACLE() even in sql_mode=''. But it was not possible to re-parse the MariaDB version of SUBSTR() after switching to sql_mode=ORACLE. It was erroneously mis-interpreted as SUBSTR_ORACLE(). As a result, this combination worked fine: SET sql_mode=ORACLE; CREATE TABLE t1 ... g CHAR(1) GENERATED ALWAYS AS (SUBSTR(b,0,0)) VIRTUAL, ...; INSERT ... FLUSH TABLES; SET sql_mode=''; INSERT ... But the other way around it crashed: SET sql_mode=''; CREATE TABLE t1 ... g CHAR(1) GENERATED ALWAYS AS (SUBSTR(b,0,0)) VIRTUAL, ...; INSERT ... FLUSH TABLES; SET sql_mode=ORACLE; INSERT ... At CREATE time, SUBSTR was instantiated as Item_func_substr and printed in the FRM file as substr(). At re-open time with sql_mode=ORACLE, "substr()" was erroneously instantiated as Item_func_substr_oracle. Fix: The fix proposes a symmetric solution. It provides a way to re-parse reliably all sql_mode dependent functions to their original CREATE TABLE time meaning, no matter what the open-time sql_mode is. We take advantage of the same idea we previously used to resolve sql_mode dependent data types. Now all sql_mode dependent functions are printed by SHOW using a schema qualifier when the current sql_mode differs from the function sql_mode: SET sql_mode=''; CREATE TABLE t1 ... SUBSTR(a,b,c) ..; SET sql_mode=ORACLE; SHOW CREATE TABLE t1; -> mariadb_schema.substr(a,b,c) SET sql_mode=ORACLE; CREATE TABLE t2 ... SUBSTR(a,b,c) ..; SET sql_mode=''; SHOW CREATE TABLE t1; -> oracle_schema.substr(a,b,c) Old replacement names like substr_oracle() are still understood for backward compatibility and used in FRM files (for downgrade compatibility), but they are not printed by SHOW any more.	2023-11-08 15:01:20 +04:00
Alexander Barkov	09e237088c	MDEV-31184 Remove parser tokens DECODE_MARIADB_SYM and DECODE_ORACLE_SYM Changing the code handling sql_mode-dependent function DECODE(): - removing parser tokens DECODE_MARIADB_SYM and DECODE_ORACLE_SYM - removing the DECODE() related code from sql_yacc.yy/sql_yacc_ora.yy - adding handling of DECODE() with help of a new Create_func_func_decode	2023-10-24 01:45:47 +04:00
Marko Mäkelä	be24e75229	Merge 10.11 into 11.0	2023-10-19 08:12:16 +03:00
Marko Mäkelä	2ecc0443ec	Merge 10.10 into 10.11	2023-10-17 16:04:21 +03:00
Marko Mäkelä	d5e15424d8	Merge 10.6 into 10.10 The MDEV-29693 conflict resolution is from Monty, as well as is a bug fix where ANALYZE TABLE wrongly built histograms for single-column PRIMARY KEY. Also includes a fix for safe_malloc error reporting. Other things: - Copied main.log_slow from 10.4 to avoid mtr issue Disabled test: - spider/bugfix.mdev_27239 because we started to get +Error 1429 Unable to connect to foreign data source: localhost -Error 1158 Got an error reading communication packets - main.delayed - Bug#54332 Deadlock with two connections doing LOCK TABLE+INSERT DELAYED This part is disabled for now as it fails randomly with different warnings/errors (no corruption).	2023-10-14 13:36:11 +03:00
Marko Mäkelä	0dd25f28f7	Merge 10.5 into 10.6	2023-09-11 14:46:39 +03:00
Monty	b08474435f	Fix compression tests for s390x The problem is that s390x is not using the default bzip library we use on other platforms, which causes compressed string lengths to be differnt than what mtr tests expects. Fixed by: - Added have_normal_bzip.inc, which checks if compress() returns the expected length. - Adjust the results to match the expected one - main.func_compress.test & archive.archive - Don't print lengths that depends on compression library - mysqlbinlog compress tests & connect.zip - Don't print DATA_LENGTH for SET column_compression_zlib_level=1 - main.column_compression	2023-09-05 12:34:39 +03:00
Oleksandr Byelkin	51f9d62005	Merge branch '10.11' into 11.0	2023-08-09 07:53:48 +02:00
Oleksandr Byelkin	036df5f970	Merge branch '10.10' into 10.11	2023-08-08 14:57:31 +02:00
Oleksandr Byelkin	ced243a099	Merge branch '10.9' into 10.10	2023-08-05 20:34:09 +02:00
Oleksandr Byelkin	34a8e78581	Merge branch '10.6' into 10.9	2023-08-04 08:01:06 +02:00
Oleksandr Byelkin	6bf8483cac	Merge branch '10.5' into 10.6	2023-08-01 15:08:52 +02:00
Oleksandr Byelkin	f291c3df2c	Merge branch '10.4' into 10.5	2023-07-27 15:43:21 +02:00
Marko Mäkelä	f2b4972bd4	Merge 10.11 into 11.0	2023-07-26 15:13:06 +03:00
Marko Mäkelä	bce3ee704f	Merge 10.10 into 10.11	2023-07-26 14:44:43 +03:00
Marko Mäkelä	b1b47264d2	Merge 10.9 into 10.10	2023-07-26 14:17:36 +03:00
Marko Mäkelä	864bbd4d09	Merge 10.6 into 10.9	2023-07-26 13:42:23 +03:00
Lena Startseva	9854fb6fa7	MDEV-31003: Second execution for ps-protocol This patch adds for "--ps-protocol" second execution of queries "SELECT". Also in this patch it is added ability to disable/enable (--disable_ps2_protocol/--enable_ps2_protocol) second execution for "--ps-prototocol" in testcases.	2023-07-26 17:15:00 +07:00
Alexander Barkov	1a5c4c2d9b	MDEV-26186 280 Bytes lost in mysys/array.c, mysys/hash.c, sql/sp.cc, sql/sp.cc, sql/item_create.cc, sql/item_create.cc, sql/sql_yacc.yy:10748 when using oracle sql_mode There was a memory leak under these conditions: - YYABORT was called in the end-of-rule action of a rule containing expr_lex - This expr_lex was not bound to any sp_lex_keeper Bison did not call %destructor <expr_lex> in this case, because its stack already contained a reduced upper-level rule. Fixing rules starting with RETURN, CONTINUE, EXIT keywords: Turning end-of-rule actions with YYABORT into mid-rule actions by adding an empty trailing { } block. This prevents the upper level rule from being reduced without calling %destructor <expr_lex>. In other rules expr_lex is used not immediately before the last end-of-rule { } block, so they don't need changes.	2023-07-18 12:19:16 +04:00
Alexander Barkov	400c101332	MDEV-30662 SQL/PL package body does not appear in I_S.ROUTINES.ROUTINE_DEFINITION - Moving the code from a public function trim_whitespaces() to the class Lex_cstring as methods. This code may be useful in other contexts, and also this code becomes visible inside sql_class.h - Adding a helper method THD::strmake_lex_cstring_trim_whitespaces() - Unifying the way how CREATE PROCEDURE/CREATE FUNCTION and CREATE PACKAGE/CREATE PACKAGE BODY work: a) Now CREATE PACKAGE/CREATE PACKAGE BODY also calls Lex->sphead->set_body_start() to remember the cpp body start inside an sp_head member. b) adding a "const char *cpp_body_end" parameter to sp_head::set_stmt_end(). These changes made it possible to reuse sp_head::set_stmt_end() inside LEX::create_package_finalize() and remove the duplucate code. - Renaming sp_head::m_body_begin to m_cpp_body_begin and adding a comment to make it clear that this member is used only during parsing, and points to a fragment inside the cpp buffer. - Changed sp_head::set_body_start() and sp_head::set_stmt_end() to skip the calls related to "body_utf8" in cases when m_parent is not NULL. A non-NULL m_parent means that we're inside a package routine. "body_utf8" in such case belongs not to the current sphead itself, but to parent (the package) sphead. So an sphead instance of a package routine should neither initialize, nor finalize, nor change in any other ways the "body_utf8" related members of Lex_input_stream, and should not take over or copy "body_utf8" data from Lex_input_stream to "this".	2023-07-14 13:26:26 +04:00
Sergei Petrunia	feaeb27b69	MDEV-29152: Assertion failed ... upon TO_CHAR with wrong argument Item_func_tochar::check_arguments() didn't check if its arguments each had one column. Failing to make this check and proceeding would eventually cause either an assertion failure or the execution would reach "MY_ASSERT_UNREACHABLE();" which would produce a crash with a misleading stack trace. * Fixed Item_func_tochar::check_arguments() to do the required check. * Also fixed MY_ASSERT_UNREACHABLE() to terminate the program. Just "executing" __builtin_unreachable() used to cause "undefined results", which in my experience was a crash with corrupted stack trace.	2023-07-12 12:05:59 +03:00
Monty	3fa99f0c0e	Change cost for REF to take into account cost for 1 extra key read_next The main difference in code path between EQ_REF and REF is that for REF we have to do an extra read_next on the index to check that there is no more matching rows. Before this patch we added a preference of EQ_REF by ensuring that REF would always estimate to find at least 2 rows. This patch adds the cost of the extra key read_next to REF access and removes the code that limited REF to at least 2 rows. For some queries this can have a big effect as the total estimated rows will be halved for each REF table with 1 rows. multi_range cost calculations are also changed to take into account the difference between EQ_REF and REF. The effect of the patch to the test suite: - About 80 test case changed - Almost all changes where for EXPLAIN where estimated rows for REF where changed from 2 to 1. - A few test cases using explain extended had a change of 'filtered'. This is because of the estimated rows are now closer to the calculated selectivity. - A very few test had a change of table order. This is because the change of estimated rows from 2 to 1 or the small cost change for REF (main.subselect_sj_jcl6, main.group_by, main.dervied_cond_pushdown, main.distinct, main.join_nested, main.order_by, main.join_cache) - No key statistics and the estimated rows are now smaller which cased estimated filtering to be lower. (main.subselect_sj_mat) - The number of total rows are halved. (main.derived_cond_pushdown) - Plans with 1 row changed to use RANGE instead of REF. (main.group_min_max) - ALL changed to REF (main.key_diff) - Key changed from ref + index_only to PRIMARY key for InnoDB, as OPTIMIZER_ROW_LOOKUP_COST + OPTIMIZER_ROW_NEXT_FIND_COST is smaller than OPTIMIZER_KEY_LOOKUP_COST + OPTIMIZER_KEY_NEXT_FIND_COST. (main.join_outer_innodb) - Cost changes printouts (main.opt_trace*) - Result order change (innodb_gis.rtree)	2023-02-10 12:58:50 +02:00
Monty	1f4a9f086a	Removed "<select expression> INTO <destination>" deprication. This was done after discussions with Igor, Sanja and Bar. The main reason for removing the deprication was to ensure that MariaDB is always backward compatible whenever possible. Other things: - Added statistics counters, mainly for the feedback plugin. - INTO OUTFILE - INTO variable - If INTO is using the old syntax (end of query)	2023-02-03 11:57:50 +03:00
Monty	5e5a8eda16	Derived tables and union can now create distinct keys The idea is that instead of marking all select_lex's with DISTINCT, we only mark those that really need distinct result. Benefits of this change: - Temporary tables used with derived tables, UNION, IN are now smaller as duplicates are removed already on the insert phase. - The optimizer can now produce better plans with EQ_REF. This can be seen from the tests where several queries does not anymore materialize derived tables twice. - Queries affected by 'in_predicate_conversion_threshold' where large IN lists are converted to sub query produces better plans. Other things: - Removed on duplicate call to sel->init_select() in LEX::add_primary_to_query_expression_body() - I moved the testing of tab->table->pos_in_table_list->is_materialized_derived() in join_read_const_table() to the caller as it caused problems for derived tables that could be proven to be const tables. This also is likely to fix some bugs as if join_read_const_table() was aborted, the table was left marked as JT_CONST, which cannot be good. I added an ASSERT there for now that can be removed when the code has been properly tested.	2023-02-02 22:32:57 +03:00
Monty	b6215b9b20	Update row and key fetch cost models to take into account data copy costs Before this patch, when calculating the cost of fetching and using a row/key from the engine, we took into account the cost of finding a row or key from the engine, but did not consistently take into account index only accessed, clustered key or covered keys for all access paths. The cost of the WHERE clause (TIME_FOR_COMPARE) was not consistently considered in best_access_path(). TIME_FOR_COMPARE was used in calculation in other places, like greedy_search(), but was in some cases (like scans) done an a different number of rows than was accessed. The cost calculation of row and index scans didn't take into account the number of rows that where accessed, only the number of accepted rows. When using a filter, the cost of index_only_reads and cost of accessing and disregarding 'filtered rows' where not taken into account, which made filters cost less than there actually where. To remedy the above, the following key & row fetch related costs has been added: - The cost of fetching and using a row is now split into different costs: - key + Row fetch cost (as before) but multiplied with the variable 'optimizer_cache_cost' (default to 0.5). This allows the user to tell the optimizer the likehood of finding the key and row in the engine cache. - ROW_COPY_COST, The cost copying a row from the engine to the sql layer or creating a row from the join_cache to the record buffer. Mostly affects table scan costs. - ROW_LOOKUP_COST, the cost of fetching a row by rowid. - KEY_COPY_COST the cost of finding the next key and copying it from the engine to the SQL layer. This is used when we calculate the cost index only reads. It makes index scans more expensive than before if they cover a lot of rows. (main.index_merge_myisam) - KEY_LOOKUP_COST, the cost of finding the first key in a range. This replaces the old define IDX_LOOKUP_COST, but with a higher cost. - KEY_NEXT_FIND_COST, the cost of finding the next key (and rowid). when doing a index scan and comparing the rowid to the filter. Before this cost was assumed to be 0. All of the above constants/variables are now tuned to be somewhat in proportion of executing complexity to each other. There is tuning need for these in the future, but that can wait until the above are made user variables as that will make tuning much easier. To make the usage of the above easy, there are new (not virtual) cost calclation functions in handler: - ha_read_time(), like read_time(), but take optimizer_cache_cost into account. - ha_read_and_copy_time(), like ha_read_time() but take into account ROW_COPY_TIME - ha_read_and_compare_time(), like ha_read_and_copy_time() but take TIME_FOR_COMPARE into account. - ha_rnd_pos_time(). Read row with row id, taking ROW_COPY_COST into account. This is used with filesort where we don't need to execute the WHERE clause again. - ha_keyread_time(), like keyread_time() but take optimizer_cache_cost into account. - ha_keyread_and_copy_time(), like ha_keyread_time(), but add KEY_COPY_COST. - ha_key_scan_time(), like key_scan_time() but take optimizer_cache_cost nto account. - ha_key_scan_and_compare_time(), like ha_key_scan_time(), but add KEY_COPY_COST & TIME_FOR_COMPARE. I also added some setup costs for doing different types of scans and creating temporary tables (on disk and in memory). This encourages the optimizer to not use these for simple 'a few row' lookups if there are adequate key lookup strategies. - TABLE_SCAN_SETUP_COST, cost of starting a table scan. - INDEX_SCAN_SETUP_COST, cost of starting an index scan. - HEAP_TEMPTABLE_CREATE_COST, cost of creating in memory temporary table. - DISK_TEMPTABLE_CREATE_COST, cost of creating an on disk temporary table. When calculating cost of fetching ranges, we had a cost of IDX_LOOKUP_COST (0.125) for doing a key div for a new range. This is now replaced with 'io_cost * KEY_LOOKUP_COST (1.0) * optimizer_cache_cost', which matches the cost we use for 'ref' and other key lookups. The effect is that the cost is now a bit higher when we have many ranges for a key. Allmost all calculation with TIME_FOR_COMPARE is now done in best_access_path(). 'JOIN::read_time' now includes the full cost for finding the rows in the table. In the result files, many of the changes are now again close to what they where before the "Update cost for hash and cached joins" commit, as that commit didn't fix the filter cost (too complex to do everything in one commit). The above changes showed a lot of a lot of inconsistencies in optimizer cost calculation. The main objective with the other changes was to do calculation as similar (and accurate) as possible and to make different plans more comparable. Detailed list of changes: - Calculate index_only_cost consistently and correctly for all scan and ref accesses. The row fetch_cost and index_only_cost now takes into account clustered keys, covered keys and index only accesses. - cost_for_index_read now returns both full cost and index_only_cost - Fixed cost calculation of get_sweep_read_cost() to match other similar costs. This is bases on the assumption that data is more often stored on SSD than a hard disk. - Replaced constant 2.0 with new define TABLE_SCAN_SETUP_COST. - Some scan cost estimates did not take into account TIME_FOR_COMPARE. Now all scan costs takes this into account. (main.show_explain) - Added session variable optimizer_cache_hit_ratio (default 50%). By adjusting this on can reduce or increase the cost of index or direct record lookups. The effect of the default is that key lookups is now a bit cheaper than before. See usage of 'optimizer_cache_cost' in handler.h. - JOIN_TAB::scan_time() did not take into account index only scans, which produced a wrong cost when index scan was used. Changed JOIN_TAB:::scan_time() to take into consideration clustered and covered keys. The values are now cached and we only have to call this function once. Other calls are changed to use the cached values. Function renamed to JOIN_TAB::estimate_scan_time(). - Fixed that most index cost calculations are done the same way and more close to 'range' calculations. The cost is now lower than before for small data sets and higher for large data sets as we take into account how many keys are read (main.opt_trace_selectivity, main.limit_rows_examined). - Ensured that index_scan_cost() == range(scan_of_all_rows_in_table_using_one_range) + MULTI_RANGE_READ_INFO_CONST. One effect of this is that if there is choice of doing a full index scan and a range-index scan over almost the whole table then index scan will be preferred (no range-read setup cost). (innodb.innodb, main.show_explain, main.range) - Fixed the EQ_REF and REF takes into account clustered and covered keys. This changes some plans to use covered or clustered indexes as these are much cheaper. (main.subselect_mat_cost, main.state_tables_innodb, main.limit_rows_examined) - Rowid filter setup cost and filter compare cost now takes into account fetching and checking the rowid (KEY_NEXT_FIND_COST). (main.partition_pruning heap.heap_btree main.log_state) - Added KEY_NEXT_FIND_COST to Range_rowid_filter_cost_info::lookup_cost to account of the time to find and check the next key value against the container - Introduced ha_keyread_time(rows) that takes into account finding the next row and copying the key value to 'record' (KEY_COPY_COST). - Introduced ha_key_scan_time() for calculating an index scan over all rows. - Added IDX_LOOKUP_COST to keyread_time() as a startup cost. - Added index_only_fetch_cost() as a convenience function to OPT_RANGE. - keyread_time() cost is slightly reduced to prefer shorter keys. (main.index_merge_myisam) - All of the above caused some index_merge combinations to be rejected because of cost (main.index_intersect). In some cases 'ref' where replaced with index_merge because of the low cost calculation of get_sweep_read_cost(). - Some index usage moved from PRIMARY to a covering index. (main.subselect_innodb) - Changed cost calculation of filter to take KEY_LOOKUP_COST and TIME_FOR_COMPARE into account. See sql_select.cc::apply_filter(). filter parameters and costs are now written to optimizer_trace. - Don't use matchings_records_in_range() to try to estimate the number of filtered rows for ranges. The reason is that we want to ensure that 'range' is calculated similar to 'ref'. There is also more work needed to calculate the selectivity when using ranges and ranges and filtering. This causes filtering column in EXPLAIN EXTENDED to be 100.00 for some cases where range cannot use filtering. (main.rowid_filter) - Introduced ha_scan_time() that takes into account the CPU cost of finding the next row and copying the row from the engine to 'record'. This causes costs of table scan to slightly increase and some test to changed their plan from ALL to RANGE or ALL to ref. (innodb.innodb_mysql, main.select_pkeycache) In a few cases where scan time of very small tables have lower cost than a ref or range, things changed from ref/range to ALL. (main.myisam, main.func_group, main.limit_rows_examined, main.subselect2) - Introduced ha_scan_and_compare_time() which is like ha_scan_time() but also adds the cost of the where clause (TIME_FOR_COMPARE). - Added small cost for creating temporary table for materialization. This causes some very small tables to use scan instead of materialization. - Added checking of the WHERE clause (TIME_FOR_COMPARE) of the accepted rows to ROR costs in get_best_ror_intersect() - Removed '- 0.001' from 'join->best_read' and optimize_straight_join() to ensure that the 'Last_query_cost' status variable contains the same value as the one that was calculated by the optimizer. - Take avg_io_cost() into account in handler::keyread_time() and handler::read_time(). This should have no effect as it's 1.0 by default, except for heap that overrides these functions. - Some 'ref_or_null' accesses changed to 'range' because of cost adjustments (main.order_by) - Added scan type "scan_with_join_cache" for optimizer_trace. This is just to show in the trace what kind of scan was used. - When using 'scan_with_join_cache' take into account number of preceding tables (as have to restore all fields for all previous table combination when checking the where clause) The new cost added is: (row_combinations * ROW_COPY_COST * number_of_cached_tables). This increases the cost of join buffering in proportion of the number of tables in the join buffer. One effect is that full scans are now done earlier as the cost is then smaller. (main.join_outer_innodb, main.greedy_optimizer) - Removed the usage of 'worst_seeks' in cost_for_index_read as it caused wrong plans to be created; It prefered JT_EQ_REF even if it would be much more expensive than a full table scan. A related issue was that worst_seeks only applied to full lookup, not to clustered or index only lookups, which is not consistent. This caused some plans to use index scan instead of eq_ref (main.union) - Changed federated block size from 4096 to 1500, which is the typical size of an IO packet. - Added costs for reading rows to Federated. Needed as there is no caching of rows in the federated engine. - Added ha_innobase::rnd_pos_time() cost function. - A lot of extra things added to optimizer trace - More costs, especially for materialization and index_merge. - Make lables more uniform - Fixed a lot of minor bugs - Added 'trace_started()' around a lot of trace blocks. - When calculating ORDER BY with LIMIT cost for using an index the cost did not take into account the number of row retrivals that has to be done or the cost of comparing the rows with the WHERE clause. The cost calculated would be just a fraction of the real cost. Now we calculate the cost as we do for ranges and 'ref'. - 'Using index for group-by' is used a bit more than before as now take into account the WHERE clause cost when comparing with 'ref' and prefer the method with fewer row combinations. (main.group_min_max). Bugs fixed: - Fixed that we don't calculate TIME_FOR_COMPARE twice for some plans, like in optimize_straight_join() and greedy_search() - Fixed bug in save_explain_data where we could test for the wrong index when displaying 'Using index'. This caused some old plans to show 'Using index'. (main.subselect_innodb, main.subselect2) - Fixed bug in get_best_ror_intersect() where 'min_cost' was not updated, and the cost we compared with was not the one that was used. - Fixed very wrong cost calculation for priority queues in check_if_pq_applicable(). (main.order_by now correctly uses priority queue) - When calculating cost of EQ_REF or REF, we added the cost of comparing the WHERE clause with the found rows, not all row combinations. This made ref and eq_ref to be regarded way to cheap compared to other access methods. - FORCE INDEX cost calculation didn't take into account clustered or covered indexes. - JT_EQ_REF cost was estimated as avg_io_cost(), which is half the cost of a JT_REF key. This may be true for InnoDB primary key, but not for other unique keys or other engines. Now we use handler function to calculate the cost, which allows us to handle consistently clustered, covered keys and not covered keys. - ha_start_keyread() didn't call extra_opt() if keyread was already enabled but still changed the 'keyread' variable (which is wrong). Fixed by not doing anything if keyread is already enabled. - multi_range_read_info_cost() didn't take into account io_cost when calculating the cost of ranges. - fix_semijoin_strategies_for_picked_join_order() used the wrong record_count when calling best_access_path() for SJ_OPT_FIRST_MATCH and SJ_OPT_LOOSE_SCAN. - Hash joins didn't provide correct best_cost to the upper level, which means that the cost for hash_joins more expensive than calculated in best_access_path (a difference of 10x * TIME_OF_COMPARE). This is fixed in the new code thanks to that we now include TIME_OF_COMPARE cost in 'read_time'. Other things: - Added some 'if (thd->trace_started())' to speed up code - Removed not used function Cost_estimate::is_zero() - Simplified testing of HA_POS_ERROR in get_best_ror_intersect(). (No cost changes) - Moved ha_start_keyread() from join_read_const_table() to join_read_const() to enable keyread for all types of JT_CONST tables. - Made a few very short functions inline in handler.h Notes: - In main.rowid_filter the join order of order and lineitem is swapped. This is because the cost of doing a range fetch of lineitem(98 rows) is almost as big as the whole join of order,lineitem. The filtering will also ensure that we only have to do very small key fetches of the rows in lineitem. - main.index_merge_myisam had a few changes where we are now using less keys for index_merge. This is because index scans are now more expensive than before. - handler->optimizer_cache_cost is updated in ha_external_lock(). This ensures that it is up to date per statements. Not an optimal solution (for locked tables), but should be ok for now. - 'DELETE FROM t1 WHERE t1.a > 0 ORDER BY t1.a' does not take cost of filesort into consideration when table scan is chosen. (main.myisam_explain_non_select_all) - perfschema.table_aggregate_global_* has changed because an update on a table with 1 row will now use table scan instead of key lookup. TODO in upcomming commits: - Fix selectivity calculation for ranges with and without filtering and when there is a ref access but scan is chosen. For this we have to store the lowest known value for 'accepted_records' in the OPT_RANGE structure. - Change that records_read does not include filtered rows. - test_if_cheaper_ordering() needs to be updated to properly calculate costs. This will fix tests like main.order_by_innodb, main.single_delete_update - Extend get_range_limit_read_cost() to take into considering cost_for_index_read() if there where no quick keys. This will reduce the computed cost for ORDER BY with LIMIT in some cases. (main.innodb_ext_key) - Fix that we take into account selectivity when counting the number of rows we have to read when considering using a index table scan to resolve ORDER BY. - Add new calculation for rnd_pos_time() where we take into account the benefit of reading multiple rows from the same page.	2023-02-02 21:43:30 +03:00
Marko Mäkelä	7933367a27	Merge 10.10 into 10.11	2022-11-21 10:51:10 +02:00
Marko Mäkelä	bebe193979	Merge 10.9 into 10.10	2022-11-21 10:32:08 +02:00
Marko Mäkelä	f46efb4476	Merge 10.7 into 10.8	2022-11-17 21:35:12 +02:00
Marko Mäkelä	d5332086d7	Merge 10.6 into 10.7	2022-11-17 09:19:32 +02:00
Marko Mäkelä	9aea7d83c8	Merge 10.5 into 10.6	2022-11-17 08:37:35 +02:00
Alexander Barkov	72c728feba	MDEV-29370 Functions in packages are slow and seems to ignore deterministic	2022-11-15 11:34:00 +04:00
Luis Eduardo Oliveira Lizardo	ad7631bdce	MDEV-28926 Add time spent on query optimizer to JSON ANALYZE (#2193 ) * Add query optimizer timer to ANALYZE FORMAT=JSON * Adapt tests and results * Change logic to always close the writer after printing query blocks	2022-10-26 09:18:29 +03:00
Oleksandr Byelkin	bb76dcbec7	Merge branch '10.9' into 10.10	2022-10-04 13:32:38 +02:00

1 2 3 4 5 ...

383 Commits