This patch also fixes
MDEV-31391 Assertion `((best.records_out) == 0.0 ... failed
Cost changes caused by this change:
- range queries with join buffer now have a notable smaller cost.
- range ranges are bit more expensive as the MULTI_RANGE_COST is now
properly applied to it in all cases (this extra cost is equal to a
key lookup).
- table scan cost is slight smaller as we now assume data is cached in
the engine after the first scan pass. (We did this before for range
scans and other access methods).
- partition tables had wrong values for max_row_blocks and
max_index_blocks. Correcting this, causes range access on
partitioned tables to have slightly higher cost because of the
increased estimated IO.
- Using first match + join buffer caused 'filtered' to be calcualted
wrong. (Only affected EXPLAIN, not query costs).
- Added cost_without_join_buffer to optimizer_trace.
- check_quick_select() adjusted the number of rows according to persistent
statistics, but did not adjust cost. Now fixed.
The big change in the patch are:
- In best_access_path(), where we now are using storing the cost in
'ALL_READ_COST cost' and only converting it to a double at the end.
This allows us to more exactly calculate the effect of the join_cache.
- In JOIN_TAB::estimate_scan_time(), store the cost also in a
ALL_READ_COST object.
One of effect if this change is that when joining very small tables:
t1 some_access_method
t2 range
t3 ALL Use join buffer
This is swiched to
t1 some_access_method
t3 ALL
t2 range use join buffer
Both plans has the same cost, but as table scan in this case has less
cost than rang, the table scan will be considered first and thus have
precidence.
Test case changes:
- optimizer_trace - Addition of cost_without_join_buffer
- subselect_mat_cost_bugs - Small tables and scan versus range
- range & range_mrr_icp - Range + join_cache is faster than ref
- optimizer_trace - cost_without_join_buffer, smaller scan cost,
range setup cost.
- mrr - range+join_buffer used as smaller cost
The main difference in code path between EQ_REF and REF is that for
REF we have to do an extra read_next on the index to check that there
is no more matching rows.
Before this patch we added a preference of EQ_REF by ensuring that REF
would always estimate to find at least 2 rows.
This patch adds the cost of the extra key read_next to REF access and
removes the code that limited REF to at least 2 rows. For some queries
this can have a big effect as the total estimated rows will be halved
for each REF table with 1 rows.
multi_range cost calculations are also changed to take into account
the difference between EQ_REF and REF.
The effect of the patch to the test suite:
- About 80 test case changed
- Almost all changes where for EXPLAIN where estimated rows for REF
where changed from 2 to 1.
- A few test cases using explain extended had a change of 'filtered'.
This is because of the estimated rows are now closer to the
calculated selectivity.
- A very few test had a change of table order.
This is because the change of estimated rows from 2 to 1 or the small
cost change for REF
(main.subselect_sj_jcl6, main.group_by, main.dervied_cond_pushdown,
main.distinct, main.join_nested, main.order_by, main.join_cache)
- No key statistics and the estimated rows are now smaller which cased
estimated filtering to be lower.
(main.subselect_sj_mat)
- The number of total rows are halved.
(main.derived_cond_pushdown)
- Plans with 1 row changed to use RANGE instead of REF.
(main.group_min_max)
- ALL changed to REF
(main.key_diff)
- Key changed from ref + index_only to PRIMARY key for InnoDB, as
OPTIMIZER_ROW_LOOKUP_COST + OPTIMIZER_ROW_NEXT_FIND_COST is smaller than
OPTIMIZER_KEY_LOOKUP_COST + OPTIMIZER_KEY_NEXT_FIND_COST.
(main.join_outer_innodb)
- Cost changes printouts
(main.opt_trace*)
- Result order change
(innodb_gis.rtree)
This includes all test changes from
"Changing all cost calculation to be given in milliseconds"
and forwards.
Some of the things that caused changes in the result files:
- As part of fixing tests, I added 'echo' to some comments to be able to
easier find out where things where wrong.
- MATERIALIZED has now a higher cost compared to X than before. Because
of this some MATERIALIZED types have changed to DEPENDEND SUBQUERY.
- Some test cases that required MATERIALIZED to repeat a bug was
changed by adding more rows to force MATERIALIZED to happen.
- 'Filtered' in SHOW EXPLAIN has in many case changed from 100.00 to
something smaller. This is because now filtered also takes into
account the smallest possible ref access and filters, even if they
where not used. Another reason for 'Filtered' being smaller is that
we now also take into account implicit filtering done for subqueries
using FIRSTMATCH.
(main.subselect_no_exists_to_in)
This is caluculated in best_access_path() and stored in records_out.
- Table orders has changed because more accurate costs.
- 'index' and 'ALL' for small tables has changed to use 'range' or
'ref' because of optimizer_scan_setup_cost.
- index can be changed to 'range' as 'range' optimizer assumes we don't
have to read the blocks from disk that range optimizer has already read.
This can be confusing in the case where there is no obvious where clause
but instead there is a hidden 'key_column > NULL' added by the optimizer.
(main.subselect_no_exists_to_in)
- Scan on primary clustered key does not report 'Using Index' anymore
(It's a table scan, not an index scan).
- For derived tables, the number of rows is now 100 instead of 2,
which can be seen in EXPLAIN.
- More tests have "Using index for group by" as the cost of this
optimization is now more correct (lower).
- A primary key could be preferred for a normal key, even if it would
access more rows, as it's faster to do 1 lokoup and 3 'index_next' on a
clustered primary key than one lookup trough a secondary.
(main.stat_tables_innodb)
Notes:
- There was a 4.7% more calls to best_extension_by_limited_search() in
the main.greedy_optimizer test. However examining the test results
it looked that the plans where slightly better (eq_ref where more
chained together) so I assume this is ok.
- I have verified a few test cases where there was notable/unexpected
changes in the plan and in all cases the new optimizer plans where
faster. (main.greedy_optimizer and some others)
The old code did not't correctly add TIME_FOR_COMPARE to rows that are
part of the scan that will be compared with the attached where clause.
Now the cost calculation for hash join and full join cache join are
identical except for HASH_FANOUT (10%)
The cost for a join with keys is now also uniform.
The total cost for a using a key for lookup is calculated in one place as:
(cost_of_finding_rows_through_key(records) + records/TIME_FOR_COMPARE)*
record_count_of_previous_row_combinations + startup_cost
startup_cost is the cost of a creating a temporary table (if needed)
Best_cost now includes the cost of comparing all WHERE clauses and also
cost of joining with previous row combinations.
Other things:
- Optimizer trace is now printing the total costs, including testing the
WHERE clause (TIME_FOR_COMPARE) and comparing with all previous rows.
- In optimizer trace, include also total cost of query together with the
final join order. This makes it easier to find out where the cost was
calculated.
- Old code used filter even if the cost for it was higher than not using a
filter. This is not corrected.
- When rebasing on 10.11, I noticed some changes to access_cost_factor
calculation. These changes was not picked as the coming changes
to filtering will make that code obsolete.
Fixed also that the 'with_found_constraint parameter' to
matching_candidates_in_table() is as documented: It is now true only
if there is a reference to a previous table in the WHERE condition for
the current examined table (as it was originally documented)
Changes in test results:
- Filtered was 25% smaller for some queries (expected).
- Some join order changed (probably because the tables had very few rows).
- Some more table scans, probably because there would be fewer returned
rows.
- Some tests exposes a bug that if there is more filtered rows, then the
cost for table scan will be higher. This will be fixed in a later commit.
* be strict in CREATE TABLE, just like in ALTER TABLE, because
CREATE TABLE, just like ALTER TABLE, can be rolled back for any engine
* but don't auto-convert warnings into errors for engine warnings
(handler::create) - this matches ALTER TABLE behavior
* and not when creating a default record, these errors are handled
specially (and replaced with ER_INVALID_DEFAULT)
* always issue a Note when a non-unique key is truncated, because it's
not a Warning that can be converted to an Error. Before this commit
it was a Note for blobs and a Warning for all other data types.
Remove usage of deprecated variable storage_engine. It was deprecated in 5.5 but
it never issued a deprecation warning. Make it issue a warning in 10.5.1.
Replaced with default_storage_engine.
Limit increased from 1000 to 2000.
Avoiding stack overflow by only storing keys and pages on the stack in
recursive functions if there is plenty of space on it.
Other things:
- Use less stack space for b-tree operations as we now only allocate as
much space as needed instead of always allocating HA_MAX_KEY_LENGTH.
- Replaced most usage of my_safe_alloca() in Aria with the stack_alloc
interface.
- Moved my_setstacksize() to mysys/my_pthread.c
So to push index condition for each join tab we have calculate the index condition that can be pushed and then
remove this index condition from the original condition. This is done through the function make_cond_remainder.
The problem is the function make_cond_remainder does not remove index condition when there is an OR operator.
Fixed this by making the function make_cond_remainder to keep in mind of the OR operator.
Also updated results for multiple test files which were incorrectly updated by the commit e0c1b3f242
code which was supposed to remove the condition present in the index
condition was not getting executed when the condition had OR operator, with AND the pushed
index condition was getting removed from where.
This problem affects all versions starting from 5.5 but this is a performance improvement, so fixing it in 10.4
create table t1 (a smallint primary key auto_increment);
insert into t1 values(32767);
insert into t1 values(NULL);
ERROR 1062 (23000): Duplicate entry '32767' for key 'PRIMARY
Now on always gets error HA_ERR_AUTOINC_RANGE=167 "Out of range value for column", independent of
store engine, SQL Mode or number of inserted rows. This is an unique error that is easier to test for in replication.
Another bug fix is that we now get an error when trying to insert a too big auto-generated value, even in non-strict mode.
Before one get insted the max column value inserted.
This patch also fixes some issues with inserting negative numbers in an auto-increment column.
Fixed the ER_DUP_ENTRY and HA_ERR_AUTOINC_ERANGE are compared the same between master and slave.
This ensures that replication works between an old server to a new slave for auto-increment overflow errors.
Added SQLSTATE errors for handler errors
Smaller bug fixes:
* Added warnings for duplicate key errors when using INSERT IGNORE
* Fixed bug when using --skip-log-bin followed by --log-bin, which did set log-bin to "0"
* Allow one to see how cmake is called by using --just-print --just-configure
BUILD/FINISH.sh:
--just-print --just-configure now shows how cmake would be invoked. Good for understanding parameters to cmake.
cmake/configure.pl:
--just-print --just-configure now shows how cmake would be invoked. Good for understanding parameters to cmake.
include/CMakeLists.txt:
Added handler_state.h
include/handler_state.h:
SQLSTATE for handler error messages.
Required for HA_ERR_AUTOINC_ERANGE, but solves also some other cases.
mysql-test/extra/binlog_tests/binlog.test:
Fixed old wrong behaviour
Added more tests
mysql-test/extra/binlog_tests/binlog_insert_delayed.test:
Reset binary log to only print what's necessary in show_binlog_events
mysql-test/extra/rpl_tests/rpl_auto_increment.test:
Update to new error codes
mysql-test/extra/rpl_tests/rpl_insert_delayed.test:
Ignore warnings as this depends on how the test is run
mysql-test/include/strict_autoinc.inc:
On now gets an error on overflow
mysql-test/r/auto_increment.result:
Update results after fixing error message
mysql-test/r/auto_increment_ranges_innodb.result:
Test new behaviour
mysql-test/r/auto_increment_ranges_myisam.result:
Test new behaviour
mysql-test/r/commit_1innodb.result:
Added warnings for duplicate key error
mysql-test/r/create.result:
Added warnings for duplicate key error
mysql-test/r/insert.result:
Added warnings for duplicate key error
mysql-test/r/insert_select.result:
Added warnings for duplicate key error
mysql-test/r/insert_update.result:
Added warnings for duplicate key error
mysql-test/r/mix2_myisam.result:
Added warnings for duplicate key error
mysql-test/r/myisam_mrr.result:
Added warnings for duplicate key error
mysql-test/r/null_key.result:
Added warnings for duplicate key error
mysql-test/r/replace.result:
Update to new error codes
mysql-test/r/strict_autoinc_1myisam.result:
Update to new error codes
mysql-test/r/strict_autoinc_2innodb.result:
Update to new error codes
mysql-test/r/strict_autoinc_3heap.result:
Update to new error codes
mysql-test/r/trigger.result:
Added warnings for duplicate key error
mysql-test/r/xtradb_mrr.result:
Added warnings for duplicate key error
mysql-test/suite/binlog/r/binlog_innodb_row.result:
Updated result
mysql-test/suite/binlog/r/binlog_row_binlog.result:
Out of range data for auto-increment is not inserted anymore
mysql-test/suite/binlog/r/binlog_statement_insert_delayed.result:
Updated result
mysql-test/suite/binlog/r/binlog_stm_binlog.result:
Out of range data for auto-increment is not inserted anymore
mysql-test/suite/binlog/r/binlog_unsafe.result:
Updated result
mysql-test/suite/innodb/r/innodb-autoinc.result:
Update to new error codes
mysql-test/suite/innodb/r/innodb-lock.result:
Updated results
mysql-test/suite/innodb/r/innodb.result:
Updated results
mysql-test/suite/innodb/r/innodb_bug56947.result:
Updated results
mysql-test/suite/innodb/r/innodb_mysql.result:
Updated results
mysql-test/suite/innodb/t/innodb-autoinc.test:
Update to new error codes
mysql-test/suite/maria/maria3.result:
Updated result
mysql-test/suite/maria/mrr.result:
Updated result
mysql-test/suite/optimizer_unfixed_bugs/r/bug43617.result:
Updated result
mysql-test/suite/rpl/r/rpl_auto_increment.result:
Update to new error codes
mysql-test/suite/rpl/r/rpl_insert_delayed,stmt.rdiff:
Updated results
mysql-test/suite/rpl/r/rpl_loaddatalocal.result:
Updated results
mysql-test/t/auto_increment.test:
Update to new error codes
mysql-test/t/auto_increment_ranges.inc:
Test new behaviour
mysql-test/t/auto_increment_ranges_innodb.test:
Test new behaviour
mysql-test/t/auto_increment_ranges_myisam.test:
Test new behaviour
mysql-test/t/replace.test:
Update to new error codes
mysys/my_getopt.c:
Fixed bug when using --skip-log-bin followed by --log-bin, which did set log-bin to "0"
sql/handler.cc:
Ignore negative values for signed auto-increment columns
Always give an error if we get an overflow for an auto-increment-column (instead of inserting the max value)
Ensure that the row number is correct for the out-of-range-value error message.
******
Fixed wrong printing of column namn for "Out of range value" errors
Fixed that INSERT_ID is correctly replicated also for out-of-range autoincrement values
Fixed that print_keydup_error() can also be used to generate warnings
******
Return HA_ERR_AUTOINC_ERANGE (167) instead of ER_WARN_DATA_OUT_OF_RANGE for auto-increment overflow
sql/handler.h:
Allow INSERT IGNORE to continue also after out-of-range inserts.
Fixed that print_keydup_error() can also be used to generate warnings
sql/log_event.cc:
Added DBUG_PRINT
Fixed the ER_AUTOINC_READ_FAILED, ER_DUP_ENTRY and HA_ERR_AUTOINC_ERANGE are compared the same between master and slave.
This ensures that replication works between an old server to a new slave for auto-increment overflow errors.
sql/sql_insert.cc:
Add warnings for duplicate key errors when using INSERT IGNORE
sql/sql_state.c:
Added handler errors
sql/sql_table.cc:
Update call to print_keydup_error()
storage/innobase/handler/ha_innodb.cc:
Fixed increment handling of auto-increment columns to be consistent with rest of MariaDB.
storage/xtradb/handler/ha_innodb.cc:
Fixed increment handling of auto-increment columns to be consistent with rest of MariaDB.