Trivial batch, using the handler statistics already collected for
the slow query log.
The reason for the changes in test cases was mainly to change to use
select TABLE_SCHEMA ... from information_schema.table_statistics instead
of 'show table_statistics' to avoid future changes to test results
if we add more columns to table_statistics.
Columns added to TABLE_STATISTICS
- ROWS_INSERTED, ROWS_DELETED, ROWS_UPDATED, KEY_READ_HITS and
KEY_READ_MISSES.
Columns added to CLIENT_STATISTICS and USER_STATISTICS:
- KEY_READ_HITS and KEY_READ_MISSES.
User visible changes (except new columns):
- CLIENT_STATISTICS and USER_STATISTICS has columns KEY_READ_HITS and
KEY_READ_MISSES added after column ROWS_UPDATED before SELECT_COMMANDS.
Other changes:
- Do not collect table statistics for system tables like index_stats
table_stats, performance_schema, information_schema etc as the user
has no control of these and the generate noice in the statistics.
- All row variables that are part of user_stats are moved to
'struct rows_stats' to make it easy to clear all of them at once.
- ha_read_key_misses added to STATUS_VAR
Notes:
- userstat.result has a change of numbers of rows for handler_read_key.
This is because use-stat-tables is now disabled for the test.
This includes all test changes from
"Changing all cost calculation to be given in milliseconds"
and forwards.
Some of the things that caused changes in the result files:
- As part of fixing tests, I added 'echo' to some comments to be able to
easier find out where things where wrong.
- MATERIALIZED has now a higher cost compared to X than before. Because
of this some MATERIALIZED types have changed to DEPENDEND SUBQUERY.
- Some test cases that required MATERIALIZED to repeat a bug was
changed by adding more rows to force MATERIALIZED to happen.
- 'Filtered' in SHOW EXPLAIN has in many case changed from 100.00 to
something smaller. This is because now filtered also takes into
account the smallest possible ref access and filters, even if they
where not used. Another reason for 'Filtered' being smaller is that
we now also take into account implicit filtering done for subqueries
using FIRSTMATCH.
(main.subselect_no_exists_to_in)
This is caluculated in best_access_path() and stored in records_out.
- Table orders has changed because more accurate costs.
- 'index' and 'ALL' for small tables has changed to use 'range' or
'ref' because of optimizer_scan_setup_cost.
- index can be changed to 'range' as 'range' optimizer assumes we don't
have to read the blocks from disk that range optimizer has already read.
This can be confusing in the case where there is no obvious where clause
but instead there is a hidden 'key_column > NULL' added by the optimizer.
(main.subselect_no_exists_to_in)
- Scan on primary clustered key does not report 'Using Index' anymore
(It's a table scan, not an index scan).
- For derived tables, the number of rows is now 100 instead of 2,
which can be seen in EXPLAIN.
- More tests have "Using index for group by" as the cost of this
optimization is now more correct (lower).
- A primary key could be preferred for a normal key, even if it would
access more rows, as it's faster to do 1 lokoup and 3 'index_next' on a
clustered primary key than one lookup trough a secondary.
(main.stat_tables_innodb)
Notes:
- There was a 4.7% more calls to best_extension_by_limited_search() in
the main.greedy_optimizer test. However examining the test results
it looked that the plans where slightly better (eq_ref where more
chained together) so I assume this is ok.
- I have verified a few test cases where there was notable/unexpected
changes in the plan and in all cases the new optimizer plans where
faster. (main.greedy_optimizer and some others)