1
0
mirror of https://github.com/MariaDB/server.git synced 2025-11-08 00:28:29 +03:00
Commit Graph

155 Commits

Author SHA1 Message Date
Monty
b66cdbd1ea Changing all cost calculation to be given in milliseconds
This makes it easier to compare different costs and also allows
the optimizer to optimizer different storage engines more reliably.

- Added tests/check_costs.pl, a tool to verify optimizer cost calculations.
  - Most engine costs has been found with this program. All steps to
    calculate the new costs are documented in Docs/optimizer_costs.txt

- User optimizer_cost variables are given in microseconds (as individual
  costs can be very small). Internally they are stored in ms.
- Changed DISK_READ_COST (was DISK_SEEK_BASE_COST) from a hard disk cost
  (9 ms) to common SSD cost (400MB/sec).
- Removed cost calculations for hard disks (rotation etc).
- Changed the following handler functions to return IO_AND_CPU_COST.
  This makes it easy to apply different cost modifiers in ha_..time()
  functions for io and cpu costs.
  - scan_time()
  - rnd_pos_time() & rnd_pos_call_time()
  - keyread_time()
- Enhanched keyread_time() to calculate the full cost of reading of a set
  of keys with a given number of ranges and optional number of blocks that
  need to be accessed.
- Removed read_time() as keyread_time() + rnd_pos_time() can do the same
  thing and more.
- Tuned cost for: heap, myisam, Aria, InnoDB, archive and MyRocks.
  Used heap table costs for json_table. The rest are using default engine
  costs.
- Added the following new optimizer variables:
  - optimizer_disk_read_ratio
  - optimizer_disk_read_cost
  - optimizer_key_lookup_cost
  - optimizer_row_lookup_cost
  - optimizer_row_next_find_cost
  - optimizer_scan_cost
- Moved all engine specific cost to OPTIMIZER_COSTS structure.
- Changed costs to use 'records_out' instead of 'records_read' when
  recalculating costs.
- Split optimizer_costs.h to optimizer_costs.h and optimizer_defaults.h.
  This allows one to change costs without having to compile a lot of
  files.
- Updated costs for filter lookup.
- Use a better cost estimate in best_extension_by_limited_search()
  for the sorting cost.
- Fixed previous issues with 'filtered' explain column as we are now
  using 'records_out' (min rows seen for table) to calculate filtering.
  This greatly simplifies the filtering code in
  JOIN_TAB::save_explain_data().

This change caused a lot of queries to be optimized differently than
before, which exposed different issues in the optimizer that needs to
be fixed.  These fixes are in the following commits.  To not have to
change the same test case over and over again, the changes in the test
cases are done in a single commit after all the critical change sets
are done.

InnoDB changes:
- Updated InnoDB to not divide big range cost with 2.
- Added cost for InnoDB (innobase_update_optimizer_costs()).
- Don't mark clustered primary key with HA_KEYREAD_ONLY. This will
  prevent that the optimizer is trying to use index-only scans on
  the clustered key.
- Disabled ha_innobase::scan_time() and ha_innobase::read_time() and
  ha_innobase::rnd_pos_time() as the default engine cost functions now
  works good for InnoDB.

Other things:
- Added  --show-query-costs (\Q) option to mysql.cc to show the query
  cost after each query (good when working with query costs).
- Extended my_getopt with GET_ADJUSTED_VALUE which allows one to adjust
  the value that user is given. This is used to change cost from
  microseconds (user input) to milliseconds (what the server is
  internally using).
- Added include/my_tracker.h  ; Useful include file to quickly test
  costs of a function.
- Use handler::set_table() in all places instead of 'table= arg'.
- Added SHOW_OPTIMIZER_COSTS to sys variables. These are input and
  shown in microseconds for the user but stored as milliseconds.
  This is to make the numbers easier to read for the user (less
  pre-zeros).  Implemented in 'Sys_var_optimizer_cost' class.
- In test_quick_select() do not use index scans if 'no_keyread' is set
  for the table. This is what we do in other places of the server.
- Added THD parameter to Unique::get_use_cost() and
  check_index_intersect_extension() and similar functions to be able
  to provide costs to called functions.
- Changed 'records' to 'rows' in optimizer_trace.
- Write more information to optimizer_trace.
- Added INDEX_BLOCK_FILL_FACTOR_MUL (4) and INDEX_BLOCK_FILL_FACTOR_DIV (3)
  to calculate usage space of keys in b-trees. (Before we used numeric
  constants).
- Removed code that assumed that b-trees has similar costs as binary
  trees. Replaced with engine calls that returns the cost.
- Added Bitmap::find_first_bit()
- Added timings to join_cache for ANALYZE table (patch by Sergei Petrunia).
- Added records_init and records_after_filter to POSITION to remember
  more of what best_access_patch() calculates.
- table_after_join_selectivity() changed to recalculate 'records_out'
  based on the new fields from best_access_patch()

Bug fixes:
- Some queries did not update last_query_cost (was 0). Fixed by moving
  setting thd->...last_query_cost in JOIN::optimize().
- Write '0' as number of rows for const tables with a matching row.

Some internals:
- Engine cost are stored in OPTIMIZER_COSTS structure.  When a
  handlerton is created, we also created a new cost variable for the
  handlerton. We also create a new variable if the user changes a
  optimizer cost for a not yet loaded handlerton either with command
  line arguments or with SET
  @@global.engine.optimizer_cost_variable=xx.
- There are 3 global OPTIMIZER_COSTS variables:
  default_optimizer_costs   The default costs + changes from the
                            command line without an engine specifier.
  heap_optimizer_costs      Heap table costs, used for temporary tables
  tmp_table_optimizer_costs The cost for the default on disk internal
                            temporary table (MyISAM or Aria)
- The engine cost for a table is stored in table_share. To speed up
  accesses the handler has a pointer to this. The cost is copied
  to the table on first access. If one wants to change the cost one
  must first update the global engine cost and then do a FLUSH TABLES.
  This was done to be able to access the costs for an open table
  without any locks.
- When a handlerton is created, the cost are updated the following way:
  See sql/keycaches.cc for details:
  - Use 'default_optimizer_costs' as a base
  - Call hton->update_optimizer_costs() to override with the engines
    default costs.
  - Override the costs that the user has specified for the engine.
  - One handler open, copy the engine cost from handlerton to TABLE_SHARE.
  - Call handler::update_optimizer_costs() to allow the engine to update
    cost for this particular table.
  - There are two costs stored in THD. These are copied to the handler
    when the table is used in a query:
    - optimizer_where_cost
    - optimizer_scan_setup_cost
- Simply code in best_access_path() by storing all cost result in a
  structure. (Idea/Suggestion by Igor)
2023-02-02 23:54:45 +03:00
Luis Eduardo Oliveira Lizardo
ad7631bdce MDEV-28926 Add time spent on query optimizer to JSON ANALYZE (#2193)
* Add query optimizer timer to ANALYZE FORMAT=JSON

* Adapt tests and results

* Change logic to always close the writer after printing query blocks
2022-10-26 09:18:29 +03:00
Sergei Petrunia
9841a8080b Fix comment. 2022-04-29 17:29:47 +03:00
Sergei Petrunia
3f68c2169e MDEV-28201: Server crashes upon SHOW ANALYZE/EXPLAIN FORMAT=JSON
- Describe the lifetime of EXPLAIN data structures in
  sql_explain.h:ExplainDataStructureLifetime.

- Make Item_field::set_field() call set_refers_to_temp_table()
  when it refers to a temp. table.
- Introduce QT_DONT_ACCESS_TMP_TABLES flag for Item::print.
  It directs Item_field::print to not try access its the
  temp table.
- Introduce Explain_query::notify_tables_are_closed()
  and call it right before the query closes its tables.
- Make Explain data stuctures' print_explain_json() methods
  accept "no_tmp_tbl" parameter which means pass
  QT_DONT_ACCESS_TMP_TABLES when printing items.
- Make Show_explain_request::call_in_target_thread() not call
  set_current_thd(). This wasn't needed as the code inside
  lex->print_explain() uses output->thd anyway. output->thd
  refers to the SHOW command's THD object.
2022-04-29 10:48:26 +03:00
Oleg Smirnov
02c3babdec MDEV-28124 Server crashes in Explain_aggr_filesort::print_json_members
SHOW EXPLAIN/ANALYZE FORMAT=JSON tries to access items that have already been
freed by a call to free_items() during THD::cleanup_after_query().
The solution is to disallow APC calls including SHOW EXPLAIN/ANALYZE
just before the call to free_items().
2022-04-29 10:48:25 +03:00
Oleg Smirnov
a0475cb9ca MDEV-27021 Add explicit indication of SHOW EXPLAIN/ANALYZE.
1. Add explicit indication that the output is produced by
SHOW EXPLAIN/ANALYZE FORMAT=JSON command.
2. Remove useless "r_total_time_ms" field from SHOW ANALYZE FORMAT=JSON
output when there is no timed statistics gathered.
3. Add "r_query_time_in_progress_ms" to the output of SHOW ANALYZE FORMAT=JSON.
2022-04-29 10:48:25 +03:00
Oleg Smirnov
328684833b MDEV-10000 Add EXPLAIN [FORMAT=JSON] FOR CONNECTION syntax support
EXPLAIN FOR CONNECTION is a MySQL-compatible syntax for SHOW EXPLAIN.
This commit also adds support for FORMAT=JSON to SHOW EXPLAIN,
so the possible options to get JSON-formatted output are:
- SHOW EXPLAIN FORMAT=JSON FOR $con
- EXPLAIN FORMAT=JSON FOR CONNECTION $con
2022-04-29 10:47:03 +03:00
Sergei Krivonos
c9fcea14e9 MDEV-27036: re-enable my_json_writer-t unit test 2021-12-15 21:36:56 +02:00
Sergei Golubchik
1e8bcbd0a0 Revert "MDEV-27036: re-enable my_json_writer-t unit test"
This reverts commit 2d21917e7d.

No explainations, lots of code moved, wrong cmake changes
2021-12-07 09:57:51 +01:00
Sergei Krivonos
2d21917e7d MDEV-27036: re-enable my_json_writer-t unit test 2021-12-04 22:25:46 -05:00
Marko Mäkelä
a8379e53e8 Merge 10.5 into 10.6
The changes to galera.galear_var_replicate_myisam_on
in commit d9b933bec6
are omitted due to conflicts
with commit 27d66d644c.
2021-10-13 13:28:12 +03:00
Marko Mäkelä
99bb3fb656 Merge 10.4 into 10.5 2021-10-13 12:33:56 +03:00
Marko Mäkelä
a736a3174a Merge 10.3 into 10.4 2021-10-13 12:03:32 +03:00
Marko Mäkelä
4a7dfda373 Merge 10.2 into 10.3 2021-10-13 11:38:21 +03:00
Sergei Petrunia
26eb666463 Make Explain_node::children protected 2021-10-08 19:17:06 +03:00
Alexey Botchkov
e9fd327ee3 MDEV-17399 Add support for JSON_TABLE.
The specific table handler for the table functions was introduced,
and used to implement JSON_TABLE.
2021-04-21 10:21:43 +04:00
Sergei Petrunia
2656e87682 Cleanup: fake_select_lex->select_number=FAKE_SELECT_LEX_ID, not [U]INT_MAX
SELECT_LEX objects that are "fake_select_lex" (i.e read UNION output)
used both INT_MAX and UINT_MAX as select_number.
- mysql_explain_union() assigned UINT_MAX
- st_select_lex_unit::add_fake_select_lex assigned INT_MAX

This didn't matter initially (before EXPLAIN FORMAT=JSON), because the
code  had no checks for this value.

EXPLAIN FORMAT=JSON and later other features did introduce checks for
select_number values. The check had to check for two constants and
looked really confusing.

This patch joins the two constants into one - FAKE_SELECT_LEX_ID.
2021-04-17 00:05:29 +03:00
Marko Mäkelä
b30a013142 Merge 10.4 into 10.5 2020-05-13 14:25:06 +03:00
Marko Mäkelä
38f6c47f8a Merge 10.3 into 10.4 2020-05-13 12:52:57 +03:00
Sergei Petrunia
d01d94d77b MDEV-17568: LATERAL DERIVED is not clearly visible in EXPLAIN FORMAT=JSON
Make LATERAL DERIVED tables visible in EXPLAIN FORMAT=JSON output.
2020-05-06 23:44:34 +03:00
Sergei Golubchik
7c58e97bf6 perfschema memory related instrumentation changes 2020-03-10 19:24:22 +01:00
Sergei Petrunia
68ed3a81f2 MDEV-20854: ANALYZE for statements: not clear where the time is spent
Count the "gap" time between table accesses and display it as
r_other_time_ms in the "table" element.

* The advantage of this approach is that it doesn't add any new
  my_timer_cycles() calls.
* The disadvantage is that the definition of what is done during
  "other time" is not that clear: it includes checking the WHERE
  (for this table), constructing index lookup tuple (for the next table)
  writing to GROUP BY temporary table (as we dont account for that time
  separately [yet], etc)
2019-11-12 14:40:00 +03:00
Marko Mäkelä
2fd82471ab Merge 10.3 into 10.4 2019-06-12 08:37:27 +03:00
Marko Mäkelä
b42dbdbccd Merge 10.2 into 10.3 2019-06-11 13:00:18 +03:00
Sergei Petrunia
5d06edfb26 MDEV-19714: JOIN::pseudo_bits_cond is not visible in EXPLAIN FORMAT=JSON
Make it visible
2019-06-08 02:28:29 +03:00
Oleksandr Byelkin
c07325f932 Merge branch '10.3' into 10.4 2019-05-19 20:55:37 +02:00
Marko Mäkelä
be85d3e61b Merge 10.2 into 10.3 2019-05-14 17:18:46 +03:00
Marko Mäkelä
26a14ee130 Merge 10.1 into 10.2 2019-05-13 17:54:04 +03:00
Vicențiu Ciorbaru
cb248f8806 Merge branch '5.5' into 10.1 2019-05-11 22:19:05 +03:00
Igor Babaev
98d55b1366 Merge branch '10.4' into bb-10.4-mdev16188 2019-02-14 22:07:33 -08:00
Igor Babaev
16327fc2e7 MDEV-17096 Pushdown of simple derived tables to storage engines
MDEV-17631 select_handler for a full query pushdown

Interfaces + Proof of Concept for federatedx with test cases.

The interfaces have been developed for integration of ColumnStore engine.
2019-02-06 17:02:44 -08:00
Galina Shalygina
447e0f023f MDEV-18144: ANALYZE for statement support for PK filters
ANALYZE and ANALYZE FORMAT=JSON structures are changed in the way that they
show additional information when rowid filter is used:

- r_selectivity_pct - the observed filter selectivity
- r_buffer_size - the size of the rowid filter container buffer
- r_filling_time_ms - how long it took to fill rowid filter container

New class Rowid_filter_tracker was added. This class is needed to collect data
about how rowid filter is executed.
2019-02-06 23:40:07 +03:00
Igor Babaev
658128af43 MDEV-16188 Use in-memory PK filters built from range index scans
This patch contains a full implementation of the optimization
that allows to use in-memory rowid / primary filters built for range  
conditions over indexes. In many cases usage of such filters reduce  
the number of disk seeks spent for fetching table rows.

In this implementation the choice of what possible filter to be applied  
(if any) is made purely on cost-based considerations.

This implementation re-achitectured the partial implementation of
the feature pushed by Galina Shalygina in the commit
8d5a11122c.

Besides this patch contains a better implementation of the generic  
handler function handler::multi_range_read_info_const() that
takes into account gaps between ranges when calculating the cost of
range index scans. It also contains some corrections of the
implementation of the handler function records_in_range() for MyISAM.

This patch supports the feature for InnoDB and MyISAM.
2019-02-03 14:56:12 -08:00
Galina Shalygina
8d5a11122c MDEV-16188: Use in-memory PK filters built from range index scans
First phase: make optimizer choose to use filter and show it in EXPLAIN.
2018-09-28 23:50:22 +03:00
Marko Mäkelä
b006d2ead4 Merge bb-10.2-ext into 10.3 2018-02-15 10:22:03 +02:00
Alexander Barkov
3cad31f2a7 Merge remote-tracking branch 'origin/10.2' into bb-10.2-ext 2018-02-08 19:06:25 +04:00
Sergei Golubchik
4771ae4b22 Merge branch 'github/10.1' into 10.2 2018-02-06 14:50:50 +01:00
Oleksandr Byelkin
80d3eee072 MDEV-14857: problem with 10.2.11 server crashing when executing stored procedure
Counter for select numbering made stored with the statement (before was global)
So now it does have always accurate value which does not depend on
interruption of statement prepare by errors like lack of table in
a view definition.
2018-02-01 09:51:47 +01:00
Marko Mäkelä
145ae15a33 Merge bb-10.2-ext into 10.3 2018-01-04 09:22:59 +02:00
Monty
fbab79c9b8 Merge remote-tracking branch 'origin/10.2' into bb-10.2-ext
Conflicts:
	cmake/make_dist.cmake.in
	mysql-test/r/func_json.result
	mysql-test/r/ps.result
	mysql-test/t/func_json.test
	mysql-test/t/ps.test
	sql/item_cmpfunc.h
2018-01-01 19:39:59 +02:00
Varun Gupta
a118c20c81 MDEV-10844: EXPLAIN FORMAT=JSON doesn't show order direction for filesort
Currently explain format=json does not show the order direction of fields used during filesort.
This patch would remove this limitation
2017-12-30 10:18:22 +05:30
Michael Widenius
87933d5261 Handle failures from malloc
Most "new" failures fixed in the following files:
- sql_select.cc
- item.cc
- item_func.cc
- opt_subselect.cc

Other things:
- Allocate udf_handler strings in mem_root
  - Required changes in sql_string.h
- Add mem_root as argument to some new [] calls
- Mark udf_handler strings as thread specific
- Removed some comment blocks with code
2017-11-17 07:30:05 +02:00
Alexander Barkov
f00a314f9a Merge remote-tracking branch 'origin/10.2' into bb-10.2-ext 2017-03-31 16:40:29 +04:00
Sergei Golubchik
da4d71d10d Merge branch '10.1' into 10.2 2017-03-30 12:48:42 +02:00
Oleksandr Byelkin
05d3c3d3f7 MDEV-10141: Add support for INTERSECT (and common parts for EXCEPT)
MDEV-10140: Add support for EXCEPT
2017-03-14 11:52:00 +01:00
iangilfillan
f0ec34002a Correct FSF address 2017-03-10 18:21:29 +01:00
Sergei Petrunia
a2f245e49f MDEV-10372: EXPLAIN fixes for recursive CTEs, including FORMAT=JSON
- Tabular EXPLAIN now prints "RECURSIVE UNION".
- There is a basic implementation of EXPLAIN FORMAT=JSON.
- it produces "recursive_union" JSON struct
- No other details or ANALYZE support, yet.
2016-08-08 23:02:52 +03:00
Galina Shalygina
be1d06c8a5 Merge branch '10.2' into 10.2-mdev9864 2016-05-08 23:04:41 +03:00
Sergei Petrunia
ab44e892d8 MDEV-9652: EXPLAIN FORMAT=JSON should show outer_ref_cond
Show outer_ref_condition in EXPLAIN FORMAT=JSON output.
2016-02-28 18:18:29 +03:00
Sergei Petrunia
7016621596 MDEV-8829: Assertion `0' failed in Explain_table_access::tag_to_json
- Add EXPLAIN/ANALYZE FORMAT=JSON handling for a few special cases.
2015-09-24 15:45:54 +03:00