1
0
mirror of https://github.com/MariaDB/server.git synced 2025-08-08 11:22:35 +03:00
Commit Graph

28 Commits

Author SHA1 Message Date
Monty
dfdedd46e4 MDEV-32188 make TIMESTAMP use whole 32-bit unsigned range
This patch extends the timestamp from
2038-01-19 03:14:07.999999 to 2106-02-07 06:28:15.999999
for 64 bit hardware and OS where 'long' is 64 bits.
This is true for 64 bit Linux but not for Windows.

This is done by treating the 32 bit stored int as unsigned instead of
signed.  This is safe as MariaDB has never accepted dates before the epoch
(1970).
The benefit of this approach that for normal timestamp the storage is
compatible with earlier version.

However for tables using system versioning we before stored a
timestamp with the year 2038 as the 'max timestamp', which is used to
detect current values.  This patch stores the new 2106 year max value
as the max timestamp. This means that old tables using system
versioning needs to be updated with mariadb-upgrade when moving them
to 11.4. That will be done in a separate commit.
2024-05-27 12:39:02 +02:00
Alexander Barkov
351a8eecf0 MDEV-32148 Inefficient WHERE timestamp_column=datetime_const_expr
Changing the way how a the following conditions are evaluated:

    WHERE timestamp_column=datetime_const_expr

(for all comparison operators: =, <=>, <, >, <=, >=, <> and for NULLIF)

Before the change it was always performed as DATETIME.
That was not efficient, as involved per-row TIMESTAMP->DATETIME conversion
for timestamp_column. For example, in case of the SYSTEM time zone
it involved a localtime_r() call, which is known to be slow.

After the change it's performed as TIMESTAMP in many cases.
This allows to avoid per-row conversion, as it works the other way around:
datetime_const_expr is converted to TIMESTAMP once before the execution stage.

Note, datetime_const_expr must be inside monotone continuous periods of
the current time zone, i.e. not near these anomalies:
- DST changes (spring forward, fall back)
- leap seconds
2024-01-12 15:24:05 +04:00
Monty
3fa99f0c0e Change cost for REF to take into account cost for 1 extra key read_next
The main difference in code path between EQ_REF and REF is that for
REF we have to do an extra read_next on the index to check that there
is no more matching rows.

Before this patch we added a preference of EQ_REF by ensuring that REF
would always estimate to find at least 2 rows.

This patch adds the cost of the extra key read_next to REF access and
removes the code that limited REF to at least 2 rows. For some queries
this can have a big effect as the total estimated rows will be halved
for each REF table with 1 rows.

multi_range cost calculations are also changed to take into account
the difference between EQ_REF and REF.

The effect of the patch to the test suite:
- About 80 test case changed
- Almost all changes where for EXPLAIN where estimated rows for REF
  where changed from 2 to 1.
- A few test cases using explain extended had a change of 'filtered'.
  This is because of the estimated rows are now closer to the
  calculated selectivity.
- A very few test had a change of table order.
  This is because the change of estimated rows from 2 to 1 or the small
  cost change for REF
  (main.subselect_sj_jcl6, main.group_by, main.dervied_cond_pushdown,
  main.distinct, main.join_nested, main.order_by, main.join_cache)
- No key statistics and the estimated rows are now smaller which cased
  estimated filtering to be lower.
  (main.subselect_sj_mat)
- The number of total rows are halved.
  (main.derived_cond_pushdown)
- Plans with 1 row changed to use RANGE instead of REF.
  (main.group_min_max)
- ALL changed to REF
  (main.key_diff)
- Key changed from ref + index_only to PRIMARY key for InnoDB, as
  OPTIMIZER_ROW_LOOKUP_COST + OPTIMIZER_ROW_NEXT_FIND_COST is smaller than
  OPTIMIZER_KEY_LOOKUP_COST + OPTIMIZER_KEY_NEXT_FIND_COST.
  (main.join_outer_innodb)
- Cost changes printouts
  (main.opt_trace*)
- Result order change
  (innodb_gis.rtree)
2023-02-10 12:58:50 +02:00
Monty
727491b72a Added test cases for preceding test
This includes all test changes from
"Changing all cost calculation to be given in milliseconds"
and forwards.

Some of the things that caused changes in the result files:

- As part of fixing tests, I added 'echo' to some comments to be able to
  easier find out where things where wrong.
- MATERIALIZED has now a higher cost compared to X than before. Because
  of this some MATERIALIZED types have changed to DEPENDEND SUBQUERY.
  - Some test cases that required MATERIALIZED to repeat a bug was
    changed by adding more rows to force MATERIALIZED to happen.
- 'Filtered' in SHOW EXPLAIN has in many case changed from 100.00 to
  something smaller. This is because now filtered also takes into
  account the smallest possible ref access and filters, even if they
  where not used. Another reason for 'Filtered' being smaller is that
  we now also take into account implicit filtering done for subqueries
  using FIRSTMATCH.
  (main.subselect_no_exists_to_in)
  This is caluculated in best_access_path() and stored in records_out.
- Table orders has changed because more accurate costs.
- 'index' and 'ALL' for small tables has changed to use 'range' or
   'ref' because of optimizer_scan_setup_cost.
- index can be changed to 'range' as 'range' optimizer assumes we don't
  have to read the blocks from disk that range optimizer has already read.
  This can be confusing in the case where there is no obvious where clause
  but instead there is a hidden 'key_column > NULL' added by the optimizer.
  (main.subselect_no_exists_to_in)
- Scan on primary clustered key does not report 'Using Index' anymore
  (It's a table scan, not an index scan).
- For derived tables, the number of rows is now 100 instead of 2,
  which can be seen in EXPLAIN.
- More tests have "Using index for group by" as the cost of this
  optimization is now more correct (lower).
- A primary key could be preferred for a normal key, even if it would
  access more rows, as it's faster to do 1 lokoup and 3 'index_next' on a
  clustered primary key than one lookup trough a secondary.
  (main.stat_tables_innodb)

Notes:

- There was a 4.7% more calls to best_extension_by_limited_search() in
  the main.greedy_optimizer test.  However examining the test results
  it looked that the plans where slightly better (eq_ref where more
  chained together) so I assume this is ok.
- I have verified a few test cases where there was notable/unexpected
  changes in the plan and in all cases the new optimizer plans where
  faster.  (main.greedy_optimizer and some others)
2023-02-03 00:00:35 +03:00
Marko Mäkelä
9608773f75 MDEV-4750 follow-up: Reduce disabling innodb_stats_persistent
This essentially reverts commit 4e89ec6692
and only disables InnoDB persistent statistics for tests where it is
desirable. By design, InnoDB persistent statistics will not be updated
except by ANALYZE TABLE or by STATS_AUTO_RECALC.

The internal transactions that update persistent InnoDB statistics
in background tasks (with innodb_stats_auto_recalc=ON) may cause
nondeterministic query plans or interfere with some tests that deal
with other InnoDB internals, such as the purge of transaction history.
2021-08-31 13:55:02 +03:00
Marko Mäkelä
1657b7a583 Merge 10.4 to 10.5 2020-10-22 17:08:49 +03:00
Marko Mäkelä
46957a6a77 Merge 10.3 into 10.4 2020-10-22 13:27:18 +03:00
Aleksey Midenkov
ddea8f6a39 MDEV-23779 Error upon querying the view, that selecting from versioned table with partitions
PARTITION clause in SELECT means query is non-versioned (see
WITH_PARTITION_STORAGE_ENGINE in vers_setup_conds()).

vers_setup_conds() expands such query to SYSTEM_TIME_ALL which is then
added to VIEW specification. When VIEW is queried both clauses
PARTITION and FOR SYSTEM_TIME ALL lead to ER_VERS_QUERY_IN_PARTITION
(same place WITH_PARTITION_STORAGE_ENGINE).

Fix removes FOR SYSTEM_TIME ALL from VIEW by accessing original
SYSTEM_TIME clause: the one specified in parser. As a side-effect
EXPLAIN SELECT displays SYSTEM_TIME specified in SELECT which is
user-friendly.
2020-10-20 10:49:54 +03:00
Monty
eb483c5181 Updated optimizer costs in multi_range_read_info_const() and sql_select.cc
- multi_range_read_info_const now uses the new records_in_range interface
- Added handler::avg_io_cost()
- Don't calculate avg_io_cost() in get_sweep_read_cost if avg_io_cost is
  not 1.0.  In this case we trust the avg_io_cost() from the handler.
- Changed test_quick_select to use TIME_FOR_COMPARE instead of
  TIME_FOR_COMPARE_IDX to align this with the rest of the code.
- Fixed bug when using test_if_cheaper_ordering where we didn't use
  keyread if index was changed
- Fixed a bug where we didn't use index only read when using order-by-index
- Added keyread_time() to HEAP.
  The default keyread_time() was optimized for blocks and not suitable for
  HEAP. The effect was the HEAP prefered table scans over ranges for btree
  indexes.
- Fixed get_sweep_read_cost() for HEAP tables
- Ensure that range and ref have same cost for simple ranges
  Added a small cost (MULTI_RANGE_READ_SETUP_COST) to ranges to ensure
  we favior ref for range for simple queries.
- Fixed that matching_candidates_in_table() uses same number of records
  as the rest of the optimizer
- Added avg_io_cost() to JT_EQ_REF cost. This helps calculate the cost for
  HEAP and temporary tables better. A few tests changed because of this.
- heap::read_time() and heap::keyread_time() adjusted to not add +1.
  This was to ensure that handler::keyread_time() doesn't give
  higher cost for heap tables than for normal tables. One effect of
  this is that heap and derived tables stored in heap will prefer
  key access as this is now regarded as cheap.
- Changed cost for index read in sql_select.cc to match
  multi_range_read_info_const(). All index cost calculation is now
  done trough one function.
- 'ref' will now use quick_cost for keys if it exists. This is done
  so that for '=' ranges, 'ref' is prefered over 'range'.
- scan_time() now takes avg_io_costs() into account
- get_delayed_table_estimates() uses block_size and avg_io_cost()
- Removed default argument to test_if_order_by_key(); simplifies code
2020-03-27 03:58:32 +02:00
Marko Mäkelä
db4a27ab73 Merge 10.3 into 10.4 2019-08-31 06:53:45 +03:00
Sergei Petrunia
ef76f81c98 MDEV-20109: Optimizer ignores distinct key created for materialized...
(Backported to 10.3, addressed review input)

Sj_materialization_picker::check_qep(): fix error in cost/fanout
calculations:
- for each join prefix, add #prefix_rows / TIME_FOR_COMPARE to the cost,
  like best_extension_by_limited_search does
- Remove the fanout produced by the subquery tables.
- Also take into account join condition selectivity

optimize_wo_join_buffering() (used by LooseScan and FirstMatch)
- also add #prefix_rows / TIME_FOR_COMPARE to the cost of each prefix.
- Also take into account join condition selectivity
2019-08-30 12:02:40 +03:00
Varun Gupta
93c360e3a5 MDEV-15253: Default optimizer setting changes for MariaDB 10.4
use_stat_tables= PREFERABLY
optimizer_use_condition_selectivity= 4
2018-12-09 09:22:00 +05:30
Igor Babaev
1b45ede6ab Adjusted test results after mdev-15159. 2018-05-14 19:02:06 -07:00
Sergei Golubchik
531acda484 MDEV-14820 System versioning is applied incorrectly to CTEs
Make sure that SELECT_LEX_UNIT::derived, behaves as documented
(points to the "TABLE_LIST representing this union in the
embedding select"). For recursive CTE this was not necessarily
the case, it could've pointed to the TABLE_LIST inside the CTE,
not in the embedding select.

To fix:
* don't update unit->derived in mysql_derived_prepare(), pass derived
  as an argument to st_select_lex_unit::prepare()
* prefer to set unit->derived in TABLE_LIST::init_derived()
  to the TABLE_LIST in the embedding select, not to the recursive
  reference. Fail if there are many TABLE_LISTs in the embedding
  select with conflicting FOR SYSTEM_TIME clauses.

cleanup:
* remove redundant THD* argument from st_select_lex_unit::prepare()
2018-05-12 10:16:45 +02:00
Aleksey Midenkov
8efca72f4a MDEV-14792 INSERT without column list into table with explicit versioning columns produces bad data 2018-01-01 23:37:02 +03:00
Aleksey Midenkov
b55a149194 Timestamp-based versioning for InnoDB [closes #209]
* Removed integer_fields check
* Reworked Vers_parse_info::check_sys_fields()
* Misc renames
* versioned as vers_sys_type_t

* Removed versioned_by_sql(), versioned_by_engine()

versioned() works as before;
versioned(VERS_TIMESTAMP) is versioned_by_sql();
versioned(VERS_TRX_ID) is versioned_by_engine().

* create_tmp_table() fix
* Foreign constraints for timestamp-based
* Range auto-specifier fix
* SQL: 1-row partition rotation fix [fixes #260]
* Fix 'drop system versioning, algorithm=inplace'
2017-12-18 19:03:51 +03:00
Aleksey Midenkov
4624e565f3 System Versioning 1.0 pre6
Merge remote-tracking branch 'mariadb/bb-10.3-temporal-serg' into trunk
2017-12-15 18:12:18 +03:00
Sergei Golubchik
ca6454bcfe for now, remove FOR SYSTEM_TIME at the end of the query
non-standard, redundant, potentially risky in the future,
hides bugs. See #383, #384, #385

Fixed a parser bug where
SELECT * FROM (t1 join t2) FOR SYSTEM_TIME ...
was not an error.
2017-12-13 21:51:20 +01:00
Aleksey Midenkov
bc4a86699d SQL: recursive CTE inner derived vers_conditions [fix #385] 2017-12-13 15:31:46 +03:00
Aleksey Midenkov
84b718ae70 SQL: derived SYSTEM_TIME clash detection [closes #371] 2017-12-08 16:26:17 +03:00
Sergei Golubchik
ea1ccfa500 SQL: regression fix: make NOW a valid identifier again [#363]
* again, as in 10.2, NOW is a keyword only if followed by parentheses
* use AS OF CURRENT_TIMESTAMP or AS OF NOW()
* AS OF CURRENT_TIMESTAMP and AS OF NOW() mean AS OF NOW(6),
  not AS OF NOW(0), (same behavior as in a DEFAULT clause)
2017-12-08 16:24:56 +03:00
Sergei Golubchik
e60da371d1 fix versioning tests not to fail w/o innodb 2017-12-05 17:46:01 +03:00
Sergei Golubchik
3198bc839d Parser: unreserve keywords
SELECT * FROM t1 FOR SYSTEM_TIME AS OF ...

becomes ambiguous, but it's the same as with

SELECT ... UNION SELECT ... ORDER BY ...
2017-12-05 15:09:09 +03:00
Aleksey Midenkov
11a9d8f7e3 Tests: typo fix in cte.test
Related to c2c8808a16
2017-09-20 13:14:16 +03:00
Aleksey Midenkov
c2c8808a16 SQL: compare TRX_ID fields against timestamps [closes #231] 2017-08-03 16:01:16 +03:00
Aleksey Midenkov
aa292666cc Parser: moved 'for system_time' before alias
Due to standard (see 7.6 <table reference>).
2017-07-23 17:08:00 +03:00
Aleksey Midenkov
91c8b43e77 Parser: syntax for query system_time [closes #230]
Eliminated `QUERY FOR`.
2017-07-12 12:10:13 +03:00
Aleksey Midenkov
60e456df33 SQL: system_time propagation from derived table [fixes #228] 2017-07-12 10:36:52 +03:00