Running an UPDATE statement in PS mode and having positional
parameter(s) bound with an array of actual values (that is
prepared to be run in bulk mode) results in incorrect behaviour
in presence of on update trigger that also executes an UPDATE
statement. The same is true for handling a DELETE statement in
presence of on delete trigger. Typically, the visible effect of
such incorrect behaviour is expressed in a wrong number of
updated/deleted rows of a target table. Additionally, in case UPDATE
statement, a number of modified rows and a state message returned
by a statement contains wrong information about a number of modified rows.
The reason for incorrect number of updated/deleted rows is that
a data structure used for binding positional argument with its
actual values is stored in THD (this is thd->bulk_param) and reused
on processing every INSERT/UPDATE/DELETE statement. It leads to
consuming actual values bound with top-level UPDATE/DELETE statement
by other DML statements used by triggers' body.
To fix the issue, reset the thd->bulk_param temporary to the value
nullptr before invoking triggers and restore its value on finishing
its execution.
The second part of the problem relating with wrong value of affected
rows reported by Connector/C API is caused by the fact that diagnostics
area is reused by an original DML statement and a statement invoked
by a trigger. This fact should be take into account on finalizing a
state of diagnostics area on completion running of a statement.
Important remark: in case the macros DBUG_OFF is on, call of the method
Diagnostics_area::reset_diagnostics_area()
results in reset of the data members
m_affected_rows, m_statement_warn_count.
Values of these data members of the class Diagnostics_area are used on
sending OK and EOF messages. In case DML statement is executed in PS bulk
mode such resetting results in sending wrong result values to a client
for affected rows in case the DML statement fires a triggers. So, reset
these data members only in case the current statement being processed
is not run in bulk mode.
This was caused by wrong allocation of variable on stack.
(Was allocating 4K of data instead of 512 bytes).
No test case as the original MDEV test cases is not usable for mtr.
This patch fixes the issue with passing the DEFAULT or IGNORE values to
positional parameters for some kind of SQL statements to be executed
as prepared statements.
The main idea of the patch is to associate an actual value being passed
by the USING clause with the positional parameter represented by
the Item_param class. Such association must be performed on execution of
UPDATE statement in PS/SP mode. Other corner cases that results in
server crash is on handling CREATE TABLE when positional parameter
placed after the DEFAULT clause or CALL statement and passing either
the value DEFAULT or IGNORE as an actual value for the positional parameter.
This case is fixed by checking whether an error is set in diagnostics
area at the function pack_vcols() on return from the function pack_expression()
Enable unusable key notes for non-equality predicates:
<, <=, =>, >, BETWEEN, IN, LIKE
Note, in some scenarios it displays duplicate notes, e.g.
for queries with ORDER BY:
SELECT * FROM t1
WHERE indexed_string_column >= 10
ORDER BY indexed_string_column
LIMIT 5;
This should be tolarable. Getting rid of the diplicate note
completely would need a much more complex patch, which is
not desiable in 10.6.
Details:
- Changing RANGE_OPT_PARAM::note_unusable_keys from bool
to a new data type Item_func::Bitmap, so the caller can
choose with a better granuality which predicates
should raise unusable key notes inside the range optimizer:
a. all predicates (=, <=>, <, <=, =>, >, BETWEEN, IN, LIKE)
b. all predicates except equality (=, <=>)
c. none of the predicates
"b." is needed because in some scenarios equality predicates (=, <=>)
send unusable key notes at an earlier stage, before the range optimizer,
during update_ref_and_keys(). Calling the range optimizer with
"all predicates" would produce duplicate notes for = and <=> in such cases.
- Fixing get_quick_record_count() to call the range optimizer
with "all predicates except equality" instead of "none of the predicates".
Before this change the range optimizer suppressed all notes for
non-equality predicates: <, <=, =>, >, BETWEEN, IN, LIKE.
This actually fixes the reported problem.
- Fixing JOIN::make_range_rowid_filters() to call the range optimizer
with "all predicates except equality" instead of "all predicates".
Before this change the range optimizer produced duplicate notes
for = and <=> during a rowid_filter optimization.
- Cleanup:
Adding the op_collation argument to Field::raise_note_cannot_use_key_part()
and displaying the operation collation rather than the argument collation
in the unusable key note. This is important for operations with more than
two arguments: BETWEEN and IN, e.g.:
SELECT * FROM t1
WHERE column_utf8mb3_general_ci
BETWEEN 'a' AND 'b' COLLATE utf8mb3_unicode_ci;
SELECT * FROM t1
WHERE column_utf8mb3_general_ci
IN ('a', 'b' COLLATE utf8mb3_unicode_ci);
The note for 'a' now prints utf8mb3_unicode_ci as the collation.
which is the collation of the entire operation:
Cannot use key key1 part[0] for lookup:
"`column_utf8mb3_general_ci`" of collation `utf8mb3_general_ci` >=
"'a'" of collation `utf8mb3_unicode_ci`
Before this change it printed the collation of 'a',
so the note was confusing:
Cannot use key key1 part[0] for lookup:
"`column_utf8mb3_general_ci`" of collation `utf8mb3_general_ci` >=
"'a'" of collation `utf8mb3_general_ci`"
First UPDATE under START TRANSACTION does nothing (nstate= nstate),
but anyway generates history. Since update vector is empty we get into
(!uvect->n_fields) branch which only adds history row, but does not do
update. After that we get current row with wrong (old) row_start value
and because of that second UPDATE tries to insert history row again
because it sees trx->id != row_start which is the guard to avoid
inserting multiple trx_id-based history rows under same transaction
(because we have same trx_id and we get duplicate error and this bug
demostrates that). But this try anyway fails because PK is based on
row_end which is constant under same transaction, so PK didn't change.
The fix moves vers_make_update() to an earlier stage of
calc_row_difference(). Therefore it prepares update vector before
(!uvect->n_fields) check and never gets into that branch, hence no
need to handle versioning inside that condition anymore.
Now trx->id and row_start are equal after first UPDATE and we don't
try to insert second history row.
== Cleanups and improvements ==
ha_innobase::update_row():
vers_set_fields and vers_ins_row are cleaned up into direct condition
check. SQLCOM_ALTER_TABLE check now is not used as this is dead code,
assertion is done instead.
upd_node->is_delete is set in calc_row_difference() just to keep
versioning code as much in one place as possible. vers_make_delete()
is still located in row_update_for_mysql() as this is required for
ha_innodbase::delete_row() as well.
row_ins_duplicate_error_in_clust():
Restrict DB_FOREIGN_DUPLICATE_KEY to the better conditions.
VERSIONED_DELETE is used specifically to help lower stack to
understand what caused current insert. Related to MDEV-29813.
The new statistics is enabled by adding the "engine", "innodb" or "full"
option to --log-slow-verbosity
Example output:
# Pages_accessed: 184 Pages_read: 95 Pages_updated: 0 Old_rows_read: 1
# Pages_read_time: 17.0204 Engine_time: 248.1297
Page_read_time is time doing physical reads inside a storage engine.
(Writes cannot be tracked as these are usually done in the background).
Engine_time is the time spent inside the storage engine for the full
duration of the read/write/update calls. It uses the same code as
'analyze statement' for calculating the time spent.
The engine statistics is done with a generic interface that should be
easy for any engine to use. It can also easily be extended to provide
even more statistics.
Currently only InnoDB has counters for Pages_% and Undo_% status.
Engine_time works for all engines.
Implementation details:
class ha_handler_stats holds all engine stats. This class is included
in handler and THD classes.
While a query is running, all statistics is updated in the handler. In
close_thread_tables() the statistics is added to the THD.
handler::handler_stats is a pointer to where statistics should be
collected. This is set to point to handler::active_handler_stats if
stats are requested. If not, it is set to 0.
handler_stats has also an element, 'active' that is 1 if stats are
requested. This is to allow engines to avoid doing any 'if's while
updating the statistics.
Cloned or partition tables have the pointer set to the base table if
status are requested.
There is a small performance impact when using --log-slow-verbosity=engine:
- All engine calls in 'select' will be timed.
- IO calls for InnoDB reads will be timed.
- Incrementation of counters are done on local variables and accesses
are inline, so these should have very little impact.
- Statistics has to be reset for each statement for the THD and each
used handler. This is only 40 bytes, which should be neglectable.
- For partition tables we have to loop over all partitions to update
the handler_status as part of table_init(). Can be optimized in the
future to only do this is log-slow-verbosity changes. For this to work
we have to update handler_status for all opened partitions and
also for all partitions opened in the future.
Other things:
- Added options 'engine' and 'full' to log-slow-verbosity.
- Some of the new files in the test suite comes from Percona server, which
has similar status information.
- buf_page_optimistic_get(): Do not increment any counter, since we are
only validating a pointer, not performing any buf_pool.page_hash lookup.
- Added THD argument to save_explain_data_intern().
- Switched arguments for save_explain_.*_data() to have
always THD first (generates better code as other functions also have THD
first).
EXPLAIN EXTENDED for an UPDATE/DELETE/INSERT/REPLACE statement did not
produce the warning containing the text representation of the query
obtained after the optimization phase. Such warning was produced for
SELECT statements, but not for DML statements.
The patch fixes this defect of EXPLAIN EXTENDED for DML statements.
Remove table_count from Query_tables_list (not used, moved to MYSQL_LOCK).
Rename table_count from LEX to avoid mixing it with other counters of tables.
Cause: a copy of the joined TABLE_LIST is created during multi_update::prepare
and TABLE::pos_in_table_list of the tables are set to point to the new
TABLE_LIST object. This prevents some optimization steps to perform correctly.
Solution: do not update pos_in_table_list during multi_update::prepare
not every index-using plan sets bits in table->quick_keys.
QUICK_ROR_INTERSECT_SELECT, for example, doesn't.
Use the fact that select->quick is set instead.
Also allow EXPLAIN to work.
- Handle stored function conditions correctly, with the same logic as with UDFs.
- When running queries on Spider SE, by default, we do not push down WHERE conditions containing usage of UDFs/stored functions to remote data nodes, unless the user demands (by setting spider_use_pushdown_udf).
- Disable direct update/delete when a udf condition is skipped.
The columns that are part of DEFAULT expression were not read-marked
in statements like UPDATE...SET b=DEFAULT.
The problem is `F(DEFAULT)` expression depends of the left-hand side of an
assignment. However, setup_fields accepts only right-hand side value.
Neither Item::fix_fields does.
Suchwise, b=DEFAULT(b) works fine, because Item_default_field has
information on what field it is default of:
if (thd->mark_used_columns != MARK_COLUMNS_NONE)
def_field->default_value->expr->update_used_tables();
in Item_default_value::fix_fields().
It is not reasonable to pass a left-hand side to Item:fix_fields, because
the case is rare, so the rewrite
b= F(DEFAULT) -> b= F(DEFAULT(b))
is made instead.
Both UPDATE and multi-UPDATE are affected, however any form of INSERT
is not: it marks all the fields in DEFAULT expressions for read in
TABLE::mark_default_fields_for_write().
Reformulate mark_columns_used_by_index* function family in a more laconic
way:
mark_columns_used_by_index -> mark_index_columns
mark_columns_used_by_index_for_read_no_reset -> mark_index_columns_for_read
mark_columns_used_by_index_no_reset -> mark_index_columns_no_reset
static mark_index_columns -> do_mark_index_columns
Both EXPLAIN and EXPLAIN EXTENDED statements produce different results set
in case it is run in normal way and in PS mode for the statements
UPDATE/DELETE with subquery.
The use case below reproduces the issue:
MariaDB [test]> CREATE TABLE t1 (c1 INT KEY) ENGINE=MyISAM;
Query OK, 0 rows affected (0,128 sec)
MariaDB [test]> CREATE TABLE t2 (c2 INT) ENGINE=MyISAM;
Query OK, 0 rows affected (0,023 sec)
MariaDB [test]> CREATE TABLE t3 (c3 INT) ENGINE=MyISAM;
Query OK, 0 rows affected (0,021 sec)
MariaDB [test]> EXPLAIN EXTENDED UPDATE t3 SET c3 =
-> ( SELECT COUNT(d1.c1) FROM ( SELECT a11.c1 FROM t1 AS a11
-> STRAIGHT_JOIN t2 AS a21 ON a21.c2 = a11.c1 JOIN t1 AS a12
-> ON a12.c1 = a11.c1 ) d1 );
+------+-------------+-------+------+---------------+------+---------+------+------+----------+--------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+------+-------------+-------+------+---------------+------+---------+------+------+----------+--------------------------------+
| 1 | PRIMARY | t3 | ALL | NULL | NULL | NULL | NULL | 0 | 100.00 | |
| 2 | SUBQUERY | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Impossible WHERE noticed after reading const tables
+------+-------------+-------+------+---------------+------+---------+------+------+----------+--------------------------------+
2 rows in set (0,002 sec)
MariaDB [test]> PREPARE stmt FROM
-> EXPLAIN EXTENDED UPDATE t3 SET c3 =
-> ( SELECT COUNT(d1.c1) FROM ( SELECT a11.c1 FROM t1 AS a11
-> STRAIGHT_JOIN t2 AS a21 ON a21.c2 = a11.c1 JOIN t1 AS a12
-> ON a12.c1 = a11.c1 ) d1 );
Query OK, 0 rows affected (0,000 sec)
Statement prepared
MariaDB [test]> EXECUTE stmt;
+------+-------------+-------+------+---------------+------+---------+------+------+----------+--------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+------+-------------+-------+------+---------------+------+---------+------+------+----------+--------------------------------+
| 1 | PRIMARY | t3 | ALL | NULL | NULL | NULL | NULL | 0 | 100.00 | |
| 2 | SUBQUERY | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | no matching row in const table |
+------+-------------+-------+------+---------------+------+---------+------+------+----------+--------------------------------+
2 rows in set (0,000 sec)
The reason by that different result sets are produced is that on execution
of the statement 'EXECUTE stmt' the flag SELECT_DESCRIBE not set
in the data member SELECT_LEX::options for instances of SELECT_LEX that
correspond to subqueries used in the UPDTAE/DELETE statements.
Initially, these flags were set on parsing the statement
PREPARE stmt FROM "EXPLAIN EXTENDED UPDATE t3 SET ..."
but latter they were reset before starting real execution of
the parsed query during handling the statement 'EXECUTE stmt';
So, to fix the issue the functions mysql_update()/mysql_delete()
have been modified to set the flag SELECT_DESCRIBE forcibly
in the data member SELECT_LEX::options for the primary SELECT_LEX
of the UPDATE/DELETE statement.
The ROWNUM() function is for SELECT mapped to JOIN->accepted_rows, which is
incremented for each accepted rows.
For Filesort, update, insert, delete and load data, we map ROWNUM() to
internal variables incremented when the table is changed.
The connection between the row counter and Item_func_rownum is done
in sql_select.cc::fix_items_after_optimize() and
sql_insert.cc::fix_rownum_pointers()
When ROWNUM() is used anywhere in query, the optimization to ignore ORDER
BY in sub queries are disabled. This was done to get the following common
Oracle query to work:
select * from (select * from t1 order by a desc) as t where rownum() <= 2;
MDEV-3926 "Wrong result with GROUP BY ... WITH ROLLUP" contains a discussion
about this topic.
LIMIT optimization is enabled when in a top level WHERE clause comparing
ROWNUM() with a numerical constant using any of the following expressions:
- ROWNUM() < #
- ROWNUM() <= #
- ROWNUM() = 1
ROWNUM() can be also be the right argument to the comparison function.
LIMIT optimization is done in two cases:
- For the current sub query when the ROWNUM comparison is done on the top
level:
SELECT * from t1 WHERE rownum() <= 2 AND t1.a > 0
- For an inner sub query, when the upper level has only a ROWNUM comparison
in the WHERE clause:
SELECT * from (select * from t1) as t WHERE rownum() <= 2
In Oracle mode, one can also use ROWNUM without parentheses.
Other things:
- Fixed bug where the optimizer tries to optimize away sub queries
with RAND_TABLE_BIT set (non-deterministic queries). Now these
sub queries will not be converted to joins. This bug fix was also
needed to get rownum() working inside subqueries.
- In remove_const() remove setting simple_order to FALSE if ROLLUP is
USED. This code was disable a long time ago because of wrong assignment
in the following code. Instead we set simple_order to false if
RAND_TABLE_BIT was used in the SELECT list. This ensures that
we don't delete ORDER BY if the result set is not deterministic, like
in 'SELECT RAND() AS 'r' FROM t1 ORDER BY r';
- Updated parameters for Sort_param::init_for_filesort() to be able
to provide filesort with information where the number of accepted
rows should be stored
- Reordered fields in class Filesort to optimize storage layout
- Added new error messsage to tell that a function can't be used in HAVING
- Added field 'with_rownum' to THD to mark that ROWNUM() is used in the
query.
Co-author: Oleksandr Byelkin <sanja@mariadb.com>
LIMIT optimization for sub query