This regression was introduced in MDEV-16515.
We would fail to drop a temporary table on client disconnect,
because trx_is_interrupted() would hold. To add insult to
injury, in MariaDB 10.1, InnoDB temporary tables are actually
persistent, so the garbage temporary tables will never be dropped.
row_drop_table_for_mysql(): If several iterations of
buf_LRU_drop_page_hash_for_tablespace() are needed,
do not interrupt dropping a temporary table even after
the transaction was marked as killed.
Server shutdown will still terminate the loop, and also DROP TABLE
of persistent tables will keep checking if the execution was aborted.
Field_iterator_table_ref::set_field_iterator
Several functions that processed different prepare statements missed
the DT_INIT flag in last parameter of the open_normal_and_derived_tables()
calls. It made context analysis of derived tables dependent on the order in
which the derived tables were processed by mysql_handle_derived(). This
order was induced by the order of SELECTs in all_select_list.
In 10.4 the order of SELECTs in all_select_list became different and lack
of the DT_INIT flags in some open_normal_and_derived_tables() call became
critical as some derived tables were not identified as such.
Problem was that a parallel open of a table, overwrote info->state that
was in used by repair.
Fixed by changing _ma_tmp_disable_logging_for_table() to use
a new state buffer state.no_logging to store the temporary state.
Other things:
- Use original number of rows when retrying repair to get rid of a
potential warning "Number of rows changed from X to Y"
- Changed maria_commit() to make it easier to merge with 10.4
- If table is not locked (like with show commands), use the global
number of rows as the local number may not be up to date.
(Minor not critical fix)
- Added some missing DBUG_RETURN
The bug appears because of the Item_func_in::build_clone() method.
The 'array' field for the Item_func_in item that can be pushed into
the materialized view/derived table was built in the wrong way.
It becomes lame after the pushdown of the condition into the first
SELECT that defines that view/derived table. The server crashes in
the pushdown into the next SELECT while trying to use already lame
'array' field.
To fix it Item_func_in::build_clone() was changed.
Currently RocksDB engine doesn't update AUTO_INCREMENT in the UPDATE statement.
For example,
CREATE TABLE t1 (pk INT AUTO_INCREMENT, a INT, PRIMARY KEY(pk)) ENGINE=RocksDB;
INSERT INTO t1 (a) VALUES (1);
UPDATE t1 SET pk = 3; ==> AUTO_INCREMENT should be updated to 4.
Without this fix, it hits the Assertion `dd_val >= last_val' failed in
myrocks::ha_rocksdb::load_auto_incr_value_from_index.
(cherry picked from commit f7154242b8)
An INSERT into a temporary table would fail to set the
index page as modified. If there were no other write operations
(such as UPDATE or DELETE) to the page, and the page was evicted,
we would read back the old contents of the page, causing
corruption or loss of data.
page_cur_insert_rec_write_log(): Call mtr_t::set_modified()
for temporary tables. Normally this is part of the mlog_open()
call, but the mlog_open() call was only present in debug builds.
This regression was caused by
commit 48192f963a
which was preparation for MDEV-11369 and supposed to affect
debug builds only.
Thanks to Thirunarayanan Balathandayuthapani for debugging.
When a table is renamed to an internal #sql2 or #sql-ib name during
a table-rebuilding DDL operation such as OPTIMIZE TABLE or ALTER TABLE,
and shortly after that a purge operation in an index on virtual columns
is attempted, the operation could fail, but purge would fail to release
the table reference.
innodb_acquire_mdl(): Release the reference if the table name is not
valid for acquiring a meta-data lock (MDL).
innodb_find_table_for_vc(): Add a debug assertion if the table name
is not valid. This code path is for DML execution. The table
should have a valid name for executing DML, and furthermore a MDL
will prevent the table from being renamed.
row_vers_build_clust_v_col(): Add a debug assertion that both indexes
must belong to the same table.
After iterating all fields and setting PART_INDIRECT_KEY_FLAG as
necessary, TABLE::mark_columns_used_by_virtual_fields() remembers
in TABLE_SHARE that this operation was done and need not be repeated.
But as the flag is set in TABLE_SHARE, PART_INDIRECT_KEY_FLAG must
be set in TABLE_SHARE::field[], not only in TABLE::field[].
Otherwise all new TABLEs opened from this TABLE_SHARE will
never have it.
My conflict resolution for the script did not work out after all,
and apparently I was testing a wrong version. Revert MDEV-15511
from MariaDB 10.2 for now.
trx_purge_add_update_undo_to_history(): Relax the too strict assertion
by removing the condition on srv_fast_shutdown (innodb_fast_shutdown).
Rollback is allowed during any form of shutdown.
buf_dump(): Only generate the output when shutdown is in progress.
log_write_up_to(): Only generate the output before actually writing
to the redo log files.
srv_purge_should_exit(): Rate-limit the output, and instead of
displaying the work done, indicate the work that remains to be done
until the completion of the slow shutdown.
The bug appears because of the wrong pushdown into the WHERE clause of the
materialized derived table/view work. For the excl_dep_on_grouping_fields()
method that checks if the condition can be pushed into the WHERE clause
the case when Item_cond is used is missing. For Item_cond elements this
method always returns positive result (that condition can be pushed).
So this condition is pushed even if is shouldn't be pushed.
To fix it new Item_cond::excl_dep_on_grouping_fields() method is added.
using INSERT INTO
This patch allows condition pushdown into a materialized derived / view when
this table is used in INSERT SELECT, multi-table UPDATE and multi-table DELETE.
This patch introduces support for the system variable eq_range_index_dive_limit
that existed in MySQL starting from 5.6. The variable sets a limit for
index dives into equality ranges. Index dives are performed by optimizer
to estimate the number of rows in range scans. Index dives usually provide
good estimate but they are pretty expensive. To estimate the number of rows
in equality ranges statistical data on indexes can be employed. Its usage gives
not so good estimates but it's cheap. So if the number of equality dives
required by an index scan exceeds the set limit no dives for equality
ranges are performed by the optimizer for this index.
As the new system variable is introduced in a stable version the default
value for it is set to a special value meaning there is no limit for the number
of index dives performed by the optimizer.
The patch partially uses the MySQL code for WL 5957
'Statistics-based Range optimization for many ranges'.
The MySQL 5.7 TRUNCATE TABLE is inherently incompatible
with hot backup, because it is creating and deleting a separate
log file, and it is not writing redo log for all changes of the
InnoDB data dictionary tables. Refuse to create a corrupted backup
if the unsafe form of TRUNCATE was executed.
Note: Undo log tablespace truncation cannot be detected easily.
Also it is incompatible with backup, for similar reasons.
xtrabackup_backup_func(): "Subscribe to" the log events before
the first invocation of xtrabackup_copy_logfile().
recv_parse_or_apply_log_rec_body(): If the function pointer
log_truncate is set, invoke it to report MLOG_TRUNCATE.
Remove the logic for skipping the test if a log checkpoint occurred,
and the logic for tolerating failures. Thanks to MDEV-16791,
MLOG_INDEX_LOAD is supposed to always work.
Amend commit b853b4fd88
that was reverted in commit 29150e2391.
recv_parse_log_recs(): Do check for corrupted redo log or file
system before checking for len==0, but only read *ptr if
it is not past the end of the buffer (end_ptr).
recv_parse_log_rec(): Report incorrect redo log type
in a consistent way with recv_parse_or_apply_log_rec_body().
This is a follow-up to commit f30c5af42e.
The Pool poisoning that was introduced in MDEV-15030 introduced
race conditions in AddressSanitizer builds, because concurrent
poisoning and unpoisoning were not prevented by any synchronization
primitive.
Pool::get(): Protect the unpoisoning by m_lock_strategy.
Pool::mem_free(): Protect the poisoning by m_lock_strategy.
Pool::putl(): Renamed from put(), because now the caller is
responsible for invoking m_lock_strategy.
If trx_free() and trx_create_low() were called while a call to
trx_reference() was pending, we could get a reference to a wrong
transaction object.
trx_reference(): Return NULL if the trx->id no longer matches.
lock_trx_release_locks(): Assign trx->id = 0, so that trx_reference()
will not return a reference to this object.
trx_cleanup_at_db_startup(): Assign trx->id = 0.
assert_trx_is_free(): Assert !trx->n_ref. Assert trx->id == 0,
now that it will be cleared as part of a transaction commit.
Allocate trx->lock.rec_pool and trx->lock.table_pool directly from trx_t.
Remove unnecessary use of std::vector.
In order to do this, move some definitions from lock0priv.h to
lock0types.h, so that ib_lock_t will not be an opaque type.
Item_subselect::is_expensive() used to return FALSE (Inexpensive) whenever
it saw that one of SELECTs in the Subquery's UNION is degenerate. It
ignored the fact that other parts of the UNION might not be inexpensive,
including the case where pther parts of the UNION have no query plan yet.
For a subquery in form col >= ANY (SELECT 'foo' UNION SELECT 'bar')
this would cause the query to be considered inexpensive when there is
no query plan for the second part of the UNION, which in turn would
cause the SELECT 'foo' to compute and free itself while still inside
JOIN::optimize for that SELECT (See MDEV comment for full description).
If we have a 2+ node cluster which is replicating from an async master
and the binlog_format is set to STATEMENT and multi-row inserts are executed
on a table with an auto_increment column such that values are automatically
generated by MySQL, then the server node generates wrong auto_increment
values, which are different from what was generated on the async master.
The causes and fixes:
1. We need to improve processing of changing the auto-increment values
after changing the cluster size.
2. If wsrep auto_increment_control switched on during operation of
the node, then we should immediately update the auto_increment_increment
and auto_increment_offset global variables, without waiting of the next
invocation of the wsrep_view_handler_cb() callback. In the current version
these variables retain its initial values if wsrep_auto_increment_control
is switched on during operation of the node, which leads to inconsistent
results on the different nodes in some scenarios.
3. If wsrep auto_increment_control switched off during operation of the node,
then we must return the original values of the auto_increment_increment and
auto_increment_offset global variables, as the user has set. To make this
possible, we need to add a "shadow copies" of these variables (which stores
the latest values set by the user).