Now INSERT, UPDATE, ALTER statements involving incompatible data type pairs, e.g.:
UPDATE TABLE t1 SET col_inet6=col_int;
INSERT INTO t1 (col_inet6) SELECT col_in FROM t2;
ALTER TABLE t1 MODIFY col_inet6 INT;
consistently return an error at the statement preparation time:
ERROR HY000: Illegal parameter data types inet6 and int for operation 'SET'
and abort the statement before starting interating rows.
This error is the same with what is raised for queries like:
SELECT col_inet6 FROM t1 UNION SELECT col_int FROM t2;
SELECT COALESCE(col_inet6, col_int) FROM t1;
Before this change the error was caught only during the execution time,
when a Field_xxx::store_xxx() was called for the very firts row.
The behavior was not consistent between various statements and could do different things:
- abort the statement
- set a column to the data type default value (e.g. '::' for INET6)
- set a column to NULL
A typical old error was:
ERROR 22007: Incorrect inet6 value: '1' for column `test`.`t1`.`a` at row 1
EXCEPTION:
Note, there is an exception: a multi-row INSERT..VALUES, e.g.:
INSERT INTO t1 (col_a,col_b) VALUES (a1,b1),(a2,b2);
checks assignment compability at the preparation time for the very first row only:
(col_a,col_b) vs (a1,b1)
Other rows are still checked at the execution time and return the old warnings
or errors in case of a failure. This is done because catching all rows at the
preparation time would change behavior significantly. So it still works
according to the STRICT_XXX_TABLES sql_mode flags and the table transaction ability.
This is too late to change this behavior in 10.7.
There is no a firm decision yet if a multi-row INSERT..VALUES
behavior will change in later versions.
This is based on commit 20ae4816bb
with some adjustments for MDEV-12353.
row_ins_sec_index_entry_low(): If a separate mini-transaction is
needed to adjust the minimum bounding rectangle (MBR) in the parent
page, we must disable redo logging if the table is a temporary table.
For temporary tables, no log is supposed to be written, because
the temporary tablespace will be reinitialized on server restart.
rtr_update_mbr_field(), rtr_merge_and_update_mbr(): Changed the return
type to void and removed unreachable code. In older versions, these
used to return a different value for temporary tables.
page_id_t: Add constexpr to most member functions.
mtr_t::log_write(): Catch log writes to invalid tablespaces
so that the test case would crash without the fix to
row_ins_sec_index_entry_low().
row_ins_sec_index_entry_low(): If a separate mini-transaction is
needed to adjust the minimum bounding rectangle (MBR) in the parent
page, we must disable redo logging if the table is a temporary table.
For temporary tables, no log is supposed to be written, because
the temporary tablespace will be reinitialized on server restart.
rtr_update_mbr_field(): Plug a memory leak.
* preserve DESC index property in the parser
* store it in the frm (only for HA_KEY_ALG_BTREE)
* read it from the frm
* show it in SHOW CREATE
* skip DESC indexes in opt_range.cc and opt_sum.cc
* ORDER BY test
This includes a fix of MDEV-27432.
This essentially reverts commit 4e89ec6692
and only disables InnoDB persistent statistics for tests where it is
desirable. By design, InnoDB persistent statistics will not be updated
except by ANALYZE TABLE or by STATS_AUTO_RECALC.
The internal transactions that update persistent InnoDB statistics
in background tasks (with innodb_stats_auto_recalc=ON) may cause
nondeterministic query plans or interfere with some tests that deal
with other InnoDB internals, such as the purge of transaction history.
Import operation without .cfg file fails when there is mismatch of index
between metadata table and .ibd file. Moreover, MDEV-19022 shows
that InnoDB can end up with index tree where non-leaf page has only
one child page. So it is unsafe to find the secondary index root page.
This patch does the following when importing the table without .cfg file:
1) If the metadata contains more than one index then InnoDB stops
the import operation and report the user to drop all secondary
indexes before doing import operation.
2) When the metadata contain only clustered index then InnoDB finds the
index id by reading page 0 & page 3 instead of traversing the
whole tablespace.
InnoDB should calculate the MBR for the first field of
spatial index and do the comparison with the clustered
index field MBR. Due to MDEV-25459 refactoring, InnoDB
calculate the length of the first field and fails with
too long column error.
This feature adds the functionality of ignorability for indexes.
Indexes are not ignored be default.
To control index ignorability explicitly for a new index,
use IGNORE or NOT IGNORE as part of the index definition for
CREATE TABLE, CREATE INDEX, or ALTER TABLE.
Primary keys (explicit or implicit) cannot be made ignorable.
The table INFORMATION_SCHEMA.STATISTICS get a new column named IGNORED that
would store whether an index needs to be ignored or not.
We implement an idea that was suggested by Michael 'Monty' Widenius
in October 2017: When InnoDB is inserting into an empty table or partition,
we can write a single undo log record TRX_UNDO_EMPTY, which will cause
ROLLBACK to clear the table.
For this to work, the insert into an empty table or partition must be
covered by an exclusive table lock that will be held until the transaction
has been committed or rolled back, or the INSERT operation has been
rolled back (and the table is empty again), in lock_table_x_unlock().
Clustered index records that are covered by the TRX_UNDO_EMPTY record
will carry DB_TRX_ID=0 and DB_ROLL_PTR=1<<55, and thus they cannot
be distinguished from what MDEV-12288 leaves behind after purging the
history of row-logged operations.
Concurrent non-locking reads must be adjusted: If the read view was
created before the INSERT into an empty table, then we must continue
to imagine that the table is empty, and not try to read any records.
If the read view was created after the INSERT was committed, then
all records must be visible normally. To implement this, we introduce
the field dict_table_t::bulk_trx_id.
This special handling only applies to the very first INSERT statement
of a transaction for the empty table or partition. If a subsequent
statement in the transaction is modifying the initially empty table again,
we must enable row-level undo logging, so that we will be able to
roll back to the start of the statement in case of an error (such as
duplicate key).
INSERT IGNORE will continue to use row-level logging and locking, because
implementing it would require the ability to roll back the latest row.
Since the undo log that we write only allows us to roll back the entire
statement, we cannot support INSERT IGNORE. We will introduce a
handler::extra() parameter HA_EXTRA_IGNORE_INSERT to indicate to storage
engines that INSERT IGNORE is being executed.
In many test cases, we add an extra record to the table, so that during
the 'interesting' part of the test, row-level locking and logging will
be used.
Replicas will continue to use row-level logging and locking until
MDEV-24622 has been addressed. Likewise, this optimization will be
disabled in Galera cluster until MDEV-24623 enables it.
dict_table_t::bulk_trx_id: The latest active or committed transaction
that initiated an insert into an empty table or partition.
Protected by exclusive table lock and a clustered index leaf page latch.
ins_node_t::bulk_insert: Whether bulk insert was initiated.
trx_t::mod_tables: Use C++11 style accessors (emplace instead of insert).
Unlike earlier, this collection will cover also temporary tables.
trx_mod_table_time_t: Add start_bulk_insert(), end_bulk_insert(),
is_bulk_insert(), was_bulk_insert().
trx_undo_report_row_operation(): Before accessing any undo log pages,
invoke trx->mod_tables.emplace() in order to determine whether undo
logging was disabled, or whether this is the first INSERT and we are
supposed to write a TRX_UNDO_EMPTY record.
row_ins_clust_index_entry_low(): If we are inserting into an empty
clustered index leaf page, set the ins_node_t::bulk_insert flag for
the subsequent trx_undo_report_row_operation() call.
lock_rec_insert_check_and_lock(), lock_prdt_insert_check_and_lock():
Remove the redundant parameter 'flags' that can be checked in the caller.
btr_cur_ins_lock_and_undo(): Simplify the logic. Correctly write
DB_TRX_ID,DB_ROLL_PTR after invoking trx_undo_report_row_operation().
trx_mark_sql_stat_end(), ha_innobase::extra(HA_EXTRA_IGNORE_INSERT),
ha_innobase::external_lock(): Invoke trx_t::end_bulk_insert() so that
the next statement will not be covered by table-level undo logging.
ReadView::changes_visible(trx_id_t) const: New accessor for the case
where the trx_id_t is not read from a potentially corrupted index page
but directly from the memory. In this case, we can skip a sanity check.
row_sel(), row_sel_try_search_shortcut(), row_search_mvcc():
row_sel_try_search_shortcut_for_mysql(),
row_merge_read_clustered_index(): Check dict_table_t::bulk_trx_id.
row_sel_clust_sees(): Replaces lock_clust_rec_cons_read_sees().
lock_sec_rec_cons_read_sees(): Replaced with lower-level code.
btr_root_page_init(): Refactored from btr_create().
dict_index_t::clear(), dict_table_t::clear(): Empty an index or table,
for the ROLLBACK of an INSERT operation.
ROW_T_EMPTY, ROW_OP_EMPTY: Note a concurrent ROLLBACK of an INSERT
into an empty table.
This is joint work with Thirunarayanan Balathandayuthapani,
who created a working prototype.
Thanks to Matthias Leich for extensive testing.
INSERT...SELECT reading from an InnoDB table is slow due to
creating explicit record locks. Use the sequence engine instead.
Also, remove the space before rtr_page_need_second_split
to actually make the debug injection work.
In main.index_merge_myisam we remove the test that was added in
commit a2d24def8c because
it duplicates the test case that was added in
commit 5af12e4635.
- multi_range_read_info_const now uses the new records_in_range interface
- Added handler::avg_io_cost()
- Don't calculate avg_io_cost() in get_sweep_read_cost if avg_io_cost is
not 1.0. In this case we trust the avg_io_cost() from the handler.
- Changed test_quick_select to use TIME_FOR_COMPARE instead of
TIME_FOR_COMPARE_IDX to align this with the rest of the code.
- Fixed bug when using test_if_cheaper_ordering where we didn't use
keyread if index was changed
- Fixed a bug where we didn't use index only read when using order-by-index
- Added keyread_time() to HEAP.
The default keyread_time() was optimized for blocks and not suitable for
HEAP. The effect was the HEAP prefered table scans over ranges for btree
indexes.
- Fixed get_sweep_read_cost() for HEAP tables
- Ensure that range and ref have same cost for simple ranges
Added a small cost (MULTI_RANGE_READ_SETUP_COST) to ranges to ensure
we favior ref for range for simple queries.
- Fixed that matching_candidates_in_table() uses same number of records
as the rest of the optimizer
- Added avg_io_cost() to JT_EQ_REF cost. This helps calculate the cost for
HEAP and temporary tables better. A few tests changed because of this.
- heap::read_time() and heap::keyread_time() adjusted to not add +1.
This was to ensure that handler::keyread_time() doesn't give
higher cost for heap tables than for normal tables. One effect of
this is that heap and derived tables stored in heap will prefer
key access as this is now regarded as cheap.
- Changed cost for index read in sql_select.cc to match
multi_range_read_info_const(). All index cost calculation is now
done trough one function.
- 'ref' will now use quick_cost for keys if it exists. This is done
so that for '=' ranges, 'ref' is prefered over 'range'.
- scan_time() now takes avg_io_costs() into account
- get_delayed_table_estimates() uses block_size and avg_io_cost()
- Removed default argument to test_if_order_by_key(); simplifies code