1
0
mirror of https://github.com/MariaDB/server.git synced 2025-09-11 05:52:26 +03:00
Commit Graph

699 Commits

Author SHA1 Message Date
Aleksey Midenkov
0c99e6e9a6 MDEV-22562 Assertion `next_insert_id == 0' upon UPDATE on system-versioned table
Don't update autoinc counter on history row insert. Uniqueness is kept
due to merge with row_end.
2021-03-31 21:25:36 +03:00
Aleksey Midenkov
af52a0e516 MDEV-24690 Dropping primary key column from versioned table always fails with 1072
Exclude system-invisible key-parts from MDEV-11114 (04b288ae)
restriction.
2021-03-31 21:25:33 +03:00
Marko Mäkelä
94b4578704 Merge 10.5 into 10.6 2021-02-17 19:39:05 +02:00
Sergei Golubchik
25d9d2e37f Merge branch 'bb-10.4-release' into bb-10.5-release 2021-02-15 16:43:15 +01:00
Aleksey Midenkov
1398160a71 MDEV-24522 Assertion `inited==NONE' fails upon UPDATE on versioned table with unique blob
Cause: no table->update_handler cloned at the moment of
vers_insert_history_row(). update_handler is needed because there
can't be several inited indexes at once in the same handler. First
index is inited by QUICK_RANGE_SELECT::reset(). Then when history row
is inserted check_duplicate_long_entry_key() is done and it requires
another index.
2021-01-26 14:41:23 +03:00
Marko Mäkelä
3cef4f8f0f MDEV-515 Reduce InnoDB undo logging for insert into empty table
We implement an idea that was suggested by Michael 'Monty' Widenius
in October 2017: When InnoDB is inserting into an empty table or partition,
we can write a single undo log record TRX_UNDO_EMPTY, which will cause
ROLLBACK to clear the table.

For this to work, the insert into an empty table or partition must be
covered by an exclusive table lock that will be held until the transaction
has been committed or rolled back, or the INSERT operation has been
rolled back (and the table is empty again), in lock_table_x_unlock().

Clustered index records that are covered by the TRX_UNDO_EMPTY record
will carry DB_TRX_ID=0 and DB_ROLL_PTR=1<<55, and thus they cannot
be distinguished from what MDEV-12288 leaves behind after purging the
history of row-logged operations.

Concurrent non-locking reads must be adjusted: If the read view was
created before the INSERT into an empty table, then we must continue
to imagine that the table is empty, and not try to read any records.
If the read view was created after the INSERT was committed, then
all records must be visible normally. To implement this, we introduce
the field dict_table_t::bulk_trx_id.

This special handling only applies to the very first INSERT statement
of a transaction for the empty table or partition. If a subsequent
statement in the transaction is modifying the initially empty table again,
we must enable row-level undo logging, so that we will be able to
roll back to the start of the statement in case of an error (such as
duplicate key).

INSERT IGNORE will continue to use row-level logging and locking, because
implementing it would require the ability to roll back the latest row.
Since the undo log that we write only allows us to roll back the entire
statement, we cannot support INSERT IGNORE. We will introduce a
handler::extra() parameter HA_EXTRA_IGNORE_INSERT to indicate to storage
engines that INSERT IGNORE is being executed.

In many test cases, we add an extra record to the table, so that during
the 'interesting' part of the test, row-level locking and logging will
be used.

Replicas will continue to use row-level logging and locking until
MDEV-24622 has been addressed. Likewise, this optimization will be
disabled in Galera cluster until MDEV-24623 enables it.

dict_table_t::bulk_trx_id: The latest active or committed transaction
that initiated an insert into an empty table or partition.
Protected by exclusive table lock and a clustered index leaf page latch.

ins_node_t::bulk_insert: Whether bulk insert was initiated.

trx_t::mod_tables: Use C++11 style accessors (emplace instead of insert).
Unlike earlier, this collection will cover also temporary tables.

trx_mod_table_time_t: Add start_bulk_insert(), end_bulk_insert(),
is_bulk_insert(), was_bulk_insert().

trx_undo_report_row_operation(): Before accessing any undo log pages,
invoke trx->mod_tables.emplace() in order to determine whether undo
logging was disabled, or whether this is the first INSERT and we are
supposed to write a TRX_UNDO_EMPTY record.

row_ins_clust_index_entry_low(): If we are inserting into an empty
clustered index leaf page, set the ins_node_t::bulk_insert flag for
the subsequent trx_undo_report_row_operation() call.

lock_rec_insert_check_and_lock(), lock_prdt_insert_check_and_lock():
Remove the redundant parameter 'flags' that can be checked in the caller.

btr_cur_ins_lock_and_undo(): Simplify the logic. Correctly write
DB_TRX_ID,DB_ROLL_PTR after invoking trx_undo_report_row_operation().

trx_mark_sql_stat_end(), ha_innobase::extra(HA_EXTRA_IGNORE_INSERT),
ha_innobase::external_lock(): Invoke trx_t::end_bulk_insert() so that
the next statement will not be covered by table-level undo logging.

ReadView::changes_visible(trx_id_t) const: New accessor for the case
where the trx_id_t is not read from a potentially corrupted index page
but directly from the memory. In this case, we can skip a sanity check.

row_sel(), row_sel_try_search_shortcut(), row_search_mvcc():
row_sel_try_search_shortcut_for_mysql(),
row_merge_read_clustered_index(): Check dict_table_t::bulk_trx_id.

row_sel_clust_sees(): Replaces lock_clust_rec_cons_read_sees().

lock_sec_rec_cons_read_sees(): Replaced with lower-level code.

btr_root_page_init(): Refactored from btr_create().

dict_index_t::clear(), dict_table_t::clear(): Empty an index or table,
for the ROLLBACK of an INSERT operation.

ROW_T_EMPTY, ROW_OP_EMPTY: Note a concurrent ROLLBACK of an INSERT
into an empty table.

This is joint work with Thirunarayanan Balathandayuthapani,
who created a working prototype.
Thanks to Matthias Leich for extensive testing.
2021-01-25 18:41:27 +02:00
Marko Mäkelä
8de233af81 Merge 10.4 into 10.5 2021-01-11 16:29:51 +02:00
Marko Mäkelä
fd5e103aa4 Merge 10.3 into 10.4 2021-01-11 10:35:06 +02:00
Nikita Malyavin
d846b55d9b MDEV-17891 Assertion failure upon attempt to replace into a full table
Problem: Assertion `transactional_table || !changed ||
thd->transaction.stmt.modified_non_trans_table' failed due REPLACE into a
versioned table.

It is not specific to system versioning/pertitioning/heap, but this
combination makes it much easier to reproduce.

The thing is to make first ha_update_row call succeed to make
info->deleted != 0. And then make REPLACE fail by any reason.

In this scenario we overflow versioned partition, so next ha_update_row
succeeds, but corresponding ha_write_row fails to insert history record.

Fix: modified_non_trans_table is set in one missed place
2021-01-07 14:53:41 +10:00
Oleksandr Byelkin
02e7bff882 Merge commit '10.4' into 10.5 2021-01-06 10:53:00 +01:00
Marko Mäkelä
0aa02567dd Merge 10.3 into 10.4 2020-12-23 14:52:59 +02:00
Aleksey Midenkov
9b38ed4c85 MDEV-23446 UPDATE does not insert history row if the row is not changed
Add history row outside of compare_record() check. For TRX_ID
versioning we have to fail can_compare_record to force InnoDB update
which adds history row; and there in ha_innobase::update_row() is
additional "row changed" check where we force history row anyway.
2020-12-22 08:34:18 +03:00
Aleksey Midenkov
932ec586aa MDEV-23644 Assertion on evaluating foreign referential action for self-reference in system versioned table
First part of the fix (row0mysql.cc) addresses external columns when adding history
row on referential action. The full data must be retrieved before the
row is inserted.

Second part of the fix (the rest) avoids duplicate primary key error between
the history row generated on referential action and the history row
generated by SQL command. Both command and referential action can
happen on same table since foreign key can be self-reference (parent
and child tables are same). Moreover, the self-reference can refer
multiple rows when the key is non-unique. In such case history is
generated by referential action occured on first row but processed all
rows by a matched key. The second round is when the next row is
processed by a command but history already exists. In such case we
check TRX_ID of existing history row and if it is the same we assume
the above situation and skip adding one more history row or failing
the command.
2020-12-22 03:33:53 +03:00
Aleksey Midenkov
7410ff436e MDEV-21138 Assertion col->ord_part' or f.col->ord_part' failed in row_build_index_entry_low
First part (row0mysql.cc) fixes ins_node_set_new_row() usage workflow
as it is designed to operate on empty row (see row_get_prebuilt_insert_row()
for example).

Second part (row0ins.cc) fixes duplicate key error in FTS_DOC_ID_INDEX
since history rows must not generate entries in that index. We detect
FTS_DOC_ID_INDEX by a number of attributes and skip it if the row is
historical.

Misc fixes:

row_build_index_entry_low() does not accept non-NULL tuple
for FTS index (subject assertion fails), assertion (index->type !=
DICT_FTS) adds code understanding.

Now as historical_row is copied in row_update_vers_insert() there is
no need to copy the row twice: ROW_COPY_POINTERS is used to build
historical_row initially.

dbug_print_rec() debug functions.
2020-12-22 03:33:53 +03:00
Aleksey Midenkov
d4258f3a8f MDEV-22178 Assertion `info->alias.str' failed in partition_info::check_partition_info instead of ER_VERS_WRONG_PARTS
Assign create_info->alias for ALTER TABLE since it is NULL and later
accessed for printing error message.
2020-12-22 03:33:53 +03:00
Marko Mäkelä
1657b7a583 Merge 10.4 to 10.5 2020-10-22 17:08:49 +03:00
Marko Mäkelä
46957a6a77 Merge 10.3 into 10.4 2020-10-22 13:27:18 +03:00
Aleksey Midenkov
9b46d8e5c4 MDEV-23968 CREATE TEMPORARY TABLE .. LIKE (system versioned table) returns error if unique index is defined in the table
- Remove row_start/row_end from keys in fix_create_like();

- Disable manual adding of implicit row_start/row_end to indexes on
  CREATE TABLE. INVISIBLE_SYSTEM fields are unoperable by user;

- Fix memory leak on allocation of Key_part_spec.
2020-10-20 10:49:54 +03:00
Aleksey Midenkov
ddea8f6a39 MDEV-23779 Error upon querying the view, that selecting from versioned table with partitions
PARTITION clause in SELECT means query is non-versioned (see
WITH_PARTITION_STORAGE_ENGINE in vers_setup_conds()).

vers_setup_conds() expands such query to SYSTEM_TIME_ALL which is then
added to VIEW specification. When VIEW is queried both clauses
PARTITION and FOR SYSTEM_TIME ALL lead to ER_VERS_QUERY_IN_PARTITION
(same place WITH_PARTITION_STORAGE_ENGINE).

Fix removes FOR SYSTEM_TIME ALL from VIEW by accessing original
SYSTEM_TIME clause: the one specified in parser. As a side-effect
EXPLAIN SELECT displays SYSTEM_TIME specified in SELECT which is
user-friendly.
2020-10-20 10:49:54 +03:00
Aleksey Midenkov
a3c379ea61 MDEV-23799 CREATE .. SELECT wrong result on join versioned table
For join to work correctly versioning condition must be added to table
on_expr. Without that JOIN_CACHE gets expression (1)

  trigcond(xtitle.row_end = TIMESTAMP'2038-01-19 06:14:07.999999') and
  trigcond(xtitle.elementId = x.`id` and xtitle.pkey = 'title')

instead of (2)

  trigcond(xtitle.elementId = x.`id` and xtitle.pkey = 'title')

for join_null_complements(). It is NULL-row of xtitle for
complementing the join and the above comparisons of course FALSE, but
trigcond (Item_func_trig_cond) makes them TRUE via its trig_var
property which is bound to some boolean properties of JOIN_TAB.

Expression (2) evaluated to TRUE because its trig_var is bound to
first_inner_tab->not_null_compl. The expression (1) does not evaluate
correctly because row_end comparison's trig_var is bound to
first_inner->found earlier. As a result JOIN_CACHE::check_match()
skipped the row for join_null_complements().

When we add versioning condition to table's on_expr the optimizer in
make_join_select() distributes conditions differently. tmp_cond
inherits on_expr value and in Good case it is full expression

xgender.elementId = x.`id` and xgender.pkey = 'gender' and
xgender.row_end = TIMESTAMP'2038-01-19 06:14:07.999999'

while in Bad case it is only

xgender.elementId = x.`id` and xgender.pkey = 'gender'.

Later in Good row_end condition is optimized out and we get one
trigcond in form of (2).
2020-10-20 10:49:54 +03:00
Marko Mäkelä
97a4a3872e Merge 10.4 into 10.5 2020-08-26 12:02:07 +03:00
Alexander Barkov
329301d2e6 MDEV-23562 Assertion `time_type == MYSQL_TIMESTAMP_DATETIME' failed upon SELECT from versioned table
The code in vers_select_conds_t::init_from_sysvar() assumed that
the value of the system_versioning_asof is DATETIME.
But it also could be DATE after a query like this:
  SET system_versioning_asof = DATE(NOW());

Fixing Sys_var_vers_asof::update() to convert the value
of the assignment source to DATETIME if it returned DATE.

Now vers_select_conds_t::init_from_sysvar() always gets a DATETIME value.
2020-08-24 22:55:39 +04:00
Marko Mäkelä
50a11f396a Merge 10.4 into 10.5 2020-08-01 14:42:51 +03:00
Marko Mäkelä
9216114ce7 Merge 10.3 into 10.4 2020-07-31 18:09:08 +03:00
Nikita Malyavin
34f2be3b29 MDEV-16023 Unfortunate error message WARN_VERS_PART_FULL (partition <name> is full) when rotation time for the last interval passed
* remove one case of WARN_VERS_PART_FULL
2020-07-29 14:32:07 +10:00
Marko Mäkelä
4ec032b492 Merge 10.4 into 10.5 2020-07-21 17:33:16 +03:00
Marko Mäkelä
b1538f4d60 Merge 10.3 into 10.4 2020-07-21 16:36:47 +03:00
Aleksey Midenkov
af83ed9f0e MDEV-20661 Virtual fields are not recalculated on system fields value assignment
Fix stale virtual field value in 4 cases: when virtual field depends
on row_start/row_end in timestamp/trx_id versioned table. row_start
dep is recalculated in vers_update_fields() (SQL and InnoDB
layer). row_end dep is recalculated on history row insert.
2020-07-20 18:28:08 +03:00
Aleksey Midenkov
af57c65809 MDEV-22061 InnoDB: Assertion of missing row in sec index row_start upon REPLACE on a system-versioned table
make_versioned_helper() appended new update field unconditionally
while it should check if this field already exists in update vector.

Misc renames to conform versioning prefix. vers_update_fields() name
conforms with sql layer TABLE::vers_update_fields().
2020-07-20 18:28:07 +03:00
Marko Mäkelä
1813d92d0c Merge 10.4 into 10.5 2020-07-02 09:41:44 +03:00
Marko Mäkelä
f347b3e0e6 Merge 10.3 into 10.4 2020-07-02 07:39:33 +03:00
Aleksey Midenkov
b633b6a9d8 MDEV-22906 Disallow system_versioning_asof in DML
system_versioning_asof does not influence on multi-delete,
multi-update, insert-select, replace-select.
2020-06-16 10:43:53 +03:00
Marko Mäkelä
6877ef9a7c Merge 10.4 into 10.5 2020-06-05 20:36:43 +03:00
Marko Mäkelä
68d9d512e9 Merge 10.3 into 10.4 2020-06-05 18:05:22 +03:00
Aleksey Midenkov
05693cf214 MDEV-22112 Assertion `tab_part_info->part_type == RANGE_PARTITION || tab_part_info->part_type == LIST_PARTITION' failed in prep_alter_part_table
Incorrect syntax for SYSTEM_TIME partition. work_part_info is detected
as HASH partition. We cannot add partition of different type neither
we cannot reorganize SYSTEM_TIME into/from different type
partitioning.

The sidefix for version until 10.5 corrects the message:
"For LIST partitions each partition must be defined"
2020-06-04 12:12:49 +03:00
Marko Mäkelä
4a0b56f604 Merge 10.4 into 10.5 2020-05-31 10:28:59 +03:00
Marko Mäkelä
6da14d7b4a Merge 10.3 into 10.4 2020-05-30 11:04:27 +03:00
Marko Mäkelä
2e1d10ecac Add end-of-test markers to ease merges 2020-05-30 10:48:27 +03:00
Aleksey Midenkov
4783494a5e MDEV-22283 Server crashes in key_copy or unexpected error 156
(The table already existed in the storage engine)

Wrong algorithm of closing partitions on error doesn't close last
partition.
2020-05-29 16:19:15 +03:00
Aleksey Midenkov
57f7b4866f MDEV-16937 Strict SQL with system versioned tables causes issues (10.4)
Respect system fields in NO_ZERO_DATE mode.

This is the subject for refactoring in MDEV-19597

Conflict resolution from 7d5223310789f967106d86ce193ef31b315ecff0
2020-05-29 11:45:19 +03:00
Aleksey Midenkov
19da9a51ae MDEV-16937 Strict SQL with system versioned tables causes issues
Respect system fields in NO_ZERO_DATE mode.

This is the subject for refactoring in MDEV-19597
2020-05-28 22:22:20 +03:00
Aleksey Midenkov
dd9773b723 MDEV-22413 Server hangs upon UPDATE on a view reading from versioned partitioned table
UPDATE gets access to history records because versioning conditions
are not set for VIEW. This leads to endless loop of inserting history
records when clustered index is rebuilt and ha_rnd_next() returns
newly inserted history record.

Return back original behavior of failing on write-locked table in
historical query.

35b679b9 assumed that SELECT_LEX::lock_type influences anything, but
actually at this point table is already locked. Original bug report
was tempesta-tech/mariadb#102
2020-05-28 22:22:19 +03:00
Aleksey Midenkov
dac1280a65 MDEV-18794 Assertion `!m_innodb' failed in ha_partition::cmp_ref upon SELECT from partitioned table
System versioning assertion fix. Since DROP SYSTEM VERSIONING does not
change list of dropped keys we should handle a special case.

Caused by MDEV-19751. This fix deprecates MDEV-17091.
2020-05-28 20:55:59 +03:00
Aleksey Midenkov
73aa78ea9d MDEV-22155 ALTER add default history partitions name clash on non-default partitions
If any of default names clashes with existing names find next large
enough name gap for the requested number of partitions.
2020-04-27 16:36:03 +03:00
Aleksey Midenkov
e174fa9d79 MDEV-22207 Drop default history partitions renders table inaccessible
This is continuation of MDEV-22153 bug when contiguity of history
partitions is broken. ha_partition::open_read_partitions() can not
handle non-contiguous list of default partitions.

Fix: when default partition is dropped convert list of partitions to
non-default.
2020-04-27 16:36:03 +03:00
Marko Mäkelä
fbe2712705 Merge 10.4 into 10.5
The functional changes of commit 5836191c8f
(MDEV-21168) are omitted due to MDEV-742 having addressed the issue.
2020-04-25 21:57:52 +03:00
Marko Mäkelä
af91266498 Merge 10.3 into 10.4
In main.index_merge_myisam we remove the test that was added in
commit a2d24def8c because
it duplicates the test case that was added in
commit 5af12e4635.
2020-04-16 12:12:26 +03:00
Aleksey Midenkov
139117528a MDEV-22153 ALTER add default history partitions makes table inaccessible
ADD default history partitions generates wrong partition name,
f.ex. p2 instead of p1. Gap in sequence of partition names leads to
ha_partition::open_read_partitions() fail on inexistent name.

Manual fixing such broken table requires:

1. create empty table by any name (t_empty) with correct number
of partitions;
2. stop the server;
3. rename data files (.myd, .myi or .ibd) of broken table to t_empty
fixing the partition sequence (#p2 to #p1, #p3 to #p2);
4. start the server;
5. drop the broken table;
6. rename t_empty to correct table name.
2020-04-06 06:26:46 +03:00
Aleksey Midenkov
105b879d0f MDEV-21941 RENAME doesn't work for system time or period fields
- Ignore system-invisible fields (as well as for setting default value);
- Handle rename of system time and period fields.
2020-04-04 00:53:37 +03:00
Aleksey Midenkov
0932c5804d MDEV-20515 multi-update tries to position updated table by null reference
Cause

Join tmp table inserts null row because of OUTER JOIN, that's
expected. Since `multi_update::prepare2()` converted
`Item_temptable_rowid` into `Item_field` (28dbdf3)
`multi_update::send_data()` accesses join tmp record directly and
treats it as a normal row ignoring null status of ref field. NULL ref
field is then treated as normal in `multi_update::do_updates()` which
tries to position updated table by reference 0.

Note that reference 0 may be valid reference and the first row of
table can be wrongly updated (see multi_update.test).

Fix

Do not add row into multi-update tmp table in case of null ref
field. Join tmp table does not have null_row status at this time (as
well as `STATUS_NULL_ROW`) and cannot be skipped by these properties
(see first comment in multi_update::send_data()). But it has all null
fields (including the ref field).
2020-04-02 20:48:38 +03:00