1
0
mirror of https://github.com/MariaDB/server.git synced 2025-11-27 05:41:41 +03:00
Commit Graph

117 Commits

Author SHA1 Message Date
Marko Mäkelä
984d7100cd Merge 10.4 into 10.5 2019-06-13 18:36:09 +03:00
Marko Mäkelä
2fd82471ab Merge 10.3 into 10.4 2019-06-12 08:37:27 +03:00
Sergei Petrunia
f7579518e2 MDEV-19600: The optimizer should be able to produce rows=1 estimate for unique index with NULLable columns
Modify best_access_path() to produce rows=1 estimate for null-rejecting
lookups on unique NULL keys.
2019-06-05 14:00:45 +03:00
Marko Mäkelä
826f9d4f7e Merge 10.4 into 10.5 2019-05-23 10:32:21 +03:00
Monty
007f68c37f Replace ha_notify_table_changed() with notify_tabledef_changed()
Reason for the change was that ha_notify_table_changed() was done
after table open when .frm had been replaced, which caused failure
in engines that checks on open if .frm matches the engines table
definition.

Other changes:
- Remove not needed open/close call at end of inline alter table.
  Some test that depended on the table beeing in the table cache after
  ALTER TABLE had to be updated.
2019-05-23 01:20:18 +03:00
Oleksandr Byelkin
c07325f932 Merge branch '10.3' into 10.4 2019-05-19 20:55:37 +02:00
Oleksandr Byelkin
c51f85f882 Merge branch '10.2' into 10.3 2019-05-12 17:20:23 +02:00
Marko Mäkelä
8ce702aa90 MDEV-17540 Server crashes in row_purge after TRUNCATE TABLE
row_purge_upd_exist_or_extern_func(): Check for node->vcol_op_failed()
after row_purge_remove_sec_if_poss(), like row_purge_del_mark() did.
This avoids us dereferencing the node->table=NULL pointer.

The test case, submitted by Elena Stepanova, is not deterministic and
does not repeat the bug on 10.2. With the added loop, for me, it reliably
crashes 10.3 without the fix. I was unable to create a deterministic
test case for either 10.2 or 10.3.

Reviewed by Thirunarayanan Balathandayuthapani
2019-05-10 10:50:35 +03:00
Marko Mäkelä
b132b8895e Merge 10.3 into 10.4 2019-05-05 10:23:14 +03:00
Marko Mäkelä
4d59f45260 Merge 10.2 into 10.3 2019-04-27 20:41:31 +03:00
Marko Mäkelä
c795a9f3fe MDEV-12004: Add the Bug#28825718 test case
Adapt the test case from
mysql/mysql-server@2bbbcddd90
(MySQL 5.7.26).
2019-04-26 17:40:20 +03:00
Marko Mäkelä
268d46b87d Merge 10.3 into 10.4 2019-04-03 09:20:44 +03:00
Marko Mäkelä
977d7a7572 Merge 10.2 into 10.3 2019-04-03 08:21:43 +03:00
Marko Mäkelä
e3f44d8d0e MDEV-19085: Remove a bogus debug assertion
MariaDB does support InnoDB tables with no stored columns.
(They are necessarily empty.)
2019-04-02 16:40:27 +03:00
Marko Mäkelä
5c3ff5cb93 Merge 10.3 into 10.4 2019-04-02 11:04:54 +03:00
Michael Widenius
b5615eff0d Write information about restart in .result
Idea comes from MySQL which does something similar
2019-04-01 19:47:24 +03:00
Marko Mäkelä
349560d5d5 Merge 10.2 into 10.3 2019-03-27 13:27:04 +02:00
Marko Mäkelä
1e9c2b2305 Merge 10.1 into 10.2 2019-03-27 12:26:11 +02:00
Marko Mäkelä
8b480df63e Merge 10.3 into 10.4 2019-03-25 17:18:15 +02:00
Marko Mäkelä
c3a6c683e2 Merge 10.2 into 10.3 2019-03-25 11:03:19 +02:00
Chris Calender
2d6e627a9f Fix for MDEV-18276, typo in error message + all other occurrences of refering 2019-03-23 00:10:18 +04:00
Oleksandr Byelkin
93ac7ae70f Merge branch '10.3' into 10.4 2019-02-21 14:40:52 +01:00
Galina Shalygina
7fe1ca7ed6 Merge branch '10.4' into bb-10.4-mdev7486 2019-02-19 11:00:39 +03:00
Sergei Petrunia
3f154943db MDEV-18608: Defaults for 10.4: histogram size should be set
Followup: update test results
2019-02-18 13:37:57 +03:00
Galina Shalygina
7a77b221f1 MDEV-7486: Condition pushdown from HAVING into WHERE
Condition can be pushed from the HAVING clause into the WHERE clause
if it depends only on the fields that are used in the GROUP BY list
or depends on the fields that are equal to grouping fields.
Aggregate functions can't be pushed down.

How the pushdown is performed on the example:

SELECT t1.a,MAX(t1.b)
FROM t1
GROUP BY t1.a
HAVING (t1.a>2) AND (MAX(c)>12);

=>

SELECT t1.a,MAX(t1.b)
FROM t1
WHERE (t1.a>2)
GROUP BY t1.a
HAVING (MAX(c)>12);

The implementation scheme:

1. Extract the most restrictive condition cond from the HAVING clause of
   the select that depends only on the fields that are used in the GROUP BY
   list of the select (directly or indirectly through equalities)
2. Save cond as a condition that can be pushed into the WHERE clause
   of the select
3. Remove cond from the HAVING clause if it is possible

The optimization is implemented in the function
st_select_lex::pushdown_from_having_into_where().

New test file having_cond_pushdown.test is created.
2019-02-17 23:38:44 -08:00
Oleksandr Byelkin
65c5ef9b49 dirty merge 2019-02-07 13:59:31 +01:00
Igor Babaev
33907360f5 MDEV-16188 Post-merge corrections and adjustments 2019-02-04 22:44:33 -08:00
Sergei Golubchik
676f43da3a cleanup: don't ---replace_regex /#sql-.*/#sql-temporary/
no longer needed
2019-02-05 01:34:17 +01:00
Marko Mäkelä
78829a5780 Merge 10.3 into 10.4 2019-01-24 22:42:35 +02:00
Marko Mäkelä
947b6b849d Merge 10.2 into 10.3 2019-01-24 16:14:12 +02:00
Marko Mäkelä
7930ab7e33 Comment out the statement that triggers MDEV-18366 2019-01-24 12:41:07 +02:00
Marko Mäkelä
64678ca506 Bug #22990029: Add a test case 2019-01-23 19:46:35 +02:00
Marko Mäkelä
e32305e505 Add a test for Bug #28470805 DELETE CASCADE CRASHES ... ON RESTART
Thanks to commit 2614a0ab0f
before MDEV-5800, no version of MariaDB is affected by this bug
that was fixed in MySQL 5.7.25.
2019-01-23 19:45:12 +02:00
Marko Mäkelä
b5763ecd01 Merge 10.3 into 10.4 2018-12-18 11:33:53 +02:00
Marko Mäkelä
45531949ae Merge 10.2 into 10.3 2018-12-18 09:15:41 +02:00
Alexey Botchkov
c4ab352b67 MDEV-14576 Include full name of object in message about incorrect value for column.
The error message modified.
Then the TABLE_SHARE::error_table_name() implementation taken from 10.3,
          to be used as a name of the table in this message.
2018-12-16 02:21:41 +04:00
Marko Mäkelä
c64265f3f9 Merge 10.3 into 10.4 2018-12-12 14:09:48 +02:00
Varun Gupta
8aef7f2bb9 MDEV-17778: Alter table leads to a truncation warning with ANALYZE command
Alter statement changed the THD structure by setting the value to FIELD_CHECK_WARN
and then not resetting it back. This led ANALYZE to throw a warning which previously
it didn't.
2018-12-10 18:27:13 +05:30
Varun Gupta
93c360e3a5 MDEV-15253: Default optimizer setting changes for MariaDB 10.4
use_stat_tables= PREFERABLY
optimizer_use_condition_selectivity= 4
2018-12-09 09:22:00 +05:30
Alexander Barkov
07e4853c23 MDEV-17563 Different results using table or view when comparing values of time type
MDEV-17625 Different warnings when comparing a garbage to DATETIME vs TIME

- Splitting processes of data type conversion (to TIME/DATE,DATETIME)
  and warning generation.
  Warning are now only get collected during conversion (in an "int" variable),
  and are pushed in the very end of conversion (not in parallel).
  Warnings generated by the low level routines str_to_xxx() and number_to_xxx()
  can now be changed at the end, when TIME_FUZZY_DATES is applied,
  from "Invalid value" to "Truncated invalid value".

  Now "Illegal value" is issued only when the low level routine returned
  an error and TIME_FUZZY_DATES was not set. Otherwise, if the low level
  routine returned "false" (success), or if NULL was converted to a zero
  datetime by TIME_FUZZY_DATES, then "Truncated illegal value"
  is issued. This gives better warnings.

- Methods Type_handler::Item_get_date() and
  Type_handler::Item_func_hybrid_field_type_get_date() now only
  convert and collect warning information, but do not push warnings.

- Changing the return data type for Type_handler::Item_get_date()
  and Type_handler::Item_func_hybrid_field_type_get_date() from
  "bool" to "void". The conversion result (success vs error) can be
  checked by testing ltime->time_type. MYSQL_TIME_{NONE|ERROR}
  mean mean error, other values mean success.

- Adding new wrapper methods Type_handler::Item_get_date_with_warn() and
  Type_handler::Item_func_hybrid_field_type_get_date_with_warn()
  to do conversion followed by raising warnings, and changing
  the code to call new Type_handler::***_with_warn() methods.

- Adding a helper class Temporal::Status, a wrapper
  for MYSQL_TIME_STATUS with automatic initialization.

- Adding a helper class Temporal::Warn, to collect warnings
  but without actually raising them. Moving a part of ErrConv
  into a separate class ErrBuff, and deriving both Temporal::Warn
  and ErrConv from ErrBuff. The ErrBuff part of Temporal::Warn
  is used to collect textual representation of the input data.

- Adding a helper class Temporal::Warn_push. It's used
  to collect warning information during conversion, and
  automatically pushes warnings to the diagnostics area
  on its destructor time (in case of non-zero warning).

- Moving more code from various functions inside class Temporal.

- Adding more Temporal_hybrid constructors and
  protected Temporal methods make_from_xxx(),
  which convert and only collect warning information, but do not
  actually raise warnings.

- Now the low level functions  str_to_datetime() and str_to_time()
  always set status->warning if the return value is "true" (error).

- Now the low level functions number_to_time() and number_to_datetime()
  set the "*was_cut" argument if the return value is "true" (error).

- Adding a few DBUG_ASSERTs to make sure that str_to_xxx() and
  number_to_xxx() always set warnings on error.

- Adding new warning flags MYSQL_TIME_WARN_EDOM and MYSQL_TIME_WARN_ZERO_DATE
  for the code symmetry. Before this change there was a special
  code path for (rc==true && was_cut==0) which was treated by
  Field_temporal::store_invalid_with_warning as "zero date violation".
  Now was_cut==0 always means that there are no any error/warnings/notes
  to be raised, not matter what rc is.

- Using new Temporal_hybrid constructors in combination with
  Temporal::Warn_push inside str_to_datetime_with_warn(),
  double_to_datetime_with_warn(), int_to_datetime_with_warn(),
  Field::get_date(), Item::get_date_from_string(), and a few other places.

- Removing methods Dec_ptr::to_datetime_with_warn(),
  Year::to_time_with_warn(), my_decimal::to_datetime_with_warn(),
  Dec_ptr::to_datetime_with_warn().
  Fixing Sec6::to_time() and Sec6::to_datetime() to
  convert and only collect warnings, without raising warnings.
  Now warning raising functionality resides in Temporal::Warn_push.

- Adding classes Longlong_hybrid_null and Double_null, to
  return both value and the "IS NULL" flag. Adding methods
  Item::to_double_null(), to_longlong_hybrid_null(),
  Item_func_hybrid_field_type::to_longlong_hybrid_null_op(),
  Item_func_hybrid_field_type::to_double_null_op().
  Removing separate classes VInt and VInt_op, as they
  have been replaced by a single class Longlong_hybrid_null.

- Adding a helper method Temporal::type_name_by_timestamp_type(),
  moving a part of make_truncated_value_warning() into it,
  and reusing in Temporal::Warn::push_conversion_warnings().

- Removing Item::make_zero_date() and
  Item_func_hybrid_field_type::make_zero_mysql_time().
  They provided duplicate functionality.
  Now this code resides in Temporal::make_fuzzy_date().
  The latter is now called for all Item types when data type
  conversion (to DATE/TIME/DATETIME) is involved, including
  Item_field and Item_direct_view_ref.
  This fixes MDEV-17563: Item_direct_view_ref now correctly converts
  NULL to a zero date when TIME_FUZZY_DATES says so.
2018-11-08 09:31:46 +04:00
Marko Mäkelä
074c684099 Merge 10.3 into 10.4 2018-11-06 16:24:16 +02:00
Marko Mäkelä
df563e0c03 Merge 10.2 into 10.3
main.derived_cond_pushdown: Move all 10.3 tests to the end,
trim trailing white space, and add an "End of 10.3 tests" marker.
Add --sorted_result to tests where the ordering is not deterministic.

main.win_percentile: Add --sorted_result to tests where the
ordering is no longer deterministic.
2018-11-06 09:40:39 +02:00
Marko Mäkelä
31366c6c93 MDEV-17548 Incorrect access to off-page column for indexed virtual column
row_build_index_entry_low(): ext does not contain virtual columns.

row_upd_store_v_row(): Copy virtual column values

This is based on the following fix in MySQL 5.7.24:

commit 4ec2158bec73f1582501c4b3e3de250fed9edc9a
Author: Sachin Agarwal <sachin.z.agarwal@oracle.com>
Date:   Fri Aug 24 14:44:13 2018 +0530

    Bug #27968952 INNODB CRASH/CORRUPTION WITH TEXT PREFIX INDEXES

    Problem:
    There are two problems:
      1. If there is one secondary index on extenally
    stored column and another seconday index on virtual column (whose
    base column is not externally stored). then while updating seconday
    index on vitrual column, virtual column data is replaced by
    externally stoared column.
      2. In row update operation, node->row contains
    shallow copy of virtual data fields. While building an update vector
    containing all the fields to be modified, compute virtual column.
    which may causes change in virtual data fields in node->row.

    In both the above cases, while updating seconday index on virtual
    column, couldn't find the row and hit an explicite assert inside
    ROW_NOT_FOUND.

    Fix:
    1. Added check if column is virtual then its ext flag should be ZERO
    and virtual column data will not be replaced by offset column data.
    2. Deep copy of virtual data fields for node->row.

    RB: #20382
    Reviewed by : Jimmy.Yang@oracle.com
2018-10-25 17:08:36 +03:00
Marko Mäkelä
0e5a4ac253 MDEV-15662 Instant DROP COLUMN or changing the order of columns
Allow ADD COLUMN anywhere in a table, not only adding as the
last column.

Allow instant DROP COLUMN and instant changing the order of columns.

The added columns will always be added last in clustered index records.
In new records, instantly dropped columns will be stored as NULL or
empty when possible.

Information about dropped and reordered columns will be written in
a metadata BLOB (mblob), which is stored before the first 'user' field
in the hidden metadata record at the start of the clustered index.
The presence of mblob is indicated by setting the delete-mark flag in
the metadata record.

The metadata BLOB stores the number of clustered index fields,
followed by an array of column information for each field.
For dropped columns, we store the NOT NULL flag, the fixed length,
and for variable-length columns, whether the maximum length exceeded
255 bytes. For non-dropped columns, we store the column position.

Unlike with MDEV-11369, when a table becomes empty, it cannot
be converted back to the canonical format. The reason for this is
that other threads may hold cached objects such as
row_prebuilt_t::ins_node that could refer to dropped or reordered
index fields.

For instant DROP COLUMN and ROW_FORMAT=COMPACT or ROW_FORMAT=DYNAMIC,
we must store the n_core_null_bytes in the root page, so that the
chain of node pointer records can be followed in order to reach the
leftmost leaf page where the metadata record is located.
If the mblob is present, we will zero-initialize the strings
"infimum" and "supremum" in the root page, and use the last byte of
"supremum" for storing the number of null bytes (which are allocated
but useless on node pointer pages). This is necessary for
btr_cur_instant_init_metadata() to be able to navigate to the mblob.

If the PRIMARY KEY contains any variable-length column and some
nullable columns were instantly dropped, the dict_index_t::n_nullable
in the data dictionary could be smaller than it actually is in the
non-leaf pages. Because of this, the non-leaf pages could use more
bytes for the null flags than the data dictionary expects, and we
could be reading the lengths of the variable-length columns from the
wrong offset, and thus reading the child page number from wrong place.
This is the result of two design mistakes that involve unnecessary
storage of data: First, it is nonsense to store any data fields for
the leftmost node pointer records, because the comparisons would be
resolved by the MIN_REC_FLAG alone. Second, there cannot be any null
fields in the clustered index node pointer fields, but we nevertheless
reserve space for all the null flags.

Limitations (future work):

MDEV-17459 Allow instant ALTER TABLE even if FULLTEXT INDEX exists
MDEV-17468 Avoid table rebuild on operations on generated columns
MDEV-17494 Refuse ALGORITHM=INSTANT when the row size is too large

btr_page_reorganize_low(): Preserve any metadata in the root page.
Call lock_move_reorganize_page() only after restoring the "infimum"
and "supremum" records, to avoid a memcmp() assertion failure.

dict_col_t::DROPPED: Magic value for dict_col_t::ind.

dict_col_t::clear_instant(): Renamed from dict_col_t::remove_instant().
Do not assert that the column was instantly added, because we
sometimes call this unconditionally for all columns.
Convert an instantly added column to a "core column". The old name
remove_instant() could be mistaken to refer to "instant DROP COLUMN".

dict_col_t::is_added(): Rename from dict_col_t::is_instant().

dtype_t::metadata_blob_init(): Initialize the mblob data type.

dtuple_t::is_metadata(), dtuple_t::is_alter_metadata(),
upd_t::is_metadata(), upd_t::is_alter_metadata(): Check if info_bits
refer to a metadata record.

dict_table_t::instant: Metadata about dropped or reordered columns.

dict_table_t::prepare_instant(): Prepare
ha_innobase_inplace_ctx::instant_table for instant ALTER TABLE.
innobase_instant_try() will pass this to dict_table_t::instant_column().
On rollback, dict_table_t::rollback_instant() will be called.

dict_table_t::instant_column(): Renamed from instant_add_column().
Add the parameter col_map so that columns can be reordered.
Copy and adjust v_cols[] as well.

dict_table_t::find(): Find an old column based on a new column number.

dict_table_t::serialise_columns(), dict_table_t::deserialise_columns():
Convert the mblob.

dict_index_t::instant_metadata(): Create the metadata record
for instant ALTER TABLE. Invoke dict_table_t::serialise_columns().

dict_index_t::reconstruct_fields(): Invoked by
dict_table_t::deserialise_columns().

dict_index_t::clear_instant_alter(): Move the fields for the
dropped columns to the end, and sort the surviving index fields
in ascending order of column position.

ha_innobase::check_if_supported_inplace_alter(): Do not allow
adding a FTS_DOC_ID column if a hidden FTS_DOC_ID column exists
due to FULLTEXT INDEX. (This always required ALGORITHM=COPY.)

instant_alter_column_possible(): Add a parameter for InnoDB table,
to check for additional conditions, such as the maximum number of
index fields.

ha_innobase_inplace_ctx::first_alter_pos: The first column whose position
is affected by instant ADD, DROP, or changing the order of columns.

innobase_build_col_map(): Skip added virtual columns.

prepare_inplace_add_virtual(): Correctly compute num_to_add_vcol.
Remove some unnecessary code. Note that the call to
innodb_base_col_setup() should be executed later.

commit_try_norebuild(): If ctx->is_instant(), let the virtual
columns be added or dropped by innobase_instant_try().

innobase_instant_try(): Fill in a zero default value for the
hidden column FTS_DOC_ID (to reduce the work needed in MDEV-17459).
If any columns were dropped or reordered (or added not last),
delete any SYS_COLUMNS records for the following columns, and
insert SYS_COLUMNS records for all subsequent stored columns as well
as for all virtual columns. If any virtual column is dropped, rewrite
all virtual column metadata. Use a shortcut only for adding
virtual columns. This is because innobase_drop_virtual_try()
assumes that the dropped virtual columns still exist in ctx->old_table.

innodb_update_cols(): Renamed from innodb_update_n_cols().

innobase_add_one_virtual(), innobase_insert_sys_virtual(): Change
the return type to bool, and invoke my_error() when detecting an error.

innodb_insert_sys_columns(): Insert a record into SYS_COLUMNS.
Refactored from innobase_add_one_virtual() and innobase_instant_add_col().

innobase_instant_add_col(): Replace the parameter dfield with type.

innobase_instant_drop_cols(): Drop matching columns from SYS_COLUMNS
and all columns from SYS_VIRTUAL.

innobase_add_virtual_try(), innobase_drop_virtual_try(): Let
the caller invoke innodb_update_cols().

innobase_rename_column_try(): Skip dropped columns.

commit_cache_norebuild(): Update table->fts->doc_col.

dict_mem_table_col_rename_low(): Skip dropped columns.

trx_undo_rec_get_partial_row(): Skip dropped columns.

trx_undo_update_rec_get_update(): Handle the metadata BLOB correctly.

trx_undo_page_report_modify(): Avoid out-of-bounds access to record fields.
Log metadata records consistently.
Apparently, the first fields of a clustered index may be updated
in an update_undo vector when the index is ID_IND of SYS_FOREIGN,
as part of renaming the table during ALTER TABLE. Normally, updates of
the PRIMARY KEY should be logged as delete-mark and an insert.

row_undo_mod_parse_undo_rec(), row_purge_parse_undo_rec():
Use trx_undo_metadata.

row_undo_mod_clust_low(): On metadata rollback, roll back the root page too.

row_undo_mod_clust(): Relax an assertion. The delete-mark flag was
repurposed for ALTER TABLE metadata records.

row_rec_to_index_entry_impl(): Add the template parameter mblob
and the optional parameter info_bits for specifying the desired new
info bits. For the metadata tuple, allow conversion between the original
format (ADD COLUMN only) and the generic format (with hidden BLOB).
Add the optional parameter "pad" to determine whether the tuple should
be padded to the index fields (on ALTER TABLE it should), or whether
it should remain at its original size (on rollback).

row_build_index_entry_low(): Clean up the code, removing
redundant variables and conditions. For instantly dropped columns,
generate a dummy value that is NULL, the empty string, or a
fixed length of NUL bytes, depending on the type of the dropped column.

row_upd_clust_rec_by_insert_inherit_func(): On the update of PRIMARY KEY
of a record that contained a dropped column whose value was stored
externally, we will be inserting a dummy NULL or empty string value
to the field of the dropped column. The externally stored column would
eventually be dropped when purge removes the delete-marked record for
the old PRIMARY KEY value.

btr_index_rec_validate(): Recognize the metadata record.

btr_discard_only_page_on_level(): Preserve the generic instant
ALTER TABLE metadata.

btr_set_instant(): Replaces page_set_instant(). This sets a clustered
index root page to the appropriate format, or upgrades from
the MDEV-11369 instant ADD COLUMN to generic ALTER TABLE format.

btr_cur_instant_init_low(): Read and validate the metadata BLOB page
before reconstructing the dictionary information based on it.

btr_cur_instant_init_metadata(): Do not read any lengths from the
metadata record header before reading the BLOB. At this point, we
would not actually know how many nullable fields the metadata record
contains.

btr_cur_instant_root_init(): Initialize n_core_null_bytes in one
of two possible ways.

btr_cur_trim(): Handle the mblob record.

row_metadata_to_tuple(): Convert a metadata record to a data tuple,
based on the new info_bits of the metadata record.

btr_cur_pessimistic_update(): Invoke row_metadata_to_tuple() if needed.
Invoke dtuple_convert_big_rec() for metadata records if the record is
too large, or if the mblob is not yet marked as externally stored.

btr_cur_optimistic_delete_func(), btr_cur_pessimistic_delete():
When the last user record is deleted, do not delete the
generic instant ALTER TABLE metadata record. Only delete
MDEV-11369 instant ADD COLUMN metadata records.

btr_cur_optimistic_insert(): Avoid unnecessary computation of rec_size.

btr_pcur_store_position(): Allow a logically empty page to contain
a metadata record for generic ALTER TABLE.

REC_INFO_DEFAULT_ROW_ADD: Renamed from REC_INFO_DEFAULT_ROW.
This is for the old instant ADD COLUMN (MDEV-11369) only.

REC_INFO_DEFAULT_ROW_ALTER: The more generic metadata record,
with additional information for dropped or reordered columns.

rec_info_bits_valid(): Remove. The only case when this would fail
is when the record is the generic ALTER TABLE metadata record.

rec_is_alter_metadata(): Check if a record is the metadata record
for instant ALTER TABLE (other than ADD COLUMN). NOTE: This function
must not be invoked on node pointer records, because the delete-mark
flag in those records may be set (it is garbage), and then a debug
assertion could fail because index->is_instant() does not necessarily
hold.

rec_is_add_metadata(): Check if a record is MDEV-11369 ADD COLUMN metadata
record (not more generic instant ALTER TABLE).

rec_get_converted_size_comp_prefix_low(): Assume that the metadata
field will be stored externally. In dtuple_convert_big_rec() during
the rec_get_converted_size() call, it would not be there yet.

rec_get_converted_size_comp(): Replace status,fields,n_fields with tuple.

rec_init_offsets_comp_ordinary(), rec_get_converted_size_comp_prefix_low(),
rec_convert_dtuple_to_rec_comp(): Add template<bool mblob = false>.
With mblob=true, process a record with a metadata BLOB.

rec_copy_prefix_to_buf(): Assert that no fields beyond the key and
system columns are being copied. Exclude the metadata BLOB field.

rec_convert_dtuple_to_metadata_comp(): Convert an alter metadata tuple
into a record.

row_upd_index_replace_metadata(): Apply an update vector to an
alter_metadata tuple.

row_log_allocate(): Replace dict_index_t::is_instant()
with a more appropriate condition that ignores dict_table_t::instant.
Only a table on which the MDEV-11369 ADD COLUMN was performed
can "lose its instantness" when it becomes empty. After
instant DROP COLUMN or reordering columns, we cannot simply
convert the table to the canonical format, because the data
dictionary cache and all possibly existing references to it
from other client connection threads would have to be adjusted.

row_quiesce_write_index_fields(): Do not crash when the table contains
an instantly dropped column.

Thanks to Thirunarayanan Balathandayuthapani for discussing the design
and implementing an initial prototype of this.
Thanks to Matthias Leich for testing.
2018-10-19 18:57:23 +03:00
Marko Mäkelä
6319c0b541 MDEV-13564: Replace innodb_unsafe_truncate with innodb_safe_truncate
Rename the 10.2-specific configuration option innodb_unsafe_truncate
to innodb_safe_truncate, and invert its value.

The default (for now) is innodb_safe_truncate=OFF, to avoid
disrupting users with an undo and redo log format change within
a Generally Available (GA) release series.
2018-10-11 15:10:13 +03:00
Marko Mäkelä
3448ceb02a MDEV-13564: Implement innodb_unsafe_truncate=ON for compatibility
While MariaDB Server 10.2 is not really guaranteed to be compatible
with Percona XtraBackup 2.4 (for example, the MySQL 5.7 undo log format
change that could be present in XtraBackup, but was reverted from
MariaDB in MDEV-12289), we do not want to disrupt users who have
deployed xtrabackup and MariaDB Server 10.2 in their environments.

With this change, MariaDB 10.2 will continue to use the backup-unsafe
TRUNCATE TABLE code, so that neither the undo log nor the redo log
formats will change in an incompatible way.

Undo tablespace truncation will keep using the redo log only. Recovery
or backup with old code will fail to shrink the undo tablespace files,
but the contents will be recovered just fine.

In the MariaDB Server 10.2 series only, we introduce the configuration
parameter innodb_unsafe_truncate and make it ON by default. To allow
MariaDB Backup (mariabackup) to work properly with TRUNCATE TABLE
operations, use loose_innodb_unsafe_truncate=OFF.

MariaDB Server 10.3.10 and later releases will always use the
backup-safe TRUNCATE TABLE, and this parameter will not be
added there.

recv_recovery_rollback_active(): Skip row_mysql_drop_garbage_tables()
unless innodb_unsafe_truncate=OFF. It is too unsafe to drop orphan
tables if RENAME operations are not transactional within InnoDB.

LOG_HEADER_FORMAT_10_3: Replaces LOG_HEADER_FORMAT_CURRENT.

log_init(), log_group_file_header_flush(),
srv_prepare_to_delete_redo_log_files(),
innobase_start_or_create_for_mysql(): Choose the redo log format
and subformat based on the value of innodb_unsafe_truncate.
2018-10-11 08:17:04 +03:00
Thirunarayanan Balathandayuthapani
e9d9ca8c44 MDEV-16980 Wrongly set tablename len while opening the
table for purge thread

Problem:
=======
	Purge tries to fetch mdl lock for the whole table even though it tries
to open one of the partition. But table name length was wrongly set to indicate
the partition name too.

Solution:
========
- Table name length should identify the table name only not the partition name.
2018-10-08 21:40:18 +05:30
Thirunarayanan Balathandayuthapani
7d4beb7286 MDEV-16980 Wrongly set tablename len while opening the
table for purge thread

Problem:
=======
	Purge tries to fetch mdl lock for the whole table even though it tries
to open one of the partition. But table name length was wrongly set to indicate
the partition name too.

Solution:
========
- Table name length should identify the table name only not the partition name.
2018-10-08 21:06:42 +05:30
Marko Mäkelä
5a1868b58d MDEV-13564 Mariabackup does not work with TRUNCATE
This is a merge from 10.2, but the 10.2 version of this will not
be pushed into 10.2 yet, because the 10.2 version would include
backports of MDEV-14717 and MDEV-14585, which would introduce
a crash recovery regression: Tables could be lost on
table-rebuilding DDL operations, such as ALTER TABLE,
OPTIMIZE TABLE or this new backup-friendly TRUNCATE TABLE.
The test innodb.truncate_crash occasionally loses the table due to
the following bug:

MDEV-17158 log_write_up_to() sometimes fails
2018-09-07 22:15:06 +03:00
Marko Mäkelä
055a3334ad MDEV-13564 Mariabackup does not work with TRUNCATE
Implement undo tablespace truncation via normal redo logging.

Implement TRUNCATE TABLE as a combination of RENAME to #sql-ib name,
CREATE, and DROP.

Note: Orphan #sql-ib*.ibd may be left behind if MariaDB Server 10.2
is killed before the DROP operation is committed. If MariaDB Server 10.2
is killed during TRUNCATE, it is also possible that the old table
was renamed to #sql-ib*.ibd but the data dictionary will refer to the
table using the original name.

In MariaDB Server 10.3, RENAME inside InnoDB is transactional,
and #sql-* tables will be dropped on startup. So, this new TRUNCATE
will be fully crash-safe in 10.3.

ha_mroonga::wrapper_truncate(): Pass table options to the underlying
storage engine, now that ha_innobase::truncate() will need them.

rpl_slave_state::truncate_state_table(): Before truncating
mysql.gtid_slave_pos, evict any cached table handles from
the table definition cache, so that there will be no stale
references to the old table after truncating.

== TRUNCATE TABLE ==

WL#6501 in MySQL 5.7 introduced separate log files for implementing
atomic and crash-safe TRUNCATE TABLE, instead of using the InnoDB
undo and redo log. Some convoluted logic was added to the InnoDB
crash recovery, and some extra synchronization (including a redo log
checkpoint) was introduced to make this work. This synchronization
has caused performance problems and race conditions, and the extra
log files cannot be copied or applied by external backup programs.

In order to support crash-upgrade from MariaDB 10.2, we will keep
the logic for parsing and applying the extra log files, but we will
no longer generate those files in TRUNCATE TABLE.

A prerequisite for crash-safe TRUNCATE is a crash-safe RENAME TABLE
(with full redo and undo logging and proper rollback). This will
be implemented in MDEV-14717.

ha_innobase::truncate(): Invoke RENAME, create(), delete_table().
Because RENAME cannot be fully rolled back before MariaDB 10.3
due to missing undo logging, add some explicit rename-back in
case the operation fails.

ha_innobase::delete(): Introduce a variant that takes sqlcom as
a parameter. In TRUNCATE TABLE, we do not want to touch any
FOREIGN KEY constraints.

ha_innobase::create(): Add the parameters file_per_table, trx.
In TRUNCATE, the new table must be created in the same transaction
that renames the old table.

create_table_info_t::create_table_info_t(): Add the parameters
file_per_table, trx.

row_drop_table_for_mysql(): Replace a bool parameter with sqlcom.

row_drop_table_after_create_fail(): New function, wrapping
row_drop_table_for_mysql().

dict_truncate_index_tree_in_mem(), fil_truncate_tablespace(),
fil_prepare_for_truncate(), fil_reinit_space_header_for_table(),
row_truncate_table_for_mysql(), TruncateLogger,
row_truncate_prepare(), row_truncate_rollback(),
row_truncate_complete(), row_truncate_fts(),
row_truncate_update_system_tables(),
row_truncate_foreign_key_checks(), row_truncate_sanity_checks():
Remove.

row_upd_check_references_constraints(): Remove a check for
TRUNCATE, now that the table is no longer truncated in place.

The new test innodb.truncate_foreign uses DEBUG_SYNC to cover some
race-condition like scenarios. The test innodb-innodb.truncate does
not use any synchronization.

We add a redo log subformat to indicate backup-friendly format.
MariaDB 10.4 will remove support for the old TRUNCATE logging,
so crash-upgrade from old 10.2 or 10.3 to 10.4 will involve
limitations.

== Undo tablespace truncation ==

MySQL 5.7 implements undo tablespace truncation. It is only
possible when innodb_undo_tablespaces is set to at least 2.
The logging is implemented similar to the WL#6501 TRUNCATE,
that is, using separate log files and a redo log checkpoint.

We can simply implement undo tablespace truncation within
a single mini-transaction that reinitializes the undo log
tablespace file. Unfortunately, due to the redo log format
of some operations, currently, the total redo log written by
undo tablespace truncation will be more than the combined size
of the truncated undo tablespace. It should be acceptable
to have a little more than 1 megabyte of log in a single
mini-transaction. This will be fixed in MDEV-17138 in
MariaDB Server 10.4.

recv_sys_t: Add truncated_undo_spaces[] to remember for which undo
tablespaces a MLOG_FILE_CREATE2 record was seen.

namespace undo: Remove some unnecessary declarations.

fil_space_t::is_being_truncated: Document that this flag now
only applies to undo tablespaces. Remove some references.

fil_space_t::is_stopping(): Do not refer to is_being_truncated.
This check is for tablespaces of tables. Potentially used
tablespaces are never truncated any more.

buf_dblwr_process(): Suppress the out-of-bounds warning
for undo tablespaces.

fil_truncate_log(): Write a MLOG_FILE_CREATE2 with a nonzero
page number (new size of the tablespace in pages) to inform
crash recovery that the undo tablespace size has been reduced.

fil_op_write_log(): Relax assertions, so that MLOG_FILE_CREATE2
can be written for undo tablespaces (without .ibd file suffix)
for a nonzero page number.

os_file_truncate(): Add the parameter allow_shrink=false
so that undo tablespaces can actually be shrunk using this function.

fil_name_parse(): For undo tablespace truncation,
buffer MLOG_FILE_CREATE2 in truncated_undo_spaces[].

recv_read_in_area(): Avoid reading pages for which no redo log
records remain buffered, after recv_addr_trim() removed them.

trx_rseg_header_create(): Add a FIXME comment that we could write
much less redo log.

trx_undo_truncate_tablespace(): Reinitialize the undo tablespace
in a single mini-transaction, which will be flushed to the redo log
before the file size is trimmed.

recv_addr_trim(): Discard any redo logs for pages that were
logged after the new end of a file, before the truncation LSN.
If the rec_list becomes empty, reduce n_addrs. After removing
any affected records, actually truncate the file.

recv_apply_hashed_log_recs(): Invoke recv_addr_trim() right before
applying any log records. The undo tablespace files must be open
at this point.

buf_flush_or_remove_pages(), buf_flush_dirty_pages(),
buf_LRU_flush_or_remove_pages(): Add a parameter for specifying
the number of the first page to flush or remove (default 0).

trx_purge_initiate_truncate(): Remove the log checkpoints, the
extra logging, and some unnecessary crash points. Merge the code
from trx_undo_truncate_tablespace(). First, flush all to-be-discarded
pages (beyond the new end of the file), then trim the space->size
to make the page allocation deterministic. At the only remaining
crash injection point, flush the redo log, so that the recovery
can be tested.
2018-09-07 22:10:02 +03:00