1
0
mirror of https://github.com/MariaDB/server.git synced 2025-09-09 18:40:27 +03:00
Commit Graph

2744 Commits

Author SHA1 Message Date
Marko Mäkelä
1bd681c8b3 MDEV-25506 (3 of 3): Do not delete .ibd files before commit
This is a complete rewrite of DROP TABLE, also as part of other DDL,
such as ALTER TABLE, CREATE TABLE...SELECT, TRUNCATE TABLE.

The background DROP TABLE queue hack is removed.
If a transaction needs to drop and create a table by the same name
(like TRUNCATE TABLE does), it must first rename the table to an
internal #sql-ib name. No committed version of the data dictionary
will include any #sql-ib tables, because whenever a transaction
renames a table to a #sql-ib name, it will also drop that table.
Either the rename will be rolled back, or the drop will be committed.

Data files will be unlinked after the transaction has been committed
and a FILE_RENAME record has been durably written. The file will
actually be deleted when the detached file handle returned by
fil_delete_tablespace() will be closed, after the latches have been
released. It is possible that a purge of the delete of the SYS_INDEXES
record for the clustered index will execute fil_delete_tablespace()
concurrently with the DDL transaction. In that case, the thread that
arrives later will wait for the other thread to finish.

HTON_TRUNCATE_REQUIRES_EXCLUSIVE_USE: A new handler flag.
ha_innobase::truncate() now requires that all other references to
the table be released in advance. This was implemented by Monty.

ha_innobase::delete_table(): If CREATE TABLE..SELECT is detected,
we will "hijack" the current transaction, drop the table in
the current transaction and commit the current transaction.
This essentially fixes MDEV-21602. There is a FIXME comment about
making the check less failure-prone.

ha_innobase::truncate(), ha_innobase::delete_table():
Implement a fast path for temporary tables. We will no longer allow
temporary tables to use the adaptive hash index.

dict_table_t::mdl_name: The original table name for the purpose of
acquiring MDL in purge, to prevent a race condition between a
DDL transaction that is dropping a table, and purge processing
undo log records of DML that had executed before the DDL operation.
For #sql-backup- tables during ALTER TABLE...ALGORITHM=COPY, the
dict_table_t::mdl_name will differ from dict_table_t::name.

dict_table_t::parse_name(): Use mdl_name instead of name.

dict_table_rename_in_cache(): Update mdl_name.

For the internal FTS_ tables of FULLTEXT INDEX, purge would
acquire MDL on the FTS_ table name, but not on the main table,
and therefore it would be able to run concurrently with a
DDL transaction that is dropping the table. Previously, the
DROP TABLE queue hack prevented a race between purge and DDL.
For now, we introduce purge_sys.stop_FTS() to prevent purge from
opening any table, while a DDL transaction that may drop FTS_
tables is in progress. The function fts_lock_table(), which will
be invoked before the dictionary is locked, will wait for
purge to release any table handles.

trx_t::drop_table_statistics(): Drop statistics for the table.
This replaces dict_stats_drop_index(). We will drop or rename
persistent statistics atomically as part of DDL transactions.
On lock conflict for dropping statistics, we will fail instantly
with DB_LOCK_WAIT_TIMEOUT, because we will be holding the
exclusive data dictionary latch.

trx_t::commit_cleanup(): Separated from trx_t::commit_in_memory().
Relax an assertion around fts_commit() and allow DB_LOCK_WAIT_TIMEOUT
in addition to DB_DUPLICATE_KEY. The call to fts_commit() is
entirely misplaced here and may obviously break the consistency
of transactions that affect FULLTEXT INDEX. It needs to be fixed
separately.

dict_table_t::n_foreign_key_checks_running: Remove (MDEV-21175).
The counter was a work-around for missing meta-data locking (MDL)
on the SQL layer, and not really needed in MariaDB.

ER_TABLE_IN_FK_CHECK: Replaced with ER_UNUSED_28.

HA_ERR_TABLE_IN_FK_CHECK: Remove.

row_ins_check_foreign_constraints(): Do not acquire
dict_sys.latch either. The SQL-layer MDL will protect us.

This was reviewed by Thirunarayanan Balathandayuthapani
and tested by Matthias Leich.
2021-06-09 17:06:07 +03:00
Marko Mäkelä
a722ee88f3 Merge 10.5 into 10.6 2021-06-01 11:39:38 +03:00
Marko Mäkelä
9c7a456a92 Merge 10.4 into 10.5 2021-06-01 10:38:09 +03:00
sjaakola
e212415690 MDEV-25551 applying crash with tables without PK
The underlying problem with MDEV-25551 turned out to be that
transactions having changes for tables with no primary key,
were not safe to apply in parallel. This is due to excessive locking
in innodb side, and even non related row modifications could end up
in lock conflict during applying.

The fix for MDEV-25551 has disabled parallel applying for tables with no PK.
This fix depends on change for wsrep-lib, where a separate PR allows
application to modify transaction flags in wsrep-lib.

This commit has also separate mtr test for verifying that transactions
modifying a table with no primary key, will not apply in parallel.
This test is a modified version of initial test created by Gabor Orosz,
the reporterr of MDEV-25551.
Another mtr test was added in galera_sr suite, for testing if modifying
tables with no primary key would causes issues for streaming replication
use cases.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2021-05-26 07:41:05 +03:00
Sergei Golubchik
b2340cdd07 fix compilation w/o partitioning 2021-05-19 22:54:14 +02:00
Monty
83e529eced MDEV-18465 Logging of DDL statements during backup
Many of the changes was needed to be able to collect and print engine
name and table version id's in the ddl log.
2021-05-19 22:54:13 +02:00
Monty
7762ee5dbe MDEV-25180 Atomic ALTER TABLE
MDEV-25604 Atomic DDL: Binlog event written upon recovery does not
           have default database

The purpose of this task is to ensure that ALTER TABLE is atomic even if
the MariaDB server would be killed at any point of the alter table.
This means that either the ALTER TABLE succeeds (including that triggers,
the status tables and the binary log are updated) or things should be
reverted to their original state.

If the server crashes before the new version is fully up to date and
commited, it will revert to the original table and remove all
temporary files and tables.
If the new version is commited, crash recovery will use the new version,
and update triggers, the status tables and the binary log.
The one execption is ALTER TABLE .. RENAME .. where no changes are done
to table definition. This one will work as RENAME and roll back unless
the whole statement completed, including updating the binary log (if
enabled).

Other changes:
- Added handlerton->check_version() function to allow the ddl recovery
  code to check, in case of inplace alter table, if the table in the
  storage engine is of the new or old version.
- Added handler->table_version() so that an engine can report the current
  version of the table. This should be changed each time the table
  definition changes.
- Added  ha_signal_ddl_recovery_done() and
  handlerton::signal_ddl_recovery_done() to inform all handlers when
  ddl recovery has been done. (Needed by InnoDB).
- Added handlerton call inplace_alter_table_committed, to signal engine
  that ddl_log has been closed for the alter table query.
- Added new handerton flag
  HTON_REQUIRES_NOTIFY_TABLEDEF_CHANGED_AFTER_COMMIT to signal when we
  should call hton->notify_tabledef_changed() during
  mysql_inplace_alter_table. This was required as MyRocks and InnoDB
  needed the call at different times.
- Added function server_uuid_value() to be able to generate a temporary
  xid when ddl recovery writes the query to the binary log. This is
  needed to be able to handle crashes during ddl log recovery.
- Moved freeing of the frm definition to end of mysql_alter_table() to
  remove duplicate code and have a common exit strategy.

-------
InnoDB part of atomic ALTER TABLE
(Implemented by Marko Mäkelä)
innodb_check_version(): Compare the saved dict_table_t::def_trx_id
to determine whether an ALTER TABLE operation was committed.

We must correctly recover dict_table_t::def_trx_id for this to work.
Before purge removes any trace of DB_TRX_ID from system tables, it
will make an effort to load the user table into the cache, so that
the dict_table_t::def_trx_id can be recovered.

ha_innobase::table_version(): return garbage, or the trx_id that would
be used for committing an ALTER TABLE operation.

In InnoDB, table names starting with #sql-ib will remain special:
they will be dropped on startup. This may be revisited later in
MDEV-18518 when we implement proper undo logging and rollback
for creating or dropping multiple tables in a transaction.

Table names starting with #sql will retain some special meaning:
dict_table_t::parse_name() will not consider such names for
MDL acquisition, and dict_table_rename_in_cache() will treat such
names specially when handling FOREIGN KEY constraints.

Simplify InnoDB DROP INDEX.
Prevent purge wakeup

To ensure that dict_table_t::def_trx_id will be recovered correctly
in case the server is killed before ddl_log_complete(), we will block
the purge of any history in SYS_TABLES, SYS_INDEXES, SYS_COLUMNS
between ha_innobase::commit_inplace_alter_table(commit=true)
(purge_sys.stop_SYS()) and purge_sys.resume_SYS().
The completion callback purge_sys.resume_SYS() must be between
ddl_log_complete() and MDL release.

--------

MyRocks support for atomic ALTER TABLE
(Implemented by Sergui Petrunia)

Implement these SE API functions:
- ha_rocksdb::table_version()
- hton->check_version = rocksdb_check_versionMyRocks data dictionary
  now stores table version for each table.
  (Absence of table version record is interpreted as table_version=0,
  that is, which means no upgrade changes are needed)
- For inplace alter table of a partitioned table, call the underlying
  handlerton when checking if the table is ok. This assumes that the
  partition engine commits all changes at once.
2021-05-19 22:54:13 +02:00
Monty
6aa9a552c2 MDEV-24576 Atomic CREATE TABLE
There are a few different cases to consider

Logging of CREATE TABLE and CREATE TABLE ... LIKE
- If REPLACE is used and there was an existing table, DDL log the drop of
  the table.
- If discovery of table is to be done
    - DDL LOG create table
  else
    - DDL log create table (with engine type)
    - create the table
- If table was created
  - Log entry to binary log with xid
  - Mark DDL log completed

Crash recovery:
- If query was in binary log do nothing and exit
- If discoverted table
   - Delete the .frm file
-else
   - Drop created table and frm file
- If table was dropped, write a DROP TABLE statement in binary log

CREATE TABLE ... SELECT required a little more work as when one is using
statement logging the query is written to the binary log before commit is
done.
This was fixed by adding a DROP TABLE to the binary log during crash
recovery if the ddl log entry was not closed. In this case the binary log
will contain:
CREATE TABLE xxx ... SELECT ....
DROP TABLE xxx;

Other things:
- Added debug_crash_here() functionality to Aria to be able to test
  crash in create table between the creation of the .MAI and the .MAD files.
2021-05-19 22:54:13 +02:00
Monty
7a588c30b1 MDEV-24408 Crash-safe DROP DATABASE
Description of how DROP DATABASE works after this patch

- Collect list of tables
- DDL log tables as they are dropped
- DDL log drop database
- Delete db.opt
- Delete data directory
- Log either DROP TABLE or DROP DATABASE to binary log
- De active ddl log entry

This is in line of how things where before (minus ddl logging) except that
we delete db.opt file last to not loose it if DROP DATABASE fails.

On recovery we have to ensure that all dropped tables are logged in
binary log and that they are properly dropped (as with atomic drop
table).
No new tables be dropped as part of recovery.

Recovery of active drop database ddl log entry:

- If drop database was logged to ddl log but was not found in the binary
  log:
  - drop the db.opt file and database directory.
  - Log DROP DATABASE to binary log
- If drop database was not logged to ddl log
  - Update binary log with DROP TABLE of the dropped tables. If table list
    is longer than max_allowed_packet, then the query will be split into
    multiple DROP TABLE/VIEW queries.

Other things:
- Added DDL_LOG_STATE and 'current database' as arguments to
  mysql_rm_table_no_locks(). This was needed to be able to combine
  ddl logging of DROP DATABASE and DROP TABLE and make the generated
  DROP TABLE statements shorter.
- To make the DROP TABLE statement created by ddl log shorter, I changed
  the binlogged query to use current directory and omit the directory
  part for all tables in the current directory.
- Merged some DROP TABLE and DROP VIEW code in ddl logger.  This was done
  to be able get separate DROP VIEW and DROP TABLE statements in the binary
  log.
- Added a 'recovery_state' variable to remember the state of dropped
  tables and views.
- Moved out code that drops database objects (stored procedures) from
  mysql_rm_db_internal() to drop_database_objects() for better code reuse.
- Made mysql_rm_db_internal() global so that could be used by the ddl
  recovery code.
2021-05-19 22:54:13 +02:00
Monty
e3cfb7c803 MDEV-23844 Atomic DROP TABLE (single table)
Logging logic:
- Log tables, just before they are dropped, to the ddl log
- After the last table for the statement is dropped, log an xid for the
  whole ddl log event

In case of crash:
- Remove first any active DROP TABLE events from the ddl log that matches
  xids found in binary log (this mean the drop was successful and was
  propery logged).
- Loop over all active DROP TABLE events
  - Ensure that the table is completely dropped
- Write a DROP TABLE entry to the binary log with the dropped tables.

Other things:
- Added code to ha_drop_table() to be able to tell the difference if
  a get_new_handler() failed because of out-of-memory or because the
  handler refused/was not able to create a a handler. This was needed
  to get sequences to work as sequences needs a share object to be passed
  to get_new_handler()
- TC_LOG_BINLOG::recover() was changed to always collect Xid's from the
  binary log and always call ddl_log_close_binlogged_events(). This was
  needed to be able to collect DROP TABLE events with embedded Xid's
  (used by ddl log).
- Added a new variable "$grep_script" to binlog filter to be able to find
  only rows that matches a regexp.
- Had to adjust some test that changed because drop statements are a bit
  larger in the binary log than before (as we have to store the xid)

Other things:
- MDEV-25588 Atomic DDL: Binlog query event written upon recovery is corrupt
  fixed (in the original commit).
2021-05-19 22:54:12 +02:00
Monty
47010ccffa MDEV-23842 Atomic RENAME TABLE
- Major rewrite of ddl_log.cc and ddl_log.h
  - ddl_log.cc described in the beginning how the recovery works.
  - ddl_log.log has unique signature and is dynamic. It's easy to
    add more information to the header and other ddl blocks while still
    being able to execute old ddl entries.
  - IO_SIZE for ddl blocks is now dynamic. Can be changed without affecting
    recovery of old logs.
  - Code is more modular and is now usable outside of partition handling.
  - Renamed log file to dll_recovery.log and added option --log-ddl-recovery
    to allow one to specify the path & filename.
- Added ddl_log_entry_phase[], number of phases for each DDL action,
  which allowed me to greatly simply set_global_from_ddl_log_entry()
- Changed how strings are stored in log entries, which allows us to
  store much more information in a log entry.
- ddl log is now always created at start and deleted on normal shutdown.
  This simplices things notable.
- Added probes debug_crash_here() and debug_simulate_error() to simply
  crash testing and allow crash after a given number of times a probe
  is executed. See comments in debug_sync.cc and rename_table.test for
  how this can be used.
- Reverting failed table and view renames is done trough the ddl log.
  This ensures that the ddl log is tested also outside of recovery.
- Added helper function 'handler::needs_lower_case_filenames()'
- Extend binary log with Q_XID events. ddl log handling is using this
  to check if a ddl log entry was logged to the binary log (if yes,
  it will be deleted from the log during ddl_log_close_binlogged_events()
- If a DDL entry fails 3 time, disable it. This is to ensure that if
  we have a crash in ddl recovery code the server will not get stuck
  in a forever crash-restart-crash loop.

mysqltest.cc changes:
- --die will now replace $variables with their values
- $error will contain the error of the last failed statement

storage engine changes:
- maria_rename() was changed to be more robust against crashes during
  rename.
2021-05-19 22:54:12 +02:00
Monty
d754d3d951 Less noise in the error log
- Updated error messages for recovery
- Changed printing of debug sync point information to make it fit 80 char
2021-05-19 22:54:12 +02:00
Monty
249263525f Avoid creating the .frm file twice in some cases
Other things:
- Updated code comments & fixed indentation
- Removed an old QQ (temporary) comment that does not apply anymore
2021-05-19 22:54:12 +02:00
Monty
a206658b98 Change CHARSET_INFO character set and collaction names to LEX_CSTRING
This change removed 68 explict strlen() calls from the code.

The following renames was done to ensure we don't use the old names
when merging code from earlier releases, as using the new variables
for print function could result in crashes:
- charset->csname renamed to charset->cs_name
- charset->name renamed to charset->coll_name

Almost everything where mechanical changes except:
- Changed to use the new Protocol::store(LEX_CSTRING..) when possible
- Changed to use field->store(LEX_CSTRING*, CHARSET_INFO*) when possible
- Changed to use String->append(LEX_CSTRING&) when possible

Other things:
- There where compiler issues with ensuring that all character set names
  points to the same string: gcc doesn't allow one to use integer constants
  when defining global structures (constant char * pointers works fine).
  To get around this, I declared defines for each character set name
  length.
2021-05-19 22:54:07 +02:00
Monty
b6ff139aa3 Reduce usage of strlen()
Changes:
- To detect automatic strlen() I removed the methods in String that
  uses 'const char *' without a length:
  - String::append(const char*)
  - Binary_string(const char *str)
  - String(const char *str, CHARSET_INFO *cs)
  - append_for_single_quote(const char *)
  All usage of append(const char*) is changed to either use
  String::append(char), String::append(const char*, size_t length) or
  String::append(LEX_CSTRING)
- Added STRING_WITH_LEN() around constant string arguments to
  String::append()
- Added overflow argument to escape_string_for_mysql() and
  escape_quotes_for_mysql() instead of returning (size_t) -1 on overflow.
  This was needed as most usage of the above functions never tested the
  result for -1 and would have given wrong results or crashes in case
  of overflows.
- Added Item_func_or_sum::func_name_cstring(), which returns LEX_CSTRING.
  Changed all Item_func::func_name()'s to func_name_cstring()'s.
  The old Item_func_or_sum::func_name() is now an inline function that
  returns func_name_cstring().str.
- Changed Item::mode_name() and Item::func_name_ext() to return
  LEX_CSTRING.
- Changed for some functions the name argument from const char * to
  to const LEX_CSTRING &:
  - Item::Item_func_fix_attributes()
  - Item::check_type_...()
  - Type_std_attributes::agg_item_collations()
  - Type_std_attributes::agg_item_set_converter()
  - Type_std_attributes::agg_arg_charsets...()
  - Type_handler_hybrid_field_type::aggregate_for_result()
  - Type_handler_geometry::check_type_geom_or_binary()
  - Type_handler::Item_func_or_sum_illegal_param()
  - Predicant_to_list_comparator::add_value_skip_null()
  - Predicant_to_list_comparator::add_value()
  - cmp_item_row::prepare_comparators()
  - cmp_item_row::aggregate_row_elements_for_comparison()
  - Cursor_ref::print_func()
- Removes String_space() as it was only used in one cases and that
  could be simplified to not use String_space(), thanks to the fixed
  my_vsnprintf().
- Added some const LEX_CSTRING's for common strings:
  - NULL_clex_str, DATA_clex_str, INDEX_clex_str.
- Changed primary_key_name to a LEX_CSTRING
- Renamed String::set_quick() to String::set_buffer_if_not_allocated() to
  clarify what the function really does.
- Rename of protocol function:
  bool store(const char *from, CHARSET_INFO *cs) to
  bool store_string_or_null(const char *from, CHARSET_INFO *cs).
  This was done to both clarify the difference between this 'store' function
  and also to make it easier to find unoptimal usage of store() calls.
- Added Protocol::store(const LEX_CSTRING*, CHARSET_INFO*)
- Changed some 'const char*' arrays to instead be of type LEX_CSTRING.
- class Item_func_units now used LEX_CSTRING for name.

Other things:
- Fixed a bug in mysql.cc:construct_prompt() where a wrong escape character
  in the prompt would cause some part of the prompt to be duplicated.
- Fixed a lot of instances where the length of the argument to
  append is known or easily obtain but was not used.
- Removed some not needed 'virtual' definition for functions that was
  inherited from the parent. I added override to these.
- Fixed Ordered_key::print() to preallocate needed buffer. Old code could
  case memory overruns.
- Simplified some loops when adding char * to a String with delimiters.
2021-05-19 22:27:48 +02:00
Alexey Botchkov
d2e2219e46 MDEV-25146 JSON_TABLE: Non-descriptive + wrong error messages upon trying to store array or object.
More informative messages added.
Do not issue additional messages if already handled.
2021-04-21 10:21:45 +04:00
Alexey Botchkov
e9fd327ee3 MDEV-17399 Add support for JSON_TABLE.
The specific table handler for the table functions was introduced,
and used to implement JSON_TABLE.
2021-04-21 10:21:43 +04:00
Marko Mäkelä
d2e2d32933 Merge 10.5 into 10.6 2021-04-14 12:32:27 +03:00
Marko Mäkelä
6c3e860cbf Merge 10.4 into 10.5 2021-04-14 11:35:39 +03:00
Marko Mäkelä
5008171b05 Merge 10.3 into 10.4 2021-04-14 10:33:59 +03:00
Aleksey Midenkov
77ffbbca49 MDEV-25172 Wrong error message for ADD COLUMN .. AS ROW START
Handle one more condition in fix_alter_info() for non-versioned table
and produce ER_VERS_NOT_VERSIONED error.
2021-03-31 21:25:41 +03:00
Aleksey Midenkov
0c99e6e9a6 MDEV-22562 Assertion `next_insert_id == 0' upon UPDATE on system-versioned table
Don't update autoinc counter on history row insert. Uniqueness is kept
due to merge with row_end.
2021-03-31 21:25:36 +03:00
Marko Mäkelä
2ad61c6782 Merge 10.5 into 10.6 2021-03-29 16:16:12 +03:00
Marko Mäkelä
e8b7fceb82 MDEV-24302: RESET MASTER hangs
Starting with MariaDB 10.5, roughly after MDEV-23855 was fixed,
we are observing sporadic hangs during the execution of the
RESET MASTER statement. We are hoping to fix the hangs with these
changes, but due to the rather infrequent occurrence of the hangs
and our inability to reliably reproduce the hangs, we cannot be
sure of this.

What we do know is that innodb_force_recovery=2 (or a larger setting)
will prevent srv_master_callback (the former srv_master_thread) from
running. In that mode, periodic log flushes would never occur and
RESET MASTER could hang indefinitely. That is demonstrated by the new
test case that was developed by Andrei Elkin. We fix this case by
implementing a special case for it.

This also includes some code cleanup and renames of misleadingly
named code. The interface has nothing to do with log checkpoints in
the storage engine; it is only about requesting log writes to be
persistent.

handlerton::commit_checkpoint_request,
commit_checkpoint_notify_ha(): Remove the unused parameter hton.

log_requests.start: Replaces pending_checkpoint_list.
log_requests.end: Replaces pending_checkpoint_list_end.
log_requests.mutex: Replaces pending_checkpoint_mutex.

log_flush_notify_and_unlock(), log_flush_notify(): Replaces
innobase_mysql_log_notify().  The new implementation should be
functionally equivalent to the old one.

innodb_log_flush_request(): Replaces innobase_checkpoint_request().
Implement a fast path for common cases, and reduce the mutex hold time.
POSSIBLE FIX OF THE HANG: We will invoke commit_checkpoint_notify_ha()
for the current request if it is already satisfied, as well as invoke
log_flush_notify_and_unlock() for any satisfied requests.

log_write(): Invoke log_flush_notify() when the write is already durable.
This was missing WITH_PMEM when the log is in persistent memory.

Reviewed by: Vladislav Vaintroub
2021-03-29 15:16:23 +03:00
Varun Gupta
f691d9865b MDEV-7317: Make an index ignorable to the optimizer
This feature adds the functionality of ignorability for indexes.
Indexes are not ignored be default.

To control index ignorability explicitly for a new index,
use IGNORE or NOT IGNORE as part of the index definition for
CREATE TABLE, CREATE INDEX, or ALTER TABLE.

Primary keys (explicit or implicit) cannot be made ignorable.

The table INFORMATION_SCHEMA.STATISTICS get a new column named IGNORED that
would store whether an index needs to be ignored or not.
2021-03-04 22:50:00 +05:30
Sergei Golubchik
25d9d2e37f Merge branch 'bb-10.4-release' into bb-10.5-release 2021-02-15 16:43:15 +01:00
Sergei Golubchik
00a313ecf3 Merge branch 'bb-10.3-release' into bb-10.4-release
Note, the fix for "MDEV-23328 Server hang due to Galera lock conflict resolution"
was null-merged. 10.4 version of the fix is coming up separately
2021-02-12 17:44:22 +01:00
Sergei Golubchik
60ea09eae6 Merge branch '10.2' into 10.3 2021-02-01 13:49:33 +01:00
Sergei Golubchik
0d8bd7cc3a MDEV-18428 Memory: If transactional=0 is specified in CREATE TABLE, it is not possible to ALTER TABLE
* be strict in CREATE TABLE, just like in ALTER TABLE, because
  CREATE TABLE, just like ALTER TABLE, can be rolled back for any engine
* but don't auto-convert warnings into errors for engine warnings
  (handler::create) - this matches ALTER TABLE behavior
* and not when creating a default record, these errors are handled
  specially (and replaced with ER_INVALID_DEFAULT)
* always issue a Note when a non-unique key is truncated, because it's
  not a Warning that can be converted to an Error. Before this commit
  it was a Note for blobs and a Warning for all other data types.
2021-01-11 21:54:47 +01:00
Sergei Golubchik
9b750dcbd8 MDEV-23536 Race condition between KILL and transaction commit
Server part:
  kill_handlerton() was accessing thd->ha_data[] for some other thd,
  while it could be concurrently modified by its owner thd.

  protect thd->ha_data[] modifications with a mutex.
  require this mutex when accessing thd->ha_data[] from kill_handlerton.

InnoDB part:
  on close_connection, detach trx from thd before freeing the trx
2021-01-11 21:54:47 +01:00
Oleksandr Byelkin
02e7bff882 Merge commit '10.4' into 10.5 2021-01-06 10:53:00 +01:00
Oleksandr Byelkin
478b83032b Merge branch '10.3' into 10.4 2020-12-25 09:13:28 +01:00
Oleksandr Byelkin
25561435e0 Merge branch '10.2' into 10.3 2020-12-23 19:28:02 +01:00
Rucha Deodhar
74223c33d1 MDEV-23209: Assertion `!is_set() || (m_status == DA_OK_BULK && is_bulk_op())'
failed in Diagnostics_area::set_ok_status on INSERT

Analysis: Error is not returned when strict mode is enabled and value is
truncated because double is outside range.
Fix: Return HA_ERR_AUTOINC_ERANGE if the error was reported when double is
outside range.
2020-12-15 13:00:24 +05:30
Sergei Petrunia
6859e80df7 MDEV-24351: S3, same-backend replication: Dropping a table on master...
..causes error on slave.
Cause: if the master doesn't have the frm file for the table,
DROP TABLE code will call ha_delete_table_force() to drop the table
in all available storage engines.
The issue was that this code path didn't check for
HTON_TABLE_MAY_NOT_EXIST_ON_SLAVE flag for the storage engine,
and so did not add "... IF EXISTS" to the statement that's written
to the binary log.  This can cause error on the slave when it tries to
drop a table that's already gone.
2020-12-08 17:58:22 +03:00
Marko Mäkelä
6a1e655cb0 Merge 10.4 into 10.5 2020-12-02 18:29:49 +02:00
Marko Mäkelä
589cf8dbf3 Merge 10.3 into 10.4 2020-12-01 19:51:14 +02:00
Sergei Golubchik
00f54b56b1 cleanup: RAII helper for changing thd->count_cuted_rows 2020-11-25 22:19:59 +01:00
Nikita Malyavin
f244b499e7 handler: move row change start signal down after the checks 2020-11-02 14:21:08 +10:00
Nikita Malyavin
afca976885 MDEV-22639 Assertion failed in ha_check_overlaps upon multi-table update
After Sergei's cleanup this assertion is not actual anymore -- we can't
predict if the handler was used for lookup, especially in multi-update
scenario.

`position(old_data)` is made earlier in `ha_check_overlaps`, therefore it
is guaranteed that we compare right refs.
2020-11-02 14:11:43 +10:00
Nikita Malyavin
d543363f25 MDEV-22714 Assertion failed upon multi-update on table WITHOUT OVERLAPS
The problem here was that ha_check_overlaps internally uses ha_index_read,
which in case of fail overwrites table->status. Even though the handlers
are different, they share a common table, so the value is anyway spoiled.
This is bad, and table->status is badly designed and overweighted by
functionality, but nothing can be done with it, since the code related to
this logic is ancient and it's impossible to extract it with normal effort.

So let's just save and restore the value in ha_update_row before and after
the checks.

Other operations like INSERT and simple UPDATE are not in risk, since they
don't use this table->status approach.
DELETE does not do any unique checks, so it's also safe.
2020-11-02 14:11:42 +10:00
Marko Mäkelä
1657b7a583 Merge 10.4 to 10.5 2020-10-22 17:08:49 +03:00
Marko Mäkelä
46957a6a77 Merge 10.3 into 10.4 2020-10-22 13:27:18 +03:00
Marko Mäkelä
e3d692aa09 Merge 10.2 into 10.3 2020-10-22 08:26:28 +03:00
Daniele Sciascia
fdf87973cb MDEV-23081 Stray XA transactions at startup, with wsrep_on=OFF
Change xarecover_handlerton so that transaction with WSREP prefixed
xids are rolled back when Galera is disabled.

Reviewd-by: Jan Lindström <jan.lindstrom@mariadb.com>
2020-10-21 16:29:07 +03:00
Marko Mäkelä
620ea816ad Merge 10.1 into 10.2 2020-10-21 14:02:04 +03:00
Monty
71d263a198 MDEV-23691 S3 storage engine: delayed slave can drop the table
This commit fixed the problems with S3 after the "DROP TABLE FORCE" changes.
It also fixes all failing replication S3 tests.

A slave is delayed if it is trying to execute replicated queries on a
table that is already converted to S3 by the master later in the binlog.

Fixes for replication events on S3 tables for delayed slaves:
- INSERT and INSERT ... SELECT and CREATE TABLE are ignored but written
  to the binary log.   UPDATE & DELETE will be fixed in a future commit.

Other things:
- On slaves with --s3-slave-ignore-updates set, allow S3 tables to be
  opened in read-write mode. This was done to be able to
  ignore-but-replicate queries like insert.  Without this change any
  open of an S3 table failed with 'Table is read only' which is too
  early to be able to replicate the original query.
- Errors are now printed if handler::extra() call fails in
  wait_while_tables_are_used().
- Error message for row changes are changed from HA_ERR_WRONG_COMMAND
  to HA_ERR_TABLE_READONLY.
- Disable some maria_extra() calls for S3 tables. This could cause
  S3 tables to fail in some cases.
- Added missing thr_lock_delete() to ma_open() in case of failure.
- Removed from mysql_prepare_insert() the not needed argument 'table'.
2020-10-21 03:09:29 +03:00
Aleksey Midenkov
9b46d8e5c4 MDEV-23968 CREATE TEMPORARY TABLE .. LIKE (system versioned table) returns error if unique index is defined in the table
- Remove row_start/row_end from keys in fix_create_like();

- Disable manual adding of implicit row_start/row_end to indexes on
  CREATE TABLE. INVISIBLE_SYSTEM fields are unoperable by user;

- Fix memory leak on allocation of Key_part_spec.
2020-10-20 10:49:54 +03:00
Sergei Petrunia
3e807d255e MDEV-23938: innodb row_search_idx_cond_check handle ICP_ABORTED_BY_USER
- row_search_mvcc() should return DB_INTERRUPTED when it got killed.
- Add a syncpoint for the ICP check.
- Add test coverage for killed-during-ICP-check scenario

Backport of MDEV-22761 fixes for ICP from 10.4 commits:
* a6f956488c
* c03885cd9c

XtraDB was fixed in deb3b9a174

Reviewer: Daniel Black
2020-10-16 09:44:03 +11:00
Sergei Petrunia
c03885cd9c MDEV-22761: innodb row_search_idx_cond_check handle CHECK_ABORTED_BY_USER
Part #2:
- row_search_mvcc() should return DB_INTERRUPTED when it got
- Move the sync point from innodb internals to
  handler_rowid_filter_check() where other storage engines can use
  it too
- Add a similar syncpoint for the ICP check.
- Add a bigger test and test coverage for Rowid Filter with MyISAM
- Add test coverage for killed-during-ICP-check scenario
2020-10-14 15:14:46 +03:00