1
0
mirror of https://github.com/MariaDB/server.git synced 2025-12-01 17:39:21 +03:00
Commit Graph

2165 Commits

Author SHA1 Message Date
Marko Mäkelä
921c5e9314 Merge bb-10.2-ext into 10.3
MDEV-11415 Remove excessive undo logging during ALTER TABLE…ALGORITHM=COPY

Move a test from innodb.rename_table_debug to innodb.alter_copy.

ha_innobase::extra(HA_EXTRA_BEGIN_ALTER_COPY): Register id-versioned
tables so that mysql.transaction_registry will be updated, even for
empty tables that are subjected to ALTER TABLE…ALGORITHM=COPY.
2018-01-30 21:26:53 +02:00
Marko Mäkelä
33714d2065 Merge bb-10.2-ext into 10.3 2018-01-30 21:04:48 +02:00
Marko Mäkelä
0c1f220611 Merge 10.2 into bb-10.2-ext 2018-01-30 20:47:12 +02:00
Marko Mäkelä
0ba6aaf030 MDEV-11415 Remove excessive undo logging during ALTER TABLE…ALGORITHM=COPY
If a crash occurs during ALTER TABLE…ALGORITHM=COPY, InnoDB would spend
a lot of time rolling back writes to the intermediate copy of the table.
To reduce the amount of busy work done, a work-around was introduced in
commit fd069e2bb3 in MySQL 4.1.8 and 5.0.2,
to commit the transaction after every 10,000 inserted rows.

A proper fix would have been to disable the undo logging altogether and
to simply drop the intermediate copy of the table on subsequent server
startup. This is what happens in MariaDB 10.3 with MDEV-14717,MDEV-14585.
In MariaDB 10.2, the intermediate copy of the table would be left behind
with a name starting with the string #sql.

This is a backport of a bug fix from MySQL 8.0.0 to MariaDB,
contributed by jixianliang <271365745@qq.com>.

Unlike recent MySQL, MariaDB supports ALTER IGNORE. For that operation
InnoDB must for now keep the undo logging enabled, so that the latest
row can be rolled back in case of an error.

In Galera cluster, the LOAD DATA statement will retain the existing
behaviour and commit the transaction after every 10,000 rows if
the parameter wsrep_load_data_splitting=ON is set. The logic to do
so (the wsrep_load_data_split() function and the call
handler::extra(HA_EXTRA_FAKE_START_STMT)) are joint work
by Ji Xianliang and Marko Mäkelä.

The original fix:

Author: Thirunarayanan Balathandayuthapani <thirunarayanan.balathandayuth@oracle.com>
Date:   Wed Dec 2 16:09:15 2015 +0530

Bug#17479594 AVOID INTERMEDIATE COMMIT WHILE DOING ALTER TABLE ALGORITHM=COPY

Problem:

During ALTER TABLE, we commit and restart the transaction for every
10,000 rows, so that the rollback after recovery would not take so long.

Fix:

Suppress the undo logging during copy alter operation. If fts_index is
present then insert directly into fts auxiliary table rather
than doing at commit time.

ha_innobase::num_write_row: Remove the variable.

ha_innobase::write_row(): Remove the hack for committing every 10000 rows.

row_lock_table_for_mysql(): Remove the extra 2 parameters.

lock_get_src_table(), lock_is_table_exclusive(): Remove.

Reviewed-by: Marko Mäkelä <marko.makela@oracle.com>
Reviewed-by: Shaohua Wang <shaohua.wang@oracle.com>
Reviewed-by: Jon Olav Hauglid <jon.hauglid@oracle.com>
2018-01-30 20:24:23 +02:00
Marko Mäkelä
6d390bab4a Merge 10.2 into bb-10.2-ext 2018-01-30 20:18:25 +02:00
Marko Mäkelä
706ed8552d Revert "MDEV-6928: Add trx pointer to struct mtr_t"
This reverts commit 3486135bb5.

The commit comment ended in the words: "This is needed later."
Apparently the "later" never arrived.
2018-01-29 11:05:17 +02:00
Marko Mäkelä
041a32abcd Remove trx_mod_tables_t::vers_by_trx
Only invoke set_versioned() on trx_id versioned tables.

dict_table_t::versioned_by_id(): New accessor, to determine if
a table is system versioned by transaction ID.
2018-01-28 22:21:48 +02:00
Marko Mäkelä
1da063a45b Remove unused metadata for non-existing sync_thread_mutex 2018-01-28 22:17:54 +02:00
Marko Mäkelä
b8c92d752c Fixed compiler warning 2018-01-26 10:50:20 +04:00
Sergey Vojtovich
ce04790065 MDEV-14482 - Cache line contention on ut_rnd_ulint_counter()
InnoDB RNG maintains global state, causing otherwise unnecessary bus
traffic. Even worse this is cross-mutex traffic. That is different
mutexes suffer from contention.

Fixed delay of 4 was verified to give best throughput by OLTP update
index and read-write benchmarks on Intel Broadwell (2/20/40) and
ARM (1/46/46).
2018-01-26 10:25:33 +04:00
Marko Mäkelä
92d233a512 MDEV-15061 TRUNCATE must honor InnoDB table locks
Traditionally, DROP TABLE and TRUNCATE TABLE discarded any locks that
may have been held on the table. This feels like an ACID violation.
Probably most occurrences of it were prevented by meta-data locks (MDL)
which were introduced in MySQL 5.5.

dict_table_t::n_foreign_key_checks_running: Reduce the number of
non-debug checks.

lock_remove_all_on_table(), lock_remove_all_on_table_for_trx(): Remove.

ha_innobase::truncate(): Acquire an exclusive InnoDB table lock
before proceeding. DROP TABLE and DISCARD/IMPORT were already doing
this.

row_truncate_table_for_mysql(): Convert the already started transaction
into a dictionary operation, and do not invoke lock_remove_all_on_table().

row_mysql_table_id_reassign(): Do not call lock_remove_all_on_table().
This function is only used in ALTER TABLE...DISCARD/IMPORT TABLESPACE,
which is already holding an exclusive InnoDB table lock.

TODO: Make n_foreign_key_checks running a debug-only variable.
This would require two fixes:
(1) DROP TABLE: Exclusively lock the table beforehand, to prevent
the possibility of concurrently running foreign key checks (which
would acquire a table IS lock and then record S locks).
(2) RENAME TABLE: Find out if n_foreign_key_checks_running>0 actually
constitutes a potential problem.
2018-01-25 22:43:43 +02:00
Vicențiu Ciorbaru
61e2f43e05 Remove ut_win_init_time from innodb
The patch was brought in from 5.6.39 merge and we don't need it in
MariaDB
2018-01-25 19:50:13 +02:00
Vicențiu Ciorbaru
f775ee6006 Fix innodb compilation failure on Windows
We don't need to print an error when loading kernel32.dll. If we can't
load it we'll automatically crash. Reviewed by wlad@mariadb.com
2018-01-25 11:34:56 +02:00
Vicențiu Ciorbaru
3699a4b5c0 Merge branch 'merge-innodb-5.6' into 10.0 2018-01-24 18:23:25 +02:00
Vicențiu Ciorbaru
b20f821e07 Fix Innodb ASAN error on init
Backport 7c03edf2fe from xtradb to innodb
2018-01-24 15:18:36 +02:00
Marko Mäkelä
9875d5c3e1 Merge bb-10.2-ext into 10.3 2018-01-24 14:00:33 +02:00
Vicențiu Ciorbaru
3dfe148074 5.6.39 2018-01-23 17:43:37 +02:00
Alexander Barkov
ec6b8c546a Merge remote-tracking branch 'origin/10.2' into bb-10.2-ext 2018-01-23 17:43:12 +04:00
Marko Mäkelä
89ae5d7f2f Allocate mutex_monitor, create_tracker statically 2018-01-22 16:30:38 +02:00
Marko Mäkelä
d04e1d4bdc MDEV-15029 XA COMMIT and XA ROLLBACK operate on freed transaction object
innobase_commit_by_xid(), innobase_rollback_by_xid(): Decrement
the reference count before freeing the transaction object to the pool.
Failure to do so might corrupt the transaction bookkeeping
if trx_create_low() returns the same object to another thread
before we are done with it.

trx_sys_close(): Detach the recovered XA PREPARE transactions from
trx_sys->rw_trx_list before freeing them.
2018-01-22 16:25:37 +02:00
Sergey Vojtovich
4dc30f3c17 MDEV-15019 - InnoDB: store ReadView on trx
This will allow us to reduce critical section protected by
trx_sys.mutex:
- no need to maintain global m_free list
- eliminate if (trx->read_view == NULL) condition.

On x86_64 sizeof(Readview) is 144 mostly due to padding, sizeof(trx_t)
with ReadView is 1200.

Also don't close ReadView for read-write transactions, just mark it
closed similarly to read-only.

Clean-up: removed n_prepared_recovered_trx and n_prepared_trx, which
accidentally re-appeared after some rebase.
2018-01-22 16:23:15 +04:00
Marko Mäkelä
c425dcd8f2 Merge 10.2 into bb-10.2-ext 2018-01-22 09:04:32 +02:00
Marko Mäkelä
4f8555f1f6 MDEV-14941 Timeouts on persistent statistics tables caused by MDEV-14511
MDEV-14511 tried to avoid some consistency problems related to InnoDB
persistent statistics. The persistent statistics are being written by
an InnoDB internal SQL interpreter that requires the InnoDB data dictionary
cache to be locked.

Before MDEV-14511, the statistics were written during DDL in separate
transactions, which could unnecessarily reduce performance (each commit
would require a redo log flush) and break atomicity, because the statistics
would be updated separately from the dictionary transaction.

However, because it is unacceptable to hold the InnoDB data dictionary
cache locked while suspending the execution for waiting for a
transactional lock (in the mysql.innodb_index_stats or
mysql.innodb_table_stats tables) to be released, any lock conflict
was immediately be reported as "lock wait timeout".

To fix MDEV-14941, an attempt to reduce these lock conflicts by acquiring
transactional locks on the user tables in both the statistics and DDL
operations was made, but it would still not entirely prevent lock conflicts
on the mysql.innodb_index_stats and mysql.innodb_table_stats tables.

Fixing the remaining problems would require a change that is too intrusive
for a GA release series, such as MariaDB 10.2.

Thefefore, we revert the change MDEV-14511. To silence the
MDEV-13201 assertion, we use the pre-existing flag trx_t::internal.
2018-01-22 08:58:47 +02:00
Monty
27a5d96bcb Merge remote-tracking branch 'origin/10.2' into bb-10.2-ext
Conflicts:
	sql/sp_rcontext.cc
2018-01-21 20:32:48 +02:00
Monty
f67b8273c0 Fixed wrong arguments to printf in InnoDB 2018-01-21 20:22:00 +02:00
Sergey Vojtovich
ec32c05072 Get rid of trx->read_view pointer juggling
trx->read_view|= 1 was done in a silly attempt to fix race condition
where trx->read_view was closed without trx_sys.mutex lock by read-only
trasnactions.

This just made the problem less likely to happen. In fact there was race
condition in const version of trx_get_read_view(): pointer may change to
garbage any moment after MVCC::is_view_active(trx->read_view) check and
before this function returns.

This patch doesn't fix this race condition, but rather makes it's
consequences less destructive.
2018-01-20 16:10:38 +04:00
Sergey Vojtovich
64048bafe0 Removed purge_trx_id_age and purge_view_trx_id_age
These were unused status variables available in debug builds only.
Also removed trx_sys.rw_max_trx_id: not used anymore.
2018-01-20 16:10:37 +04:00
Sergey Vojtovich
db5bb785f9 Allocate trx_sys.mvcc at link time
trx_sys.mvcc was allocated dynamically for no good reason.
2018-01-20 16:10:36 +04:00
Marko Mäkelä
f8882cce93 Replace trx_sys_t* trx_sys with trx_sys_t trx_sys
There is only one transaction system object in InnoDB.
Allocate the storage for it at link time, not at runtime.

lock_rec_fetch_page(): Use the correct fetch mode BUF_GET.
Pages may never be deallocated from a tablespace while
record locks are pointing to them.
2018-01-20 16:10:36 +04:00
Sergey Vojtovich
7078203389 MDEV-14756 - Remove trx_sys_t::rw_trx_list
Use atomic operations when accessing trx_sys_t::max_trx_id. We can't yet
move trx_sys_t::get_new_trx_id() out of mutex because it must be updated
atomically along with trx_sys_t::rw_trx_ids.
2018-01-20 16:10:35 +04:00
Sergei Golubchik
8f102b584d Merge branch 'github/10.3' into bb-10.3-temporal 2018-01-17 00:45:02 +01:00
Marko Mäkelä
6dd302d164 Merge bb-10.2-ext into 10.3 2018-01-11 19:44:41 +02:00
Marko Mäkelä
cca611d1c0 Merge 10.2 into bb-10.2-ext 2018-01-11 18:00:31 +02:00
Sergey Vojtovich
380069c235 MDEV-14638 - Replace trx_sys_t::rw_trx_set with LF_HASH
trx_sys_t::rw_trx_set is implemented as std::set, which does a few quite
expensive operations under trx_sys_t::mutex protection: e.g. malloc/free
when adding/removing elements. Traversing b-tree is not that cheap either.

This has negative scalability impact, which is especially visible when running
oltp_update_index.lua benchmark on a ramdisk.

To reduce trx_sys_t::mutex contention std::set is replaced with LF_HASH. None
of LF_HASH operations require trx_sys_t::mutex (nor any other global mutex)
protection.

Another interesting issue observed with std::set is reproducible ~2% performance
decline after benchmark is ran for ~60 seconds. With LF_HASH results are stable.

All in all this patch optimises away one of three trx_sys->mutex locks per
oltp_update_index.lua query. The other two critical sections became smaller.

Relevant clean-ups:

Replaced rw_trx_set iteration at startup with local set. The latter is needed
because values inserted to rw_trx_list must be ordered by trx->id.

Removed redundant conditions from trx_reference(): it is (and even was) never
called with transactions that have trx->state == TRX_STATE_COMMITTED_IN_MEMORY.
do_ref_count doesn't (and probably even didn't) make any sense: now it is called
only when reference counter increment is actually requested.

Moved condition out of mutex in trx_erase_lists().

trx_rw_is_active(), trx_rw_is_active_low() and trx_get_rw_trx_by_id() were
greatly simplified and replaced by appropriate trx_rw_hash_t methods.

Compared to rw_trx_set, rw_trx_hash holds transactions only in PREPARED or
ACTIVE states. Transactions in COMMITTED state were required to be found
at InnoDB startup only. They are now looked up in the local set.

Removed unused trx_assert_recovered().

Removed unused innobase_get_trx() declaration.

Removed rather semantically incorrect trx_sys_rw_trx_add().

Moved information printout from trx_sys_init_at_db_start() to
trx_lists_init_at_db_start().
2018-01-11 12:30:53 +04:00
Marko Mäkelä
dfde5ae912 MDEV-14130 InnoDB messages should not refer to the MySQL 5.7 manual
Replace most occurrences of the REFMAN macro. For some pages there
is no replacement yet.
2018-01-10 13:53:44 +02:00
Aleksey Midenkov
c59c1a0736 System Versioning 1.0 pre8
Merge branch '10.3' into trunk
2018-01-10 12:36:55 +03:00
Marko Mäkelä
fa7d85bb87 Merge bb-10.2-ext into 10.3 2018-01-05 22:52:06 +02:00
Monty
e9a2082634 Merge remote-tracking branch 'origin/10.2' into bb-10.2-ext
Conflicts:
	mysql-test/r/cte_nonrecursive.result
	mysql-test/suite/galera/r/galera_bf_abort.result
	mysql-test/suite/galera/r/galera_bf_abort_get_lock.result
	mysql-test/suite/galera/r/galera_bf_abort_sleep.result
	mysql-test/suite/galera/r/galera_enum.result
	mysql-test/suite/galera/r/galera_fk_conflict.result
	mysql-test/suite/galera/r/galera_insert_multi.result
	mysql-test/suite/galera/r/galera_many_indexes.result
	mysql-test/suite/galera/r/galera_mdl_race.result
	mysql-test/suite/galera/r/galera_nopk_bit.result
	mysql-test/suite/galera/r/galera_nopk_blob.result
	mysql-test/suite/galera/r/galera_nopk_large_varchar.result
	mysql-test/suite/galera/r/galera_nopk_unicode.result
	mysql-test/suite/galera/r/galera_pk_bigint_signed.result
	mysql-test/suite/galera/r/galera_pk_bigint_unsigned.result
	mysql-test/suite/galera/r/galera_serializable.result
	mysql-test/suite/galera/r/galera_toi_drop_database.result
	mysql-test/suite/galera/r/galera_toi_lock_exclusive.result
	mysql-test/suite/galera/r/galera_toi_truncate.result
	mysql-test/suite/galera/r/galera_unicode_pk.result
	mysql-test/suite/galera/r/galera_var_auto_inc_control_off.result
	mysql-test/suite/galera/r/galera_wsrep_log_conficts.result
	sql/field.cc
	sql/rpl_gtid.cc
	sql/share/errmsg-utf8.txt
	sql/sql_acl.cc
	sql/sql_parse.cc
	sql/sql_partition_admin.cc
	sql/sql_prepare.cc
	sql/sql_repl.cc
	sql/sql_table.cc
	sql/sql_yacc.yy
2018-01-05 16:52:40 +02:00
Monty
5e0b13d173 Fixed wrong arguments to printf and related functions
Other things, mainly to get
create_mysqld_error_find_printf_error tool to work:

- Added protection to not include mysqld_error.h twice
- Include "unireg.h" instead of "mysqld_error.h" in server
- Added protection if ER_XX messages are already defined
- Removed wrong calls to my_error(ER_OUTOFMEMORY) as
  my_malloc() and my_alloc will do this automatically
- Added missing %s to ER_DUP_QUERY_NAME
- Removed old and wrong calls to my_strerror() when using
  MY_ERROR_ON_RENAME (wrong merge)
- Fixed deadlock error message from Galera. Before the extra
  information given to ER_LOCK_DEADLOCK was missing because
  ER_LOCK_DEADLOCK doesn't provide any extra information.

I kept #ifdef mysqld_error_find_printf_error_used in sql_acl.h
to make it easy to do this kind of check again in the future
2018-01-04 16:24:09 +02:00
Aleksey Midenkov
5c0a19c873 System Versioning 1.0 pre7
Merge branch '10.3' into trunk
2017-12-21 11:16:42 +03:00
Marko Mäkelä
2534b5cb99 Merge bb-10.2-ext into 10.3 2017-12-20 22:37:24 +02:00
Marko Mäkelä
0bc36758ba MDEV-14717 RENAME TABLE in InnoDB is not crash-safe
InnoDB in MariaDB 10.2 appears to only write MLOG_FILE_RENAME2
redo log records during table-rebuilding ALGORITHM=INPLACE operations.
We must write the records for any .ibd file renames, so that the
operations are crash-safe.

If InnoDB is killed during a RENAME TABLE operation, it can happen that
the transaction for updating the data dictionary will be rolled back.
But, nothing will roll back the renaming of the .ibd file
(the MLOG_FILE_RENAME2 only guarantees roll-forward), or for that matter,
the renaming of the dict_table_t::name in the dict_sys cache. We introduce
the undo log record TRX_UNDO_RENAME_TABLE to fix this.

fil_space_for_table_exists_in_mem(): Remove the parameters
adjust_space, table_id and some code that was trying to work around
these deficiencies.

fil_name_write_rename(): Write a MLOG_FILE_RENAME2 record.

dict_table_rename_in_cache(): Invoke fil_name_write_rename().

trx_undo_rec_copy(): Set the first 2 bytes to the length of the
copied undo log record.

trx_undo_page_report_rename(), trx_undo_report_rename():
Write a TRX_UNDO_RENAME_TABLE record with the old table name.

row_rename_table_for_mysql(): Invoke trx_undo_report_rename()
before modifying any data dictionary tables.

row_undo_ins_parse_undo_rec(): Roll back TRX_UNDO_RENAME_TABLE
by invoking dict_table_rename_in_cache(), which will take care
of both renaming the table and the file.
2017-12-20 22:21:03 +02:00
Eugene Kosov
acdfacee75 IB: TRT is not updated on ADD SYSTEM VERSIONING [fixes #413] 2017-12-20 22:46:28 +03:00
Marko Mäkelä
0436a0ff3c Merge bb-10.2-ext into 10.3 2017-12-19 17:28:22 +02:00
Marko Mäkelä
028e91f380 Merge 10.2 into bb-10.2-ext 2017-12-19 17:12:14 +02:00
Marko Mäkelä
8d70097c21 Merge 10.1 to 10.2
Follow-up fix to MDEV-14008: Let Field_double::val_uint() silently
return 0 on error
2017-12-19 16:48:28 +02:00
Marko Mäkelä
09c5bbf471 Merge 10.0 into 10.1 2017-12-18 20:05:50 +02:00
Aleksey Midenkov
b55a149194 Timestamp-based versioning for InnoDB [closes #209]
* Removed integer_fields check
* Reworked Vers_parse_info::check_sys_fields()
* Misc renames
* versioned as vers_sys_type_t

* Removed versioned_by_sql(), versioned_by_engine()

versioned() works as before;
versioned(VERS_TIMESTAMP) is versioned_by_sql();
versioned(VERS_TRX_ID) is versioned_by_engine().

* create_tmp_table() fix
* Foreign constraints for timestamp-based
* Range auto-specifier fix
* SQL: 1-row partition rotation fix [fixes #260]
* Fix 'drop system versioning, algorithm=inplace'
2017-12-18 19:03:51 +03:00
Alexander Barkov
c1e5fef05d MDEV-14008 Assertion failing: `!is_set() || (m_status == DA_OK_BULK && is_bulk_op()) 2017-12-18 11:25:38 +04:00
Aleksey Midenkov
73606a3977 System Versioning 1.0 pre5 [closes #407]
Merge branch '10.3' into trunk

Both field_visibility and VERS_HIDDEN_FLAG exist independently.

TODO:
VERS_HIDDEN_FLAG should be replaced with SYSTEM_INVISIBLE (or COMPLETELY_INVISIBLE?).
2017-12-15 15:18:59 +03:00