mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-11-30 05:23:50 +03:00

Author	SHA1	Message	Date
Marko Mäkelä	b42294bc64	MDEV-19514 Defer change buffer merge until pages are requested We will remove the InnoDB background operation of merging buffered changes to secondary index leaf pages. Changes will only be merged as a result of an operation that accesses a secondary index leaf page, such as a SQL statement that performs a lookup via that index, or is modifying the index. Also ROLLBACK and some background operations, such as purging the history of committed transactions, or computing index cardinality statistics, can cause change buffer merge. Encryption key rotation will not perform change buffer merge. The motivation of this change is to simplify the I/O logic and to allow crash recovery to happen in the background (MDEV-14481). We also hope that this will reduce the number of "mystery" crashes due to corrupted data. Because change buffer merge will typically take place as a result of executing SQL statements, there should be a clearer connection between the crash and the SQL statements that were executed when the server crashed. In many cases, a slight performance improvement was observed. This is joint work with Thirunarayanan Balathandayuthapani and was tested by Axel Schwenke and Matthias Leich. The InnoDB monitor counter innodb_ibuf_merge_usec will be removed. On slow shutdown (innodb_fast_shutdown=0), we will continue to merge all buffered changes (and purge all undo log history). Two InnoDB configuration parameters will be changed as follows: innodb_disable_background_merge: Removed. This parameter existed only in debug builds. All change buffer merges will use synchronous reads. innodb_force_recovery will be changed as follows: * innodb_force_recovery=4 will be the same as innodb_force_recovery=3 (the change buffer merge cannot be disabled; it can only happen as a result of an operation that accesses a secondary index leaf page). The option used to be capable of corrupting secondary index leaf pages. Now that capability is removed, and innodb_force_recovery=4 becomes 'safe'. * innodb_force_recovery=5 (which essentially hard-wires SET GLOBAL TRANSACTION ISOLATION LEVEL READ UNCOMMITTED) becomes safe to use. Bogus data can be returned to SQL, but persistent InnoDB data files will not be corrupted further. * innodb_force_recovery=6 (ignore the redo log files) will be the only option that can potentially cause persistent corruption of InnoDB data files. Code changes: buf_page_t::ibuf_exist: New flag, to indicate whether buffered changes exist for a buffer pool page. Pages with pending changes can be returned by buf_page_get_gen(). Previously, the changes were always merged inside buf_page_get_gen() if needed. ibuf_page_exists(const buf_page_t&): Check if a buffered changes exist for an X-latched or read-fixed page. buf_page_get_gen(): Add the parameter allow_ibuf_merge=false. All callers that know that they may be accessing a secondary index leaf page must pass this parameter as allow_ibuf_merge=true, unless it does not matter for that caller whether all buffered changes have been applied. Assert that whenever allow_ibuf_merge holds, the page actually is a leaf page. Attempt change buffer merge only to secondary B-tree index leaf pages. btr_block_get(): Add parameter 'bool merge'. All callers of btr_block_get() should know whether the page could be a secondary index leaf page. If it is not, we should avoid consulting the change buffer bitmap to even consider a merge. This is the main interface to requesting index pages from the buffer pool. ibuf_merge_or_delete_for_page(), recv_recover_page(): Replace buf_page_get_known_nowait() with much simpler logic, because it is now guaranteed that that the block is x-latched or read-fixed. mlog_init_t::mark_ibuf_exist(): Renamed from mlog_init_t::ibuf_merge(). On crash recovery, we will no longer merge any buffered changes for the pages that we read into the buffer pool during the last batch of applying log records. buf_page_get_gen_known_nowait(), BUF_MAKE_YOUNG, BUF_KEEP_OLD: Remove. btr_search_guess_on_hash(): Merge buf_page_get_gen_known_nowait() to its only remaining caller. buf_page_make_young_if_needed(): Define as an inline function. Add the parameter buf_pool. buf_page_peek_if_young(), buf_page_peek_if_too_old(): Add the parameter buf_pool. fil_space_validate_for_mtr_commit(): Remove a bogus comment about background merge of the change buffer. btr_cur_open_at_rnd_pos_func(), btr_cur_search_to_nth_level_func(), btr_cur_open_at_index_side_func(): Use narrower data types and scopes. ibuf_read_merge_pages(): Replaces buf_read_ibuf_merge_pages(). Merge the change buffer by invoking buf_page_get_gen().	2019-10-11 17:28:15 +03:00
Marko Mäkelä	d04f2de80a	Merge 10.4 into 10.5	2019-10-11 08:41:36 +03:00
Marko Mäkelä	b05be3ef8c	Add encryption.innodb-redo-badkey,strict_full_crc32 In commit `0f7732d1d1` we introduced a innodb_checksum_algorithm=full_crc32 combination to a number of encryption tests, and also fixed the code accordingly. The default in MariaDB 10.5 is innodb_checksum_algorithm=full_crc32. In a test merge to 10.5, the test encryption.innodb-redo-badkey failed once due to a message that had been added in that commit. Let us introduce a full_crc32 option to that test. And let us use strict_crc32 and strict_full_crc32 instead of the non-strict variants, for the previously augmented tests, to be in line with the earlier tests encryption.corrupted_during_recovery and encryption.innodb_encrypt_temporary_tables.	2019-10-11 08:24:30 +03:00
Marko Mäkelä	09afd3da1a	Merge 10.3 into 10.4	2019-10-10 21:30:40 +03:00
Marko Mäkelä	7f84e3ad75	Merge 10.2 into 10.3	2019-10-10 20:38:44 +03:00
Marko Mäkelä	0f7732d1d1	MDEV-19335 adjustment for innodb_checksum_algorithm=full_crc32 When MDEV-12026 introduced innodb_checksum_algorithm=full_crc32 in MariaDB 10.4, it accidentally added a dependency on buf_page_t::encrypted. Now that the flag has been removed, we must adjust the page-read routine. buf_page_io_complete(): When the full_crc32 page checksum matches but the tablespace ID in the page does not match after decrypting, we should declare it a decryption failure and suppress the page dump output and any attempts to re-read the page.	2019-10-10 15:24:14 +03:00
Marko Mäkelä	c11e5cdd12	Merge 10.3 into 10.4	2019-10-10 11:19:25 +03:00
Michael Widenius	726b1998fc	Fixed feedback_plugin_load to work with staticly loaded plugin	2019-10-10 10:25:32 +03:00
Aleksey Midenkov	a92f3146d2	MDEV-19406 Assertion on updating view of join with versioned table TABLE::mark_columns_needed_for_update(): use_all_columns() assigns pointer of all_set into read_set and write_set, but this is not good since all_set is changed later by TABLE::mark_columns_used_by_index_no_reset(). Do column_bitmaps_signal() whenever we change read_set/write_set.	2019-10-10 00:20:34 +03:00
Aleksey Midenkov	647a38818a	MDEV-16130 wrong error message adding AS ROW START to versioned table	2019-10-10 00:20:34 +03:00
Aleksey Midenkov	75ba5c815d	MDEV-16210 FK constraints on versioned tables use historical rows, which may cause constraint violation Constraint check is done on secondary index update. F.ex. DELETE does row_upd_sec_index_entry() and checks constraints in row_upd_check_references_constraints(). UPDATE is optimized for the case when order is not changed (node->cmpl_info & UPD_NODE_NO_ORD_CHANGE) and doesn't do row_upd_sec_index_entry(), so it doesn't check constraints. Since for versioned DELETE we do UPDATE actually, but expect behaviour of DELETE in terms of constraints, we should deny this optimization to get constraints checked. Fix wrong referenced table check when versioned DELETE inserts history in parent table. Set check_ref to false in this case. Removed unused dup_chk_only argument for row_ins_sec_index_entry() and added check_ref argument. MDEV-18057 fix was superseded by this fix and reverted. foreign.test: All key_type combinations: pk, unique, sec(ondary).	2019-10-10 00:20:34 +03:00
Aleksey Midenkov	6684989801	versioning test suite fixes Preparation for MDEV-16210: replace.test: key_type combinations: PK and UNIQUE. foreign.test: Preparation for key_type combinations. Other fixes: * Merged versioning.update2 into versioning.update; * Removed test2 database and done individual drop instead.	2019-10-10 00:20:34 +03:00
Marko Mäkelä	892378fb9d	Merge 10.2 into 10.3	2019-10-09 13:25:11 +03:00
Jan Lindström	62dce14d15	MDEV-20782 : Galera test failure on galera_sr.galera_sr_mysqldump_sst Add auto increment offset save and restore as node(s) are restarted.	2019-10-09 11:42:50 +03:00
Jan Lindström	44a11a7c08	MDEV-20780 : Galera test failure on galera_sr.galera_sr_ddl_master Fix wait_condition and use repeatable read with wsrep_sync at the end.	2019-10-09 11:41:14 +03:00
Jan Lindström	57b666b2e5	Fix test case wsrep.mdev_6832 we need to wait until wsrep_ready is ON before test can end.	2019-10-08 15:04:39 +03:00
Sachin Setiya	89bd5623b0	MDEV-20582 Asan failure in table_def::calc_field_size MDEV-20591 Already solves this issue, Just enable the test case commented parts	2019-10-08 17:00:54 +05:30
Sachin Setiya	fc33c3cda5	MDEV-20591 Wrong Number of rows in mysqlbinlog output calc_field_event_length should accurately calculate the size of BLOB type fields, Instead of returning just the bytes taken by length it should return length bytes + actual length.	2019-10-08 16:54:48 +05:30
Sachin Setiya	27664ef29d	MDEV-20574 Position of events reported by mysqlbinlog is wrong with encrypted binlogs, SHOW BINLOG EVENTS reports the correct one. Analysis Mysqlbinlog output for encrypted binary log #Q> insert into tab1 values (3,'row 003') #190912 17:36:35 server id 10221 end_log_pos 980 CRC32 0x53bcb3d3 Table_map: `test`.`tab1` mapped to number 19 # at 940 #190912 17:36:35 server id 10221 end_log_pos 1026 CRC32 0xf2ae5136 Write_rows: table id 19 flags: STMT_END_F Here we can see Table_map_log_event ends at 980 but Next event starts at 940. And the reason for that is we do not send START_ENCRYPTION_EVENT to the slave Solution:- Send Start_encryption_log_event as Ignorable_log_event to slave(mysqlbinlog), So that mysqlbinlog can update its log_pos. Since Slave can request multiple FORMAT_DESCRIPTION_EVENT while master does not have so We only update slave master pos when master actually have the FORMAT_DESCRIPTION_EVENT. Similar logic should be applied for START_ENCRYPTION_EVENT. Also added the test case when new server reads the data from old server which does not send START_ENCRYPTION_EVENT to slave. Master Slave Upgrade Scenario. When Slave is updated first, Slave will have extra logic of handling START_ENCRYPTION_EVENT But master willnot be sending START_ENCRYPTION_EVENT. So there will be no issue. When Master is updated first, It will send START_ENCRYPTION_EVENT to slave , But slave will ignore this event in queue_event.	2019-10-08 14:35:34 +05:30
sachin	1e0f09cacb	MDEV-16239 Many test in rpl suite fails Fix rpl_skip_error test. We cant reset Slave_skipped_errors(even with FLUSH STATUS), So instead of absolute slave_skipped_errors we look for delta of slave_skipped_errors Fix rpl.rpl_binlog_errors and binlog_encryption.rpl_binlog_errors We create the $load_file and $load_file2 but we never remove them. Fix rpl_000011.test Instead of real value use delta value , Since flush status wont flush LONGLONG variable. Fix rpl_row_find_row_debug Instead of searching whole log_error_ file we will use search_pattern_in_file which runs pattern search only on latest test run , instead of full file. Fix rpl_ip_mix rpl_ip_mix2 We should call reset slave all because we also want to reset master_host otherwise show slave status wont be empty and making repeat N a failure. Fix rpl_rotate_logs First we have to remove master.info file (cleanup) and second we have to call reset slave all because if we do not call reset slave all then we wont read master.info file beacuse we already have master config in memory. And this makes start slave to pass , which shoud fail becuase its permision is 000 Fix circular_serverid0 test The reason is that ++dbug_rows_event_count == 2 in queue_event does not take --repeat into account. So I have reseted the dbug_rows_event_count in if body.	2019-10-08 13:34:25 +05:30
Sachin	01bf9f8c3d	MDEV-20591 Wrong Number of rows in mysqlbinlog output calc_field_event_length should accurately calculate the size of BLOB type fields, Instead of returning just the bytes taken by length it should return length bytes + actual length.	2019-10-08 13:34:11 +05:30
Jan Lindström	c339487030	Try to fix galera_parallel_simple test case.	2019-10-07 08:47:42 +03:00
Jan Lindström	fe4f766e81	Add wait_condition to wait that node returns to ready state before accessing it.	2019-10-04 14:45:25 +03:00
Marko Mäkelä	627027a674	Merge 10.4 into 10.5	2019-10-04 10:56:47 +03:00
Alexander Barkov	c2d8db66be	MDEV-20735 Allow non-reserved keywords as user defined type names	2019-10-03 16:03:32 +04:00
Alexander Barkov	d168601e83	MDEV-20734 Allow reserved keywords as user defined type names	2019-10-03 14:02:00 +04:00
Jan Lindström	eb0a10b072	Add missing have_debug to galera.MDEV-20225 test case.	2019-10-03 07:03:15 +03:00
Vlad Lesin	edda2fd149	MDEV-20703: mariabackup creates binlog files in server binlog directory on --prepare --export step When "--export" mariabackup option is used, mariabackup starts the server in bootstrap mode to generate .cfg files for the certain innodb tables. The started instance of the server reads options from the file, pointed out in "--defaults-file" mariabackup option. If the server uses the same config file as mariabackup, and binlog is switched on in that config file, then "mariabackup --prepare --export" will create binary log files in the server's binary log directory, what can cause issues. The fix is to add "--skip-log-bin" in mysld options when the server is started to generate .cfg files.	2019-10-01 13:57:24 +03:00
Jan Lindström	97d82c3429	galera_sp_bf_abort requires debug Galera library.	2019-10-01 12:40:29 +03:00
Alexander Barkov	1ae09ec863	Merge remote-tracking branch 'origin/10.4' into 10.5	2019-10-01 11:44:27 +04:00
seppo	c42c4233cb	MDEV-20225 BF aborting SP execution (#1394 ) * MDEV-20225 BF aborting SP execution When stored procedure execution was chosen as victim for a BF abort, the old implemnetationn called for rollback immediately when execution was inside SP isntruction. Technically this happened in wsrep_after_statement() call, which identified the need for a rollback. The problem was that MariaDB does not accept rollback (nor commit) inside sub statement, there are several asserts about it, checking for THD::in_sub_stmt. This patch contains a fix, which skips calling wsrep_after_statement() for SP execution, which is marked as BF must abort. Instead, we return error code to upper level, where rollback will eventually happen, ouside of SP execution. Also, appending the affected trigger table (dropped or created) in the populated key set for the write set, which prevents parallel applying of other transactions working on the same table. * MDEV-20225 BF aborting SP execution, second patch First PR missed 4 commits, which are now squashed in this patch: - Added galera_sp_bf_abort test. A MTR test case which will reproduce BF-BF conflict if all keys corresponding to affected tables are not assigned for DROP TRIGGER. - Fixed incorrect use of sync pointsin MDEV-20225 - Added condition for SQLCOM_DROP_TRIGGER in wsrep_can_run_in_toi() to make it replicate. * MDEV-20225 BF aborting SP execution, third patch The galera_trigger.test caused a situation, where SP invocation caused a trigger to fire, and the trigger executed as sub statement SP, and was BF aborted by applier. because of wsrep_after_statement() was called for the sub-statement level, it ended up in exeuting rollback and asserted there. Thus fix will catch sub-statement level SP execution, and avoids calling wsrep_after_statement()	2019-10-01 10:41:33 +03:00
Alexander Barkov	dc588e3d3f	Merge remote-tracking branch 'origin/10.3' into 10.4	2019-10-01 10:45:52 +04:00
Alexander Barkov	7e44c455f4	Merge remote-tracking branch 'origin/10.2' into 10.3	2019-10-01 09:37:40 +04:00
Alexander Barkov	f203245e9e	Merge remote-tracking branch 'origin/10.1' into 10.2	2019-10-01 07:11:54 +04:00
Aleksey Midenkov	58fdf5b2fa	MDEV-16144 Default TIMESTAMP clause for SELECT from versioned 1. Removed TIMESTAMP/TRANSACTION unit auto-detection in favor of default TIMESTAMP. Reasons: 1.1. rare practical use and doubtful advantage of such auto-detection; 1.2. it conflicts with MDEV-16226 (TRX_ID-based versioned tables performance improvement). Needless check_unit membership removed. 2. SQL: versioning type handling refactoring Vers_type_handler hierarchy stores versioning properties of type. virtual Type_handler::vers() accesses specialization of Vers_type_handler for specific type. virtual Vers_type_handler::kind() returns versioning kind (timestamp/trx_id). Removed Type_handler::Vers_history_point_check_unit() in favor of Type_handler::vers(). Renames: require_timestamp() -> require_timestamp_error() require_trx_id() -> require_trx_id_error() EDIT by Alexander Barkov (@abarkov): check_sys_fields() moved to Vers_type_handler::check_sys_fields()	2019-09-30 14:05:09 +03:00
Alexey Botchkov	6ac2a35553	MDEV-19628 JSON with starting double quotes key is not valid. First character of the key name is just skipped, so the escapement wasn't handled properly.	2019-09-30 14:43:32 +04:00
Sujatha	9b80f9300d	MDEV-20645: Replication consistency is broken as workers miss the error notification from an earlier failed group. Analysis: ======== In general if there are three groups. 1 - Inserts 32 which fails due to local entry '32' on slave. 2 - Inserts 33 3 - Inserts 34 Each group considers itself as a waiter and it waits for prior group 'waitee'. This is done in 'register_wait_for_prior_event_group_commit'. If there is no other parallel group being scheduled then no waitee will be there. Let us assume 3 groups are being scheduled in parallel. 3-> waits for 2-> waits for->1 '1' upon completion it checks is there any registered subsequent waiter. If so it wakes up the subsequent waiter with its execution status. This execution status is stored in wakeup_error. If '1' failed then it sends corresponding wakeup_error to 2. Then '2' aborts and it propagates error to '3'. So all further commits are aborted. This mechanism works only when all transactions reach a stage where they are waiting for their prior commit to complete. In case of optimistic following scenario occurs. 1,2,3 are scheduled in parallel. 3 - Reaches group_commit_code waits for 2 to complete. 1 - errors out sets stop_on_error_sub_id=1. When a group execution results in error its corresponding sub_id is set to 'stop_on_error_sub_id'. Any new groups queued for execution will check if their sub_id is > stop_on_error_sub_id. If it is true their execution will be skipped as prior group execution failed. 'skip_event_group=1' will be set. Since the execution of SQL thread is about to stop we just skip execution of all the following event groups. We still do all the normal waiting and wakeup processing between the event groups as a simple way to ensure that everything is stopped and cleaned up correctly. Upon error '1' transaction checks for registered waiters. Since no one is there it simply goes away. 2 - Starts the execution. It checks do I have a waitee. Since wait_commit_sub_id == entry->last_committed_sub_id no waitee is set. Secondly: 'entry->stop_on_error_sub_id' is set by '1'st execution. Now 'handle_parallel_thread' code checks if the current group 'sub_id' is greater than the 'sub_id' set within 'stop_on_error_sub_id'. Since the above is true 'skip_event_group=true' is set. Simply call 'wait_for_prior_commit' to wakeup all waiters. Group '2' didn't had any waitee and its execution is skipped. Hence its wakeup_error=0.It sends a positive wakeup signal to '3'. Which commits. This results in a missed transaction. i.e 33 is missed and 34 is committed. Fix: === When a worker learns that an earlier transaction execution has failed, and it should not proceed for further execution, it should mark its own execution status as failed so that it alerts its followers to abort as well.	2019-09-30 13:22:37 +05:30
Sergei Golubchik	cd41ffe1f1	MDEV-19713 Remove big_tables system variable mark big_tables deprecated, the server can put temp tables on disk as needed avoiding "table full" errors. in case someone would really need to force a tmp table to be created on disk from the start and for testing allow tmp_memory_table_size to be set to 0. fix tests to use that instead (and add a test that it actually works). make sure in-memory TREE size limit is never 0 (it's [ab]using tmp_memory_table_size at the moment) remove few sys_vars.*_basic tests	2019-09-28 19:21:14 +02:00
Sergei Golubchik	fab84ec979	removes references to a sysvar that disappeared 6 years ago	2019-09-28 19:21:10 +02:00
Sergei Golubchik	32efbaa19a	MDEV-7481 Replace max_long_data_size functionality with max_allowed_packet	2019-09-28 19:20:35 +02:00
Alexander Barkov	f1dcbc2d9a	MDEV-20639 ASAN SEGV in get_prefix upon modifying base column type with existing indexed virtual column	2019-09-28 12:34:57 +04:00
Marko Mäkelä	12414cd9f2	Merge 10.4 into 10.5	2019-09-27 19:12:07 +03:00
Marko Mäkelä	9b5cdeeb0f	Merge 10.3 into 10.4	2019-09-27 16:26:53 +03:00
Marko Mäkelä	ea2b19dee6	MDEV-20117: Fix another scenario Thanks to Eugene Kosov for noting that the fix is incomplete. It turns out that on instant DROP/reorder column (MDEV-15562), we must always write the metadata record, even though the table was empty. Alternatively, we should guarantee that all undo log records for the table have been purged. (Attempting to do that by updating table_id leads to other problems; see commit 1b31d8852c00b4bab6e6fe179b97db45ccb8d535.) It would be tempting to remove dict_index_t::clear_instant_alter() altogether, but it turns that we need that when the instant ALTER TABLE operation of a first-time DROP COLUMN is being rolled back. innobase_instant_try(): Clarify a comment. Purge never calls dict_index_t::clear_instant_alter(), but it may invoke dict_index_t::clear_instant_add(). On first-time instant DROP/reorder, always write a metadata record, even if the table is empty.	2019-09-27 16:01:55 +03:00
Marko Mäkelä	2911a9a693	Merge 10.2 into 10.3	2019-09-27 15:56:15 +03:00
Thirunarayanan Balathandayuthapani	c76873f23d	MDEV-20688 Recovery crashes after unnecessarily reading a corrupted page The test encryption.innodb-redo-badkey was accidentally disabled until commit `23657a2101` enabled it recently. Once it was enabled, it started failing randomly. recv_recover_corrupt_page(): Do not assume that any redo log exists for the page. A page may be unnecessarily read by read-ahead. When noting the corruption, reset recv_addr->state to RECV_PROCESSED, so that even if the same page is re-read again, we will only decrement recv_sys->n_addrs once.	2019-09-27 17:46:10 +05:30
Marko Mäkelä	4ec0c346b8	Remove a useless large test, and add a debug assertion The test innodb_fts.fulltext_table_evict was only creating 1000 tables with fulltext indexes, only to check that no tables with fulltext indexes are being evicted. The reason why tables containing fulltext indexes cannot be evicted is that fts_optimize_init() invokes dict_table_prevent_eviction().	2019-09-27 14:05:39 +03:00
Marko Mäkelä	72f671ab7b	Merge 10.4 into 10.5	2019-09-27 07:15:07 +03:00
Marko Mäkelä	1f4ee3fa5a	MDEV-20117 Assertion 0 failed in row_sel_get_clust_rec_for_mysql The crash scenario is as follows: (1) A non-empty table exists. (2) MDEV-15562 instant ADD/DROP/reorder has been invoked. (3) Some purgeable undo log exists for the table. (4) The table becomes empty, containing not even any delete-marked records, only containing the hidden metadata record that was added in (2). (5) An instant ADD/DROP/reorder column is executed, and the table is emptied and the (2) metadata removed. (6) Purge processes an undo log record from (3), which will refer to a non-existent clustered index field, because the metadata that was created in (2) was remoeved in (5). We fix this by adjusting step (5) so that we will never remove the MDEV-15562-style metadata record. Removing the MDEV-11369 metadata record (instant ADD COLUMN to the end of the table) is completely fine at any time when the table becomes empty, because dict_index_t::n_fields will remain unchanged. innobase_instant_try(): Never remove the MDEV-15562 metadata record. page_cur_delete_rec(): Do not reset FIL_PAGE_TYPE when the MDEV-15562 metadata record is being removed as part of btr_cur_pessimistic_update() invoked by innobase_instant_try().	2019-09-26 19:45:10 +03:00
Marko Mäkelä	bb5afc7ceb	Merge 10.3 into 10.4	2019-09-26 16:56:02 +03:00

1 2 3 4 5 ...

13521 Commits