mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-09-13 13:47:59 +03:00

Author	SHA1	Message	Date
Yuchen Pei	8cdee25952	MDEV-36132 Substitute vcol expressions with indexed vcol fields in ORDER BY and GROUP BY Also expand vcol field index coverings to include indexes covering all the fields in the expression. The reasoning goes as follows: let f(c1, c2, ..., cn) be a function on applied to columns c1, c2, ..., cn, if f(...) is covered by an index, so should vc whose expression is f(...). For example, if t.vf = t.c1 + t.c2, and t has three indexes (vf), (c1, c2), (c1). Before this change, vf's index covering is a singleton {(vf)}. Let's call that the "conventional" index covering. After this change vf's index covering is now {(vf), (c1, c2)}, since (c1, c2) covers both c1 and c2. Let's call (c1, c2) in this case the "extra" covering. With the coverings updated, when an index in the "extra" covering is chosen for keyread, the vcol also needs to be calculated. In this case we mark vcol in the table read_set, and ensure it is computed. With these changes, we see various improvements, including from using full table scan + filesort to full index scan + filesort when ORDER BY an indexed vcol (here vc = c + 1 is a vcol and both c and vc are indexes): explain select c + 1 from t order by vc; id select_type table type possible_keys key key_len ref rows Extra -1 SIMPLE t ALL NULL NULL NULL NULL 10000 Using filesort +1 SIMPLE t index NULL c 5 NULL 10000 Using index; Using filesort The substitutions are followed updates to all_fields which include a copy of the ORDER BY/GROUP BY item pointers, as well as corresponding updates to ref_pointer_array so that the all_fields and ref_pointer_array remain in sync. Another, related change is the recomputation of table index covering on substitutions. It not only reflects the correct table index covering after the substitutions, but also improve executions where the vcol index can be chosen, such as this example (here vc = c + 1 and vc is the only index in the table), from full table scan + filesort to full index scan: select vc from t order by c + 1; We do it in SELECT as well as in single table DELETE/UPDATE.	2025-07-22 10:44:12 +10:00
Sergei Golubchik	bead24b7f3	mariadb-test: wait on disconnect Remove one of the major sources of race condiitons in mariadb-test. Normally, mariadb_close() sends COM_QUIT to the server and immediately disconnects. In mariadb-test it means the test can switch to another connection and sends queries to the server before the server even started parsing the COM_QUIT packet and these queries can see the connection as fully active, as it didn't reach dispatch_command yet. This is a major source of instability in tests and many - but not all, still less than a half - tests employ workarounds. The correct one is a pair count_sessions.inc/wait_until_count_sessions.inc. Also very popular was wait_until_disconnected.inc, which was completely useless, because it verifies that the connection is closed, and after disconnect it always is, it didn't verify whether the server processed COM_QUIT. Sadly the placebo was as widely used as the real thing. Let's fix this by making mariadb-test `disconnect` command _to wait_ for the server to confirm. This makes almost all workarounds redundant. In some cases count_sessions.inc/wait_until_count_sessions.inc is still needed, though, as only `disconnect` command is changed: * after external tools, like `exec $MYSQL` * after failed `connect` command * replication, after `STOP SLAVE` * Federated/CONNECT/SPIDER/etc after `DROP TABLE` and also in some XA tests, because an XA transaction is dissociated from the THD very late, after the server has closed the client connection. Collateral cleanups: fix comments, remove some redundant statements: * DROP IF EXISTS if nothing is known to exist * DROP table/view before DROP DATABASE * REVOKE privileges before DROP USER etc	2025-07-16 09:14:33 +07:00
Marko Mäkelä	cffbb17480	MDEV-28933: Per-table unique FOREIGN KEY constraint names Before MySQL 4.0.18, user-specified constraint names were ignored. Starting with MySQL 4.0.18, the specified constraint name was prepended with the schema name and '/'. Now we are transforming into a format where the constraint name is prepended with the dict_table_t::name and the impossible UTF-8 sequence 0xff. Generated constraint names will be ASCII decimal numbers. On upgrade, old FOREIGN KEY constraint names will be displayed without any schema name prefix. They will be updated to the new format on DDL operations. dict_foreign_t::sql_id(): Return the SQL constraint name without any schemaname/tablename\377 or schemaname/ prefix. row_rename_table_for_mysql(), dict_table_rename_in_cache(): Simplify the logic: Just rename constraints to the new format. dict_table_get_foreign_id(): Replaces dict_table_get_highest_foreign_id(). innobase_get_foreign_key_info(): Let my_error() refer to erroneous anonymous constraints as "(null)". row_delete_constraint(): Try to drop all 3 constraint name variants. Reviewed by: Thirunarayanan Balathandayuthapani Tested by: Matthias Leich	2025-07-08 12:30:27 +03:00
Oleksandr Byelkin	f1102da37a	Merge branch '11.8' into 12.0	2025-05-22 09:22:55 +02:00
Oleg Smirnov	4bb2669d18	MDEV-33281 Optimizer hints Cleanup: fix formatting, rename objects	2025-05-05 12:02:47 +07:00
Sergei Golubchik	237e24497b	Merge remote-tracking branch 'github/bb-11.4-release' into bb-11.8-serg	2025-04-27 19:40:00 +02:00
Oleksandr Byelkin	a8d4642375	Merge branch '10.11' into 11.4	2025-04-26 10:53:02 +02:00
Oleksandr Byelkin	20b818f45e	Merge branch '10.6' into 10.11	2025-04-21 11:23:11 +02:00
Oleksandr Byelkin	a135551569	Merge branch '10.5' into 10.6	2025-04-21 10:43:17 +02:00
Marko Mäkelä	1c9f64e54f	MDEV-36613 Incorrect undo logging for indexes on virtual columns Starting with mysql/mysql-server@02f8eaa998 and commit `2e814d4702` the index ID of indexes on virtual columns was being encoded insufficiently in InnoDB undo log records. Only the least significant 32 bits were being written. This could lead to some corruption of the affected indexes on ROLLBACK, as well as to missed chances to remove some history from such indexes when purging the history of committed transactions that included DELETE or an UPDATE in the indexes. dict_hdr_create(): In debug instrumented builds, initialize the DICT_HDR_INDEX_ID close to the 32-bit barrier, instead of initializing it to DICT_HDR_FIRST_ID (10). This will allow the changed code to be exercised while running ./mtr --suite=gcol,vcol. trx_undo_log_v_idx(): Encode large index->id in a similar way as mysql/mysql-server@e00328b4d0 but using a different implementation. trx_undo_read_v_idx_low(): Decode large index->id in a similar way as mach_u64_read_much_compressed(). Reviewed by: Debarun Banerjee	2025-04-16 15:55:45 +03:00
Marko Mäkelä	bb9f010432	Merge 11.4 into 11.8	2025-03-05 20:39:47 +02:00
Marko Mäkelä	49a6baec56	Merge 10.11 into 11.4	2025-03-03 11:07:56 +02:00
Marko Mäkelä	6e6a1b316c	MDEV-35000: dict_table_close() breaks STATS_AUTO_RECALC stats_deinit(): Replaces dict_stats_deinit(). Deinitialize the statistics for persistent tables, so that they will be reloaded or recalculated on a subsequent ha_innobase::open(). ha_innobase::rename_table(): Invoke stats_deinit() so that the subsequent ha_innobase::open() will reload the InnoDB persistent statistics. That is, it will remain possible to have the InnoDB persistent statistics reloaded by executing the following: RENAME TABLE t TO tmp, tmp TO t; dict_table_close(table): Replaced with table->release(). There will no longer be any logic that would attempt to ensure that the InnoDB persistent statistics will be reloaded after FLUSH TABLES has been executed. This also fixes the problem that dict_table_t::stat_modified_counter would be frequently reset to 0, whenever ha_innobase::open() is invoked after the table reference count had dropped to 0. dict_table_close(table, thd, mdl): Remove the parameter "dict_locked". Do not try to invalidate the statistics. ha_innobase::statistics_init(): Replaces dict_stats_init(table). Reviewed by: Thirunarayanan Balathandayuthapani	2025-02-28 09:00:16 +02:00
Sergei Golubchik	9ee09a33bb	Merge branch '11.7' into 11.8	2025-02-11 20:29:43 +01:00
Sergei Golubchik	ba01c2aaf0	Merge branch '11.4' into 11.7 * rpl.rpl_system_versioning_partitions updated for MDEV-32188 * innodb.row_size_error_log_warnings_3 changed error for MDEV-33658 (checks are done in a different order)	2025-02-06 16:46:36 +01:00
Sergei Petrunia	1c2a83179d	MDEV-35616: Add basic optimizer support for virtual column (Review input addressed) After this patch, the optimizer can handle virtual column expressions in WHERE/ON clauses. If the table has an indexed virtual column: ALTER TABLE t1 ADD COLUMN vcol INT AS (col1+1), ADD INDEX idx1(vcol); and the query uses the exact virtual column expression: SELECT * FROM t1 WHERE col1+1 <= 100 then the optimizer will be able use index idx1 for it. This is achieved by walking the WHERE/ON clauses and replacing instances of virtual column expression (like "col1+1" above) with virtual column's Item_field (like "vcol"). The latter can be processed by the optimizer. Replacement is considered (and done) only in items that are potentially usable to the range optimizer.	2025-01-25 10:50:52 +02:00
Sergei Golubchik	f1a7693bc0	Merge branch '10.11' into 11.4	2025-01-14 23:45:41 +01:00
Sergei Golubchik	221aa5e08f	Merge branch '10.6' into 10.11	2025-01-10 13:14:42 +01:00
Sergei Golubchik	a0e5dd5433	mysqltest: fix --sorted_results only sort actual results not warnings or metadata also work for vertical results warnings are sorted separately	2025-01-09 10:00:36 +01:00
Sergei Golubchik	062f8eb37d	cleanup: key algorithm vs key flags the information about index algorithm was stored in two places inconsistently split between both. BTREE index could have key->algorithm == HA_KEY_ALG_BTREE, if the user explicitly specified USING BTREE or HA_KEY_ALG_UNDEF, if not. RTREE index had key->algorithm == HA_KEY_ALG_RTREE and always had key->flags & HA_SPATIAL FULLTEXT index had key->algorithm == HA_KEY_ALG_FULLTEXT and always had key->flags & HA_FULLTEXT HASH index had key->algorithm == HA_KEY_ALG_HASH or HA_KEY_ALG_UNDEF long unique index always had key->algorithm == HA_KEY_ALG_LONG_HASH In this commit: All indexes except BTREE and HASH always have key->algorithm set, HA_SPATIAL and HA_FULLTEXT flags are not used anymore (except for storage to keep frms backward compatible). As a side effect ALTER TABLE now detects FULLTEXT index renames correctly	2024-11-05 14:00:47 -08:00
Alexander Barkov	36eba98817	MDEV-19123 Change default charset from latin1 to utf8mb4 Changing the default server character set from latin1 to utf8mb4.	2024-07-11 10:21:07 +04:00
Alexander Barkov	903b5d6a83	MDEV-25829 Change default Unicode collation to uca1400_ai_ci Step#3 The main patch	2024-05-24 15:50:05 +04:00
Oleksandr Byelkin	cd28b2479c	Merge branch '11.1' into 11.2	2024-04-09 12:12:33 +02:00
Marko Mäkelä	fec2fd6add	Merge 10.11 into 11.0	2024-03-28 10:51:36 +02:00
Marko Mäkelä	788953463d	Merge 10.6 into 10.11 Some fixes related to commit `f838b2d799` and Rows_log_event::do_apply_event() and Update_rows_log_event::do_exec_row() for system-versioned tables were provided by Nikita Malyavin. This was required by test versioning.rpl,trx_id,row.	2024-03-28 09:16:57 +02:00
Marko Mäkelä	b8a6719889	MDEV-26642/MDEV-26643/MDEV-32898 Implement innodb_snapshot_isolation https://jepsen.io/analyses/mysql-8.0.34 highlights that the transaction isolation levels in the InnoDB storage engine do not correspond to any widely accepted definitions, such as "Generalized Isolation Level Definitions" https://pmg.csail.mit.edu/papers/icde00.pdf (PL-1 = READ UNCOMMITTED, PL-2 = READ COMMITTED, PL-2.99 = REPEATABLE READ, PL-3 = SERIALIZABLE). Only READ UNCOMMITTED in InnoDB seems to match the above definition. The issue is that InnoDB does not detect write/write conflicts (Section 4.4.3, Definition 6) in the above. It appears that as soon as we implement write/write conflict detection (SET SESSION innodb_snapshot_isolation=ON), the default isolation level (SET TRANSACTION ISOLATION LEVEL REPEATABLE READ) will become Snapshot Isolation (similar to Postgres), as defined in Section 4.2 of "A Critique of ANSI SQL Isolation Levels", MSR-TR-95-51, June 1995 https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/tr-95-51.pdf Locking reads inside InnoDB used to read the latest committed version, ignoring what should actually be visible to the transaction. The added test innodb.lock_isolation illustrates this. The statement UPDATE t SET a=3 WHERE b=2; is executed in a transaction that was started before a read view or a snapshot of the current transaction was created, and committed before the current transaction attempts to execute UPDATE t SET b=3; If SET innodb_snapshot_isolation=ON is in effect when the second transaction was started, the second transaction will be aborted with the error ER_CHECKREAD. By default (innodb_snapshot_isolation=OFF), the second transaction would execute inconsistently, displaying an incorrect SELECT COUNT(*) FROM t in its read view. If innodb_snapshot_isolation=ON, if an attempt to acquire a lock on a record that does not exist in the current read view is made, an error DB_RECORD_CHANGED (HA_ERR_RECORD_CHANGED, ER_CHECKREAD) will be raised. This error will be treated in the same way as a deadlock: the transaction will be rolled back. lock_clust_rec_read_check_and_lock(): If the current transaction has a read view where the record is not visible and innodb_snapshot_isolation=ON, fail before trying to acquire the lock. row_sel_build_committed_vers_for_mysql(): If innodb_snapshot_isolation=ON, disable the "semi-consistent read" logic that had been implemented by myself on the directions of Heikki Tuuri in order to address https://bugs.mysql.com/bug.php?id=3300 that was motivated by a customer wanting UPDATE to skip locked rows that do not match the WHERE condition. It looks like my changes were included in the MySQL 5.1.5 commit ad126d90e019f223470e73e1b2b528f9007c4532; at that time, employees of Innobase Oy (a recent acquisition of Oracle) had lost write access to the repository. The only reason why we set innodb_snapshot_isolation=OFF by default is backward compatibility with applications, such as the one that motivated the implementation of "semi-consistent read" back in 2005. In a later major release, we can default to innodb_snapshot_isolation=ON. Thanks to Peter Alvaro, Kyle Kingsbury and Alexey Gotsman for their work on https://github.com/jepsen-io/ and to Kyle and Alexey for explanations and some testing of this fix. Thanks to Vladislav Lesin for the initial test for MDEV-26643, as well as reviewing these changes.	2024-03-20 09:48:03 +02:00
Sergei Golubchik	fef31a26f3	Merge branch '11.1' into 11.2	2023-12-20 23:43:05 +01:00
Sergei Golubchik	8c8bce05d2	Merge branch '10.11' into 11.0	2023-12-19 15:53:18 +01:00
Sergei Golubchik	fd0b47f9d6	Merge branch '10.6' into 10.11	2023-12-18 11:19:04 +01:00
Sergei Golubchik	e95bba9c58	Merge branch '10.5' into 10.6	2023-12-17 11:20:43 +01:00
Sergei Golubchik	98a39b0c91	Merge branch '10.4' into 10.5	2023-12-02 01:02:50 +01:00
Marko Mäkelä	0d29f3759c	Merge 11.1 into 11.2	2023-11-28 11:19:06 +02:00
Monty	06f7ed4dcd	MDEV-28566 Assertion `!expr->is_fixed()' failed in bool virtual_column_info::fix_session_expr(THD*) The problem was that table->vcol_cleanup_expr() was not called in case of error in open_table().	2023-11-27 19:08:14 +02:00
Marko Mäkelä	5b6134b040	Merge 10.11 into 11.0	2023-11-24 11:20:56 +02:00
Marko Mäkelä	f87c7d1772	Merge 10.6 into 10.11	2023-11-21 12:47:51 +02:00
Marko Mäkelä	4c16ec3e77	MDEV-32050 fixup: Stabilize tests In any test that uses wait_all_purged.inc, ensure that InnoDB tables will be created without persistent statistics. This is a follow-up to commit `cd04673a17` after a similar failure was observed in the innodb_zip.blob test.	2023-11-21 12:42:00 +02:00
Oleksandr Byelkin	0427c4739e	Merge tag '11.1' into 11.2 MariaDB 11.1.3 release	2023-11-14 18:28:37 +01:00
Oleksandr Byelkin	48af85db21	Merge branch '10.11' into 11.0	2023-11-08 17:09:44 +01:00
Oleksandr Byelkin	04d9a46c41	Merge branch '10.6' into 10.10	2023-11-08 16:23:30 +01:00
Marko Mäkelä	228b7e4db5	MDEV-13626 Merge InnoDB test cases from MySQL 5.7 This imports and adapts a number of MySQL 5.7 test cases that are applicable to MariaDB. Some tests for old bug fixes are not that relevant because the code has been refactored since then (especially starting with MariaDB Server 10.6), and the tests would not reproduce the original bug if the fix was reverted. In the test innodb_fts.opt, there are many duplicate MATCH ranks, which would make the results nondeterministic. The test was stabilized by changing some LIMIT clauses or by adding sorted_result in those cases where the purpose of a test was to show that no sorting took place in the server. In the test innodb_fts.phrase, MySQL 5.7 would generate FTS_DOC_ID that are 1 larger than in MariaDB. In innodb_fts.index_table the difference is 2. This is because in MariaDB, fts_get_next_doc_id() post-increments cache->next_doc_id, while MySQL 5.7 pre-increments it. Reviewed by: Thirunarayanan Balathandayuthapani	2023-11-08 12:17:14 +02:00
Yuchen Pei	d0f8dfbcf0	Merge branch '11.1' into 11.2	2023-10-27 18:11:56 +11:00
Marko Mäkelä	14685b10df	MDEV-32050: Deprecate&ignore innodb_purge_rseg_truncate_frequency The motivation of introducing the parameter innodb_purge_rseg_truncate_frequency in mysql/mysql-server@28bbd66ea5 and mysql/mysql-server@8fc2120fed seems to have been to avoid stalls due to freeing undo log pages or truncating undo log tablespaces. In MariaDB Server, innodb_undo_log_truncate=ON should be a much lighter operation than in MySQL, because it will not involve any log checkpoint. Another source of performance stalls should be trx_purge_truncate_rseg_history(), which is shrinking the history list by freeing the undo log pages whose undo records have been purged. To alleviate that, we will introduce a purge_truncation_task that will offload this from the purge_coordinator_task. In that way, the next innodb_purge_batch_size pages may be parsed and purged while the pages from the previous batch are being freed and the history list being shrunk. The processing of innodb_undo_log_truncate=ON will still remain the responsibility of the purge_coordinator_task. purge_coordinator_state::count: Remove. We will ignore innodb_purge_rseg_truncate_frequency, and act as if it had been set to 1 (the maximum shrinking frequency). purge_coordinator_state::do_purge(): Invoke an asynchronous task purge_truncation_callback() to free the undo log pages. purge_sys_t::iterator::free_history(): Free those undo log pages that have been processed. This used to be a part of trx_purge_truncate_history(). purge_sys_t::clone_end_view(): Take a new value of purge_sys.head as a parameter, so that it will be updated while holding exclusive purge_sys.latch. This is needed for race-free access to the field in purge_truncation_callback(). Reviewed by: Vladislav Lesin	2023-10-25 09:11:58 +03:00
Marko Mäkelä	be24e75229	Merge 10.11 into 11.0	2023-10-19 08:12:16 +03:00
Marko Mäkelä	d5e15424d8	Merge 10.6 into 10.10 The MDEV-29693 conflict resolution is from Monty, as well as is a bug fix where ANALYZE TABLE wrongly built histograms for single-column PRIMARY KEY. Also includes a fix for safe_malloc error reporting. Other things: - Copied main.log_slow from 10.4 to avoid mtr issue Disabled test: - spider/bugfix.mdev_27239 because we started to get +Error 1429 Unable to connect to foreign data source: localhost -Error 1158 Got an error reading communication packets - main.delayed - Bug#54332 Deadlock with two connections doing LOCK TABLE+INSERT DELAYED This part is disabled for now as it fails randomly with different warnings/errors (no corruption).	2023-10-14 13:36:11 +03:00
Marko Mäkelä	8096139b3a	Merge 10.5 into 10.6	2023-09-19 10:47:26 +03:00
Marko Mäkelä	6c05edfdcd	Merge 10.4 into 10.5	2023-09-19 10:20:09 +03:00
Marko Mäkelä	76b688f100	MDEV-30024 InnoDB: tried to purge non-delete-marked of a virtual column prefix row_vers_vc_matches_cluster(): Invoke dtype_get_at_most_n_mbchars() to extract the correct number of bytes corresponding to the number of characters in a virtual column prefix index, just like we do in row_sel_sec_rec_is_for_clust_rec(). The test case would occasionally reproduce the failure when this fix is not present.	2023-09-19 09:31:34 +03:00
Sergei Golubchik	18ddde4826	Merge branch '11.1' into 11.2	2023-08-18 00:59:16 +02:00
Sergei Golubchik	a8a22b7af2	support 'alter online table t1 page_checksum=0'	2023-08-15 10:16:11 +02:00
Nikita Malyavin	ab4bfad206	MDEV-16329 [5/5] ALTER ONLINE TABLE * Log rows in online_alter_binlog. * Table online data is replicated within dedicated binlog file * Cached data is written on commit. * Versioning is fully supported. * Works both wit and without binlog enabled. * For now savepoints setup is forbidden while ONLINE ALTER goes on. Extra support is required. We can simply log the SAVEPOINT query events and replicate them together with row events. But it's not implemented for now. * Cache flipping: We want to care for the possible bottleneck in the online alter binlog reading/writing in advance. IO_CACHE does not provide anything better that sequential access, besides, only a single write is mutex-protected, which is not suitable, since we should write a transaction atomically. To solve this, a special layer on top Event_log is implemented. There are two IO_CACHE files underneath: one for reading, and one for writing. Once the read cache is empty, an exclusive lock is acquired (we can wait for a currently active transaction finish writing), and flip() is emitted, i.e. the write cache is reopened for read, and the read cache is emptied, and reopened for writing. This reminds a buffer flip that happens in accelerated graphics (DirectX/OpenGL/etc). Cache_flip_event_log is considered non-blocking for a single reader and a single writer in this sense, with the only lock held by reader during flip. An alternative approach by implementing a fair concurrent circular buffer is described in MDEV-24676. * Cache managers: We have two cache sinks: statement and transactional. It is important that the changes are first cached per-statement and per-transaction. If a statement fails, then only statement data is rolled back. The transaction moves along, however. Turns out, there's no guarantee that TABLE well persist in thd->open_tables to the transaction commit moment. If an error occurs, tables from statement are purged. Therefore, we can't store te caches in TABLE. Ideally, it should be handlerton, but we cut the corner and store it in THD in a list.	2023-08-15 10:16:11 +02:00

1 2 3 4 5 ...

367 Commits