1
0
mirror of https://github.com/MariaDB/server.git synced 2025-08-05 13:16:09 +03:00
Commit Graph

3175 Commits

Author SHA1 Message Date
Oleksandr Byelkin
04d9a46c41 Merge branch '10.6' into 10.10 2023-11-08 16:23:30 +01:00
Oleksandr Byelkin
b83c379420 Merge branch '10.5' into 10.6 2023-11-08 15:57:05 +01:00
Oleksandr Byelkin
6cfd2ba397 Merge branch '10.4' into 10.5 2023-11-08 12:59:00 +01:00
Nikita Malyavin
23f9e34256 MDEV-32444 Data from orphaned XA transaction is lost after online alter
XA support for online alter was totally missing.

Tying on binlog_hton made this hardly visible: simply having binlog_commit
called from xa_commit made an impression that it will automagically work
for online alter, which turns out wrong: all binlog does is writes
"XA END" into trx cache and flushes it to a real binlog.

In comparison, online alter can't do the same, since online replication
happens in a single transaction.

Solution: make a dedicated XA support.
* Extend struct xid_t with a pointer to Online_alter_cache_list
* On prepare: move online alter cache from THD::ha_data to XID passed
* On XA commit/rollback: use the online alter cache stored in this XID.
  This makes us pass xid_cache_element->xid to xa_commit/xa_rollback
  instead of lex->xid
* Use manual memory management for online alter cache list, instead of
  mem_root allocation, since we don't have mem_root connected to the XA
  transaction.
2023-11-04 11:53:28 +04:00
Nikita Malyavin
cb52174693 online alter: extract the source to a separate file
Move all the functions dedicated to online alter to a newly created
online_alter.cc.

With that, make many functions static and simplify the static functions
naming.

Also, rename binlog_log_row_online_alter -> online_alter_log_row.
2023-11-02 22:58:03 +04:00
Marko Mäkelä
7b842f1536 Merge 11.2 into 11.3 2023-10-27 10:48:29 +03:00
Yuchen Pei
d0f8dfbcf0 Merge branch '11.1' into 11.2 2023-10-27 18:11:56 +11:00
Marko Mäkelä
be24e75229 Merge 10.11 into 11.0 2023-10-19 08:12:16 +03:00
Marko Mäkelä
2ecc0443ec Merge 10.10 into 10.11 2023-10-17 16:04:21 +03:00
Marko Mäkelä
d5e15424d8 Merge 10.6 into 10.10
The MDEV-29693 conflict resolution is from Monty, as well as is
a bug fix where ANALYZE TABLE wrongly built histograms for
single-column PRIMARY KEY.
Also includes a fix for safe_malloc error reporting.

Other things:
- Copied main.log_slow from 10.4 to avoid mtr issue

Disabled test:
- spider/bugfix.mdev_27239 because we started to get
  +Error	1429 Unable to connect to foreign data source: localhost
  -Error	1158 Got an error reading communication packets
- main.delayed
  - Bug#54332 Deadlock with two connections doing LOCK TABLE+INSERT DELAYED
    This part is disabled for now as it fails randomly with different
    warnings/errors (no corruption).
2023-10-14 13:36:11 +03:00
Yuchen Pei
17810b7585 Merge branch '10.10' into 10.11 2023-10-13 18:05:20 +11:00
Thirunarayanan Balathandayuthapani
213b916aea MDEV-32291 memory leak in innodb.insert_into_empty test
- This failure happens due to commit bf3b787e02 (MDEV-31835). InnoDB fails to apply buffered insert
operation for transaction_registry during commit operation.
To avoid this, ha_commit_trans() should call extra() with
HA_EXTRA_RESET_STATE to apply bulk buffered insert operation.
2023-10-02 15:30:33 +05:30
Thirunarayanan Balathandayuthapani
952f06aa8b MDEV-29989 binlog_do_db option breaks versioning table
Problem:
=========
During commit, server calls prepare_commit_versioned to
determine the transaction modified system-versioned data.
Due to binlog_do_db option, we disable the binlog for the
statement. But prepare_commit_versioned() is being
called only when binlog is enabled for the statement.

Fix:
===
prepare_commit_versioned() should happen irrespective
of binlog state. So if the server has any read-write operation
then we should call prepare_commit_versioned().
2023-09-26 10:47:59 +05:30
Nikita Malyavin
28b4037242 Merge branch '11.2' into 11.3 2023-09-21 14:15:04 +04:00
Marko Mäkelä
0dd25f28f7 Merge 10.5 into 10.6 2023-09-11 14:46:39 +03:00
Marko Mäkelä
f8f7d9de2c Merge 10.4 into 10.5 2023-09-11 11:29:31 +03:00
Sergei Golubchik
382c543f53 MDEV-32012 hash unique corrupts index on virtual blobs
as always when copying record[0] aside one needs to detach
Field_blob::value's from it, and restore them when record[0]
is restored from a backup.
2023-09-06 22:38:41 +02:00
Alexander Barkov
8ad1e26b1b MDEV-32081 Remove my_casedn_str() from get_canonical_filename()
- Moving get_canonical_filename() from a public function to a method in handler.
- Adding a helper method is_canonical_filename() to handler.
- Adding helper methods left(), substr(), starts_with() to Lex_cstring.
- Adding helper methods is_sane(), buffer_overlaps(),
  max_data_size() to CharBuffer.
- Adding append_casedn() to CharBuffer. It implements the main functionality
  that replaces the being removed my_casedn_str() call.
- Adding a class Table_path_buffer,
  a descendant of CharBuffer with size FN_REFLEN.
- Changing get_canonical_filename() to get a pointer to Table_path_buffer
  instead just a pointer to char.
- Changing the data type of the "path" parameter and the return type of
  get_canonical_filename() from char* to Lex_cstring.
2023-09-04 09:36:44 +04:00
Alexander Barkov
21218d3c9e MDEV-31986 Remove old check_db_name() from make_table_name_list()
- Replacing the old style inplace check_db_name() in make_table_name_list()
  to the new style non-modifying code
- Adding "const" qualifier to the "db" parameter to ha_discover_table_names()
  and its dependency functions.
2023-08-23 08:12:47 +04:00
Sergei Golubchik
18ddde4826 Merge branch '11.1' into 11.2 2023-08-18 00:59:16 +02:00
Nikita Malyavin
70491fb07b MDEV-31677 Assertion failed upon online ALTER with binlog_row_image=NOBLOB
Make binlog_prepare_row_images accept image type as an argument.
2023-08-15 14:00:28 +02:00
Nikita Malyavin
c76072db93 MDEV-31033 ER_KEY_NOT_FOUND upon online COPY ALTER on a partitioned table
The row events were applied "twice": once for the ha_partition, and one
more time for the underlying storage engine.

There's no such problem in binlog/rpl, because ha_partiton::row_logging
is normally set to false.

The fix makes the events replicate only when the handler is a root handler.
We will try to *guess* this by comparing it to table->file. The same
approach is used in the MDEV-21540 fix, 231feabd. The assumption is made,
that the row methods are only called for table->file (and never for a
cloned handler), hence the assertions are added in ha_innobase and
ha_myisam to make sure that this is true at least for those engines

Also closes MDEV-31040, however the test is not included, since we have no
convenient way to construct a deterministic version.
2023-08-15 10:16:13 +02:00
Sergei Golubchik
64b55151f4 separate online_alter_cache_data from binlog_cache_data 2023-08-15 10:16:12 +02:00
Nikita Malyavin
5a867d847c Online alter: savepoints 2023-08-15 10:16:11 +02:00
Sergei Golubchik
332f41aae3 don't copy stmt IO_CACHE to trx IO_CACHE at the stmt end
instead use only one (trx) IO_CACHE and truncate it if the
statement is rolled back.

don't use binlog_cache_mngr to accumulate the data,
use binlog_cache_data instead.

(binlog_cache_data owns one IO_CACHE, binlog_cache_mngr owns
two binlog_cache_data's, trx and stmt).
2023-08-15 10:16:11 +02:00
Sergei Golubchik
32c3d775e9 Online alter: set read_set early, before row reads
also

* don't modify write_set
* backup/restore rpl_write_set
2023-08-15 10:16:11 +02:00
Nikita Malyavin
ab4bfad206 MDEV-16329 [5/5] ALTER ONLINE TABLE
* Log rows in online_alter_binlog.
* Table online data is replicated within dedicated binlog file
* Cached data is written on commit.
* Versioning is fully supported.
* Works both wit and without binlog enabled.

* For now savepoints setup is forbidden while ONLINE ALTER goes on.
  Extra support is required. We can simply log the SAVEPOINT query events
  and replicate them together with row events. But it's not implemented
  for now.

* Cache flipping:

  We want to care for the possible bottleneck in the online alter binlog
  reading/writing in advance.

  IO_CACHE does not provide anything better that sequential access,
  besides, only a single write is mutex-protected, which is not suitable,
  since we should write a transaction atomically.

  To solve this, a special layer on top Event_log is implemented.
  There are two IO_CACHE files underneath: one for reading, and one for
  writing.

  Once the read cache is empty, an exclusive lock is acquired (we can wait
  for a currently active transaction finish writing), and flip() is emitted,
  i.e. the write cache is reopened for read, and the read cache is emptied,
  and reopened for writing.

  This reminds a buffer flip that happens in accelerated graphics
  (DirectX/OpenGL/etc).

  Cache_flip_event_log is considered non-blocking for a single reader and a
  single writer in this sense, with the only lock held by reader during flip.

  An alternative approach by implementing a fair concurrent circular buffer
  is described in MDEV-24676.

* Cache managers:
  We have two cache sinks: statement and transactional.
  It is important that the changes are first cached per-statement and
  per-transaction.
  If a statement fails, then only statement data is rolled back. The
  transaction moves along, however.

  Turns out, there's no guarantee that TABLE well persist in
  thd->open_tables to the transaction commit moment.
  If an error occurs, tables from statement are purged.
  Therefore, we can't store te caches in TABLE. Ideally, it should be
  handlerton, but we cut the corner and store it in THD in a list.
2023-08-15 10:16:11 +02:00
Nikita Malyavin
d2d0995cf2 MDEV-16329 [4/5] Refactor MYSQL_BIN_LOG: extract Event_log ancestor
Event_log is supposed to be a basic logging class that can write events in
a single file.

MYSQL_BIN_LOG in comparison will have:
* rotation support
* index files
* purging
* gtid and transactional information handling.
* is dedicated for a general-purpose binlog
2023-08-15 10:16:11 +02:00
Nikita Malyavin
6427e343cf MDEV-16329 [3/5] use binlog_cache_data directly in most places
* Eliminate most usages of THD::use_trans_table. Only 3 left, and they are
  at quite high levels, and really essential.
* Eliminate is_transactional argument when possible. Lots of places are
  left though, because of some WSREP error handling in
  MYSQL_BIN_LOG::set_write_error.
* Remove junk binlog functions from THD
* binlog_prepare_pending_rows_event is moved to log.cc inside MYSQL_BIN_LOG
  and is not anymore template. Instead it accepls event factory with a type
  code, and a callback to a constructing function in it.
2023-08-15 10:16:11 +02:00
Nikita Malyavin
429f635f30 MDEV-16329 [2/5] refactor binlog and cache_mngr
pump up binlog and cache manager to level of binlog_log_row_internal
2023-08-15 10:16:11 +02:00
Oleksandr Byelkin
51f9d62005 Merge branch '10.11' into 11.0 2023-08-09 07:53:48 +02:00
Oleksandr Byelkin
036df5f970 Merge branch '10.10' into 10.11 2023-08-08 14:57:31 +02:00
Oleksandr Byelkin
34a8e78581 Merge branch '10.6' into 10.9 2023-08-04 08:01:06 +02:00
Oleksandr Byelkin
5ea5291d97 Merge branch '10.5' into 10.6 2023-08-04 07:52:54 +02:00
Sergei Golubchik
ab1191c039 cleanup: key->key_create_info.check_for_duplicate_indexes -> key->old
mark old keys in the ALTER TABLE with the `old` flag, not with
the `key_create_info.check_for_duplicate_indexes`.

This allows to mark old foreign keys too.
2023-08-01 22:43:16 +02:00
Oleksandr Byelkin
6bf8483cac Merge branch '10.5' into 10.6 2023-08-01 15:08:52 +02:00
Oleksandr Byelkin
7564be1352 Merge branch '10.4' into 10.5 2023-07-26 16:02:57 +02:00
Marko Mäkelä
f2b4972bd4 Merge 10.11 into 11.0 2023-07-26 15:13:06 +03:00
Marko Mäkelä
bce3ee704f Merge 10.10 into 10.11 2023-07-26 14:44:43 +03:00
Yuchen Pei
734583b0d7 MDEV-31400 Simple plugin dependency resolution
We introduce simple plugin dependency. A plugin init function may
return HA_ERR_RETRY_INIT. If this happens during server startup when
the server is trying to initialise all plugins, the failed plugins
will be retried, until no more plugins succeed in initialisation or
want to be retried.

This will fix spider init bugs which is caused in part by its
dependency on Aria for initialisation.

The reason we need a new return code, instead of treating every
failure as a request for retry, is that it may be impossible to clean
up after a failed plugin initialisation. Take InnoDB for example, it
has a global variable `buf_page_cleaner_is_active`, which may not
satisfy an assertion during a second initialisation try, probably
because InnoDB does not expect the initialisation to be called
twice.
2023-07-25 18:24:20 +10:00
Oleksandr Byelkin
f52954ef42 Merge commit '10.4' into 10.5 2023-07-20 11:54:52 +02:00
Kristian Nielsen
a8ea6627a4 MDEV-31448: Killing a replica thread awaiting its GCO can hang/crash a parallel replica
The problem was an incorrect unmark_start_commit() in
signal_error_to_sql_driver_thread(). If an event group gets an error, this
unmark could run after the following GCO started, and the subsequent
re-marking could access de-allocated GCO.

The offending unmark_start_commit() looks obviously incorrect, and the fix
is to just remove it. It was introduced in the MDEV-8302 patch, the commit
message of which suggests it was added there solely to satisfy an assertion
in ha_rollback_trans(). So update this assertion instead to not trigger for
event groups that experienced an error (rgi->worker_error). When an error
occurs in an event group, all following event groups are skipped anyway, so
the unmark should never be needed in this case.

Reviewed-by: Andrei Elkin <andrei.elkin@mariadb.com>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2023-07-12 09:41:32 +02:00
Brandon Nesterenko
9808ebe195 MDEV-30978: On slave XA COMMIT/XA ROLLBACK fail to return an error in read-only mode
Where a read-only server permits writes through replication, it
should not permit user connections to commit/rollback XA
transactions prepared via replication. The bug reported in
MDEV-30978 shows that this can happen. This is because there is no
read only check in the XA transaction logic, the most relevant one
occurs in ha_commit_trans() for normal statements/transactions.

This patch extends the XA transaction logic to check the read only
status of the server before performing an XA COMMIT or ROLLBACK.

Reviewed By:
Andrei Elkin <andrei.elkin@mariadb.com>
2023-07-11 07:49:44 -06:00
Marko Mäkelä
7cde5c539b Merge 10.6 into 10.9 2023-07-10 11:22:21 +03:00
Monty
99bd226059 MDEV-31558 Add InnoDB engine information to the slow query log
The new statistics is enabled by adding the "engine", "innodb" or "full"
option to --log-slow-verbosity

Example output:

 # Pages_accessed: 184  Pages_read: 95  Pages_updated: 0  Old_rows_read: 1
 # Pages_read_time: 17.0204  Engine_time: 248.1297

Page_read_time is time doing physical reads inside a storage engine.
(Writes cannot be tracked as these are usually done in the background).
Engine_time is the time spent inside the storage engine for the full
duration of the read/write/update calls. It uses the same code as
'analyze statement' for calculating the time spent.

The engine statistics is done with a generic interface that should be
easy for any engine to use. It can also easily be extended to provide
even more statistics.

Currently only InnoDB has counters for Pages_% and Undo_% status.
Engine_time works for all engines.

Implementation details:

class ha_handler_stats holds all engine stats.  This class is included
in handler and THD classes.
While a query is running, all statistics is updated in the handler. In
close_thread_tables() the statistics is added to the THD.

handler::handler_stats is a pointer to where statistics should be
collected. This is set to point to handler::active_handler_stats if
stats are requested. If not, it is set to 0.
handler_stats has also an element, 'active' that is 1 if stats are
requested. This is to allow engines to avoid doing any 'if's while
updating the statistics.

Cloned or partition tables have the pointer set to the base table if
status are requested.

There is a small performance impact when using --log-slow-verbosity=engine:
- All engine calls in 'select' will be timed.
- IO calls for InnoDB reads will be timed.
- Incrementation of counters are done on local variables and accesses
  are inline, so these should have very little impact.
- Statistics has to be reset for each statement for the THD and each
  used handler. This is only 40 bytes, which should be neglectable.
- For partition tables we have to loop over all partitions to update
  the handler_status as part of table_init(). Can be optimized in the
  future to only do this is log-slow-verbosity changes. For this to work
  we have to update handler_status for all opened partitions and
  also for all partitions opened in the future.

Other things:
- Added options 'engine' and 'full' to log-slow-verbosity.
- Some of the new files in the test suite comes from Percona server, which
  has similar status information.
- buf_page_optimistic_get(): Do not increment any counter, since we are
  only validating a pointer, not performing any buf_pool.page_hash lookup.
- Added THD argument to save_explain_data_intern().
- Switched arguments for save_explain_.*_data() to have
  always THD first (generates better code as other functions also have THD
  first).
2023-07-07 12:53:18 +03:00
Marko Mäkelä
1fe4bcbe05 Merge 10.11 into 11.0 2023-06-28 09:19:19 +03:00
Marko Mäkelä
71a1a28a49 Merge 10.10 into 10.11 2023-06-27 17:45:06 +03:00
Marko Mäkelä
eb6b521f1b Merge 10.6 into 10.9 2023-06-27 13:48:46 +03:00
Michael Widenius
3c7fd3c89b MDEV-23106 Unable to recognize/import partitioned tables from physical MySQL databases
MDEV-29253 Detect incompatible MySQL partition scheme and either convert
them or report to user and in error log.

This task is about converting in place MySQL 5.6 and 5.7 partition tables
to MariaDB as part of mariadb-upgrade.

- Update TABLE_SHARE::init_from_binary_frm_image() to be able to read
  MySQL frm files with partitions.
- Create .par file, if it do not exists, on open of partitioned table.

Executing mariadb-upgrade will create all the missing .par files.
The MySQL .frm file will be changed to MariaDB format after next
ALTER TABLE.

Other changes:
- If we are using stored mysql_version to distingush between MySQL and
  MariaDB  .frm file information, do not upgrade mysql_version in the
  .frm file as part of CHECK TABLE .. FOR UPGRADE as this would cause
  problems next time we parse the .frm file.
2023-06-25 16:15:08 +03:00
Marko Mäkelä
5fb2c031f7 Merge 10.11 into 11.0 2023-06-08 13:49:48 +03:00