Under `@@rpl_semi_sync_master_wait_no_slave = 0`,
when `rpl_semi_sync_master_clients` decrements to zero, the primary
reverts to async replication. This code did not check whether Semi-
Sync is still globally enabled or not as it didn’t matter before.
However, after MDEV-33551 (#3089) split the transactions’ACK condition
variables to per-transaction, this function now needs Semi-Sync’s
transaction tracker to unblock these condition variables in batch,
but this tracker is `NULL` when Semi-Sync Primary is disabled.
Co-authored-by: Kristian Nielsen <knielsen@knielsen-hq.org>
check sequence privileges in Item_func_nextval::fix_fields(),
just like column privileges are checked in Item_field::fix_fields()
remove sequence specific hacks that kinda made sequence privilege
checks works, but not in all cases. And they were too lax,
didn't requre SELECT privilege for NEXTVAL. Also INSERT privilege looks
wrong here, UPDATE would've been more appropriate, but won't
change that for compatibility reasons.
also fixes
MDEV-36413 User without any privileges to a sequence can read from it and modify it via column default
Problem:
========
- After commit cc8eefb0dc (MDEV-33087),
InnoDB does use bulk insert operation for ALTER TABLE.. ALGORITHM=COPY
and CREATE TABLE..SELECT as well. InnoDB fails to clear the bulk
buffer when it encounters error during CREATE..SELECT. Problem
is that while transaction cleanup, InnoDB fails to identify
the bulk insert for DDL operation.
Fix:
====
- Represent bulk_insert in trx by 2 bits. By doing that, InnoDB
can distinguish between TRX_DML_BULK, TRX_DDL_BULK. During DDL,
set bulk insert value for transaction to TRX_DDL_BULK.
- Introduce a parameter HA_EXTRA_ABORT_ALTER_COPY which rollbacks
only TRX_DDL_BULK transaction.
- bulk_insert_apply() happens for TRX_DDL_BULK transaction happens
only during HA_EXTRA_END_ALTER_COPY extra() call.
Starting with mysql/mysql-server@02f8eaa998
and commit 2e814d4702 the index ID of
indexes on virtual columns was being encoded insufficiently in
InnoDB undo log records. Only the least significant 32 bits were
being written. This could lead to some corruption of the affected
indexes on ROLLBACK, as well as to missed chances to remove some
history from such indexes when purging the history of committed
transactions that included DELETE or an UPDATE in the indexes.
dict_hdr_create(): In debug instrumented builds, initialize the
DICT_HDR_INDEX_ID close to the 32-bit barrier, instead of initializing
it to DICT_HDR_FIRST_ID (10). This will allow the changed code to
be exercised while running ./mtr --suite=gcol,vcol.
trx_undo_log_v_idx(): Encode large index->id in a similar way as
mysql/mysql-server@e00328b4d0
but using a different implementation.
trx_undo_read_v_idx_low(): Decode large index->id in a similar way
as mach_u64_read_much_compressed().
Reviewed by: Debarun Banerjee
Tweak `multi_source.master_info_file` re. MDEV-31857’s new default
`Master_SSL_Verify_Server_Cert=0` was optional before that
secure-by-default change. Now, passwordless connections must disable
SSL certificate verification.
Because the default unnamed connection cannot be deleted by RESET
REPLICA ALL, it must be explicitly left passwordless and having
`Master_SSL_Verify_Server_Cert=0`. The named connection is cleaned
up by RESET REPLICA ALL and thus not affected.
Adding support for the "FM" format in function TO_CHAR(date_time, fmt).
"FM" in the format string disables padding of all components following it.
So now TO_CHAR() works as follows:
- By default string format components DAY (weekday name) and
MONTH (month name) are right-padded with spaces to the maximum
possible DAY and MONTH name lengths respectively,
according to the current locale specified in @@lc_time_names.
So for example, with lc_time_names='en_US' all month names are
right-padded with spaces up to 9 characters ('September' is the longest).
SET lc_time_names='en_US';
SELECT TO_CHAR('0001-02-03', 'MONTH'); -> 'February ' (padded to 9 chars)
NEW: When typed after FM, DAY and MONTH names are not right-padded
with trailing spaces any more:
SET lc_time_names='en_US';
SELECT TO_CHAR('0001-02-03', 'FMMONTH'); -> 'February' (not padded)
- By default numeric components YYYY, YYY, YY, Y, DD, H12, H24, MI, SS
are left-padded with leading digits '0' up to the maximum possible
number of digits in the component (e.g. 4 for YYYY):
SELECT TO_CHAR('0001-02-03', 'YYYY'); -> '0001' (padded to 4 chars)
NEW: When typed after FM, these numeric components are not left-padded
with leading zeros any more:
SELECT TO_CHAR('0001-02-03', 'FMYYYY'); -> '1' (not padded)
- If FM is specified multiple times in a format string,
every FM negates the previous padding state:
* an odd FM disables padding
* an even FM enables padding
Implementation details:
- Adding a helper class Date_time_format_oracle.
- Adding a helper method Date_time_format_oracle::append_lex_cstring()
- Moving the function append_val() to Date_time_format_oracle as a method.
- Moving the function make_date_time_oracle() to Date_time_format_oracle
as a method format().
- Adding helper methods month_name() and day_name() in class MY_LOCALE,
to return the corresponding components as LEX_CSTRINGs.
Test changes only. Add wait conditions after INSERT-clauses
to make sure that they are replicated before checking
gtid position or table contents.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Additional corrections: there is a natural race between closing connection
to cluster in case of applying error and finishing the IST and sometimes
IST finishes and tries to send JOIN message over a closed connection.
This does not affect the correctness of the test or node behavior.
Added error message suppression.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Delaying scripts on joiner after SST/IST has been made
a common debug feature for all suitable SST/IST methods.
Also some minor fixes have been made for new tests.
After cluster vote to evict a node that failed a transaction,
current master can't commit anymore.
Error voting for joiner in the JOINED state was broken because
the group-wide commit cut (implicit SUCCESS vote) was not taken
into account when processing error vote request from the JOINED
node.
This commit adds 3 MTR tests to verify the fix in the galera
library works as designed.
Requires Galera library commit 91f0090a05e96c3cc353b80d961ede45cefb9279
(galera library version > 26.4.19).
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
- With the help of MDEV-14795, InnoDB implemented a way to shrink
the InnoDB system tablespace after undo tablespaces have been moved
to separate files (MDEV-29986). There is no way to defragment any
pages of InnoDB system tables. By doing that, shrinking of
system tablespace can be more effective. This patch deals with
defragment of system tables inside ibdata1.
Following steps are done to do the defragmentation of system
tablespace:
1) Make sure that there is no user tables exist in ibdata1
2) Iterate through all extent descriptor pages in system tablespace
and note their states.
3) Find the free earlier extent to replace the lastly used
extents in the system tablespace.
4) Iterate through all indexes of system tablespace and defragment
the tree level by level.
5) Iterate the level from left page to right page and find out
the page comes under the extent to be replaced. If it is then
do step (6) else step(4)
6) Prepare the allocation of new extent by latching necessary
pages. If any error happens then there is no modification of
page happened till step (5).
7) Allocate the new page from the new extent
8) Prepare the associated pages for the block to be modified
9) Prepare the step of freeing of page
10) If any error happens during preparing of associated pages,
freeing of page then restore the page which was modified
during new page allocation
11) Copy the old page content to new page
12) Change the associative pages like left, right and parent page
13) Complete the freeing of old page
Allocation of page from new extent, changing of relative pages,
freeing of page are done by 2 steps. one is prepare which
latches the to be modified pages and checks their validation.
Other is complete(), Do the operation
fseg_validate(): Validate the list exist in inode segment
Defragmentation is enabled only when :autoextend exist in
innodb_data_file_path variable.
The parameter innodb_log_spin_wait_delay will be deprecated and
ignored, because there is no spin loop anymore.
Thanks to commit 685d958e38
and commit a635c40648
multiple mtr_t::commit() can concurrently copy their slice of
mtr_t::m_log to the shared log_sys.buf. Each writer would allocate
their own log sequence number by invoking log_t::append_prepare()
while holding a shared log_sys.latch. This function was too heavy,
because it would invoke a minimum of 4 atomic read-modify-write
operations as well as system calls in the supposedly fast code path.
It turns out that with a simpler data structure, instead of having
several data fields that needed to be kept consistent with each other,
we only need one Atomic_relaxed<uint64_t> write_lsn_offset, on which
we can operate using fetch_add(), fetch_sub() as well as a single-bit
fetch_or(), which reasonably modern compilers (GCC 7, Clang 15 or later)
can translate into loop-free code on AMD64.
Before anything can be written to the log, log_sys.clear_mmap()
must be invoked.
log_t::base_lsn: The LSN of the last write_buf() or persist().
This is a rough approximation of log_sys.lsn, which will be removed.
log_t::write_lsn_offset: An Atomic_relaxed<uint64_t> that buffers
updates of write_to_buf and base_lsn.
log_t::buf_free, log_t::max_buf_free, log_t::lsn. Remove.
Replaced by base_lsn and write_lsn_offset.
log_t::buf_size: Always reflects the usable size in append_prepare().
log_t::lsn_lock: Remove. For the memory-mapped log in resize_write(),
there will be a resize_wrap_mutex.
log_t::get_lsn_approx(): Return a lower bound of get_lsn().
This should be exact unless append_prepare_wait() is pending.
log_get_lsn(): A wrapper for log_sys.get_lsn(), which must be invoked
while holding an exclusive log_sys.latch.
recv_recovery_from_checkpoint_start(): Do not invoke fil_names_clear();
it would seem to be unnecessary.
In many places, references to log_sys.get_lsn() are replaced with
log_sys.get_flushed_lsn(), which remains a simple std::atomic::load().
Reviewed by: Debarun Banerjee
archive.archive-big w1 [ fail ] timeout after 900 seconds
Test ended at 2025-03-19 22:27:30
Test case timeout after 900 seconds
== /build/mysql-test/var/1/log/archive-big.log ==
CREATE TABLE t1(a BLOB) ENGINE=ARCHIVE;
INSERT INTO t1 SELECT * FROM t1;
...
== /build/mysql-test/var/1/tmp/analyze-timeout-mysqld.1.err ==
mysqltest: Could not open connection 'default' after 500 attempts: 2002 Can't connect to local MySQL server through socket '/build/mysql-test/var/tmp/1/mysqld.1.sock' (111)
buf_block_t::initialise(): Remove a redundant call to page.lock.init()
that was already executed in buf_pool_t::create() or
buf_pool_t::resize().
This fixes a regression that was introduced in
commit b6923420f3 (MDEV-29445).
Currently execution of commit in one phase proceeds to commit by
engines when binlog_commit() does not succeed.
There are two issues with that:
1. absence of binlog_rollback() or lower-level
`binlog_cache_data::reset()` along the following execution of the
failing statement eventually will raise an assert on non-empty binlog
cache, find in the MDEV description
# --error assert(sql/log.cc:1712(binlog_close_connection))
# --disconnect default
2. engines, including ones that are rollback capable, commit in this
particular error situation.
Both effects can be observed with a new mtr test that would fail when run on
a BASE of this commit.
The BASE has to include MDEV-35207 et all fixes because the test is written
with CREATE-TABLE-SELECTs.
A new test file verifies the new behaviour to rollback including
cases with a side effect of modified non-transactional engine which
expose another MDEV-36027 (TODO: fix).
MDEV-35499 Errored-out CREATE-or-REPLACE-SELECT does not log DROP table into binlog
MDEV-35502 Failed at ROW-format binlogging CREATE-TABLE-SELECT should
not generate Incident event
When a CREATE TABLE .. SELECT errors while inserting data, a user
would expect that all changes are rolled back
and the table would not exist after executing the query.
However CREATE-TABLE-SELECT can face an error near the end of its execution
select_create::send_eof() so that the error was never checked which
led to various assert inside binlogging path that should not be
attended at all.
Specifically when binlog_commit() of ha_commit_one_phase() that
CREATE-TABLE-SELECT employs errored out because of a limited cache size
(binlog_commit may try writing to a transactional cache) the cache
was not flushed to binlog. The missed error check allowed further
execution down to trans_commit_implicit() in whose stack
DBUG_ASSERT(!(entry->using_trx_cache && !mngr->trx_cache.empty() &&
mngr->get_binlog_cache_log(TRUE)->error));
fired. In a non-debug build that table remains created/populated
inconsistently with binlog.
The fixes need and install the error checking in select_create::send_eof().
That prevents from any further execution when ha_commit_one_phase() fails
for any reason (typically due to binlog_commit()).
This commit also covers CREATE-or-REPLACE-SELECT that additionally had
a specific issue in that DROP TABLE was not logged the binary log, MDEV-35499.
See changes select_create::abort_result_set().
The current commit also corrects an unnecessary Incident event
logging when CREATE-TABLE-SELECT encounters a binloging issue, MDEV-35502.
The Incident was actually only harmful in this case as the table was
never going to be created, therefore replicated, in such a case.
In "normal" cases when the SELECT phase errors due to binlogging, an
internal incident flag gets reset inside select_create::abort_result_set().
A hunk in select_insert::prepare_eof() addresses a specific kind of
this issue that deals with incorrect computation of the binlog cache type.
Because of that in the OLD version execution was allowed to proceed along
ha_commit_trans()..binlog_commit() while a Pending event was not
flushed to the transactional cache. That might lead to the unnecessary
binlogged Incident despite the select_create::abort_result_set()
measures. However now with the corrected cache type any binlogging error
to flush the Pending event is covered according to the normal case.
non-transaction table, updates to the non-transactional table
NOTE the commit contains few tests overlapping with unfixed yet MDEV-36027.
Thanks to Brandon Nesterenko and Kristian Nielsen for thorough review,
and Kristian additionally for ideas to simplify the patch and some
code contribution.
Problem:
=======
- While loading the foreign key constraints for the parent table,
if child table wasn't open then InnoDB uses the parent table heap
to store the child table name in fk_tables list. If the consecutive
foreign key relation for the parent table fails with error,
InnoDB evicts the parent table from memory. But InnoDB accesses the
evicted table memory again in dict_sys.load_table()
Solution:
========
dict_load_table_one(): In case of error, remove the child table
names which was added during dict_load_foreigns()
Problem:
========
- InnoDB does consecutive instant alter operation, first instant DDL
fails, it fails to reset the old instant information in table during
rollback. This lead to consecutive instant alter to have wrong
assumption about the exisitng instant column information.
Fix:
====
dict_table_t::instant_column(): Duplicate the instant information
field of the table. By doing this, InnoDB alter retains the old
instant information and reset it during rollback operation
The test in commit 1756b0f37d
is occasionally failing if there are unexpectedly many page cleaner
batches that are updating the log checkpoint by small amounts.
This occurs in particular when running the server under Valgrind.
Let us insert the same number of records with a larger number of
statements in a hope that the test would then be more likely to pass.
It prevents a crash in wsrep_report_error() which happened when appliers would run
with FK and UK checks disabled and erroneously execute plain inserts as bulk inserts.
Moreover, in release builds such a behavior could lead to deadlocks between two applier
threads if a thread waiting for a table-level lock was ordered before the lock holder.
In that case the lock holder would proceed to commit order and wait forever for the
now-blocked other applier thread to commit before.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Problem was that transacton was BF-aborted after certification
succeeded and transaction tried to rollback and during
rollback binlog stmt cache containing sequence value reservations
was written into binlog.
Transaction must replay because certification succeeded but
transaction must not be written into binlog yet, it will
be done during commit after the replay.
Fix is to skip binlog write if transaction must replay and
in replay we need to reset binlog stmt cache.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Fix a regression on test galera_sr.GCF-572 introduced by commit
c9a6adba. This commit partially reverted a trivial test fix for
the galera_sr.GCF-572 test (commit 11ef59fb), which was targeted
at 10.6.
This is backporting the fix from commit 11ef59fb fix to 10.5, so
that the test stays the same all versions >= 10.5.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
in row_update_for_mysql
932ec586 (MDEV-23644) in TABLE::delete_row() added ha_delete_row() for
the case of HA_ERR_FOREIGN_DUPLICATE_KEY. The problem is
ha_update_row() called beforewards may change m_last_part which is
required for ha_delete_row() to delete from correct partition.
The fix reverts m_last_part in case ha_partition::update_row() fails.
The test case was missing a wait for the SQL thread to complete its work before checking the value of @@GLOBAL.gtid_slave_pos.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>