ha_innobase::info_low(): Assert that dict_table_t::stat_initialized()
only within the critical section. Changes of this field should be
protected by dict_table_t::lock_latch.
Problem:
========
- After commit cc8eefb0dc (MDEV-33087),
InnoDB does use bulk insert operation for ALTER TABLE.. ALGORITHM=COPY
and CREATE TABLE..SELECT as well. InnoDB fails to clear the bulk
buffer when it encounters error during CREATE..SELECT. Problem
is that while transaction cleanup, InnoDB fails to identify
the bulk insert for DDL operation.
Fix:
====
- Represent bulk_insert in trx by 2 bits. By doing that, InnoDB
can distinguish between TRX_DML_BULK, TRX_DDL_BULK. During DDL,
set bulk insert value for transaction to TRX_DDL_BULK.
- Introduce a parameter HA_EXTRA_ABORT_ALTER_COPY which rollbacks
only TRX_DDL_BULK transaction.
- bulk_insert_apply() happens for TRX_DDL_BULK transaction happens
only during HA_EXTRA_END_ALTER_COPY extra() call.
...when using io_uring on a potentially affected kernel"
Remove version check on the kernel as it now corresponds to
a working RHEL9 kernel and the problem was only there in
pre-release kernels that shouldn't have been used in production.
This reverts commit 1193a793c4.
Remove version check on the kernel as it now corresponds to
a working RHEL9 kernel and the problem was only there in
pre-release kernels that shouldn't have been used in production.
This reverts commit 3dc0d884ec.
Starting with mysql/mysql-server@02f8eaa998
and commit 2e814d4702 the index ID of
indexes on virtual columns was being encoded insufficiently in
InnoDB undo log records. Only the least significant 32 bits were
being written. This could lead to some corruption of the affected
indexes on ROLLBACK, as well as to missed chances to remove some
history from such indexes when purging the history of committed
transactions that included DELETE or an UPDATE in the indexes.
dict_hdr_create(): In debug instrumented builds, initialize the
DICT_HDR_INDEX_ID close to the 32-bit barrier, instead of initializing
it to DICT_HDR_FIRST_ID (10). This will allow the changed code to
be exercised while running ./mtr --suite=gcol,vcol.
trx_undo_log_v_idx(): Encode large index->id in a similar way as
mysql/mysql-server@e00328b4d0
but using a different implementation.
trx_undo_read_v_idx_low(): Decode large index->id in a similar way
as mach_u64_read_much_compressed().
Reviewed by: Debarun Banerjee
Added retry logic to certain file operations during installation as a
workaround for issues caused by buggy antivirus software on Windows.
Retry logic added for WritePrivateProfileString (mysql_install_db.cc)
and renaming file in Innodb.
Problem was that thread was holding lock_sys.wait_mutex when
streaming replication transaction rollback was handled and
in wsrep-lib requests THD::LOCK_thd_kill mutex causing
wrong mutex usage (thd->reset_globals()).
Fix is to remove streaming replication rollback handling
from Deadlock::report() i.e. wsrep_handle_SR_rollback call.
Purpose of Deadloc::report() is to find a cycle in the
waits-for graph if exists, report it, mark victim transaction
as deadlock victim and release locks it is waiting for.
Actual streaming replication rollback that can take longer
time can be handled later at trx_t::rollback where
lock_sys.wait_mutex is not held.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
log_t::clear_mmap(): Do not modify buf_size; we may have
file_size==0 here during bootstrap.
log_t::set_recovered(): If we are writing to a memory-mapped log,
update log_sys.buf_size to the record payload area of log_sys.buf.
This fixes up commit acd071f599
(MDEV-21923).
buf_buddy_alloc_from(): Pass the correct argument to
buf_pool.contains_zip(). This fixes a failure of the test
encryption.innochecksum when the code is built with
cmake -DWITH_UBSAN=ON -DCMAKE_BUILD_TYPE=Debug
- With the help of MDEV-14795, InnoDB implemented a way to shrink
the InnoDB system tablespace after undo tablespaces have been moved
to separate files (MDEV-29986). There is no way to defragment any
pages of InnoDB system tables. By doing that, shrinking of
system tablespace can be more effective. This patch deals with
defragment of system tables inside ibdata1.
Following steps are done to do the defragmentation of system
tablespace:
1) Make sure that there is no user tables exist in ibdata1
2) Iterate through all extent descriptor pages in system tablespace
and note their states.
3) Find the free earlier extent to replace the lastly used
extents in the system tablespace.
4) Iterate through all indexes of system tablespace and defragment
the tree level by level.
5) Iterate the level from left page to right page and find out
the page comes under the extent to be replaced. If it is then
do step (6) else step(4)
6) Prepare the allocation of new extent by latching necessary
pages. If any error happens then there is no modification of
page happened till step (5).
7) Allocate the new page from the new extent
8) Prepare the associated pages for the block to be modified
9) Prepare the step of freeing of page
10) If any error happens during preparing of associated pages,
freeing of page then restore the page which was modified
during new page allocation
11) Copy the old page content to new page
12) Change the associative pages like left, right and parent page
13) Complete the freeing of old page
Allocation of page from new extent, changing of relative pages,
freeing of page are done by 2 steps. one is prepare which
latches the to be modified pages and checks their validation.
Other is complete(), Do the operation
fseg_validate(): Validate the list exist in inode segment
Defragmentation is enabled only when :autoextend exist in
innodb_data_file_path variable.
Update cmake_minimum_required to 2.8...3.12 in root cmake and mroonga.
This will update "Policy Version" to 3.12, which will not prevent the
build by even higher cmake versions. There is also a reason to stay on
the compatible with windows "policy version", so 3.12 is conservatively
chosen.
On the other hand, it will require at least version 2.8.
The parameter innodb_log_spin_wait_delay will be deprecated and
ignored, because there is no spin loop anymore.
Thanks to commit 685d958e38
and commit a635c40648
multiple mtr_t::commit() can concurrently copy their slice of
mtr_t::m_log to the shared log_sys.buf. Each writer would allocate
their own log sequence number by invoking log_t::append_prepare()
while holding a shared log_sys.latch. This function was too heavy,
because it would invoke a minimum of 4 atomic read-modify-write
operations as well as system calls in the supposedly fast code path.
It turns out that with a simpler data structure, instead of having
several data fields that needed to be kept consistent with each other,
we only need one Atomic_relaxed<uint64_t> write_lsn_offset, on which
we can operate using fetch_add(), fetch_sub() as well as a single-bit
fetch_or(), which reasonably modern compilers (GCC 7, Clang 15 or later)
can translate into loop-free code on AMD64.
Before anything can be written to the log, log_sys.clear_mmap()
must be invoked.
log_t::base_lsn: The LSN of the last write_buf() or persist().
This is a rough approximation of log_sys.lsn, which will be removed.
log_t::write_lsn_offset: An Atomic_relaxed<uint64_t> that buffers
updates of write_to_buf and base_lsn.
log_t::buf_free, log_t::max_buf_free, log_t::lsn. Remove.
Replaced by base_lsn and write_lsn_offset.
log_t::buf_size: Always reflects the usable size in append_prepare().
log_t::lsn_lock: Remove. For the memory-mapped log in resize_write(),
there will be a resize_wrap_mutex.
log_t::get_lsn_approx(): Return a lower bound of get_lsn().
This should be exact unless append_prepare_wait() is pending.
log_get_lsn(): A wrapper for log_sys.get_lsn(), which must be invoked
while holding an exclusive log_sys.latch.
recv_recovery_from_checkpoint_start(): Do not invoke fil_names_clear();
it would seem to be unnecessary.
In many places, references to log_sys.get_lsn() are replaced with
log_sys.get_flushed_lsn(), which remains a simple std::atomic::load().
Reviewed by: Debarun Banerjee
recv_sys_t::report_progress(): Display the largest currently known LSN.
recv_scan_log(): Display an error with fewer function calls.
Reviewed by: Debarun Banerjee
With view protocol, a SELECT statement is transformed into two
statements:
1. CREATE OR REPLACE VIEW mysqltest_tmp_v AS SELECT ...
2. SELECT * FROM mysqltest_tmp_v
The first statement reconstructed the query, which is executed in the
second statement.
The reconstruction often replaces aliases in ORDER BY by the original
item.
For example, in the test spider/bugfix.mdev_29008 the query
SELECT MIN(t2.a) AS f1, t1.b AS f2 FROM tbl_a AS t1 JOIN tbl_a AS t2 GROUP BY f2 ORDER BY f1, f2;
is transformed to
"select min(`t2`.`a`) AS `f1`,`t1`.`b` AS `f2` from (`auto_test_local`.`tbl_a` `t1` join `auto_test_local`.`tbl_a` `t2`) group by `t1`.`b` order by min(`t2`.`a`),`t1`.`b`"
In such cases, spider constructs different queries to execute at the
data node. So we disable view protocol for such queries.
With view protocol, often during optimization, the GBH is not created
because join->tables_list is the view mysqltest_tmp_v which has MEMORY
as engine which does not have GBH implemented.
In such cases, if without view protocol the test takes a path that
does create a spider GBH, the resulting queries sent to the data node
often differ.
Therefore we disable view protocol for these statements.
Spider needs to lock the spider table when executing the udf, but the
server layer would have already locked tables in view protocol because
it transforms the query:
select spider_copy_table('t', 0, 1)
to two queries
create or replace view mysqltest_tmp_v as select
spider_copy_table('t', 0, 1);
select * from mysqltest_tmp_v;
So spider justifiably errors out in this case by checking on
thd->derived_tables and thd->locks in spider_copy_tables_body()
If one of the selected field is a MIN or MAX and it has been optimized
into a constant, it is not added to the temp table used by a group by
handler (GBH). The GBH therefore cannot store results to this missing
field.
On the other hand, when SELECTing from a view or a derived table,
TMP_TABLE_ALL_COLUMNS is set. If the query has no group by or order
by, an Item_temptable_field is created for this MIN/MAX field and
added to the JOIN. Since the GBH could not store results to the
corresponding field in the temp table, the value of this
Item_temptable_field remains NULL. And the NULL value is passed to the
record, then the temp row, and finally output as the (wrong) result.
To fix this, we opt to not creating a spider GBH when a view or
derived table is involved.
This fixes spider/bugfix.mdev_26345 for --view-protocol
Also fixed a comment:
TABLE_LIST::belong_to_derived is NULL if the table belongs to a
derived table that has non-MERGE type.
Running mtr --view-protocol transforms SELECT statements to a CREATE
OR REPLACE VIEW of the statement, followed by SELECT from the view.
When thus when spider tests check the query log for select statements,
it often output a different one with --view-protocol compared to
without.
By adding disable/enable_view_protocol pairs to these statements. Most
of these statements are surrounded by existing
disable/enable_ps[2]_protocol pairs.
Acked-by: Yuchen Pei <ycp@mariadb.com>
Connect engine fails to build with libxml2 2.14.0.
Connect engine uses "#ifndef BASE_BUFFER_SIZE" to determine if libxml2 is
available. If libxml2 is unavailable it did redefine xmlElementType enum
of libxml/tree.h. The reasons for this redefinition is vague, most
probably some of these constants were used when connect was compiled with
MSXML, while libxml2 was disabled.
However BASE_BUFFER_SIZE constant was removed from libxml2 recently, as
a result connect fails to build due to xmlElementType constants
redefinition.
Use LIBXML2_SUPPORT instead of BASE_BUFFER_SIZE for libxml2 availability
check.
The value of dv[0].data being null showed up
in the mtr tests:
mroonga/storage.alter_table_fulltext_add_no_primary_key
as:
/source/storage/mroonga/vendor/groonga/lib/ii.c:2052:37: runtime error: applying non-zero offset 28 to null pointer
Correct this by entrying the if condition on null pointer value.
The free is valid, and the data of size is allocated.
buf_block_t::initialise(): Remove a redundant call to page.lock.init()
that was already executed in buf_pool_t::create() or
buf_pool_t::resize().
This fixes a regression that was introduced in
commit b6923420f3 (MDEV-29445).
Prepare for a more modern CMake version than the current minimum.
- Use CMAKE_MSVC_RUNTIME_LIBRARY instead of the custom MSVC_CRT_TYPE.
- Replace CMAKE_{C,CXX}_FLAGS modifications with
add_compile_definitions/options and add_link_options.
The older method already broke with new pcre2.
- Fix clang-cl compilation and ASAN build.
- Avoid modifying CMAKE_C_STANDARD_LIBRARIES/CMAKE_CXX_STANDARD_LIBRARIES,
as this is discouraged by CMake.
- Reduce system checks.
- fix several Windows-specific "variable set but not used",
or "variable unused" warnings.
- correctly initialize std::atomic_flag (ATOMIC_FLAG_INIT)
- fix Ninja build for spider on Windows
- adjust check for sizeof(MYSQL) for Windows compilers
Problem:
=======
- While loading the foreign key constraints for the parent table,
if child table wasn't open then InnoDB uses the parent table heap
to store the child table name in fk_tables list. If the consecutive
foreign key relation for the parent table fails with error,
InnoDB evicts the parent table from memory. But InnoDB accesses the
evicted table memory again in dict_sys.load_table()
Solution:
========
dict_load_table_one(): In case of error, remove the child table
names which was added during dict_load_foreigns()
Problem:
========
- InnoDB does consecutive instant alter operation, first instant DDL
fails, it fails to reset the old instant information in table during
rollback. This lead to consecutive instant alter to have wrong
assumption about the exisitng instant column information.
Fix:
====
dict_table_t::instant_column(): Duplicate the instant information
field of the table. By doing this, InnoDB alter retains the old
instant information and reset it during rollback operation
It prevents a crash in wsrep_report_error() which happened when appliers would run
with FK and UK checks disabled and erroneously execute plain inserts as bulk inserts.
Moreover, in release builds such a behavior could lead to deadlocks between two applier
threads if a thread waiting for a table-level lock was ordered before the lock holder.
In that case the lock holder would proceed to commit order and wait forever for the
now-blocked other applier thread to commit before.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
With --view-protocol, mtr transforms a SELECT query to two queries:
1. CREATE OR REPLACE VIEW mysqltest_tmp_v AS ...
2. SELECT * FROM mysqltest_tmp_v
where ... is the original query. Further mtr may run the first query
in a separate connection.
On the other hand if the data node is the same as the spider node,
spider_same_server_link is required for connection to the data node.
Therefore, for mtr --view-protocol tests often spider_same_server_link
needs to be set on both session and global levels. In this patch we
add the missing "SET GLOBAL spider_same_server_link=1" queries to
tests that fail with wrong results due to this issue.
It does not fix --view-protocol for all the affected tests, because
there are other issues fixed in subsequent patches.
During regular iteration the page cleaner does flush from flush list
with some flush target and then goes for generating free pages from LRU
tail. When asynchronous flush is triggered i.e. when 7/8 th of the LSN
margin is filled in the redo log, the flush target for flush list is
set to innodb_io_capacity_max. If it could flush all, the flush
bandwidth for LRU flush is currently set to zero. If the LRU tail has
dirty pages, page cleaner ends up freeing no pages in one iteration.
The scenario could repeat across multiple iterations till async flush
target is reached. During this time the DB system is starved of free
pages resulting in apparent stall and in some cases dict_sys latch
fatal error.
Fix: In page cleaner iteration, before LRU flush, ensure we provide
enough flush limit so that freeing pages is no blocked by dirty pages
in LRU tail. Log IO and flush state if double write flush wait is long.
Reviewed by: Marko Mäkelä
innodb_ft_aux_table_validate(): If the table is found in InnoDB
but not valid for the parameter, only invoke dict_sys.unlock() once.
This fixes a regression due to MDEV-36122.
log_checkpoint(): In cmake -DWITH_VALGRIND=ON builds, let us
wait for all outstanding writes to complete, in order to
avoid an unexpectedly large number of innodb_log_writes
in the test innodb.page_cleaner.