insufficient grants
Defer privilege checking until fix_fields. This way ALTER will behave
consistently with CREATE, and require the same privileges to sequence
column (SELECT/INSERT)
MDEV-33813 caused a regressing in that when a disk got full when
writing to a MyISAM or Aria table the MariaDB connection would, instead
of doing a retry after 60 seconds, hang until the query was killed.
Fixed by changing mysql_coind_wait() top mysql_cond_timedwait()
Author: Thomas Stangner
handler::clone() call did not work with read only tables like S3.
It gave a wrong error message (out of memory instead of a permission
error) and aborted the query.
The issue was that the clone call had a wrong parameter to ha_open().
This now fixed. I also changed the clone call to provide the correct
error message if things fails.
This patch fixes an 'out of memory' error when using the S3 engine
for queries that could use multiple indexes together to find the matching
rows, like the following:
SELECT * FROM t1 WHERE key1 = 99 OR key2 = 2
This commit fixes a bug where Aria tables are used in
(master->slave1->slave2) and a backup is taken on slave2. In this case
it is possible that the replication position in the backup, stored in
mysql.gtid_slave_pos, will be wrong. This will lead to replication
errors if one is trying to use the backup as a new slave.
Analyze:
Replicated row events are committed with trans_commit_stmt() and
thd->transaction->all.ha_list != 0.
This means that backup_commit_lock is not taken for Aria tables,
which means the rows are committed and binary logged on the slave
under BLOCK_COMMIT which should not happen.
This issue does not occur on the master as thd->transaction->all.ha_list
is == 0 under AUTO_COMMIT, which sets 'is_real_trans' and 'rw_trans'
which in turn causes backup_commit_lock to be taken.
Fixed by checking in ha_check_and_coalesce_trx_read_only() if all handlers
supports rollback and if not, then wait for BLOCK_COMMIT also for
statement commit.
forever, cannot be killed
mysql_rm_table_no_locks() does TDC_RT_REMOVE_ALL which waits while
share is closed. The table normally is open only as OPEN_STUB, this is
what parser does for CREATE TABLE. But for SELECT the table is opened
not as a stub. If it is the same table name we anyway have two
TABLE_LIST objects: stub and not stub. So for "not stub"
TDC_RT_REMOVE_ALL sees open count and decides to wait until it is
closed. And it hangs because that was opened in the same thread.
The fix disables subqueries in CHECK expression at parser
level. Thanks to Sergei Golubchik <serg@mariadb.org> for the patch.
Oracle mode has different set operator precedence and handling (not by
standard). In Oracle mode the below test case is handled as-is, in
plain order from left to right. In MariaDB default mode follows SQL
standard and makes INTERSECT prioritized, so UNION is taken from
derived table which is INTERSECT result (here and below the same
applies for EXCEPT).
Non-distinct set operator (UNION ALL/INTERSECT ALL) works via unique
key release but it can be done only once. We cannot add index to
non-empty heap table (see heap_enable_indexes()). So every UNION ALL
before rightmost UNION DISTINCT works as UNION DISTINCT. That is
common syntax, MySQL, MSSQL and Oracle work that way.
There is union_distinct property which indicates the rightmost
distinct UNION (at least, so the algorithm works simple: it releases
the unique key after union_distinct in the loop
(st_select_lex_unit::exec()).
INTERSECT ALL code (implemented by MDEV-18844 in a896beb) does not
know about Oracle mode and treats union_distinct as the last
operation, that's why it releases unique key on union_distinct
operation. INTERSECT ALL requires unique key for it to work, so before
any INTERSECT ALL unique key must not be released (see
select_unit_ext::send_data()).
The patch tweaks INTERSECT ALL code for Oracle mode. In
disable_index_if_needed() it does not allow unique key release before
the last operation and it allows unfold on the last operation. Test
case with UNION DISTINCT following INTERSECT ALL at least does not
include invalid data, but in fact the whole INTERSECT ALL code could
be refactored for better semantical triggers.
The patch fixes typo in st_select_lex_unit::prepare() where
have_except_all_or_intersect_all masked eponymous data member which
wrongly triggered unique key release in st_select_lex_unit::prepare().
The patch fixes unknown error in case ha_disable_indexes() fails.
Note: optimize_bag_operation() does some operator substitutions, but
it does not run under PS. So if there is difference in test with --ps
that means non-optimized (have_except_all_or_intersect_all == true)
code path is not good.
Note 2: VIEW is stored and executed in normal mode (see
Sql_mode_save_for_frm_handling) hence when SELECT order is different
in Oracle mode (defined by parsed_select_expr_cont()) it must be
covered by --disable_view_protocol.
THD::reset_sub_statement_state and THD::restore_sub_staement_state
swap auto_inc_intervals_forced(Discrete_intervals_list) of a THD class
with a local variable temporary to execute other things before restoring
at the end of Table_triggers_list::process_triggers under a
rpl_master_erroneous_autoinc(true) condition as exposed by the
rpl.rpl_trigger test.
The uninitialized data isn't used and the only required action is to
copy the data in one direction. As the intent is for the auto_inc_intervals_forced
value to be overwritten or unused, MEM_UNDEFINED is used on it to
ensure the previous state is considered invalid.
The other uses of reset_sub_statement_state in Item_sp::execute_impl
also follow the same pattern of taking a copy to restore within the
same function.
Despite being included in the HAVE_valgrind define.
As such it's best differenciated from valgrind in the
server identifier as they have for the purposes a distinct
and different set of behaviours.
MSAN has its own set of test inclusions that that are different
from valgrind and such including "valgrind" in a server string that
gets tested for valgrind will incorrectly exclude some tests
that are suitable for MSAN but not valgrind.
There's a have_sanitizer system variable for exposing
the sanitizer being used so there's no need for
version verboseness.
Correct have_sanitizer system variable description to
include MSAN has been possible for a while.
Problem:
=======
- During inplace algorithm, concurrent DML fails to write
the log operation into the temporary file. InnoDB fail to
mark the error for the online log.
- ddl_log_write() releases the global ddl lock prematurely before
release the log memory entry
Fix:
===
row_log_online_op(): Mark the error in online log when
InnoDB ran out of temporary space
fil_space_extend_must_retry(): Mark the os_has_said_disk_full
as true if os_file_set_size() fails
btr_cur_pessimistic_update(): Return error code when
btr_cur_pessimistic_insert() fails
ddl_log_write(): Release the global ddl lock after releasing the
log memory entry when error was encountered
btr_cur_optimistic_update(): Relax the assertion that
blob pointer can be null during rollback because InnoDB can
ran out of space while allocating the external page
row_undo_mod_upd_exist_sec(): Remove the assertion which says
that InnoDB should fail to build index entry when rollbacking
an incomplete transaction after crash recovery. This scenario
can happen when InnoDB ran out of space.
row_upd_changes_ord_field_binary_func(): Relax the assertion to
make that externally stored field can be null when InnoDB ran out
of space.
Issue:
Mariadb acquires additional MDL locks on UPDATE/INSERT/DELETE statements
on table with foreign keys. For example, table t1 references t2, an
UPDATE to t1 will MDL lock t2 in addition to t1.
A replica may deliver an ALTER t1 and UPDATE t2 concurrently for
applying. Then the UPDATE may acquire MDL lock for t1, followed by a
conflict when the ALTER attempts to MDL lock on t1. Causing a BF-BF
conflict.
Solution:
Additional keys for the referenced/foreign table needs to be added
to avoid potential MDL conflicts with concurrent update and DDLs.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
page_is_corrupted(): Do not allocate the buffers from stack,
but from the heap, in xb_fil_cur_open().
row_quiesce_write_cfg(): Issue one type of message when we
fail to create the .cfg file.
update_statistics_for_table(), read_statistics_for_table(),
delete_statistics_for_table(), rename_table_in_stat_tables():
Use a common stack buffer for Index_stat, Column_stat, Table_stat.
ha_connect::FileExists(): Invoke push_warning_printf() so that
we can avoid allocating a buffer for snprintf().
translog_init_with_table(): Do not duplicate TRANSLOG_PAGE_SIZE_BUFF.
Let us also globally enable the GCC 4.4 and clang 3.0 option
-Wframe-larger-than=16384 to reduce the possibility of introducing
such stack overflow in the future. For RocksDB and Mroonga we relax
these limits.
Reviewed by: Vladislav Lesin
The SQL service leaves the affected rows uninitialized.
The initialization of the spider plugin that uses
this service will fail under MSAN because there isn't
an initialized value to return at the end of the query.
Valgrind is single threaded and only changes threads as part of
system calls or waits.
Some busy loops were identified and fixed where the server assumes
that some other thread will change the state, which will not happen
with valgrind.
Based on patch by Monty. Original patch introduced VALGRIND_YIELD,
which emits pthread_yield() only in valgrind builds. However it was
agreed that it is a good idea to emit yield() unconditionally, such
that other affected schedulers (like SCHED_FIFO) benefit from this
change. Also avoid pthread_yield() in favour of standard
std::this_thread::yield().
Ensure that backup_reset_alter_copy_lock() is called in case of rollback
or error in mysql_inplace_alter_table() or copy_data_between_tables().
Other things:
- Improved error from mariabackup when unexpected DDL operation is
encountered.
- Added assert if backup_ddl_log() is called in the wrong context.
This is needed to make it easy for users to automatically ignore long
char and varchars when using ANALYZE TABLE PERSISTENT.
These fields can cause problems as they will consume
'CHARACTERS * MAX_CHARACTER_LENGTH * 2 * number_of_rows' space on disk
during analyze, which can easily be much bigger than the analyzed table.
This commit adds a new user variable, analyze_max_length, default value 4G.
Any field that is bigger than this in bytes, will be ignored by
ANALYZE TABLE PERSISTENT unless it is specified in FOR COLUMNS().
While doing this patch, I noticed that we do not skip GEOMETRY columns from
ANALYZE TABLE, like we do with BLOB. This should be fixed when merging
to the 'main' branch. At the same time we should add a resonable default
value for analyze_max_length, probably 1024, like we have for
max_sort_length.
Get rid of need of matherialization for usual INSERT (cache results in
Item_cache* if needed)
- subqueries in VALUE do not see new records in the table we are
inserting to
- subqueries in RETIRNING prohibited to use the table we are inserting to
When allocate_block() have failed Query_cache::insert_table() misses
to init node->parent which is then accessed with outdated data. There
is check in Query_cache::register_all_tables():
if (block_table->parent)
unlink_table(block_table);
So unlink_table() is avoided when parent is 0.
The problem is with window function which requires its own sorting but
it is impossible as table is updated while select is
progressing. That's it, multi-update queries do not use temporary
tables to store updated rows and substitute them at the end of the
query. Instead, updates are performed directly on the target table,
row by row, during query execution. MariaDB processes updates in a way
that ensures each row is updated only once, even if it matches
multiple conditions in the query. This behavior avoids redundant
updates and does not require intermediate storage in a temporary table.
The detailed cause of the loop invoked by window function was
explained by Yuchen Pei in MDEV-31647 comments.
The fix disables window functions for multi-update.
Note that MySQL throws ER_UPDATE_TABLE_USED in that case which is the
result of check_unique_table(). But this function is not used for
multi-update in MariaDB and this check cannot be done because some
level of SELECT expressions is allowed (MDEV-13911).
failed in find_field_in_table_ref
The main crash with segfault in find_field_in_tables() was fixed by
6aa47fae30 (MDEV-35276). This fix is for debug assertion.
Since Item_default_value is also Item_field there is nothing to be
done except adding DEFAULT_VALUE_ITEM type check.
This is a safetly fix to try to fix random failures in
parallel_backup_xa_debug reported as:
sync_slave_with_master failed:
'select master_pos_wait('master-bin.000001', 1034, 300, '')' returned -1
One possible reason could be lost signals, which this patch fixes.
Under `@@rpl_semi_sync_master_wait_no_slave = 0`,
when `rpl_semi_sync_master_clients` decrements to zero, the primary
reverts to async replication. This code did not check whether Semi-
Sync is still globally enabled or not as it didn’t matter before.
However, after MDEV-33551 (#3089) split the transactions’ACK condition
variables to per-transaction, this function now needs Semi-Sync’s
transaction tracker to unblock these condition variables in batch,
but this tracker is `NULL` when Semi-Sync Primary is disabled.
Co-authored-by: Kristian Nielsen <knielsen@knielsen-hq.org>
check sequence privileges in Item_func_nextval::fix_fields(),
just like column privileges are checked in Item_field::fix_fields()
remove sequence specific hacks that kinda made sequence privilege
checks works, but not in all cases. And they were too lax,
didn't requre SELECT privilege for NEXTVAL. Also INSERT privilege looks
wrong here, UPDATE would've been more appropriate, but won't
change that for compatibility reasons.
also fixes
MDEV-36413 User without any privileges to a sequence can read from it and modify it via column default
Newer gcc reports:
error: 'rfield' may be used uninitialized [-Werror=maybe-uninitialized]
9041 | unwind_stored_field_offsets(fields, rfield);
After investigation, it turned to be an impossible case:
1. The only way it could be broken is if
if (!(field= fld->field_for_view_update()))
line case would succeed from the first time.
2. Consequent checks initialize rfield.
fld may return NULL in field_for_view_update() only for views.
3. Before fill_record, UPDATE first calls check_fields, where
field_for_view_update() result is already checked. INSERT calls
check_view_insertability that checks that all view fields are
updateable.
It all means that field_for_view_update() cannot be NULL in fill_record,
so the if can be converted to DBUG_ASSERT.
This essentially shifts the responsibility on preliminary
field_for_view_update() check to the caller.
In this patch:
1. convert field_for_view_update() check to DBUG_ASSERT
2. harden unwind_stored_field_offsets function so that it can be used
even if field_for_view_update() is NULL
3. As a consequence, `field` is passed instead of `rfield` as a
terminator.
4. Initialize `field` to NULL to bypass a false-positive warning!
The problem is that copy function was used in field list but never
copied in this execution path.
So copy should be performed before returning result.
Protection against uninitialized copy usage added.
7544fd4cae had to make use of a static array to avoid memory
use-after-free or leak.
Instead, let us make a function returning String, this is the only way
to automatically manage the memory after the function returned.
To make it all correct, move constructor is added. Normally, it is
expected, that the constructor will be elided upon return of an object
by value, but if something goes different, or -fno-elide-constructors is
used, we can have a problem. So this was a move constructor avoids
copy elision-related UB.
dbug_print_row returning char* is still there for convenient use in a
debugger.
Linker is trying to find a copy constructor to
injector::transaction::transaction
ld.lld: error: undefined symbol: injector::transaction::transaction
(injector::transaction const&)
>>> referenced by rpl_injector.cc:164
>>> rpl_injector.cc.o:(injector::new_trans(THD*))
This constructor is declared, but is not implemented. Ok if copy elision
is enabled, but causes error otherwise.
Remove the constructor declaration, so that operator= will be used in
reality.
If one of the selected field is a MIN or MAX and it has been optimized
into a constant, it is not added to the temp table used by a group by
handler (GBH). The GBH therefore cannot store results to this missing
field.
On the other hand, when SELECTing from a view or a derived table,
TMP_TABLE_ALL_COLUMNS is set. If the query has no group by or order
by, an Item_temptable_field is created for this MIN/MAX field and
added to the JOIN. Since the GBH could not store results to the
corresponding field in the temp table, the value of this
Item_temptable_field remains NULL. And the NULL value is passed to the
record, then the temp row, and finally output as the (wrong) result.
To fix this, we opt to not creating a spider GBH when a view or
derived table is involved.
This fixes spider/bugfix.mdev_26345 for --view-protocol
Also fixed a comment:
TABLE_LIST::belong_to_derived is NULL if the table belongs to a
derived table that has non-MERGE type.
It prevents a crash in wsrep_report_error() which happened when appliers would run
with FK and UK checks disabled and erroneously execute plain inserts as bulk inserts.
Moreover, in release builds such a behavior could lead to deadlocks between two applier
threads if a thread waiting for a table-level lock was ordered before the lock holder.
In that case the lock holder would proceed to commit order and wait forever for the
now-blocked other applier thread to commit before.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Problem was that transacton was BF-aborted after certification
succeeded and transaction tried to rollback and during
rollback binlog stmt cache containing sequence value reservations
was written into binlog.
Transaction must replay because certification succeeded but
transaction must not be written into binlog yet, it will
be done during commit after the replay.
Fix is to skip binlog write if transaction must replay and
in replay we need to reset binlog stmt cache.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
The test issues a simple INSERT statement, while sql_log_bin = 0.
This option disables writes to binlog. However, since MDEV-7205,
the option does not affect Galera, so changes are still replicated.
So sql_log_bin=off, "partially" disabled the binlog and the INSERT
will involve both binlog and innodb, thus requiring internal 2 phase
commit (2PC). In 2PC INSERT is first prepared, which will make it
transition to PREPARED state in innodb, and later committed which
causes the new assertion from MDEV-24035 to fail.
Running the same test with sql_log_bin enabled also results in 2PC,
but the execution has one more step for ordered commit, between prepare
and commit. Ordered commit causes the transaction state to transition
back to TRX_STATE_NOT_STARTED. Thus avoiding the assertion.
This patch makes sure that when sql_log_bin=off, the ordered commit
step is not skipped, thus going through the expected state transitions
in the storage engine.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
in row_update_for_mysql
932ec586 (MDEV-23644) in TABLE::delete_row() added ha_delete_row() for
the case of HA_ERR_FOREIGN_DUPLICATE_KEY. The problem is
ha_update_row() called beforewards may change m_last_part which is
required for ha_delete_row() to delete from correct partition.
The fix reverts m_last_part in case ha_partition::update_row() fails.
While ALTER thread tries to notify SELECT thread about lock conflict
it accesses its TABLE object (THD::notify_shared_lock()) and lock data
(mysql_lock_abort_for_thread()). As part of accessing lock data it
calls ha_partition::store_lock() which iterates over all partitions
and does their store_lock().
The problem is SELECT opened 2 read partitions, but
ha_partition::store_lock() tries to access all partitions as indicated
in m_tot_parts which is 4. So the last 2 partitions m_file[2] and
m_file[3] are uninitialized and store_lock() accesses uninitialized
data.
The code in ha_partition::store_lock() does this wrong handling to use
all partitions specifically for the case of
mysql_lock_abort_for_thread(), this is conducted with comment:
/*
This can be called from get_lock_data() in mysql_lock_abort_for_thread(),
even when thd != table->in_use. In that case don't use partition pruning,
but use all partitions instead to avoid using another threads structures.
*/
if (thd != table->in_use)
{
for (i= 0; i < m_tot_parts; i++)
to= m_file[i]->store_lock(thd, to, lock_type);
}
The explanation is "to avoid using another threads structures" does
not really explain why this change was needed.
The change was originally introduced by:
commit 9b7cccaf319
Author: Mattias Jonsson <mattias.jonsson@oracle.com>
Date: Wed May 30 00:14:39 2012 +0200
WL#4443:
final code change for dlenevs review.
- Don't use pruning in lock_count().
- Don't use pruning in store_lock() if not owning thd.
- Renamed is_fields_used_in_trigger to
is_fields_updated_in_trigger() and check if they
may be updated.
- moved out mark_fields_used(TRG_EVENT_UPDATE)
from mark_columns_needed_for_update().
And reverted the changed call order. And call
mark_fields_used(TRG_EVENT_UPDATE) instead.
which also fails to explain the rationale of the change. The original
idea of WL#4443 is to reduce locks and this change does not happen to
serve this goal.
So reverting this change restores original behaviour of using only
partitions marked for use and fixes invalid access to uninitialized
data.
In commit bda40ccb85 (MDEV-34803)
there was a spelling mistake that somehow causes the deprecated
parameter innodb_purge_rseg_truncate_frequency to be rejected
at server startup.