Analysis:
So, there were two problems that needed to be fixed.
1) To fix the crash.
2) After fixing the crash, the result was wrong.
Reason for crash: When we pass the hash to get_intersect_between_arrays(),
We were initialially not passing it value, so the operations were not
performed correctly.
Reason for wrong result: The number of rows that it was returning were same
as that in the table, but, only the first row had correct ouput, rest of
them were NULL (it should also be the result of interection). This was
because we modified the "items" HASH by deleting the "seen" elements.
So for next rows, it did not have the elements it should have in the hash.
Fix:
1) To fix the crash: pass the HASH by reference
2) To fix incorrect result: Maintain a separate "seen" hash, if an item
is found the the "items" hash, delete it ony temporarily and put it in the
seen hash. At then end, put the items from "seen" back into "items" and
reset "seen".
63620ca6d8 identfied that thr_create_utime
wasn't initialized and therefore couldn't be used origin copying
point for the initialization of other variables.
By replacing with a micotime_interval, we missed that the
THD has just been created, and thr_create_utime should be set too.
This hasn't hit any MSAN errors so far but its possible to access
thr_create_utime in some codepath so lets initialize it.
Relaxed check, only number of columns and the PK.
Enough to avoid crashes, but doesn't break upgrades and migration
from MySQL as in MDEV-37777.
Added checks everywhere. (flush/create/alter/drop server)
Check mysql.plugin table too.
Revert the fix for MDEV-35622 (957ec8bba6).
mysql.servers structure differs in different versions
of MariaDB and MySQL, cannot use Table_check_intact to validate it.
Issue:
The Commit: 0584846 'Add TL_FIRST_WRITE in SQL layer for determining R/W'
limits INSERT statements on write nodes to acquire MDL locks on it's all child
tables and thereby wsrep certification keys added, but on applier nodes it does
acquire MDL locks for all child tables. This can result into MDL BF-BF conflict
on applier node when transactions referring to parent and child tables are
executed concurrently. For example:
Tables with foreign keys: t1<-t2<-t3<-t4
Conflicting transactions: INSERT t1 and DROP TABLE t4
Wsrep certification keys taken on write node:
- for INSERT t1: t1 and t2
- for DROP TABLE t4: t4
On applier node MDL BF-BF conflict happened between two transaction because
MDL locks on t1, t2, t3 and t4 were taken for INSERT t1, which conflicted
with MDL lock on t4 taken by DROP TABLE t4.
The Wsrep certification keys helps in resolving this MDL BF-BF conflict by
prioritizing and scheduling concurrent transactions. But to generate Wsrep
certification keys it needs to open and take MDL locks on all the child tables.
The Commit: 0584846 change limits MDL lock to be taken on all child nodes for
read-only FK checks (INSERT t1). But this doesn't works on applier nodes
because Write_rows_log_event event logged for INSERT is also used to record
update (check Write_rows_log_event::get_trg_event_map()), and
therefore MDL locks is taken for all the child tables on applier node
for update and insert event.
Solution:
Additional keys for the referenced/foreign table needs to be added
to avoid potential MDL conflicts with concurrent update and DDLs.
close_thread_tables() would not flush pending row events to the binlog cache
in certain conditions if LOCK TABLES was active. This could result in the
row events being binlogged without STMT_END_F flag, and eventually leave the
THD in an invalid state that triggered assertions later.
Reviewed-by: Monty <monty@mariadb.org>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
Make REPAIR of a partition output the same type of message (error vs.
warning) in the error log during replication as it would do when run
directly.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
When binlog_row_image=MINIMAL, UPDATE_ROWS_EVENT may change columns
that are not in the before image. Such columns had their bit set in
table->write_set, but was missing their bit in table->read_set.
As part of this patch, bitmap_union() is extended to handle bitmaps of
different sizes, similar to bitmap_intersect().
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
THD class has an instance of Gap_time_tracker_data, which has a
reference to "Gap_time_tracker" using "bill_to" field.
During clean_up_after_query, reset bill_to field to NULL.
1. LIMIT ROWS EXAMINED clause allows to abort a query if a certain limit
of rows involved in processing is exceeded. This limitation should be only
enforced during the execution phase and not the optimization.
2. Unions are often executed using so-called "fake_select_lex" which
collects the final result from union parts. During that finalization
LIMIT ROWS EXAMINED must be ignored to avoid producing a potentially
incomplete result. There is a workaround at `st_select_lex_unit::exec()`
which deactivates the limit before fake_select_lex processing, and
re-activates it when the processing is finished. However, this re-activation
does not take into account whether the limit was active before the start
of fake_select_lex processing.
3. `st_select_lex_unit::exec()` can be invoked during the optimization
phase for evaluation of constant conditions with unions. At that time
the limit trigger is not activated. Given that the re-activation
mentioned above does not respect the previous state of the limit trigger,
a premature activation of the limit may happen.
This commit fixes that behavior by storing the state of the trigger
before its deactivation at `st_select_lex_unit::exec()` and re-activating
it only if the limit was active before.
4. This commit also removes the call to `thd->lex->set_rows_examined()`
from `st_select_lex_unit::exec_recursive()` which was probably copied
from `st_select_lex_unit::exec()`. But in the context of
`exec_recursive()` the call does not make sense as there is no
previous deactivation of the limit as at `exec()`.
Reviewed by: Dave Gosselin, Sergei Petrunia
check that privilege tables have a PK
and it has the correct number ok key parts
also fixes:
* MDEV-24206 SIGSEGV in replace_db_table on GRANT
* MDEV-24814 SIGSEGV in replace_table_table on GRANT
* MDEV-27842 SIGSEGV in replace_routine_table on GRANT
* MDEV-28128 SIGSEGV in replace_column_table on GRANT
* MDEV-27893 SIGSEGV in replace_proxies_priv_table on GRANT PROXY
When the query uses several Window Functions:
SELECT
WIN_FUNC1() OVER (ORDER BY 'const', col1),
WIN_FUNC2() OVER (ORDER BY col1 RANGE BETWEEN CURRENT ROW
AND 5 FOLLOWING)
compare_window_funcs_by_window_specs() will try to get the Window Specs to
reuse the ORDER BY lists. If the lists produce the same order (like above)
Window Spec of the WIN_FUNC2 will reuse the ORDER BY list of WIN_FUNC1.
However, WIN_FUNC2 has a RANGE-type window frame. It expects to get
ORDER BY list with one element, which it will use to compute frame bounds.
Proving it with ORDER BY list from WIN_FUNC1 ('const', col1) was caused an
assertion failure
The fix is to:
Use the original ORDER BY list when constructing RANGE-type frames
Fix an apparent typo bug in compare_window_funcs_by_window_specs():
assignment
win_spec1->save_order_list= win_spec2->order_list;
Saved the order list from the wrong spec. Instead, take one from win_spec1.
called with empty json arrays, UBSAN runtime error: member access within
null pointer of type 'struct String' in
Item_func_json_array_intersect::prepare_json_and_create_hash
Analysis:
Arguments are not initilized
Fix:
If the arguments are not initialized the the val_json() return NULL, so
if val_json() for either of the arguments, return NULL.
Problem was that wsrep was disconnected and new slave
threads tried to connect to cluster but failed as
we were disconnected state.
Allow changing wsrep_slave_threads only when wsrep is enabled
and we are connected to a cluster. In other cases report
error and issue a warning.
Tests on clang-20/21 had both of these tests overrunning the
stack. The check_stack_overrun function checked the function
earlier with a 2*STACK_MIN_SIZE margin. The exection within
the processing is deeper then when check_stack_overrun was
called.
Raising STACK_MIN_SIZE to 44k was sufficient (and 40k wasn't
oufficient). execution_constants also tested however
the topic mention tests are bigger.
Perfscheam tests
* perfschema.statement_program_nesting_event_check
* perfschema.statement_program_nested
* perfschema.max_program_zero
A small increase to the test thread-stack-size on statement_program_lost_inst
allows this test to continue to pass.
Problem was that for partitioned tables base table storage engine
is DB_TYPE_PARTITION_DB and naturally different than DB_TYPE_INNODB
so operation was not allowed in Galera.
Fixed by requesting implementing storage engine for partitioned
tables i.e. table->file->partition_ht() or if that does not exist
we can use base table storage engine. Resulting storage engine
type is then used on condition is operation allowed when
wsrep_mode=DISALLOW_LOCAL_GTID or not. Operations to InnoDB
storage engine i.e DB_TYPE_INNODB should be allowed.
Regression from MDEV-36765 / 2b24ed87f0.
json_unescape can return a string length 0 without it being an error.
The regression caused this 0 length empty string to appear as an
error and result in a NULL return value.
Caused by optimization done in 2e2b2a0469.
Cannot use lookup_handler in default branch of locate_dup_record() as
InnoDB update depends on positioned record and update is done in table
main handler.
The patch reverts some non-pure changes done by 2e2b2a0469 to
original logic from 72429cad. There was no long_unique_table condition
to init search on table->file, so we get into default branch with long
unique and table->file search uninitialized.
ha_rnd_init_with_error() on demand for HA_DUPLICATE_POS branch was
original logic as well.
More info: 2e2b2a0469 reverts 5e345281e3, but it seems to be OK as
MDEV-3888 test case passes. mysql-5.6.13 has the original code with
HA_WHOLE_KEY as well.
Let us access some data members of THD directly, instead of invoking
non-inline accessor functions. Note: my_thread_id will be used instead
of the potentially narrower ulong data type.
Also, let us remove some functions from sql_class.cc that were only
being used by InnoDB or RocksDB, for no reason. RocksDB always had
access to the internals of THD.
Reviewed by: Sergei Golubchik
Tested by: Saahil Alam
It appears that some error conditions don't store error information in the
Diagnostics_area. For example when table_def::compatible_with() check fails
error message is stored in Relay_log_info instead.
This results in optimistically identical votes and zero error buffer size
breaks wsrep-lib logic as it relies on error buffer size to decide whether
voting took place.
To account for this, first try to obtain error info from Diagnostics_area,
then fallback to Relay_log_info. If that fails use some "random" data to
distinguish this condition from success in production.
numerous bugs in JSON_DETAILED and multibyte charsets:
* String:chop() must be charset-aware and not simply length--
* String::append(char) must be charset-aware and not simply length++
* json_nice() first removes value_len bytes, then a
certain number of characters
GET_STR_ALLOC options are allocated by my_getopt in init_one_value().
but maria-backup never calls handle_option() for them at all,
so Sys_var_charptr_base needs a protection for partially-initialized
variables.
followup for f33367f2ab
The issue was that unpack_vcol_info_from_frm() wrongly linked the used
sequence tables into tables->internal_tables when more than one sequence
table was used.
Other things:
- Fixed internal_table_exists() to take db into account.
(This is making the code easier to read. As we where comparing
pointers the old code also worked).