1
0
mirror of https://github.com/MariaDB/server.git synced 2025-11-08 00:28:29 +03:00
Commit Graph

4924 Commits

Author SHA1 Message Date
Aleksey Midenkov
ff33f49d9a Merge 11.4 into 11.8 2025-09-29 18:25:09 +03:00
Nikita Malyavin
d4c4eb7939 MDEV-15990 Refactor write_record and fix idempotent replication
See also MDEV-30046.

Idempotent write_row works same as REPLACE: if there is a duplicating
record in the table, then it will be deleted and re-inserted, with the
same update optimization.

The code in Rows:log_event::write_row was basically copy-pasted from
write_record.

What's done:
REPLACE operation was unified across replication and sql. It is now
representred as a Write_record class, that holds the whole state, and allows
re-using some resources in between the row writes.

Replace, IODKU and single insert implementations are split across different
methods, reluting in a much cleaner code.

The entry point is preserved as a single Write_record::write_record() call.
The implementation to call is chosen on the constructor stage.

This allowed several optimizations to be done:
1. The table key list is not iterated for every row. We find last unique key in
the order of checking once and preserve it across the rows. See last_uniq_key().
2. ib_handler::referenced_by_foreign_key acquires a global lock. This call was
done per row as well. Not all the table config that allows optimized replace is
folded into a single boolean field can_optimize. All the fields to check are
even stored in a single register on a 64-bit platform.
3. DUP_REPLACE and DUP_UPDATE cases now have one less level of indirection
4. modified_non_trans_tables is checked and set only when it's really needed.
5. Obsolete bitmap manipulations are removed.

Also:
* Unify replace initialization step across implementations:
  add prepare_for_replace and finalize_replace
* alloca is removed in favor of mem_root allocation. This memory is reused
  across the rows.
* An rpl-related callback is added to the replace branch, meaning that an extra
check is made per row replace even for the common case. It can be avoided with
templates if considered a problem.
2025-09-17 11:38:55 +03:00
Oleksandr Byelkin
15b1426c3a Merge branch '10.11' into bb-11.4-release 2025-09-15 16:17:33 +02:00
Monty
882f6fa3aa Fixed typos
- Removed duplicate words, like "the the" and "to to"
- Removed duplicate lines (one double sort line found in mysql.cc)
- Fixed some typos found while searching for duplicate words.

Command used to find duplicate words:
egrep -rI "\s([a-zA-Z]+)\s+\1\s" | grep -v param

Thanks to Artjoms Rimdjonoks for the command and pointing out the
spelling errors.
2025-09-04 18:08:39 +03:00
Monty
d2ce0650ad MDEV-37356 Annotate_rows written in a 'random' position
Ensure that Annotate_rows is always written direct after GTID information,
before any table_map events.

Before this patch, the following problems existed when mixing
transactional and not transactional tables in the same statement:
- Annotate rows could be written after row events or in the next GTID
  event.
  - See rpl_row_mixing_engines

- Annotate_rows was not always written to binary log in case of error
  with a transactional table (rolled back) but a not transactional
  table was updated.
  - See sp_trans_log, binlog_row_mix_innodb_myisam

Fixed by writing the Annotate_rows event into the non transactional
cache if there are not transactional tables used. If not, write the
event into the transactional cache.
2025-09-04 18:08:39 +03:00
Marko Mäkelä
257f4b30ef Merge 10.11 into 11.4 2025-09-03 10:32:56 +03:00
Nikita Malyavin
0108664a8a Merge branch 10.11 into 11.4
# Conflicts:
#	sql/handler.h
#	sql/log_event.h
#	sql/log_event_server.cc
2025-09-02 15:58:39 +02:00
Monty
da149c7073 Add statistics usable for feedback plugin
Following status variables where added:

Feature_vector_index ; Incremented when reading a vector index from
                       a .frm file.
2025-08-29 16:34:23 +03:00
Marko Mäkelä
b489d5f813 Merge 10.6 into 10.11 2025-08-26 14:24:31 +03:00
Marko Mäkelä
8761047e11 Work around MDEV-37478
THD::get_net_wait_timeout(): Silence GCC -Wconversion by an
explicit conversion from ulong (64 or 32 bits) to uint (32 bits).
2025-08-22 11:43:38 +03:00
Nikita Malyavin
2e2b2a0469 MDEV-15990 Refactor write_record and fix idempotent replication
See also MDEV-30046.

Idempotent write_row works same as REPLACE: if there is a duplicating
record in the table, then it will be deleted and re-inserted, with the
same update optimization.

The code in Rows:log_event::write_row was basically copy-pasted from
write_record.

What's done:
REPLACE operation was unified across replication and sql. It is now
representred as a Write_record class, that holds the whole state, and allows
re-using some resources in between the row writes.

Replace, IODKU and single insert implementations are split across different
methods, reluting in a much cleaner code.

The entry point is preserved as a single Write_record::write_record() call.
The implementation to call is chosen on the constructor stage.

This allowed several optimizations to be done:
1. The table key list is not iterated for every row. We find last unique key in
the order of checking once and preserve it across the rows. See last_uniq_key().
2. ib_handler::referenced_by_foreign_key acquires a global lock. This call was
done per row as well. Not all the table config that allows optimized replace is
folded into a single boolean field can_optimize. All the fields to check are
even stored in a single register on a 64-bit platform.
3. DUP_REPLACE and DUP_UPDATE cases now have one less level of indirection
4. modified_non_trans_tables is checked and set only when it's really needed.
5. Obsolete bitmap manipulations are removed.

Also:
* Unify replace initialization step across implementations:
  add prepare_for_replace and finalize_replace
* alloca is removed in favor of mem_root allocation. This memory is reused
  across the rows.
* An rpl-related callback is added to the replace branch, meaning that an extra
check is made per row replace even for the common case. It can be avoided with
templates if considered a problem.
2025-08-04 17:44:05 +02:00
Sergei Golubchik
b565b3e7e0 Merge branch '11.4' into 11.8 2025-07-28 21:29:29 +02:00
Sergei Golubchik
c4ed889b74 Merge branch '10.11' into 11.4 2025-07-28 19:40:10 +02:00
ParadoxV5
33e845595d MDEV-36839: Revert MDEV-7409
MDEV-6247 added PROCESSLIST states for when a Replication
SQL thread processes Row events, including a WSRep variant
that dynamically includes the Galera Sequence Number.
MDEV-7409 further expanded on it by adding the table name to the states.

However, PROCESSLIST __cannot__ support generated states.
Because it loads the state texts asynchronously,
only permanently static strings are safe.
Even thread-local memory can become invalid when the thread terminates,
which can happen in the middle of generating a PROCESSLIST.

To prioritize memory safety, this commit reverts both variants to
static strings as the non-WSRep variant was before MDEV-7409.
* __Fully__ revert MDEV-7409 (d9898c9a71)
* Remove the WSRep override from MDEV-6247
  * Remove `THD::wsrep_info` and its compiler
    flag `WSREP_PROC_INFO` as they are now unused

This commit also includes small optimizations
from MDEV-36839’s previous draft, #4133.

Reviewed-by: Brandon Nesterenko <brandon.nesterenko@mariadb.com>
2025-07-22 10:05:24 -06:00
Dmitry Shulga
ef9adb569e MDEV-32694: ASAN errors in Binary_string::alloced_length / reset_stmt_params
Anonymous block is represented internally by the class sp_head,
so every statement inside an anonymous block is a SP instruction.
On the other hand, the anonymous block specified in the FROM clause of
the PREPARE statement is treated as a single statement. In result,
all parameter markers (represented by the character ?) are parts of
the anonymous block specified in the prepared statement and at the same
time parameter are markers, internally represented by instances of
the class Item_param and distributed among SP instructions representing
SQL statements (every SQL statement is represented by an instance of
the class sp_instr_stmt)

In case table metadata changed on running an anonymous block in prepared
statement mode, only SP instruction's statement is re-parsed. Before
re-parsing a SP's statement, all items are cleaned up including
instances of the class Item_param that represent positional parameters.

Unfortunately, this leads to presence of a dangling pointer in
Prepared_statement::param_array that references to the deleted
Item_param while invoking reset_stmt_params happening on every execution
of a prepared statement.

To fix the issue, no instances of Item_param created on re-parsings
a statement for failed SP instruction, rather instances of Item_param
left from first time parsing are re-used. As a consequence, all pointers
to instances of the class Item_param stored in the array
Prepared_statememt::param_array and possibly spread along the code base
  (e.g. select_lex->limit_params.select_limit)
still point to valid Items.
2025-07-02 17:50:24 +07:00
Aleksey Midenkov
30185c9c7c MDEV-23207 Assertion `tl->table == __null' failed in THD::open_temporary_table
Assertion fails because table is opened by admin_recreate_table():

71        result_code= (thd->open_temporary_tables(table_list) ||
72                      mysql_recreate_table(thd, table_list, recreate_info, false));

And that is called because t2 is failed with HA_ADMIN_NOT_IMPLEMENTED:

1093        if (result_code == HA_ADMIN_NOT_IMPLEMENTED && need_repair_or_alter)
1094        {
1095          /*
1096            repair was not implemented and we need to upgrade the table
1097            to a new version so we recreate the table with ALTER TABLE
1098          */
1099          result_code= admin_recreate_table(thd, table, &recreate_info);
1100        }

Actually 'table' is t2 but open_temporary_tables() opens whole list,
i.e. t2 and everything what follows it before first_not_own_table().

Therefore t3 is also opened for t2 processing what is wrong.

The fix opens exactly one specific table for HA_ADMIN_NOT_IMPLEMENTED.
2025-07-08 17:44:11 +03:00
Raghunandan Bhat
2c7cea28da MDEV-31721: Cursor protocol increases the counter of "Empty_queries" for select
Problem:
  Empty queries are incremented if no rows are sent to the client in the
  EXECUTE phase of select query. With cursor protocol, rows are not sent
  during EXECUTE phase; they are sent later in FETCH phase. Hence,
  queries executed with cursor protocol are always falsely treated as
  empty in EXECUTE phase.

Fix:
  For cursor protocol, empty queries are now counted during the FETCH
  phase. This ensures counter correctly reflects whether any rows were
  actually sent to the client.

Tests included in `mysql-test/main/show.test`.
2025-06-27 22:04:14 +05:30
Oleksandr Byelkin
a65f7dc71d Merge branch '11.4' into 11.8 2025-06-18 07:43:24 +02:00
Oleksandr Byelkin
89c7e2b9c7 Merge branch '10.11' into 11.4 2025-06-17 09:50:22 +02:00
Marko Mäkelä
1c7209e828 Merge 10.6 into 10.11 2025-05-21 07:36:35 +03:00
Sergey Vojtovich
55ddfe1c95 MDEV-36684 - main.mdl_sync fails under valgrind (test for Bug#42643)
Valgrind is single threaded and only changes threads as part of
system calls or waits.

Some busy loops were identified and fixed where the server assumes
that some other thread will change the state, which will not happen
with valgrind.

Based on patch by Monty. Original patch introduced VALGRIND_YIELD,
which emits pthread_yield() only in valgrind builds. However it was
agreed that it is a good idea to emit yield() unconditionally, such
that other affected schedulers (like SCHED_FIFO) benefit from this
change. Also avoid pthread_yield() in favour of standard
std::this_thread::yield().
2025-04-29 15:05:20 +04:00
Sergei Golubchik
83e0438f62 MDEV-36536 post-review changes
that were apparently partially lost in a rebase
2025-04-29 11:34:35 +02:00
Monty
1b934a387c MDEV-36536 Add option to not collect statistics for long char/varchars
This is needed to make it easy for users to automatically ignore long
char and varchars when using  ANALYZE TABLE PERSISTENT.
These fields can cause problems as they will consume
'CHARACTERS * MAX_CHARACTER_LENGTH * 2 * number_of_rows' space on disk
during analyze, which can easily be much bigger than the analyzed table.

This commit adds a new user variable, analyze_max_length, default value 4G.
Any field that is bigger than this in bytes, will be ignored by
ANALYZE TABLE PERSISTENT unless it is specified in FOR COLUMNS().

While doing this patch, I noticed that we do not skip GEOMETRY columns from
ANALYZE TABLE, like we do with BLOB. This should be fixed when merging
to the 'main' branch. At the same time we should add a resonable default
value for analyze_max_length, probably 1024, like we have for
max_sort_length.
2025-04-28 12:38:01 +03:00
Andrei Elkin
a0b77eb806 MDEV-36685 CREATE-SELECT may lose in binlog side-effects of stored-routine
When the SELECT sub-statement executes a stored function that is defined
to modify a non-transactional table, like

   delimiter |;
   create function f_ia(arg int)
   returns integer
   begin
     insert into ti_pk set a=1;
     insert into ta set a=1;
     insert into ti_pk set a=arg;
     return 1;
   end |
   delimiter ;|

any modified records that the function has succeeded
on must be binlogged as a "side effect" of CREATE-SELECT.

It is expected that a failing CREATE-SELECT like

  --error ER_DUP_ENTRY
  set statement binlog_format = ROW for create table t_y (a int) engine=aria select f_ia(1 /* err in Innodb after Aria stmt is done */) as a;

leaves upon itself the following state:

  include/show_binlog_events.inc
  Log_name	Pos	Event_type	Server_id	End_log_pos	Info
  master-bin.000001	#	Gtid	#	#	BEGIN GTID #-#-#
  master-bin.000001	#	Table_map	#	#
  table_id: # (test.  ta)
  master-bin.000001	#	Write_rows_v1	#	#
  table_id: # flags:   STMT_END_F
  master-bin.000001	#	Query	#	#	COMMIT
  select * from ta;
  a
  1
  select count(*) = 0 from ti_pk;
  true

However it's not so for the binlog part.
The reason is that prior to MDEV-34150 fixes the CREATE-SELECT's
errored phase leaves the binlog caches intact (the file:pos from 10.11 c06c36218a)

to defer their reset to the rollback phase of the top-level

/* the statement cache gets binlogged */

where the side-effect changes gets binlogged.

MDEV-34150 fixes harmed (+#4 line) the statement cache in particular
in the error phase (file:pos are from 395db6f1d5 the current 11.8 )

/* The caches incl the statement cache are gone */
/* 'cos of MDEV-34150 */
+#4  0x00005d75f9b6a92e in THD::binlog_remove_rows_events (this=0x52c000240288) at log.cc:579

Apparently it should not have been there, as proper emptying (either
with reset for the transactional cache or flush and then reset for the
statement cache) is (must be) always done via binlog_rollback of the
top-level statement.
To observe the above requirement the case is fixed with the removal of
thd->binlog_remove_rows_events() and its definition.

Tested with rpl.rpl_create_select_row.

Reviewed-by Brandon Nesterenko.
2025-04-25 21:26:35 +03:00
Marko Mäkelä
bb1d88b6dc Merge 11.4 into 11.8 2025-04-02 14:07:01 +03:00
Marko Mäkelä
f5bd250f5b Merge 10.11 into 11.4 2025-03-28 13:55:21 +02:00
Marko Mäkelä
ab0f2a00b6 Merge 10.6 into 10.11 2025-03-27 08:01:47 +02:00
Kristian Nielsen
2641409731 Fix redundant ER_PRIOR_COMMIT_FAILED in parallel replication
wait_for_prior_commit() can be called multiple times per event group,
only do my_error() the first time the call fails.

Remove redundant set_overwrite_status() calls.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
Reviewed-by: Monty <monty@mariadb.org>
2025-03-11 12:45:59 +01:00
Marko Mäkelä
bb9f010432 Merge 11.4 into 11.8 2025-03-05 20:39:47 +02:00
Sergei Golubchik
9ee09a33bb Merge branch '11.7' into 11.8 2025-02-11 20:29:43 +01:00
Alexander Barkov
b7d67ceb5f MDEV-36047 Package body variables are not allowed as FETCH targets
It was not possible to use a package body variable as a
fetch target:

CREATE PACKAGE BODY pkg AS
  vc INT := 0;
  FUNCTION f1 RETURN INT AS
    CURSOR cur IS SELECT 1 AS c FROM DUAL;
  BEGIN
    OPEN cur;
    FETCH cur INTO vc; -- this returned "Undeclared variable: vc" error.
    CLOSE cur;
    RETURN vc;
  END;
END;

FETCH assumed that all fetch targets reside of the same sp_rcontext
instance with the cursor. This patch fixes the problem.
Now a cursor and its fetch target can reside in different sp_rcontext
instances.

Details:

- Adding a helper class sp_rcontext_addr
  (a combination of Sp_rcontext_handler pointer and an offset in the rcontext)

- Adding a new class sp_fetch_target deriving from sp_rcontext_addr.
  Fetch targets in "FETCH cur INTO target1, target2 ..." are now collected
  into this structure instead of sp_variable.
  sp_variable cannot be used any more to store fetch targets,
  because it does not have a pointer to Sp_rcontext_handler
  (it only has the current rcontext offset).

- Removing members sp_instr_set members m_rcontext_handler and m_offset.
  Deriving sp_instr_set from sp_rcontext_addr instead.

- Renaming sp_instr_cfetch member  "List<sp_variable> m_varlist"
  to "List<sp_fetch_target> m_fetch_target_list".

- Fixing LEX::sp_add_cfetch() to return the pointer to the
  created sp_fetch_target instance (instead of returning bool).
  This helps to make the grammar in sql_yacc.c simpler

- Renaming LEX::sp_add_cfetch() to LEX::sp_add_instr_cfetch(),
  as `if(sp_add_cfetch())` changed its meaning to the opposite,
  to avoid automatic wrong merge from earlier versions.

- Chaning the "List<sp_variable> *vars" parameter to sp_cursor::fetch
  to have the data type "List<sp_fetch_target> *".

- Changing the data type of "List<sp_variable> &vars" in
  sp_cursor::Select_fetch_into_spvars::send_data_to_variable_list()
  to "List<sp_fetch_target> &".

- Adding THD helper methods get_rcontext() and get_variable().

- Moving the code from sql_yacc.yy into a new LEX method
  LEX::make_fetch_target().

- Simplifying the grammar in sql_yacc.yy using the new LEX method.
  Changing the data type of the bison rule sp_fetch_list from "void"
  to "List<sp_fetch_target> *".
2025-02-09 13:56:19 +04:00
Sergei Golubchik
ba01c2aaf0 Merge branch '11.4' into 11.7
* rpl.rpl_system_versioning_partitions updated for MDEV-32188
* innodb.row_size_error_log_warnings_3 changed error for MDEV-33658
  (checks are done in a different order)
2025-02-06 16:46:36 +01:00
Dave Gosselin
5e07d1abd4 MDEV-35848, MDEV-35568 Reintroduce delete_while_scanning for multi_delete
Reintroduces delete_while_scanning optimization for multi_delete.
Reverse some test changes from the initial feature devlopment now
that we delete-on-the-fly once again.
2025-02-05 10:12:30 -05:00
Dave Gosselin
5001300bd4 MDEV-30469 Support ORDER BY and LIMIT for multi-table DELETE, index hints for single-table DELETE
We now allow multitable queries with order by and limit, such as:
  delete t1.*, t2.* from t1, t2 order by t1.id desc limit 3;
To predict what rows will be deleted, run the equivalent select:
  select t1.*, t2.* from t1, t2 order by t1.id desc limit 3;
Additionally, index hints are now supported with single table delete statements:
  delete from t2 use index(xid) order by (id) limit 2;

This approach changes the multi_delete SELECT result interceptor to use a temporary
table to collect row ids pertaining to the rows that will be deleted, rather than
directly deleting rows from the target table(s).  Row ids are collected during
send_data, then read during send_eof to delete target rows.  In the event that the
temporary table created in memory is not big enough for all matching rows, it is
converted to an aria table.

Other changes:
  - Deleting from a sequence now affects zero rows instead of emitting an error

Limitations:
  - The federated connector does not create implicit row ids, so we to use a key
  when conditionally deleting.  See the change in federated_maybe_16324629.test
2025-02-05 10:12:27 -05:00
Dave Gosselin
02dc8615f2 MDEV-30469 (refactoring) Support ORDER BY and LIMIT for multi-table DELETE...
This patch includes a few changes to make the code easier to maintain:
  - Renamed SQL_I_List::link_in_list to SQL_I_List::insert.  link_in_list was
  ambiguous as it could refer to a link or it could refer to a node
  - Remove field_name local variable in multi_update::initialize_tables because
  it is not used when creating the temporary tables
  - multi_update changes:
    - Move temp table callocs to init, a more natural location for them, and moved
    tables_to_update to const member variable so we don't recompute it.
    - Filter out jtbm tables and tables not in the update map, pushing those that
    will be updated into an update_targets container.  This simplifies checks and
    loops in initialize_tables.
2025-02-05 10:08:58 -05:00
Sergei Golubchik
7d657fda64 Merge branch '10.11 into 11.4 2025-01-30 12:01:11 +01:00
Sergei Golubchik
e69f8cae1a Merge branch '10.6' into 10.11 2025-01-30 11:55:13 +01:00
Sergei Golubchik
066e8d6aea Merge branch '10.5' into 10.6 2025-01-29 11:17:38 +01:00
Dmitry Shulga
4c956fa15b MDEV-34724: Skipping a row operation from a trigger
Implementation of this task adds ability to raise the signal with
SQLSTATE '02TRG' from a BEFORE INSERT/UPDATE/DELETE trigger and handles
this signal as an indicator meaning 'to throw away the current row'
on processing the INSERT/UPDATE/DELETE statement. The signal with
SQLSTATE '02TRG' has special meaning only in case it is raised inside
BEFORE triggers, for AFTER trigger's this value of SQLSTATE isn't treated
in any special way. In according with SQL standard, the SQLSTATE class '02'
means NO DATA and sql_errno for this class is set to value
ER_SIGNAL_NOT_FOUND by current implementation of MariaDB server.
Implementation of this task assigns the value ER_SIGNAL_SKIP_ROW_FROM_TRIGGER
to sql_errno in Diagnostics_area in case the signal is raised from a trigger
and SQLSTATE has value '02TRG'.

To catch signal with SQLTSATE '02TRG' and handle it in special way, the methods
 Table_triggers_list::process_triggers
 select_insert::store_values
 select_create::store_values
 Rows_log_event::process_triggers
and the overloaded function
 fill_record_n_invoke_before_triggers
were extended with extra out parameter for returning the flag whether
to skip the current values being processed by INSERT/UPDATE/DELETE
statement. This extra parameter is passed as nullptr in case of AFTER trigger
and BEFORE trigger this parameter points to a variable to store a marker
whether to skip the current record or store it by calling write_record().
2025-01-27 16:30:27 +07:00
Nikita Malyavin
ecaedbe299 MDEV-33658 1/2 Refactoring: extract Key length initialization
mysql_prepare_create_table: Extract a Key initialization part that
relates to length calculation and long unique index designation.

append_system_key_parts call also moves there.

Move this initialization before the duplicate elimination.

Extract WITHOUT OVERPLAPS check into a separate function. It had to be moved
earlier in the code to preserve the order of the error checks, as in the tests.
2025-01-26 16:15:46 +01:00
Marko Mäkelä
98dbe3bfaf Merge 10.5 into 10.6 2025-01-20 09:57:37 +02:00
Aleksey Midenkov
0cf2176b79 MDEV-34033 Exchange partition with virtual columns fails
MDEV-28127 did is_equal() which compared vcol expressions
literally. But another table vcol expression is not equal because of
different table name.

We implement another comparison method is_identical() which respects
different table name in vcol comparison. If any field item points to
table_A and compared field item points to table_B, such items are
treated as equal in (table_A, table_B) comparison. This is done by
cloning table_B expression and renaming any table_B entries to table_A
in it.
2025-01-14 18:56:13 +03:00
Oleksandr Byelkin
0d35fe6e57 MDEV-35326: Memory Leak in init_io_cache_ext upon SHUTDOWN
The problems were that:
1) resources was freed "asimetric" normal execution in send_eof,
 in case of error in destructor.
2) destructor was not called in case of SP for result objects.
(so if the last SP execution ended with error resorces was not
freeded on reinit before execution (cleanup() called before next
execution) and destructor also was not called due to lack of
delete call for the object)

Result cleanup() renamed to reset_for_next_ps_execution() to better
reflect function().

All result method revised and freeing resources made "symetric".

Destructor of result object called for SP.

Added skipped invalidation in case of error in insert.

Removed misleading naming of reset(thd) (could be mixed with
with reset()).
2025-01-13 10:04:27 +01:00
Marko Mäkelä
15700f54c2 Merge 11.4 into 11.7 2025-01-09 09:41:38 +02:00
Marko Mäkelä
17f01186f5 Merge 10.11 into 11.4 2025-01-09 07:58:08 +02:00
Marko Mäkelä
420d9eb27f Merge 10.6 into 10.11 2025-01-08 12:51:26 +02:00
Monty
2085f36c6c Removed not used and not visible send_metdata_skip variable.
Reviewed-by: Sergei Golubchik <serg@mariadb.org>
2025-01-05 16:40:11 +02:00
Monty
e600f9aebb MDEV-35750 Change MEM_ROOT allocation sizes to reduse calls to malloc() and avoid memory fragmentation
This commit updates default memory allocations size used with MEM_ROOT
objects to minimize the number of calls to malloc().

Changes:
- Updated MEM_ROOT block sizes in sql_const.h
- Updated MALLOC_OVERHEAD to also take into account the extra memory
  allocated by my_malloc()
- Updated init_alloc_root() to only take MALLOC_OVERHEAD into account as
  buffer size, not MALLOC_OVERHEAD + sizeof(USED_MEM).
- Reset mem_root->first_block_usage if and only if first block was used.
- Increase MEM_ROOT buffers sized used by my_load_defaults, plugin_init,
  Create_tmp_table, allocate_table_share, TABLE and TABLE_SHARE.
  This decreases number of malloc calls during queries.
- Use a small buffer for THD->main_mem_root in THD::THD. This avoids
  multiple malloc() call for new connections.

I tried the above changes on a complex select query with 12 tables.
The following shows the number of extra allocations that where used
to increase the size of the MEM_ROOT buffers.

Original code:
- Connection to MariaDB:   9 allocations
- First query run:       146 allocations
- Second query run:       24 allocations

Max memory allocated for thd when using with heap table:  61,262,408
Max memory allocated for thd when using Aria tmp table:      419,464

After changes:
Connection to MariaDB:     0 allocations
- First run:              25 allocations
- Second run:              7 allocations

Max memory allocated for thd when using with heap table:  61,347,424
Max memory allocated for thd when using Aria table:          529,168

The new code uses slightly more memory, but avoids memory fragmentation
and is slightly faster thanks to much fewer calls to malloc().

Reviewed-by: Sergei Golubchik <serg@mariadb.org>
2025-01-05 16:40:11 +02:00
Monty
95975b921e MDEV-35720 Add query_time to statistics
Added Query_time (total time spent running queries) to status_variables.

Other things:
- Added SHOW_MICROSECOND_STATUS type that shows an ulonglong variable
  in microseconds converted to a double (in seconds).
- Changed Busy_time and Cpu_time to use SHOW_MICROSECOND_STATUS, which
  simplified the code and avoids some double divisions for each query.

Reviewed-by: Sergei Golubchik <serg@mariadb.org>
2024-12-30 16:13:20 +02:00
Marko Mäkelä
33907f9ec6 Merge 11.4 into 11.7 2024-12-02 17:51:17 +02:00