1
0
mirror of https://github.com/MariaDB/server.git synced 2025-11-09 11:41:36 +03:00
Commit Graph

351 Commits

Author SHA1 Message Date
Oleksandr Byelkin
ba00960fda Merge branch 'bb-11.8-release' into bb-12.1-release 2025-11-05 08:58:12 +01:00
Sergei Golubchik
c21d462e6a Merge branch 'bb-11.4-serg' into bb-11.8-serg 2025-11-04 18:27:52 +01:00
Oleg Smirnov
aa70eeac2c MDEV-36761: Put NULL-aware cardinality estimation under new_mode flag
The NULL-aware index statistics fix is now controlled by the
FIX_INDEX_STATS_FOR_ALL_NULLS flag and disabled by default
for preserving execution plan stability in stable versions.

To enable:
  SET @@new_mode = 'FIX_INDEX_STATS_FOR_ALL_NULLS';

Or via command line:
  --new-mode=FIX_INDEX_STATS_FOR_ALL_NULLS

Or in configuration file:
  [mysqld]
  new_mode=FIX_INDEX_STATS_FOR_ALL_NULLS

`all_nulls_key_parts` bitmap is now calculated at set_statistics_for_table()
2025-10-31 19:47:30 +07:00
Oleg Smirnov
d0ac583df8 MDEV-36761 Implement NULL-aware cardinality estimation for indexed columns
(Preparation for the main patch)

set_statistics_for_table() incorrectly treated indexes with all NULL
values the same as indexes with no statistics, because avg_frequency
is 0 in both cases. This caused the optimizer to ignore valid EITS
data and fall back to engine statistics.

Additionally, KEY::actual_rec_per_key() would fall back to engine
statistics even when EITS was available, and used incorrect pointer
comparison (rec_per_key == 0 instead of nullptr).

Fix by adding Index_statistics::stats_were_read flag to track per-index
whether statistics were actually read from persistent tables, and
restructuring actual_rec_per_key() to prioritize EITS when available.
2025-10-25 01:01:32 +07:00
Oleksandr Byelkin
c976b527db Merge branch '11.8' into bb-12.1-release 2025-10-08 09:05:38 +02:00
Aleksey Midenkov
ff33f49d9a Merge 11.4 into 11.8 2025-09-29 18:25:09 +03:00
Oleksandr Byelkin
15b1426c3a Merge branch '10.11' into bb-11.4-release 2025-09-15 16:17:33 +02:00
Monty
882f6fa3aa Fixed typos
- Removed duplicate words, like "the the" and "to to"
- Removed duplicate lines (one double sort line found in mysql.cc)
- Fixed some typos found while searching for duplicate words.

Command used to find duplicate words:
egrep -rI "\s([a-zA-Z]+)\s+\1\s" | grep -v param

Thanks to Artjoms Rimdjonoks for the command and pointing out the
spelling errors.
2025-09-04 18:08:39 +03:00
Oleksandr Byelkin
dfcb5c91e0 Merge branch '11.8' into 12.0 2025-06-18 07:50:39 +02:00
Oleksandr Byelkin
a65f7dc71d Merge branch '11.4' into 11.8 2025-06-18 07:43:24 +02:00
Oleksandr Byelkin
89c7e2b9c7 Merge branch '10.11' into 11.4 2025-06-17 09:50:22 +02:00
Marko Mäkelä
96b7671b05 MDEV-34388 fixup: Stack usage in rename_table_in_stat_tables()
rename_table_in_stat_tables(): Allocate TABLE_LIST[STATISTICS_TABLES]
from the heap instead of the stack, to pass -Wframe-larger-than=16384
in an optimized CMAKE_BUILD_TYPE=RelWithDebInfo build.
2025-05-23 14:51:38 +03:00
Oleksandr Byelkin
f1102da37a Merge branch '11.8' into 12.0 2025-05-22 09:22:55 +02:00
Marko Mäkelä
1c7209e828 Merge 10.6 into 10.11 2025-05-21 07:36:35 +03:00
Marko Mäkelä
82d7419e06 MDEV-34388: Stack overflow on Alpine Linux
page_is_corrupted(): Do not allocate the buffers from stack,
but from the heap, in xb_fil_cur_open().

row_quiesce_write_cfg(): Issue one type of message when we
fail to create the .cfg file.

update_statistics_for_table(), read_statistics_for_table(),
delete_statistics_for_table(), rename_table_in_stat_tables():
Use a common stack buffer for Index_stat, Column_stat, Table_stat.

ha_connect::FileExists(): Invoke push_warning_printf() so that
we can avoid allocating a buffer for snprintf().

translog_init_with_table(): Do not duplicate TRANSLOG_PAGE_SIZE_BUFF.

Let us also globally enable the GCC 4.4 and clang 3.0 option
-Wframe-larger-than=16384 to reduce the possibility of introducing
such stack overflow in the future.  For RocksDB and Mroonga we relax
these limits.

Reviewed by: Vladislav Lesin
2025-05-20 17:27:05 +03:00
Sergei Golubchik
237e24497b Merge remote-tracking branch 'github/bb-11.4-release' into bb-11.8-serg 2025-04-27 19:40:00 +02:00
Oleksandr Byelkin
a8d4642375 Merge branch '10.11' into 11.4 2025-04-26 10:53:02 +02:00
Dave Gosselin
00fa5b8676 MDEV-35721 UBSAN: runtime error: -nan outside range
Prevent division by zero when reading table statistics
2025-04-09 09:42:15 +10:00
Vasilii Lakhin
717c12de0e Fix typos in C comments inside sql/ 2025-03-14 12:08:56 +04:00
Marko Mäkelä
bb9f010432 Merge 11.4 into 11.8 2025-03-05 20:39:47 +02:00
Marko Mäkelä
49a6baec56 Merge 10.11 into 11.4 2025-03-03 11:07:56 +02:00
ParadoxV5
d5ba6f71b9 Tag push_warning_printf with ATTRIBUTE_FORMAT
* Let GCC `-Wformat` check formats sent to these `my_vsnprintf_ex` users
* Migrate them from the old extension specifiers
  to the new `-Wformat`-compatible suffixes
2025-02-12 10:17:44 +01:00
Sergei Petrunia
91de54e098 Remove redundant if-statement in Index_prefix_calc::get_avg_frequency
the code went like this:

   for ( ... ; i < prefixes ; ...)
   {
      if (i < prefixes)
      {
        ...
2025-02-04 21:47:43 +02:00
Marko Mäkelä
33907f9ec6 Merge 11.4 into 11.7 2024-12-02 17:51:17 +02:00
Marko Mäkelä
2719cc4925 Merge 10.11 into 11.4 2024-12-02 11:35:34 +02:00
Marko Mäkelä
3d23adb766 Merge 10.6 into 10.11 2024-11-29 13:43:17 +02:00
Marko Mäkelä
7d4077cc11 Merge 10.5 into 10.6 2024-11-29 12:37:46 +02:00
Brandon Nesterenko
dbfee9fc2b MDEV-34348: Consolidate cmp function declarations
Partial commit of the greater MDEV-34348 scope.
MDEV-34348: MariaDB is violating clang-16 -Wcast-function-type-strict

The functions queue_compare, qsort2_cmp, and qsort_cmp2
all had similar interfaces, and were used interchangable
and unsafely cast to one another.

This patch consolidates the functions all into the
qsort_cmp2 interface.

Reviewed By:
============
Marko Mäkelä <marko.makela@mariadb.com>
2024-11-23 08:14:22 -07:00
Sergei Golubchik
8ac3f0b1d4 MDEV-35038 Server crash in Index_statistics::get_avg_frequency upon EITS collection for vector index
don't collect eits for vector indexes
2024-11-05 14:00:50 -08:00
Sergei Golubchik
44c6328cbb cleanup: thd->alloc<>() and thd->calloc<>()
create templates

  thd->alloc<X>(n) to use instead of (X*)thd->alloc(sizeof(X)*n)

and the same for thd->calloc(). By the default the type is char,
so old usage of thd->alloc(size) works too.
2024-11-05 14:00:48 -08:00
Sergei Golubchik
062f8eb37d cleanup: key algorithm vs key flags
the information about index algorithm was stored in two
places inconsistently split between both.

BTREE index could have key->algorithm == HA_KEY_ALG_BTREE, if the user
explicitly specified USING BTREE or HA_KEY_ALG_UNDEF, if not.

RTREE index had key->algorithm == HA_KEY_ALG_RTREE
and always had key->flags & HA_SPATIAL

FULLTEXT index had  key->algorithm == HA_KEY_ALG_FULLTEXT
and always had key->flags & HA_FULLTEXT

HASH index had key->algorithm == HA_KEY_ALG_HASH or HA_KEY_ALG_UNDEF

long unique index always had key->algorithm == HA_KEY_ALG_LONG_HASH

In this commit:

All indexes except BTREE and HASH always have key->algorithm
set, HA_SPATIAL and HA_FULLTEXT flags are not used anymore (except
for storage to keep frms backward compatible).

As a side effect ALTER TABLE now detects FULLTEXT index renames correctly
2024-11-05 14:00:47 -08:00
Alexander Barkov
4e805aed85 Merge remote-tracking branch 'origin/11.4' into 11.5 2024-07-10 12:17:09 +04:00
Oleksandr Byelkin
2447dda2c0 Merge branch '10.11' into 11.1 2024-07-08 22:40:16 +02:00
Marko Mäkelä
27a3366663 Merge 10.6 into 10.11 2024-06-27 10:26:09 +03:00
Marko Mäkelä
0076eb3d4e Merge 10.5 into 10.6 2024-06-24 13:09:47 +03:00
Dave Gosselin
db0c28eff8 MDEV-33746 Supply missing override markings
Find and fix missing virtual override markings.  Updates cmake
maintainer flags to include -Wsuggest-override and
-Winconsistent-missing-override.
2024-06-20 11:32:13 -04:00
Monty
b9f5793176 MDEV-9101 Limit size of created disk temporary files and tables
Two new variables added:
- max_tmp_space_usage : Limits the the temporary space allowance per user
- max_total_tmp_space_usage: Limits the temporary space allowance for
  all users.

New status variables: tmp_space_used & max_tmp_space_used
New field in information_schema.process_list: TMP_SPACE_USED

The temporary space is counted for:
- All SQL level temporary files. This includes files for filesort,
  transaction temporary space, analyze, binlog_stmt_cache etc.
  It does not include engine internal temporary files used for repair,
  alter table, index pre sorting etc.
- All internal on disk temporary tables created as part of resolving a
  SELECT, multi-source update etc.

Special cases:
- When doing a commit, the last flush of the binlog_stmt_cache
  will not cause an error even if the temporary space limit is exceeded.
  This is to avoid giving errors on commit. This means that a user
  can temporary go over the limit with up to binlog_stmt_cache_size.

Noteworthy issue:
- One has to be careful when using small values for max_tmp_space_limit
  together with binary logging and with non transactional tables.
  If a the binary log entry for the query is bigger than
  binlog_stmt_cache_size and one hits the limit of max_tmp_space_limit
  when flushing the entry to disk, the query will abort and the
  binary log will not contain the last changes to the table.
  This will also stop the slave!
  This is also true for all Aria tables as Aria cannot do rollback
  (except in case of crashes)!
  One way to avoid it is to use @@binlog_format=statement for
  queries that updates a lot of rows.

Implementation:
- All writes to temporary files or internal temporary tables, that
  increases the file size, are routed through temp_file_size_cb_func()
  which updates and checks the temp space usage.
- Most of the temporary file monitoring is done inside IO_CACHE.
  Temporary file monitoring is done inside the Aria engine.
- MY_TRACK and MY_TRACK_WITH_LIMIT are new flags for ini_io_cache().
  MY_TRACK means that we track the file usage. TRACK_WITH_LIMIT means
  that we track the file usage and we give an error if the limit is
  breached. This is used to not give an error on commit when
  binlog_stmp_cache is flushed.
- global_tmp_space_used contains the total tmp space used so far.
  This is needed quickly check against max_total_tmp_space_usage.
- Temporary space errors are using EE_LOCAL_TMP_SPACE_FULL and
  handler errors are using HA_ERR_LOCAL_TMP_SPACE_FULL.
  This is needed until we move general errors to it's own error space
  so that they cannot conflict with system error numbers.
- Return value of my_chsize() and mysql_file_chsize() has changed
  so that -1 is returned in the case my_chsize() could not decrease
  the file size (very unlikely and will not happen on modern systems).
  All calls to _chsize() are updated to check for > 0 as the error
  condition.
- At the destruction of THD we check that THD::tmp_file_space == 0
- At server end we check that global_tmp_space_used == 0
- As a precaution against errors in the tmp_space_used code, one can set
  max_tmp_space_usage and max_total_tmp_space_usage to 0 to disable
  the tmp space quota errors.
- truncate_io_cache() function added.
- Aria tables using static or dynamic row length are registered in 8K
  increments to avoid some calls to update_tmp_file_size().

Other things:
- Ensure that all handler errors are registered.  Before, some engine
  errors could be printed as "Unknown error".
- Fixed bug in filesort() that causes a assert if there was an error
  when writing to the temporay file.
- Fixed that compute_window_func() now takes into account write errors.
- In case of parallel replication, rpl_group_info::cleanup_context()
  could call trans_rollback() with thd->error set, which would cause
  an assert. Fixed by resetting the error before calling trans_rollback().
- Fixed bug in subselect3.inc which caused following test to use
  heap tables with low value for max_heap_table_size
- Fixed bug in sql_expression_cache where it did not overflow
  heap table to Aria table.
- Added Max_tmp_disk_space_used to slow query log.
- Fixed some bugs in log_slow_innodb.test
2024-05-27 12:39:04 +02:00
Monty
ae9a4799d7 MDEV-33938 Analyze table on sequences should be prohibited 2024-05-27 12:39:03 +02:00
Monty
ab513b007b Optimize checking if a table is a statistics table 2024-05-27 12:39:02 +02:00
Oleksandr Byelkin
dd7d9d7fb1 Merge branch '11.4' into 11.5 2024-05-23 17:01:43 +02:00
Sergei Golubchik
f9807aadef Merge branch '10.11' into 11.0 2024-05-12 12:18:28 +02:00
Oleksandr Byelkin
c9b1ebee2f Merge branch '10.6' into 10.11 2024-04-26 08:02:49 +02:00
Monty
0ccdf54b64 Check and remove high stack usage
I checked all stack overflow potential problems found with
gcc -Wstack-usage=16384
and
clang -Wframe-larger-than=16384 -no-inline

Fixes:
Added '#pragma clang diagnostic ignored "-Wframe-larger-than="'
  to a lot of function to where stack usage large but resonable.
- Added stack check warnings to BUILD scrips when using clang and debug.

Function changed to use malloc instead allocating things on stack:
- read_bootstrap_query() now allocates line_buffer (20000 bytes) with
  malloc() instead of using stack. This has a small performance impact
  but this is not releant for bootstrap.
- mroonga grn_select() used 65856 bytes on stack. Changed it to use
  malloc().
- Wsrep_schema::replay_transaction() and
  Wsrep_schema::recover_sr_transactions().
- Connect zipOpen3()

Not fixed:
- mroonga/vendor/groonga/lib/expr.c grn_proc_call() uses
  43712 byte on stack.  However this is not easy to fix as the stack
  used is caused by a lot of code generated by defines.
- Most changes in mroonga/groonga where only adding of pragmas to disable
  stack warnings.
- rocksdb/options/options_helper.cc uses 20288 of stack space.
  (no reason to fix except to get rid of the compiler warning)
- Causes using alloca() where the allocation size is resonable.
- An issue in libmariadb (reported to connectors).
2024-04-23 14:12:31 +03:00
Alexander Barkov
fd247cc21f MDEV-31340 Remove MY_COLLATION_HANDLER::strcasecmp()
This patch also fixes:
  MDEV-33050 Build-in schemas like oracle_schema are accent insensitive
  MDEV-33084 LASTVAL(t1) and LASTVAL(T1) do not work well with lower-case-table-names=0
  MDEV-33085 Tables T1 and t1 do not work well with ENGINE=CSV and lower-case-table-names=0
  MDEV-33086 SHOW OPEN TABLES IN DB1 -- is case insensitive with lower-case-table-names=0
  MDEV-33088 Cannot create triggers in the database `MYSQL`
  MDEV-33103 LOCK TABLE t1 AS t2 -- alias is not case sensitive with lower-case-table-names=0
  MDEV-33109 DROP DATABASE MYSQL -- does not drop SP with lower-case-table-names=0
  MDEV-33110 HANDLER commands are case insensitive with lower-case-table-names=0
  MDEV-33119 User is case insensitive in INFORMATION_SCHEMA.VIEWS
  MDEV-33120 System log table names are case insensitive with lower-cast-table-names=0

- Removing the virtual function strnncoll() from MY_COLLATION_HANDLER

- Adding a wrapper function CHARSET_INFO::streq(), to compare
  two strings for equality. For now it calls strnncoll() internally.
  In the future it will turn into a virtual function.

- Adding new accent sensitive case insensitive collations:
    - utf8mb4_general1400_as_ci
    - utf8mb3_general1400_as_ci
  They implement accent sensitive case insensitive comparison.
  The weight of a character is equal to the code point of its
  upper case variant. These collations use Unicode-14.0.0 casefolding data.

  The result of
     my_charset_utf8mb3_general1400_as_ci.strcoll()
  is very close to the former
     my_charset_utf8mb3_general_ci.strcasecmp()

  There is only a difference in a couple dozen rare characters, because:
    - the switch from "tolower" to "toupper" comparison, to make
      utf8mb3_general1400_as_ci closer to utf8mb3_general_ci
    - the switch from Unicode-3.0.0 to Unicode-14.0.0
  This difference should be tolarable. See the list of affected
  characters in the MDEV description.

  Note, utf8mb4_general1400_as_ci correctly handles non-BMP characters!
  Unlike utf8mb4_general_ci, it does not treat all BMP characters
  as equal.

- Adding classes representing names of the file based database objects:

    Lex_ident_db
    Lex_ident_table
    Lex_ident_trigger

  Their comparison collation depends on the underlying
  file system case sensitivity and on --lower-case-table-names
  and can be either my_charset_bin or my_charset_utf8mb3_general1400_as_ci.

- Adding classes representing names of other database objects,
  whose names have case insensitive comparison style,
  using my_charset_utf8mb3_general1400_as_ci:

  Lex_ident_column
  Lex_ident_sys_var
  Lex_ident_user_var
  Lex_ident_sp_var
  Lex_ident_ps
  Lex_ident_i_s_table
  Lex_ident_window
  Lex_ident_func
  Lex_ident_partition
  Lex_ident_with_element
  Lex_ident_rpl_filter
  Lex_ident_master_info
  Lex_ident_host
  Lex_ident_locale
  Lex_ident_plugin
  Lex_ident_engine
  Lex_ident_server
  Lex_ident_savepoint
  Lex_ident_charset
  engine_option_value::Name

- All the mentioned Lex_ident_xxx classes implement a method streq():

  if (ident1.streq(ident2))
     do_equal();

  This method works as a wrapper for CHARSET_INFO::streq().

- Changing a lot of "LEX_CSTRING name" to "Lex_ident_xxx name"
  in class members and in function/method parameters.

- Replacing all calls like
    system_charset_info->coll->strcasecmp(ident1, ident2)
  to
    ident1.streq(ident2)

- Taking advantage of the c++11 user defined literal operator
  for LEX_CSTRING (see m_strings.h) and Lex_ident_xxx (see lex_ident.h)
  data types. Use example:

  const Lex_ident_column primary_key_name= "PRIMARY"_Lex_ident_column;

  is now a shorter version of:

  const Lex_ident_column primary_key_name=
    Lex_ident_column({STRING_WITH_LEN("PRIMARY")});
2024-04-18 15:22:10 +04:00
Oleksandr Byelkin
48af85db21 Merge branch '10.11' into 11.0 2023-11-08 17:09:44 +01:00
Marko Mäkelä
5a8fca5a4f Merge 10.6 into 10.10 2023-10-23 18:43:36 +03:00
Monty
a1b6befc78 Fixed crash in is_stat_table() when using hash joins.
Other usage if persistent statistics is checking 'stats_is_read' in
caller, which is why this was not noticed earlier.

Other things:
- Simplified no_stat_values_provided
2023-10-19 16:17:01 +03:00
Marko Mäkelä
be24e75229 Merge 10.11 into 11.0 2023-10-19 08:12:16 +03:00
Monty
a49ebf71af Fixed memory leak when using histograms
This was introduced in last merge with 10.6
The reason is that 10.6 does not need anything special to free histograms
as everything is allocated on a memroot.
In 10.10 histograms is using the vector class, which has some problems:
- No automatic free
- No memory usage accounting
(we should at some point remove vector usage because of the above problem)

Fixed by expliciting freeing histograms when freeing TABLE_STATISTICS
objects.
2023-10-17 15:12:49 +03:00
Monty
ec277a70e8 Do not create histograms for single column unique key
The intentention was always to not create histograms for single value
unique keys (as histograms is not useful in this case), but because of
a bug in the code this was still done.

The changes in the test cases was mainly because hist_size is now NULL
for these kind of columns.
2023-10-14 13:43:26 +03:00