1
0
mirror of https://github.com/MariaDB/server.git synced 2025-10-25 18:38:00 +03:00
Commit Graph

149 Commits

Author SHA1 Message Date
Sergei Golubchik
c4f0133444 cleanup: stat tables
don't allocate Column_statistics_collected objects that won't
be used.

minor style fixes (StringBuffer<>, etc)
2021-02-22 19:27:12 +01:00
Sergei Golubchik
06a791aa12 MDEV-23753: SIGSEGV in Column_stat::store_stat_fields
only collect persistent stats for columns explicitly listed
by the user in the  ANALYZE TABLE PERSISTENT FOR COLUMNS (...)
clause. The engine can extend table->read_set as much as
it wants, it should not affect the collected statistics.

Test case from the 3b94309a6c applies - it used to crash,
because ha_partition extended table->read_set after the loop that
initialized some objects based on bits in the read_set but before the
loop that used these objects based on bits in the read_set.
2021-02-22 19:27:12 +01:00
Sergei Golubchik
caad32ca92 Revert "MDEV-23753: SIGSEGV in Column_stat::store_stat_fields"
This reverts the commit 3b94309a6c but keeps the test

Because the fix is a hack that isn't supposed to do anything,
and relies on a side-effect of rnd_init inside ha_partition.

A different fix is coming up.
2021-02-22 19:27:12 +01:00
Varun Gupta
a461e4d306 MDEV-19474: Histogram statistics are used even with optimizer_use_condition_selectivity=3
The issue here was histogram statistics were being used even when
the level of optimizer_use_condition_selectivity doesn't allow
usage of statistics from histogram.

The histogram statistics are read for a table only when
optimizer_use_condition_selectivity > 3. But the TABLE structure can be
stored in the internal table cache and be reused for the next query.
So in this case the histogram statistics will be available for the next query.

The fix would be to make sure to use the histogram statistics only when
optimizer_use_condition_selectivity > 3.
2021-02-16 11:53:13 +05:30
Varun Gupta
072b39da66 MDEV-22583: Selectivity for BIT columns in filtered column for EXPLAIN is incorrect
For BIT columns when EITS is collected, we store the integral value in
text representation in the min and max fields of the statistical table
When this value is retrieved from the statistical table to original table
field then we try to store the text representation in the original field
which is INCORRECT.

The value that is retrieved should be converted to integral type and that
value should be stored back in the original field. This would get us the
correct estimate for selectivity of the predicate.
2021-01-30 22:36:51 +05:30
Varun Gupta
3b94309a6c MDEV-23753: SIGSEGV in Column_stat::store_stat_fields
For EITS collection min and max fields are allocated for each column
that is set in the read_set bitmap of a table. This allocation of min and max
fields happens inside alloc_statistics_for_table.

For a partitioned table ha_rnd_init is called inside the function
collect_statistics_for_table which sets the read_set bitmap for the columns
inside the partition expression. This happens only when there is a write lock
on the partitioned table.
But the allocation happens before this, so min and max fields are not allocated
for the columns involved in the partition expression.
This resulted in a crash, as the EITS statistics were collected but there was
no min and max field to store the value to.

The fix would be to call ha_rnd_init inside the function alloc_statistics_for_table
that would make sure that min and max fields are allocated for the columns
involved in the partition expression.
2021-01-12 18:47:35 +05:30
Nikita Malyavin
e25623e78a MDEV-17556 Assertion `bitmap_is_set_all(&table->s->all_set)' failed
The assertion failed in handler::ha_reset upon SELECT under
READ UNCOMMITTED from table with index on virtual column.

This was the debug-only failure, though the problem is mush wider:
* MY_BITMAP is a structure containing my_bitmap_map, the latter is a raw
 bitmap.
* read_set, write_set and vcol_set of TABLE are the pointers to MY_BITMAP
* The rest of MY_BITMAPs are stored in TABLE and TABLE_SHARE
* The pointers to the stored MY_BITMAPs, like orig_read_set etc, and
 sometimes all_set and tmp_set, are assigned to the pointers.
* Sometimes tmp_use_all_columns is used to substitute the raw bitmap
 directly with all_set.bitmap
* Sometimes even bitmaps are directly modified, like in
TABLE::update_virtual_field(): bitmap_clear_all(&tmp_set) is called.

The last three bullets in the list, when used together (which is mostly
always) make the program flow cumbersome and impossible to follow,
notwithstanding the errors they cause, like this MDEV-17556, where tmp_set
pointer was assigned to read_set, write_set and vcol_set, then its bitmap
was substituted with all_set.bitmap by dbug_tmp_use_all_columns() call,
and then bitmap_clear_all(&tmp_set) was applied to all this.

To untangle this knot, the rule should be applied:
* Never substitute bitmaps! This patch is about this.
 orig_*, all_set bitmaps are never substituted already.

This patch changes the following function prototypes:
* tmp_use_all_columns, dbug_tmp_use_all_columns
 to accept MY_BITMAP** and to return MY_BITMAP * instead of my_bitmap_map*
* tmp_restore_column_map, dbug_tmp_restore_column_maps to accept
 MY_BITMAP* instead of my_bitmap_map*

These functions now will substitute read_set/write_set/vcol_set directly,
and won't touch underlying bitmaps.
2021-01-08 16:04:29 +10:00
Marko Mäkelä
ca9276e37e Merge 10.1 into 10.2 2020-07-20 14:53:24 +03:00
Varun Gupta
dfdfeecb03 MDEV-22851: Engine independent index statistics are incorrect for large tables on Windows
An oveflow was happening on windows because on Windows sizeof(ulong) is 4 bytes
while it is 8 bytes on Linux.
Switched avg_frequency and avg length for column statistics to ulonglong.
Switched avg_frequency for index statistics to ulonglong.
2020-07-15 11:27:32 +05:30
Marko Mäkelä
d72eebaa3d Merge 10.1 into 10.2 2020-06-01 09:33:03 +03:00
Sergey Vojtovich
c279878493 Thread safe histograms loading
Previously multiple threads were allowed to load histograms concurrently.
There were no known problems caused by this. But given amount of data
races in this code, it'd happen sooner or later.

To avoid scalability bottleneck, histograms loading is protected by
per-TABLE_SHARE atomic variable.

Whenever histograms were loaded by preceding statement (hot-path), a
scalable load-acquire check is performed.

Whenever histograms have to be loaded anew, mutual exclusion for loaders
is established by atomic variable. If histograms are being loaded
concurrently, statement waits until load is completed.

- Table_statistics::total_hist_size moved to TABLE_STATISTICS_CB: only
  meaningful within TABLE_SHARE (not used for collected stats).
- TABLE_STATISTICS_CB::histograms_can_be_read and
  TABLE_STATISTICS_CB::histograms_are_read are replaced with a tri state
  atomic variable.
- Simplified away alloc_histograms_for_table_share().

Note: there's still likely a data race if a thread attempts accessing
histograms data after it failed to load it (because of concurrent load).
It was there previously and goes out of the scope of this effort. One way
of fixing it could be reviving TABLE::histograms_are_read and adding
appropriate checks whenever it is needed.

Part of MDEV-19061 - table_share used for reading statistical tables is
                     not protected
2020-05-29 21:53:54 +04:00
Sergey Vojtovich
609a0d3db3 Thread safe statistics loading
Previously multiple threads were allowed to load statistics concurrently.
There were no known problems caused by this. But given amount of data
races in this code, it'd happen sooner or later.

To avoid scalability bottleneck, statistics loading is protected by
per-TABLE_SHARE atomic variable.

Whenever statistics were loaded by preceding statement (hot-path), a
scalable load-acquire check is performed.

Whenever statistics have to be loaded anew, mutual exclusion for loaders
is established by atomic variable. If statistics are being loaded
concurrently, statement waits until load is completed.

TABLE_STATISTICS_CB::stats_can_be_read and
TABLE_STATISTICS_CB::stats_is_read are replaced with a tri state atomic
variable.

Part of MDEV-19061 - table_share used for reading statistical tables is
                     not protected
2020-05-29 21:53:54 +04:00
Sergey Vojtovich
1055a7f4fc Simplified away statistics_for_tables_is_needed()
Removed redundant loops, integrated logics into the caller instead.
Unified condition in read_statistics_for_tables(), less
"table_share != NULL" checks, no more potential "table_share == NULL"
dereferencing.

Part of MDEV-19061 - table_share used for reading statistical tables is
                     not protected
2020-05-29 21:53:54 +04:00
Monty
af784385b4 Fix for using uninitialized memory
MDEV-22073 MSAN use-of-uninitialized-value in
collect_statistics_for_table()

Other things:
innodb.analyze_table was changed to mainly test statistic
collection. Was discussed with Marko.
2020-05-15 15:10:58 +03:00
Sergei Golubchik
fd6dfb3b54 Merge branch 'github/10.1' into 10.2 2019-10-30 23:38:05 +01:00
Sergei Golubchik
313855766f cleanup
* use compile_time_assert instead of DBUG_ASSERT

* don't use thd->clear_error(), because
  * the error was already consumed by the error handler, so there is
     nothing to clear
  * it's dangerous to clear errors indiscriminately, if the error came
    from outside of read_statistics_for_tables() it must not be cleared
2019-10-30 23:14:44 +01:00
Sergei Golubchik
5392b4a32c MDEV-20354 All but last insert ignored in InnoDB tables when table locked
mysql_insert() first opens all affected tables (which implicitly
starts a transaction in InnoDB), then stat tables.
A failure to open a stat table caused open_tables() to abort
the current stmt transaction (trans_rollback_stmt()). So, from the
server point of view the following ha_write_row()-s happened outside
of a transactions, and the server didn't bother to commit them.

The server has a mechanism to prevent a transaction being
unexpectedly committed or rolled back in the middle of a statement -
if an operation takes place _in a sub-statement_ it cannot change
the transaction state. Operations on stat tables are exactly that -
they are not allowed to change a transaction state. Put them in
a sub-statement to make sure they don't.
2019-10-30 23:14:44 +01:00
Marko Mäkelä
24232ec12c Merge 10.1 into 10.2 2019-10-09 08:30:23 +03:00
Sergey Vojtovich
adefaeffcc MDEV-19536 - Server crash or ASAN heap-use-after-free in is_temporary_table /
read_statistics_for_tables_if_needed

Regression after 279a907, read_statistics_for_tables_if_needed() was
called after open_normal_and_derived_tables() failure.

Fixed by moving read_statistics_for_tables() call to a branch of
get_schema_stat_record() where result of open_normal_and_derived_tables()
is checked.

Removed THD::force_read_stats, added read_statistics_for_tables() instead.
Simplified away statistics_for_command_is_needed().
2019-10-07 13:30:22 +04:00
Sergey Vojtovich
e43791d4dc Cleanup EITS
Moved EITS allocation inside read_statistics_for_tables_if_needed().
Removed redundant is_safe argument.
2019-10-02 15:23:59 +04:00
Marko Mäkelä
ca9e0089d5 MDEV-19740: Fix GCC 9.2.1 -Wmaybe-uninitialized on AMD64
For CMAKE_BUILD_TYPE=Debug, the default MYSQL_MAINTAINER_MODE=AUTO
implies -Werror along with other flags in cmake/maintainer.cmake,
which would break the debug builds when CMAKE_CXX_FLAGS include -O2.

This fix includes a backport of 6dd3f24090
from MariaDB 10.3.
2019-09-27 10:43:23 +03:00
Varun Gupta
ab9f378b0b Backporting 273d8eb12c Proper fix for disabling warnings in read_statistics_for_table() 2019-09-23 14:11:48 +05:30
Varun Gupta
273d8eb12c MDEV-20589: Server still crashes in Field::set_warning_truncated_wrong_value
The flag is_stat_field is not set for the min_value and max_value of field items
inside table share. This is a must requirement as we don't want to throw
warnings of truncation when we read values from the statistics table to the column
statistics of table share fields.
2019-09-18 15:06:02 +05:30
Marko Mäkelä
48c67038b9 Merge 10.1 into 10.2
For MDEV-15955, the fix in create_tmp_field_from_item() would cause a
compilation error. After a discussion with Alexander Barkov, the fix
was omitted and only the test case was kept.

In 10.3 and later, MDEV-15955 is fixed properly by overriding
create_tmp_field() in Item_func_user_var.
2019-08-20 09:15:28 +03:00
Vicențiu Ciorbaru
588e67956a Make sure histograms do not write uninitialized bytes to record
A histogram size that is odd in size with DOUBLE precision will leave the last
byte unwritten. When collecting histograms, this causes the last byte to
be uninitialized in the record. memset the buffer to 0 first to make
sure this does not happen.
2019-08-13 20:45:51 +03:00
Marko Mäkelä
c41407210c Merge 10.1 into 10.2 2019-05-16 11:55:18 +03:00
Varun Gupta
6ab9d1627a MDEV-19407: Assertion `field->table->stats_is_read' failed in is_eits_usable
Statistics were not read for a table when we had a CREATE TABLE query.
Enforce reading statistics for commands CREATE TABLE, SET and DO.
2019-05-16 08:33:06 +05:30
Marko Mäkelä
26a14ee130 Merge 10.1 into 10.2 2019-05-13 17:54:04 +03:00
Vicențiu Ciorbaru
cb248f8806 Merge branch '5.5' into 10.1 2019-05-11 22:19:05 +03:00
Oleksandr Byelkin
8cbb14ef5d Merge branch '10.1' into 10.2 2019-05-04 17:04:55 +02:00
Varun Gupta
ca94ce2a58 MDEV-19352: Server crash in alloc_histograms_for_table_share upon query from information schema
To read histograms for a table, we should check if the allocation of statistics was done or not,
if not done we should not try to read histograms for such a table.
2019-05-02 01:02:56 +05:30
Varun Gupta
0d5aabd632 MDEV-19334: bool is_eits_usable(Field*): Assertion `field->table->stats_is_read' failed.
Fixed the assert by making sure that not to use EITS if the column statistics was not allocated.
2019-04-27 14:27:02 +05:30
Marko Mäkelä
bc145193c1 Merge 10.1 into 10.2 2019-04-25 09:04:09 +03:00
Igor Babaev
279a907fd0 MDEV-17605 Statistics for InnoDB table is wrong if persistent statistics is used
The command SHOW INDEXES ignored setting of the system variable
use_stat_tables to the value of 'preferably' and and showed statistical
data received from the engine. Similarly queries over the table
STATISTICS from INFORMATION_SCHEMA ignored this setting. It happened
because the function fill_schema_table_by_open() did not read any data
from statistical tables.
2019-04-22 17:11:07 -07:00
Varun Gupta
409dddf695 MDEV-18300: ASAN error in Field_blob::get_key_image upon UPDATE with subquery
For single table updates and multi-table updates , engine independent statistics were not being
read even if the statistics were collected.
Fixed it, so when the optimizer_use_condition_selectivity > 2 then we would read the available
statistics for update queries.
2019-04-11 13:05:01 +05:30
Varun Gupta
38cad6875f MDEV-18899: Server crashes in Field::set_warning_truncated_wrong_value
To fix the crash there we need to make sure that the
server while storing the statistical values in statistical tables should do it
in a multi-byte safe way.
Also there is no need to throw warnings if there is truncation while storing
values from statistical fields.
2019-03-28 13:16:28 +05:30
Marko Mäkelä
2e5aea4bab Merge 10.1 into 10.2 2018-12-13 15:47:38 +02:00
Marko Mäkelä
621041b676 Merge 10.0 into 10.1
Also, apply the MDEV-17957 changes to encrypted page checksums,
and remove error message output from the checksum function,
because these messages would be useless noise when mariabackup
is retrying reads of corrupted-looking pages, and not that
useful during normal server operation either.

The error messages in fil_space_verify_crypt_checksum()
should be refactored separately.
2018-12-13 13:37:21 +02:00
Marko Mäkelä
5ab91f5914 Remove space before #ifdef 2018-12-13 12:15:27 +02:00
Varun Gupta
ce1669af12 Fix compile error when building without the partition engine 2018-12-13 10:03:09 +05:30
Marko Mäkelä
db1210f939 Merge 10.1 into 10.2 2018-12-12 12:13:43 +02:00
Marko Mäkelä
f77f8f6d1a Merge 10.0 into 10.1 2018-12-12 10:48:53 +02:00
Varun Gupta
4886d14827 MDEV-17032: Estimates are higher for partitions of a table with @@use_stat_tables= PREFERABLY
The problem here is EITS statistics does not calculate statistics for the partitions of the table.
So a temporary solution would be to not read EITS statistics for partitioned tables.

Also disabling reading of EITS for columns that participate in the partition list of a table.
2018-12-07 19:59:45 +05:30
Oleksandr Byelkin
28f08d3753 Merge branch '10.1' into 10.2 2018-09-14 08:47:22 +02:00
Oleksandr Byelkin
31081593aa Merge branch '11.0' into 10.1 2018-09-06 22:45:19 +02:00
Varun Gupta
69d7bfd970 MDEV-17023: Crash during read_histogram_for_table with optimizer_use_condition_selectivity set to 4
No need to read statistics for tables that are not USER tables.
We allocate memory for structures to collect statistics only for USER TABLES.
2018-08-27 12:21:26 +03:00
Marko Mäkelä
ef3070e997 Merge 10.1 into 10.2 2018-08-02 08:19:57 +03:00
Igor Babaev
095dc81158 MDEV-16757 Memory leak after adding manually min/max statistical data
for blob column

ANALYZE TABLE <table> does not collect statistical data on min/max values
for BLOB columns of <table>. However these values can be added into
mysql.column_stats manually by executing proper statements.
Unfortunately this led to a memory leak because the memory allocated
for these values was never freed.
This patch provides the server with a function to free memory allocated
for min/max statistical values of BLOB types.

Temporarily changed the test case until MDEV-16711 is fixed as without
this fix the test case for MDEV-16757 did not fail only for 10.0.
2018-07-15 16:24:24 -07:00
Igor Babaev
c89bb15c31 MDEV-16757 Memory leak after adding manually min/max statistical data
for blob column

ANALYZE TABLE <table> does not collect statistical data on min/max values
for BLOB columns of <table>. However these values can be added into
mysql.column_stats manually by executing proper statements.
Unfortunately this led to a memory leak because the memory allocated
for these values was never freed.
This patch provides the server with a function to free memory allocated
for min/max statistical values of BLOB types.
2018-07-13 17:48:45 -07:00
Varun Gupta
ad9d1e8c3f MDEV-16552: [10.0] ASAN global-buffer-overflow in is_stat_table / statistics_for_tables_is_needed
Backport the fix f214d36512 to 10.0

   Author: Sergei Golubchik <serg@mariadb.org>
   Date:   Tue Apr 17 00:44:34 2018 +0200

    ASAN error in is_stat_table()

    don't memcmp beyond the first argument's end

    Also: use my_strcasecmp(table_alias_charset), like elsewhere, not memcmp
2018-07-11 15:22:04 +05:30