1
0
mirror of https://github.com/MariaDB/server.git synced 2025-11-09 11:41:36 +03:00
Commit Graph

206 Commits

Author SHA1 Message Date
Julius Goryavsky
e3d7d5ca26 Merge branch '10.5' into '10.6' 2025-02-27 04:02:33 +01:00
Alexander Barkov
583b39811c MDEV-35620 UBSAN: runtime error: applying zero offset to null pointer
in _ma_unique_hash, skip_trailing_space, my_hash_sort_mb_nopad_bin and my_strnncollsp_utf8mb4_bin

UBSAN detected the nullptr-with-offset in a few places
when handling empty blobs.

Fix:
- Adding DBUG_ASSERT(source_string) into all hash_sort() implementations
  to catch this problem in non-UBSAN debug builds.
- Fixing mi_unique_hash(), mi_unique_comp(),
  _ma_unique_hash(), _ma_unique_comp() to replace NULL pointer to
  an empty string ponter..

Note, we should also add DBUG_ASSERT(source_string != NULL) into
all implementations of strnncoll*(). But I'm afraid the patch
is going to be too long and too dangerous for 10.5.
2025-02-03 16:45:02 +04:00
Marko Mäkelä
7e0afb1c73 Merge 10.5 into 10.6 2024-10-03 09:31:39 +03:00
Alexander Barkov
841dc07ee1 MDEV-28386 UBSAN: runtime error: negation of -X cannot be represented in type 'long long int'; cast to an unsigned type to negate this value to itself in my_strntoull_8bit on SELECT ... OCT
The code in my_strntoull_8bit() and my_strntoull_mb2_or_mb4()
could hit undefinite behavior by negating of LONGLONG_MIN.
Fixing the code to avoid this.
2024-09-20 11:01:31 +04:00
Marko Mäkelä
5bada1246d Merge 10.5 into 10.6 2023-04-11 16:15:19 +03:00
Alexander Barkov
62e137d4d7 Merge remote-tracking branch 'origin/10.4' into 10.5 2023-04-05 16:16:19 +04:00
Alexander Barkov
8020b1bd73 MDEV-30034 UNIQUE USING HASH accepts duplicate entries for tricky collations
- Adding a new argument "flag" to MY_COLLATION_HANDLER::strnncollsp_nchars()
  and a flag MY_STRNNCOLLSP_NCHARS_EMULATE_TRIMMED_TRAILING_SPACES.
  The flag defines if strnncollsp_nchars() should emulate trailing spaces
  which were possibly trimmed earlier (e.g. in InnoDB CHAR compression).
  This is important for NOPAD collations.

  For example, with this input:
   - str1= 'a '    (Latin letter a followed by one space)
   - str2= 'a  '   (Latin letter a followed by two spaces)
   - nchars= 3
  if the flag is given, strnncollsp_nchars() will virtually restore
  one trailing space to str1 up to nchars (3) characters and compare two
  strings as equal:
  - str1= 'a  '  (one extra trailing space emulated)
  - str2= 'a  '  (as is)

  If the flag is not given, strnncollsp_nchars() does not add trailing
  virtual spaces, so in case of a NOPAD collation, str1 will be compared
  as less than str2 because it is shorter.

- Field_string::cmp_prefix() now passes the new flag.
  Field_varstring::cmp_prefix() and Field_blob::cmp_prefix() do
  not pass the new flag.

- The branch in cmp_whole_field() in storage/innobase/rem/rem0cmp.cc
  (which handles the CHAR data type) now also passed the new flag.

- Fixing UCA collations to respect the new flag.
  Other collations are possibly also affected, however
  I had no success in making an SQL script demonstrating the problem.
  Other collations will be extended to respect this flags in a separate
  patch later.

- Changing the meaning of the last parameter of Field::cmp_prefix()
  from "number of bytes" (internal length)
  to "number of characters" (user visible length).

  The code calling cmp_prefix() from handler.cc was wrong.
  After this change, the call in handler.cc became correct.

  The code calling cmp_prefix() from key_rec_cmp() in key.cc
  was adjusted according to this change.

- Old strnncollsp_nchar() related tests in unittest/strings/strings-t.c
  now pass the new flag.
  A few new tests also were added, without the flag.
2023-04-04 12:30:50 +04:00
Oleksandr Byelkin
f5c5f8e41e Merge branch '10.5' into 10.6 2022-02-03 17:01:31 +01:00
Oleksandr Byelkin
cf63eecef4 Merge branch '10.4' into 10.5 2022-02-01 20:33:04 +01:00
Alexander Barkov
b915f79e4e MDEV-25904 New collation functions to compare InnoDB style trimmed NO PAD strings 2022-01-21 12:16:07 +04:00
Alexander Barkov
0d68b0a2d6 MDEV-26669 Add MY_COLLATION_HANDLER functions min_str() and max_str() 2021-09-27 17:10:22 +04:00
Marko Mäkelä
80ed136e6d Merge 10.4 into 10.5 2021-04-21 09:01:01 +03:00
Monty
031f11717d Fix all warnings given by UBSAN
The easiest way to compile and test the server with UBSAN is to run:
./BUILD/compile-pentium64-ubsan
and then run mysql-test-run.
After this commit, one should be able to run this without any UBSAN
warnings. There is still a few compiler warnings that should be fixed
at some point, but these do not expose any real bugs.

The 'special' cases where we disable, suppress or circumvent UBSAN are:
- ref10 source (as here we intentionally do some shifts that UBSAN
  complains about.
- x86 version of optimized int#korr() methods. UBSAN do not like unaligned
  memory access of integers.  Fixed by using byte_order_generic.h when
  compiling with UBSAN
- We use smaller thread stack with ASAN and UBSAN, which forced me to
  disable a few tests that prints the thread stack size.
- Verifying class types does not work for shared libraries. I added
  suppression in mysql-test-run.pl for this case.
- Added '#ifdef WITH_UBSAN' when using integer arithmetic where it is
  safe to have overflows (two cases, in item_func.cc).

Things fixed:
- Don't left shift signed values
  (byte_order_generic.h, mysqltest.c, item_sum.cc and many more)
- Don't assign not non existing values to enum variables.
- Ensure that bool and enum values are properly initialized in
  constructors.  This was needed as UBSAN checks that these types has
  correct values when one copies an object.
  (gcalc_tools.h, ha_partition.cc, item_sum.cc, partition_element.h ...)
- Ensure we do not called handler functions on unallocated objects or
  deleted objects.
  (events.cc, sql_acl.cc).
- Fixed bugs in Item_sp::Item_sp() where we did not call constructor
  on Query_arena object.
- Fixed several cast of objects to an incompatible class!
  (Item.cc, Item_buff.cc, item_timefunc.cc, opt_subselect.cc, sql_acl.cc,
   sql_select.cc ...)
- Ensure we do not do integer arithmetic that causes over or underflows.
  This includes also ++ and -- of integers.
  (Item_func.cc, Item_strfunc.cc, item_timefunc.cc, sql_base.cc ...)
- Added JSON_VALUE_UNITIALIZED to json_value_types and ensure that
  value_type is initialized to this instead of to -1, which is not a valid
  enum value for json_value_types.
- Ensure we do not call memcpy() when second argument could be null.
- Fixed that Item_func_str::make_empty_result() creates an empty string
  instead of a null string (safer as it ensures we do not do arithmetic
  on null strings).

Other things:

- Changed struct st_position to an OBJECT and added an initialization
  function to it to ensure that we do not copy or use uninitialized
  members. The change to a class was also motived that we used "struct
  st_position" and POSITION randomly trough the code which was
  confusing.
- Notably big rewrite in sql_acl.cc to avoid using deleted objects.
- Changed in sql_partition to use '^' instead of '-'. This is safe as
  the operator is either 0 or 0x8000000000000000ULL.
- Added check for select_nr < INT_MAX in JOIN::build_explain() to
  avoid bug when get_select() could return NULL.
- Reordered elements in POSITION for better alignment.
- Changed sql_test.cc::print_plan() to use pointers instead of objects.
- Fixed bug in find_set() where could could execute '1 << -1'.
- Added variable have_sanitizer, used by mtr.  (This variable was before
  only in 10.5 and up).  It can now have one of two values:
  ASAN or UBSAN.
- Moved ~Archive_share() from ha_archive.cc to ha_archive.h and marked
  it virtual. This was an effort to get UBSAN to work with loaded storage
  engines. I kept the change as the new place is better.
- Added in CONNECT engine COLBLK::SetName(), to get around a wrong cast
  in tabutil.cpp.
- Added HAVE_REPLICATION around usage of rgi_slave, to get embedded
  server to compile with UBSAN. (Patch from Marko).
- Added #ifdef for powerpc64 to avoid a bug in old gcc versions related
  to integer arithmetic.

Changes that should not be needed but had to be done to suppress warnings
from UBSAN:

- Added static_cast<<uint16_t>> around shift to get rid of a LOT of
  compiler warnings when using UBSAN.
- Had to change some '/' of 2 base integers to shift to get rid of
  some compile time warnings.

Reviewed by:
- Json changes: Alexey Botchkov
- Charset changes in ctype-uca.c: Alexander Barkov
- InnoDB changes & Embedded server: Marko Mäkelä
- sql_acl.cc changes: Vicențiu Ciorbaru
- build_explain() changes: Sergey Petrunia
2021-04-20 12:30:09 +03:00
Alexander Barkov
cfe5ee90c8 MDEV-22043 Special character leads to assertion in my_wc_to_printable_generic on 10.5.2 (debug)
The code did not take into account that:
- U+005C (backslash) can occupy more than mbminlen characters (e.g. in sjis)
- Some character sets do not have a code for U+005C (e.g. swe7)

Adding a new function my_wc_to_printable into MY_CHARSET_HANDLER to
cover all special cases easier.
2020-05-09 16:01:30 +04:00
Marko Mäkelä
8b6cfda631 Merge 10.4 into 10.5 2020-02-07 08:51:20 +02:00
Monty
4d61f1247a Fixed compiler warnings from gcc 7.4.1
- Fixed possible error in rocksdb/rdb_datadic.cc
2020-01-29 23:23:55 +02:00
Alexander Barkov
f1e13fdc8d MDEV-21581 Helper functions and methods for CHARSET_INFO 2020-01-28 12:29:23 +04:00
Marko Mäkelä
5ab70e7f68 Merge 10.2 into 10.3 2019-12-27 15:14:48 +02:00
Marko Mäkelä
73985d8301 Merge 10.1 into 10.2 2019-12-23 07:14:51 +02:00
Alexander Barkov
3d98892232 Merge remote-tracking branch 'origin/5.5' into 10.1 2019-12-16 13:08:17 +04:00
Alexander Barkov
fc860d3fa3 MDEV-21065 UNIQUE constraint causes a query with string comparison to omit a row in the result set 2019-12-16 12:57:08 +04:00
Oleksandr Byelkin
55b2281a5d Merge branch '10.2' into 10.3 2019-10-31 10:58:06 +01:00
Marko Mäkelä
19ceaf2928 Merge 10.1 into 10.2 2019-10-25 12:57:36 +03:00
Sergei Golubchik
790a74d22b Merge branch 'github/5.5' into 10.1 2019-10-23 15:55:23 +02:00
Sergei Golubchik
719ac0ad4a crash in string-to-int conversion
using a specially crafted strings one could overflow `shift`
variable and cause a crash by dereferencing d10[-2147483648]
(on a sufficiently old gcc).

This is a correct fix and a test case for

Bug #29723340: MYSQL SERVER CRASH AFTER SQL QUERY WITH DATA ?AST
2019-10-19 11:48:38 +02:00
Marko Mäkelä
be85d3e61b Merge 10.2 into 10.3 2019-05-14 17:18:46 +03:00
Marko Mäkelä
26a14ee130 Merge 10.1 into 10.2 2019-05-13 17:54:04 +03:00
Vicențiu Ciorbaru
cb248f8806 Merge branch '5.5' into 10.1 2019-05-11 22:19:05 +03:00
Vicențiu Ciorbaru
5543b75550 Update FSF Address
* Update wrong zip-code
2019-05-11 21:29:06 +03:00
Marko Mäkelä
df563e0c03 Merge 10.2 into 10.3
main.derived_cond_pushdown: Move all 10.3 tests to the end,
trim trailing white space, and add an "End of 10.3 tests" marker.
Add --sorted_result to tests where the ordering is not deterministic.

main.win_percentile: Add --sorted_result to tests where the
ordering is no longer deterministic.
2018-11-06 09:40:39 +02:00
Marko Mäkelä
32062cc61c Merge 10.1 into 10.2 2018-11-06 08:41:48 +02:00
Marko Mäkelä
d63e198061 Merge 10.0 into 10.1 2018-11-05 12:15:17 +02:00
Alexander Barkov
75ceb6ff13 MDEV-17298 ASAN unknown-crash / READ of size 1 in my_strntoul_8bit upon INSERT .. SELECT 2018-10-31 14:25:26 +04:00
Marko Mäkelä
05459706f2 Merge 10.2 into 10.3 2018-08-03 15:57:23 +03:00
Marko Mäkelä
ef3070e997 Merge 10.1 into 10.2 2018-08-02 08:19:57 +03:00
Oleksandr Byelkin
cb5952b506 Merge branch '10.0' into bb-10.1-merge-sanja 2018-07-25 22:24:40 +02:00
Alexander Barkov
e2ac4098ed Simplify caseup() and casedn() in charsets
After the MDEV-13118 fix there's no code in the server that
wants caseup/casedn to change the argument in place for simple
charsets.  Let's remove this logic and always return the result in a
new string for all charsets, both simple and complex.

1. Removing the optimization that *some* character sets used in casedn()
  and caseup(), which allowed (and required) to change the case in-place,
  overwriting the string passed as the "src" argument.
  Now all CHARSET_INFO's work in the same way:
  non of them change the source string in-place, all of them now convert
  case from the source string to the destination string, leaving
  the source string untouched.

2. Adding "const" qualifier to the "char *src" parameter
   to caseup() and casedn().

3. Removing duplicate implementations in ctype-mb.c.
  Now both caseup() and casedn() implementations for all CJK character sets
  use internally the same function my_casefold_mb()
  (the former my_casefold_mb_varlen()).

4. Removing the "unused" attribute from parameters of some my_case{up|dn}_xxx()
   implementations, as the affected parameters are now *used* in the code.
   Previously these parameters were used only in DBUG_ASSERT().
2018-07-19 13:02:14 +04:00
luz.paz
3dd01669b4 Misc. typos
Found via `codespell -i 3 -w --skip="./debian/po" -I ../mariadb-server-word-whitelist.txt  ./cmake/ ./debian/ ./Docs/ ./include/ ./man/ ./plugin/ ./strings/`
2018-04-05 15:26:57 +04:00
Vladislav Vaintroub
9891ee5a2a Fix and reenable Windows compiler warning C4800 (size_t conversion). 2018-01-26 10:37:46 +00:00
Alexander Barkov
0e5eef886a MDEV-14350 Index use with collation utf8mb4_unicode_nopad_ci on LIKE pattern with wrong results 2017-12-08 13:19:19 +04:00
Vladislav Vaintroub
7354dc6773 MDEV-13384 - misc Windows warnings fixed 2017-09-28 17:20:46 +00:00
Alexander Barkov
5058ced5df MDEV-7769 MY_CHARSET_INFO refactoring# On branch 10.2
Part 3 (final): removing MY_CHARSET_HANDLER::well_formed_len().
2016-10-10 14:36:09 +04:00
Alexander Barkov
ee19806b8e MDEV-9711 NO PAD collations
Based on the patch from Daniil Medvedev (a Google Summer of Code task)
2016-09-06 12:50:02 +04:00
Alexander Barkov
e4f6fd5e12 MDEV-10743 LDML: a new syntax to reuse sort order from another 8bit simple collation 2016-09-06 12:37:11 +04:00
Alexander Barkov
1ca595fbf7 LDML refactoring for "MDEV-9711 NO PAD collations"
- Moving detection of the MY_CS_CSSORT, MY_CS_PUREASCII, MY_CS_NONASCII
  flags of loadable collations from add_collation() in mysys.c
  to my_cset_init_8bit() and my_coll_init_simple() in ctype-simple.c.

- Adding tests that these flags are set properly for loadable collations

- Moving LDML test related *.xml files from mysql-test/std_data/
  to mysql-test/std_data/ldml/, as there will be more *.xml test files
2016-09-03 09:05:56 +04:00
Alexander Barkov
e7ff281d2e MDEV-6353 my_ismbchar() and my_mbcharlen() refactoring 2016-05-17 15:27:10 +04:00
Alexander Barkov
1d73005bf3 MDEV-8360 Clean-up CHARSET_INFO: strnncollsp: diff_if_only_endspace_difference
- Removing the "diff_if_only_endspace_difference" argument from
  MY_COLLATION_HANDLER::strnncollsp(), my_strnncollsp_simple(),
  as well as in the function template MY_FUNCTION_NAME(strnncollsp)
  in strcoll.ic

- Removing the "diff_if_only_space_different" from ha_compare_text(),
  hp_rec_key_cmp().

- Adding a new function my_strnncollsp_padspace_bin() and reusing
  it instead of duplicate code pieces in my_strnncollsp_8bit_bin(),
  my_strnncollsp_latin1_de(), my_strnncollsp_tis620(),
  my_strnncollsp_utf8_cs().

- Adding more tests for better coverage of the trailing space handling.

- Removing the unused definition of HA_END_SPACE_ARE_EQUAL
2016-03-31 11:04:48 +04:00
Alexander Barkov
e09299511e MDEV-9665 Remove cs->cset->ismbchar()
Using a more powerfull cs->cset->charlen() instead.
2016-03-16 10:55:12 +04:00
Alexander Barkov
d9b25ae3db MDEV-8466 CAST works differently for DECIMAL/INT vs DOUBLE for empty strings
MDEV-8468 CAST and INSERT work differently for DECIMAL/INT vs DOUBLE for a string with trailing spaces
2015-09-17 11:05:07 +04:00
Alexander Barkov
78b80cb6ba Adding MY_CHARSET_HANDLER::native_to_mb().
This is a pre-requisite patch for:
- MDEV-8433 Make field<'broken-string' use indexes
- MDEV-8625 Bad result set with ignorable characters when using a prefix key
- MDEV-8626 Bad result set with contractions when using a prefix key
2015-08-14 18:34:41 +04:00