1
0
mirror of https://github.com/MariaDB/server.git synced 2025-08-01 03:47:19 +03:00

Bug#57737 Character sets: search fails with like, contraction, index

Problem: LIKE over an indexed column optimized away good results,
because my_like_range_utf32/utf16 returned wrong ranges for contractions.
Contraction related code was missing in my_like_range_utf32/utf16,
but did exist in my_like_range_ucs2/utf8.
It was forgotten in utf32/utf16 versions (during mysql-6.0 push/revert mess).

Fix:
The patch removes individual functions my_like_range_ucs2,
my_like_range_utf16, my_like_range_utf32 and introduces a single function
my_like_range_generic() instead. The new function handles contractions
correctly. It can handle any character set with cs->min_sort_char and
cs->max_sort_char represented in Unicode code points.

added:
  @ mysql-test/include/ctype_czech.inc
  @ mysql-test/include/ctype_like_ignorable.inc
  @ mysql-test/r/ctype_like_range.result
  @ mysql-test/t/ctype_like_range.test
  Adding tests


modified:

  @ include/m_ctype.h
  - Adding helper functions for contractions.
  - Prototypes: removing ucs2,utf16,utf32 functions, adding generic function.
  @ mysql-test/r/ctype_uca.result
  @ mysql-test/r/ctype_utf16_uca.result
  @ mysql-test/r/ctype_utf32_uca.result
  @ mysql-test/t/ctype_uca.test
  @ mysql-test/t/ctype_utf16_uca.test
  @ mysql-test/t/ctype_utf32_uca.test
  - Adding tests.

  @ strings/ctype-mb.c
  - Pad function did not put the last character.
  - Implementing my_like_range_generic() - an universal replacement
    for three separate functions
    my_like_range_ucs2(), my_like_range_utf16() and my_like_range_utf32(),
    with correct contraction handling.

  @ strings/ctype-ucs2.c
  - my_fill_mb2 did not put the high byte, as previously
    it was used to put only characters in ASCII range.
    Now it puts high byte as well
    (needed to pupulate cs->max_sort_char correctly).
  - Adding DBUG_ASSERT()
  - Removing character set specific functions:
    my_like_range_ucs2(), my_like_range_utf16() and my_like_range_utf32().
  - Using my_like_range_generic() instead of the old functions.

  @ strings/ctype-uca.c
  - Using generic function instead of the old character set specific ones.

  @ sql/item_create.cc
  @ sql/item_strfunc.cc
  @ sql/item_strfunc.h
  - Adding SQL functions LIKE_RANGE_MIN and LIKE_RANGE_MAX,
    available only in debug build to make sure like_range()
    works correctly for all character sets and collations.
This commit is contained in:
Alexander Barkov
2010-11-26 13:44:39 +03:00
parent 7d27f9c0ca
commit 93386d7387
17 changed files with 3001 additions and 344 deletions

View File

@ -0,0 +1,12 @@
SELECT @@collation_connection;
--echo #
--echo # Bug#57737 Character sets: search fails with like, contraction, index
--echo #
CREATE TABLE t1 AS SELECT REPEAT(' ', 10) AS s1 LIMIT 0;
INSERT INTO t1 VALUES ('c'),('ce'),('cé'),('ch');
SELECT * FROM t1 WHERE s1 LIKE 'c%';
ALTER TABLE t1 ADD KEY s1 (s1);
SELECT * FROM t1 WHERE s1 LIKE 'c%';
ALTER TABLE t1 DROP KEY s1, ADD KEY(s1(1));
SELECT * FROM t1 WHERE s1 LIKE 'ch';
DROP TABLE t1;

View File

@ -0,0 +1,11 @@
SELECT @@collation_connection;
--echo #
--echo # Bug#57737 Character sets: search fails with like, contraction, index
--echo # Part#2 - ignorable characters
--echo #
CREATE TABLE t1 AS SELECT REPEAT(' ', 10) AS s1 LIMIT 0;
INSERT INTO t1 VALUES ('a\0\0\0\0\0\t'),('a'),('b'),('c'),('d'),('e');
SELECT HEX(s1) FROM t1 WHERE s1 LIKE 'a%';
ALTER TABLE t1 ADD KEY s1 (s1);
SELECT HEX(s1) FROM t1 WHERE s1 LIKE 'a%';
DROP TABLE t1;