mirror of
https://github.com/MariaDB/server.git
synced 2025-07-30 16:24:05 +03:00
BUG#19580 - FULLTEXT search produces wrong results on UTF-8 columns
The problem was that MySQL hadn't true ctype implementation. As a result many multibyte punctuation/whitespace characters were treated as word characters. This fix uses recently added CTYPE table for unicode character sets (WL1386) to detect unicode punctuation/whitespace characters correctly. Note: this is incompatible change since it changes parser behavior. One will have to use REPAIR TABLE statement to rebuild fulltext indexes. mysql-test/r/fulltext2.result: Testcase for BUG#19580. mysql-test/t/fulltext2.test: Testcase for BUG#19580. storage/myisam/ft_parser.c: Use WL1386 "CTYPE table for unicode character sets" functionality. storage/myisam/ft_update.c: Use WL1386 "CTYPE table for unicode character sets" functionality. Reverse fix for BUG#16489 "utf8 + fulltext leads to corrupt index file.". It is not needed anymore, since we have true ctype implementation. storage/myisam/ftdefs.h: Use WL1386 "CTYPE table for unicode character sets" functionality. Rework true_word_char macro so it accepts ctype instead of charset as first param. It doesn't use my_isalnum anymore, but instead directly checks ctype. Obsolete word_char macro removed.
This commit is contained in:
@ -241,3 +241,11 @@ select * from t1 where match a against('ab c' in boolean mode);
|
||||
a
|
||||
drop table t1;
|
||||
set names latin1;
|
||||
SET NAMES utf8;
|
||||
CREATE TABLE t1(a VARCHAR(255), FULLTEXT(a)) ENGINE=MyISAM DEFAULT CHARSET=utf8;
|
||||
INSERT INTO t1 VALUES('„MySQL“');
|
||||
SELECT a FROM t1 WHERE MATCH a AGAINST('“MySQL„' IN BOOLEAN MODE);
|
||||
a
|
||||
„MySQL“
|
||||
DROP TABLE t1;
|
||||
SET NAMES latin1;
|
||||
|
Reference in New Issue
Block a user