mirror of
https://github.com/MariaDB/server.git
synced 2025-08-01 03:47:19 +03:00
WL#1386 - CTYPE table for unicode character sets
A prerequisite for several fulltext and XML bugs. MY_CHARSET_HANDLER now has a new function "ctype" to detect a type of the next character in a string (i.e. digit, letter, space, punctuation, control, etc), which now works correctly for both 8bit and multibyte charsets. Previously only 8bit charsets worked correctly, while any multibyte character was considered as letter in multibyte charsets. Many files: Adding new function Makefile.am: Adding build rules for uctypedump, a dump tool to create my_uctype.h using Unicode Character Database file. m_ctype.h: Adding declaration of my_uni_ctype, ctype data for Unicode. Adding new member into MY_CHARSET_HANDLER Makefile.am: Adding my_uctype.h into noinst_HEADERS my_uctype.h, uctypedump.c: new files: ctype data for unicode, and the tool to generate it from a Unicode Character Database file. include/Makefile.am: Adding my_uctype.h include/m_ctype.h: Adding declaration of my_uni_ctype, ctype data for Unicode. strings/Makefile.am: Adding build rules for uctypedump, a dump tool to create my_uctype.h using Unicode Character Database file. strings/ctype-big5.c: Adding new function strings/ctype-bin.c: Adding new function strings/ctype-cp932.c: Adding new function strings/ctype-euc_kr.c: Adding new function strings/ctype-eucjpms.c: Adding new function strings/ctype-gb2312.c: Adding new function strings/ctype-gbk.c: Adding new function strings/ctype-latin1.c: Adding new function strings/ctype-mb.c: Adding new function strings/ctype-simple.c: Adding new function strings/ctype-sjis.c: Adding new function strings/ctype-tis620.c: Adding new function strings/ctype-ucs2.c: Adding new function strings/ctype-ujis.c: Adding new function strings/ctype-utf8.c: Adding new function
This commit is contained in:
@ -1354,6 +1354,19 @@ longlong my_strtoll10_8bit(CHARSET_INFO *cs __attribute__((unused)),
|
||||
}
|
||||
|
||||
|
||||
int my_mb_ctype_8bit(CHARSET_INFO *cs, int *ctype,
|
||||
const unsigned char *s, const unsigned char *e)
|
||||
{
|
||||
if (s >= e)
|
||||
{
|
||||
*ctype= 0;
|
||||
return MY_CS_TOOFEW(0);
|
||||
}
|
||||
*ctype= cs->ctype[*s];
|
||||
return 1;
|
||||
}
|
||||
|
||||
|
||||
/*
|
||||
Check if a constant can be propagated
|
||||
|
||||
@ -1420,6 +1433,7 @@ MY_CHARSET_HANDLER my_charset_8bit_handler=
|
||||
my_numcells_8bit,
|
||||
my_mb_wc_8bit,
|
||||
my_wc_mb_8bit,
|
||||
my_mb_ctype_8bit,
|
||||
my_caseup_str_8bit,
|
||||
my_casedn_str_8bit,
|
||||
my_caseup_8bit,
|
||||
|
Reference in New Issue
Block a user