1
0
mirror of https://github.com/MariaDB/server.git synced 2025-11-27 05:41:41 +03:00
Commit Graph

270 Commits

Author SHA1 Message Date
Alexander Barkov
3e7e87ddcc MDEV-19897 Rename source code variable names from utf8 to utf8mb3 2019-06-28 12:37:04 +04:00
Oleksandr Byelkin
c07325f932 Merge branch '10.3' into 10.4 2019-05-19 20:55:37 +02:00
Marko Mäkelä
be85d3e61b Merge 10.2 into 10.3 2019-05-14 17:18:46 +03:00
Marko Mäkelä
26a14ee130 Merge 10.1 into 10.2 2019-05-13 17:54:04 +03:00
Vicențiu Ciorbaru
cb248f8806 Merge branch '5.5' into 10.1 2019-05-11 22:19:05 +03:00
Vicențiu Ciorbaru
5543b75550 Update FSF Address
* Update wrong zip-code
2019-05-11 21:29:06 +03:00
Alexander Barkov
a8efe7ab1f MDEV-17502 MDEV-17474 Change Unicode xxx_general_ci and xxx_bin collation implementation to "inline" style 2018-10-19 14:35:01 +04:00
Alexander Barkov
6eae037c4c MDEV-17474 Change Unicode collation implementation from "handler" to "inline" style 2018-10-17 06:44:40 +04:00
Marko Mäkelä
05459706f2 Merge 10.2 into 10.3 2018-08-03 15:57:23 +03:00
Marko Mäkelä
ef3070e997 Merge 10.1 into 10.2 2018-08-02 08:19:57 +03:00
Oleksandr Byelkin
cb5952b506 Merge branch '10.0' into bb-10.1-merge-sanja 2018-07-25 22:24:40 +02:00
Alexander Barkov
e2ac4098ed Simplify caseup() and casedn() in charsets
After the MDEV-13118 fix there's no code in the server that
wants caseup/casedn to change the argument in place for simple
charsets.  Let's remove this logic and always return the result in a
new string for all charsets, both simple and complex.

1. Removing the optimization that *some* character sets used in casedn()
  and caseup(), which allowed (and required) to change the case in-place,
  overwriting the string passed as the "src" argument.
  Now all CHARSET_INFO's work in the same way:
  non of them change the source string in-place, all of them now convert
  case from the source string to the destination string, leaving
  the source string untouched.

2. Adding "const" qualifier to the "char *src" parameter
   to caseup() and casedn().

3. Removing duplicate implementations in ctype-mb.c.
  Now both caseup() and casedn() implementations for all CJK character sets
  use internally the same function my_casefold_mb()
  (the former my_casefold_mb_varlen()).

4. Removing the "unused" attribute from parameters of some my_case{up|dn}_xxx()
   implementations, as the affected parameters are now *used* in the code.
   Previously these parameters were used only in DBUG_ASSERT().
2018-07-19 13:02:14 +04:00
Marko Mäkelä
b2c4740034 Fix some -Wsign-conversion
InnoDB was using int64_t instead of ha_rows (unsigned 64-bit).
2018-04-29 17:53:21 +03:00
luz.paz
3dd01669b4 Misc. typos
Found via `codespell -i 3 -w --skip="./debian/po" -I ../mariadb-server-word-whitelist.txt  ./cmake/ ./debian/ ./Docs/ ./include/ ./man/ ./plugin/ ./strings/`
2018-04-05 15:26:57 +04:00
Vladislav Vaintroub
6c279ad6a7 MDEV-15091 : Windows, 64bit: reenable and fix warning C4267 (conversion from 'size_t' to 'type', possible loss of data)
Handle string length as size_t, consistently (almost always:))
Change function prototypes to accept size_t, where in the past
ulong or uint were used. change local/member variables to size_t
when appropriate.

This fix excludes rocksdb, spider,spider, sphinx and connect for now.
2018-02-06 12:55:58 +00:00
Otto Kekäläinen
c9c28bef3c Minor spelling fixes in code comments, docs and output
This commit does not touch any variable names or any other actual code,
and thus should not in any way affect how the code works.
2018-01-12 16:49:02 +02:00
Michael Widenius
4aaa38d26e Enusure that my_global.h is included first
- Added sql/mariadb.h file that should be included first by files in sql
  directory, if sql_plugin.h is not used (sql_plugin.h adds SHOW variables
  that must be done before my_global.h is included)
- Removed a lot of include my_global.h from include files
- Removed include's of some files that my_global.h automatically includes
- Removed duplicated include's of my_sys.h
- Replaced include my_config.h with my_global.h
2017-08-24 01:05:44 +02:00
Sergei Golubchik
9ce639af52 don't export all charsets to plugins
don't use internal server collation symbol names, use collation
properties and collation IDs, they are much more stable.
2017-03-31 19:22:20 +02:00
Vladislav Vaintroub
5875633c2a MDEV-11901 : MariaRocks on Windows
fixed compilation, disabled unix-only tests (the ones that use bash
etc).

Changed plugin library name to ha_rocksdb.dll/so
2017-02-01 21:27:13 +00:00
Sergei Golubchik
4a5d25c338 Merge branch '10.1' into 10.2 2016-12-29 13:23:18 +01:00
Sergei Golubchik
2f20d297f8 Merge branch '10.0' into 10.1 2016-12-11 09:53:42 +01:00
Alexander Barkov
dd0ff30278 MDEV-11343 LOAD DATA INFILE fails to load data with an escape character followed by a multi-byte character
Partially backporting MDEV-9874 from 10.2 to 10.0

READ_INFO::read_field() raised the ER_INVALID_CHARACTER_STRING error
when reading an escape character followed by a multi-byte character.

Raising wellformedness errors in READ_INFO::read_field() was wrong,
because the main goal of READ_INFO::read_field() is to *unescape* the
data which was presumably escaped using mysql_real_escape_string(),
using the same character set with the one specified in
"LOAD DATA INFILE ... CHARACTER SET ..." (or assumed by default).

During LOAD DATA, multi-byte characters are not always scanned as a single
entity! In case of escaped data, parts of a multi-byte character can be
scanned on different loop iterations. So the old code erroneously tested
welformedness in the middle of a multi-byte character.

Moreover, the data after unescaping can go into a BLOB field, not a text field.
Wellformedness tests are meaningless in this case.

Ater this patch, wellformedness is only checked later, during
Field::store(str,length,cs) time. The loop that scans bytes only
makes sure to revert the changes made by mysql_real_escape_string().

Note, in some cases users can supply data which did not really go through
mysql_real_escape_string() and was escaped by some other means,
or was not escaped at all. The file reported in this MDEV contains
the string "\ä", which is an example of such improperly escaped data, as
- either there should be two backslashes:   "\\ä"
- or there should be no backslashes at all: "ä"
mysql_real_escape_string() could not generate "\ä".
2016-11-29 06:51:12 +04:00
Alexander Barkov
5058ced5df MDEV-7769 MY_CHARSET_INFO refactoring# On branch 10.2
Part 3 (final): removing MY_CHARSET_HANDLER::well_formed_len().
2016-10-10 14:36:09 +04:00
Alexander Barkov
0f8a1a314d MDEV-10877 xxx_unicode_nopad_ci collations 2016-09-23 14:19:07 +04:00
Alexander Barkov
ee19806b8e MDEV-9711 NO PAD collations
Based on the patch from Daniil Medvedev (a Google Summer of Code task)
2016-09-06 12:50:02 +04:00
Alexander Barkov
1ca595fbf7 LDML refactoring for "MDEV-9711 NO PAD collations"
- Moving detection of the MY_CS_CSSORT, MY_CS_PUREASCII, MY_CS_NONASCII
  flags of loadable collations from add_collation() in mysys.c
  to my_cset_init_8bit() and my_coll_init_simple() in ctype-simple.c.

- Adding tests that these flags are set properly for loadable collations

- Moving LDML test related *.xml files from mysql-test/std_data/
  to mysql-test/std_data/ldml/, as there will be more *.xml test files
2016-09-03 09:05:56 +04:00
Sergei Golubchik
932646b1ff Merge branch '10.1' into 10.2 2016-06-30 16:38:05 +02:00
Sergei Golubchik
25f1a7ae69 revert part of 69f1a32
in particular, revert changes to the spider (avoid diverging from
  the upstream if possible)
2016-06-22 10:23:11 +02:00
Alexander Barkov
63120090f9 MDEV-10262 ucs2_thai_520_w2: wrong implicit weights on the secondary level 2016-06-21 21:36:23 +04:00
Vladislav Vaintroub
69f1a3215e Replace dynamic loading of mysqld.exe data for plugins, replace with MYSQL_PLUGIN_IMPORT 2016-06-21 19:20:11 +02:00
pruet
fb35b9ad07 Multi-level collation in UCA, Thai sorting with contraction for UTF8. 2016-05-26 16:45:50 +07:00
Alexander Barkov
e7ff281d2e MDEV-6353 my_ismbchar() and my_mbcharlen() refactoring 2016-05-17 15:27:10 +04:00
Alexander Barkov
3fc6a8b832 MDEV-9811 LOAD DATA INFILE does not work well with gbk in some cases
MDEV-9824 LOAD DATA does not work with multi-byte strings in LINES TERMINATED BY when IGNORE is specified
2016-03-31 14:22:25 +04:00
Alexander Barkov
1d73005bf3 MDEV-8360 Clean-up CHARSET_INFO: strnncollsp: diff_if_only_endspace_difference
- Removing the "diff_if_only_endspace_difference" argument from
  MY_COLLATION_HANDLER::strnncollsp(), my_strnncollsp_simple(),
  as well as in the function template MY_FUNCTION_NAME(strnncollsp)
  in strcoll.ic

- Removing the "diff_if_only_space_different" from ha_compare_text(),
  hp_rec_key_cmp().

- Adding a new function my_strnncollsp_padspace_bin() and reusing
  it instead of duplicate code pieces in my_strnncollsp_8bit_bin(),
  my_strnncollsp_latin1_de(), my_strnncollsp_tis620(),
  my_strnncollsp_utf8_cs().

- Adding more tests for better coverage of the trailing space handling.

- Removing the unused definition of HA_END_SPACE_ARE_EQUAL
2016-03-31 11:04:48 +04:00
Alexander Barkov
69b5c4a422 Fixing the return data type of my_charlen() from "uint" to "int",
as it can return negative values.
The typo was introduced in the patch for MDEV-9665 in 10.2.0.
2016-03-25 15:41:10 +04:00
Alexander Barkov
3be95ee061 Removing my_strnncoll_mb_bin() and my_strnncollsp_mb_bin(),
as they are not used any more.
We now use function templates from strcoll.ic instead.
2016-03-25 06:42:44 +04:00
Alexander Barkov
e09299511e MDEV-9665 Remove cs->cset->ismbchar()
Using a more powerfull cs->cset->charlen() instead.
2016-03-16 10:55:12 +04:00
Alexander Barkov
ff2d92b17d MDEV-7231 Field ROUTINE_DEFINITION in INFORMATION_SCHEMA.ROUTINES
contains broken procedure body when used shielding quotes inside.
2016-02-24 13:12:03 +04:00
Alexander Barkov
3ba2a958be MDEV-8694 Wrong result for SELECT..WHERE a NOT LIKE 'a ' AND a='a'
Note, the patch for MDEV-8661 unintentionally fixed MDEV-8694 as well,
as a side effect. Adding a real clear fix: implementing
Item_func_like::propagate_equal_fields() with comments.
2015-08-28 17:03:09 +04:00
Alexander Barkov
78b80cb6ba Adding MY_CHARSET_HANDLER::native_to_mb().
This is a pre-requisite patch for:
- MDEV-8433 Make field<'broken-string' use indexes
- MDEV-8625 Bad result set with ignorable characters when using a prefix key
- MDEV-8626 Bad result set with contractions when using a prefix key
2015-08-14 18:34:41 +04:00
Alexander Barkov
95d07ee408 MDEV-8215 Asian MB3 charsets: compare broken bytes as "greater than any non-broken character" 2015-07-03 10:33:17 +04:00
Alexander Barkov
e6f67c64cd MDEV-6572 "USE dbname" with a bad sequence erroneously connects to a wrong database 2015-03-16 21:55:10 +04:00
Alexander Barkov
f48dc5ccc7 Moving the conversion code from String::well_formed_copy()
to my_convert_fix() - a new function in /strings.
2015-03-16 12:14:31 +04:00
Alexander Barkov
197afb413f MDEV-6566 Different INSERT behaviour on bad bytes with and without character set conversion 2015-03-13 16:51:36 +04:00
Alexander Barkov
b1b6101af2 A preparatory patch for MDEV-6566.
Adding a new virtual function MY_CHARSET_HANDLER::copy_abort().
Moving character set specific code into the correspoding implementations
(for simple, multi-byte and mbmaxlen>1 character sets).
2015-03-02 18:24:22 +04:00
Alexander Barkov
9392d0e280 - MDEV-6695 Bad column name for UCS2 string literals
The Item_string constructors called set_name() on the source string,
  which was wrong because in case of UCS2/UTF16/UTF32 the source value
  might be a not well formed string (e.g. have incomplete leftmost character).
  Now set_name() is called on str_value after its copied 
  (with optionally left zero padding) from the source string.
- MDEV-6694 Illegal mix of collation with a PS parameter
  Item_param::convert_str_value() did not set repertoire.
  Introducing a new structure MY_STRING_METADATA to collect
  character length and repertoire of a string in a single loop,
  to avoid two separate loops. Adding a new class Item_basic_value::Metadata
  as a convenience wrapper around MY_STRING_METADATA, to reuse the
  code between Item_string and Item_param.
2014-09-04 21:58:48 +04:00
Alexander Barkov
0afd292073 Fixing an MSVC warning about double "const" data type qualifier
in the code merged from MySQL-5.6:

  const CHARSET_INFO* -> CHARSET_INFO*

(CHARSET_INFO already has the "const" qualifier in MariaDB, inlike in MySQL)
2013-12-05 16:54:50 +04:00
Michael Widenius
192678e7bf MDEV-5241: Collation incompatibilities with MySQL-5.6
- Character set code & tests from Alexander Barkov
- Integration with ALTER TABLE, REPAIR and open_table from Monty

The problem was that MySQL 5.6 added some croatian and vitanamese character set collations that are incompatible with MariaDB.

The fix is to move the MariaDB conflicting collation numbers out of the region that MySQL is likely to use.
mysql_upgrade, REPAIR TABLE or ALTER TABLE will fix the collations.
If one tries to access and old incompatible table, one will get the error "Table upgrade required...."
After this patch, MariaDB supports all the MySQL character set collations and the old MariaDB croatian collations, which are closer to the latest standard than the MySQL versions.

New character sets:
ucs2_croatian_mysql561_uca_ci
utf8_croatian_mysql561_uca_ci
utf16_croatian_mysql561_uca_ci
utf32_croatian_mysql561_uca_ci
utf8mb4_croatian_mysql561_uca_ci

Other things:
- Fixed some compiler warnings
- mysql_upgrade prints information about repaired tables.
- Increased version number

VERSION:
  Increased VERSION number
client/mysqlcheck.c:
  Print repaired table name when using --verbose
include/m_ctype.h:
  Add new MariaDB collation regions that are not likely to conflict with MySQL
include/my_base.h:
  Added flag to detect if table was opened for ALTER TABLE
mysql-test/r/ctype_ldml.result:
  Updated result
mysql-test/r/ctype_uca.result:
  Updated result
mysql-test/r/ctype_upgrade.result:
  Updated result
mysql-test/r/ctype_utf16_uca.result:
  Updated result
mysql-test/r/ctype_utf32_uca.result:
  Updated result
mysql-test/r/ctype_utf8mb4_uca.result:
  Updated result
mysql-test/std_data/ctype_upgrade:
  Test files for testing upgrading of conflicting collations
mysql-test/suite/engines/funcs/r/db_alter_collate_ascii.result:
  New collations added
mysql-test/suite/engines/funcs/r/db_alter_collate_utf8.result:
  New collations added
mysql-test/suite/innodb/r/innodb_ctype_ldml.result:
  Updated test result
mysql-test/suite/innodb/t/innodb_ctype_ldml.test:
  Updated test result
mysql-test/suite/plugins/r/show_all_plugins.result:
  Updated version number
mysql-test/suite/roles/create_and_drop_role_invalid_user_table.result:
  Updated version number
mysql-test/t/ctype_ldml.test:
  Updated test
mysql-test/t/ctype_uca.test:
  Testing of new collations
mysql-test/t/ctype_upgrade.test:
  Testing of upgrading tables with old collations
  The test ensures that:
  - We will get an error if we try to open a table with old collations.
  - CHECK TABLE will detect that the table needs to be upgraded.
  - ALTER TABLE and REPAIR will fix the table.
  - mysql_upgrade works as expected
mysql-test/t/ctype_utf16_uca.test:
  Testing of new collations
mysql-test/t/ctype_utf32_uca.test:
  Testing of new collations
mysql-test/t/ctype_utf8mb4_uca.test:
  Testing of new collations
mysys/charset-def.c:
  Added new character sets
mysys/charset.c:
  Always give an error, if requested, if a character set didn't exist
sql/handler.cc:
  - Added upgrade_collation() to check if collation is compatible with old version
  - check_collation_compatibility() checks if we are using an old collation from MariaDB 5.5 or MySQL 5.6
  - ha_check_for_upgrade() returns HA_ADMIN_NEEDS_ALTER if we have an incompatible collation
sql/handler.h:
  Added new prototypes
sql/sql_table.cc:
  - Mark that tables are opened for ALTER TABLE
  - If table needs to be upgraded, ensure we are not using online alter table.
sql/table.cc:
  - If we are using an old incompatible collation, change to use the new one and mark table as incompatible.
  - Give an error if we try to open an incompatible table.
sql/table.h:
  Added error that table needs to be rebuild
storage/connect/ha_connect.cc:
  Fixed compiler warning
strings/ctype-uca.c:
  New character sets
2013-11-09 00:20:07 +02:00
Alexander Barkov
bd3dc54261 A few minor Unicode collation customization improvements were made,
which makes it possible to add more world language collations
with very complex collation rules (e.g. Myanmar):
- Weight string for a single character in a user defined collation
  was erroneously limited to 7 weights (instead of 8 weights).
  Added an extra element in the user-defined weight arrays,
  to fit 8 non-zero weights.
- Weight string limit for contractions was made two times longer (16 weights),
  which allows longer contractions without affecting the performance
  of filesort.
- A user-defined collation now refuses to initialize and reports an error
  in case if a weight string gets longer than 8 weights for a single character,
  or longer than 16 weights for a contraction. Previously weight strings
  for such characters (and contractions) were cut, so a collation
  could silently start with wrong rules.
- Fixed a bug in handling rules like "&a << b" in combination with
  shift-after-method="expand". The primary weight for "b" was not
  correctly calculated, which erroneously made "b" primary greater than "a"
  instead of primary equal to "a".
2013-10-31 14:24:24 +04:00
Alexander Barkov
71f8ca654e MDEV-5180 Data type for WEIGHT_STRING is too short in some cases
(a bug in upstream)
2013-10-25 15:01:03 +04:00