1
0
mirror of https://github.com/MariaDB/server.git synced 2025-07-30 16:24:05 +03:00

A few minor Unicode collation customization improvements were made,

which makes it possible to add more world language collations
with very complex collation rules (e.g. Myanmar):
- Weight string for a single character in a user defined collation
  was erroneously limited to 7 weights (instead of 8 weights).
  Added an extra element in the user-defined weight arrays,
  to fit 8 non-zero weights.
- Weight string limit for contractions was made two times longer (16 weights),
  which allows longer contractions without affecting the performance
  of filesort.
- A user-defined collation now refuses to initialize and reports an error
  in case if a weight string gets longer than 8 weights for a single character,
  or longer than 16 weights for a contraction. Previously weight strings
  for such characters (and contractions) were cut, so a collation
  could silently start with wrong rules.
- Fixed a bug in handling rules like "&a << b" in combination with
  shift-after-method="expand". The primary weight for "b" was not
  correctly calculated, which erroneously made "b" primary greater than "a"
  instead of primary equal to "a".
This commit is contained in:
Alexander Barkov
2013-10-31 14:24:24 +04:00
parent eea91f633f
commit bd3dc54261
5 changed files with 136 additions and 48 deletions

View File

@ -114,13 +114,25 @@
weight space between 0 and 1 in DUCET.
Also, to test it works with contractions, put some after 'z'.
-->
<reset>0</reset>
<reset>0</reset><s>001</s><s>002</s>
<pc>abcdefghijklmnopqrstuvwxyz</pc><p>aa</p><p>aaa</p>
<reset before="primary">1</reset>
<pc>ABCDEFGHIJKLMNOPQRSTUVWXYZ</pc><p>AA</p><p>AAA</p>
</rules>
</collation>
<collation name="utf8_5624_5_bad" id="369" shift-after-method="expand">
<rules>
<reset>a-a4</reset><p>xxx04</a>
<reset>a-aa5</reset><p>xxx05</a>
<reset>a-aaa6</reset><p>xxx06</a>
<reset>a-aaaa7</reset><p>xxx07</a>
<reset>a-aaaaa8</reset><p>xxx08</a>
<reset>a-aaaaaa9</reset><p>xxx09</a>
<reset>a-aaaaaa10</reset><p>xxx10</a>
</rules>
</collation>
<collation name="utf8_hugeid_ci" id="2047000000">
<rules>
<reset>a</reset>