mirror of
https://github.com/MariaDB/server.git
synced 2025-07-30 16:24:05 +03:00
MDEV-6255 DUPLICATE KEY Errors on SELECT .. GROUP BY that uses temporary and filesort.
The problem was that my_hash_sort didn't properly delete end-space characters properly, so strings that should compare identically was seen as different strings. (Space was handled correctly, but not NBSP) This caused duplicate key errors when a heap table was converted to Aria as part of overflow in group by. Fixed by removing all characters that compares as end space when creating a hash. Other things: - Fixed that --sorted_results also works for errors in mysqltest. - Speed up hash by not comparing strings that has different hash. - Speed up many my_hash_sort functions by using registers to calculate hash instead of pointers. This was previously done for some functions, but not for all. - Made a macro of the hash function, to simplify code and to be able to experiment with new hash functions. client/mysqltest.cc: Fixed that --sorted_results also works for error messages. mysql-test/r/ctype_partitions.result: New test to ensure that partitions on hash works mysql-test/suite/multi_source/gtid.result: Updated result mysql-test/suite/multi_source/gtid.test: Test that --sorted_result works for error messages mysql-test/suite/multi_source/gtid_ignore_duplicates.result: Updated result mysql-test/suite/multi_source/gtid_ignore_duplicates.test: Updated result mysql-test/suite/multi_source/load_data.result: Updated result mysql-test/suite/multi_source/load_data.test: Updated result mysql-test/t/ctype_partitions.test: New test to ensure that partitions on hash works storage/heap/hp_write.c: Speed up hash by not comparing strings that has different hash. storage/maria/ma_check.c: Extra debug strings/ctype-bin.c: Use macro for hash function strings/ctype-latin1.c: Use macro for hash function Use registers to calculate hash (speedup) strings/ctype-mb.c: Use macro for hash function Use registers to calculate hash (speedup) strings/ctype-simple.c: Use macro for hash function Use same variable names as in other my_hash_sort functions. Update my_hash_sort_simple() to properly remove end space (patch by Bar) strings/ctype-uca.c: Ignore duplicated space inside strings and end space in my_hash_sort_uca(). This fixed MDEV-6255 Use macro for hash function Use registers to calculate hash (speedup) strings/ctype-ucs2.c: Use macro for hash function Use registers to calculate hash (speedup) strings/ctype-utf8.c: Use macro for hash function Use registers to calculate hash (speedup) strings/strings_def.h: Made a macro of the hash function, to simplify code and to be able to experiment with new hash functions.
This commit is contained in:
@ -306,24 +306,48 @@ void my_hash_sort_simple(CHARSET_INFO *cs,
|
||||
{
|
||||
register const uchar *sort_order=cs->sort_order;
|
||||
const uchar *end;
|
||||
ulong n1, n2;
|
||||
register ulong m1= *nr1, m2= *nr2;
|
||||
uint16 space_weight= sort_order[' '];
|
||||
|
||||
/*
|
||||
Remove end space. We have to do this to be able to compare
|
||||
'A ' and 'A' as identical
|
||||
*/
|
||||
end= skip_trailing_space(key, len);
|
||||
Remove all trailing characters that are equal to space.
|
||||
We have to do this to be able to compare 'A ' and 'A' as identical.
|
||||
|
||||
If the key is long enough, cut the trailing spaces (0x20) using an
|
||||
optimized function implemented in skip_trailing_spaces().
|
||||
|
||||
"len > 16" is just some heuristic here.
|
||||
Calling skip_triling_space() for short values is not desirable,
|
||||
because its initialization block may be more expensive than the
|
||||
performance gained.
|
||||
*/
|
||||
|
||||
end= len > 16 ? skip_trailing_space(key, len) : key + len;
|
||||
|
||||
/*
|
||||
We removed all trailing characters that are binary equal to space 0x20.
|
||||
Now remove all trailing characters that have weights equal to space.
|
||||
Some 8bit simple collations may have such characters:
|
||||
- cp1250_general_ci 0xA0 NO-BREAK SPACE == 0x20 SPACE
|
||||
- cp1251_ukrainian_ci 0x60 GRAVE ACCENT == 0x20 SPACE
|
||||
- koi8u_general_ci 0x60 GRAVE ACCENT == 0x20 SPACE
|
||||
*/
|
||||
|
||||
for ( ; key < end ; )
|
||||
{
|
||||
if (sort_order[*--end] != space_weight)
|
||||
{
|
||||
end++;
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
n1= *nr1;
|
||||
n2= *nr2;
|
||||
for (; key < (uchar*) end ; key++)
|
||||
{
|
||||
n1^=(ulong) ((((uint) n1 & 63)+n2) *
|
||||
((uint) sort_order[(uint) *key])) + (n1 << 8);
|
||||
n2+=3;
|
||||
MY_HASH_ADD(m1, m2, (uint) sort_order[(uint) *key]);
|
||||
}
|
||||
*nr1= n1;
|
||||
*nr2= n2;
|
||||
*nr1= m1;
|
||||
*nr2= m2;
|
||||
}
|
||||
|
||||
|
||||
|
Reference in New Issue
Block a user