1
0
mirror of https://github.com/MariaDB/server.git synced 2025-08-01 03:47:19 +03:00

Backport of:

revno: 2876.47.174
revision-id: jorgen.loland@oracle.com-20110519120355-qn7eprkad9jqwu5j
parent: mayank.prasad@oracle.com-20110518143645-bdxv4udzrmqsjmhq
committer: Jorgen Loland <jorgen.loland@oracle.com>
branch nick: mysql-trunk-11765831
timestamp: Thu 2011-05-19 14:03:55 +0200
message:
  BUG#11765831: 'RANGE ACCESS' MAY INCORRECTLY FILTER 
                        AWAY QUALIFYING ROWS
        
  The problem was that the ranges created when OR'ing two 
  conditions could be incorrect. Without the bugfix, 
  "I <> 6 OR (I <> 8 AND J = 5)" would create these ranges:
  
  "NULL < I < 6",
  "6 <= I <= 6 AND 5 <= J <= 5",
  "6 < I < 8",
  "8 <= I <= 8 AND 5 <= J <= 5",
  "8 < I"
  
  While the correct ranges is
  "NULL < I < 6",
  "6 <= I <= 6 AND 5 <= J <= 5",
  "6 < I"
  
  The problem occurs when key_or() ORs
  (1) "NULL < I < 6, 6 <= I <= 6 AND 5 <= J <= 5, 6 < I" with 
  (2) "8 < I AND 5 <= J <= 5"
  
  The reason for the bug is that in key_or(), SEL_ARG *tmp is 
  used to point to the range in (1) above that is merged with 
  (2) while key1 points to the root of the red-black tree of 
  (1). When merging (1) and (2), tmp refers to the "6 < I" 
  part whereas the root is the "6 <= ... AND 5 <= J <= 5" part. 
  
  key_or() decides that the tmp range needs to be split into
  "6 < I < 8, 8 <= I <= 8, 8 < I", in which next_key_part of the 
  second range should be that of tmp. However, next_key_part is
  set to key1->next_key_part ("5 <= J <= 5") instead of 
  tmp->next_key_part (empty). Fixing this gives the correct but
  not optimal ranges:
  "NULL < I < 6",
  "6 <= I <= 6 AND 5 <= J <= 5",
  "6 < I < 8",
  "8 <= I <= 8",
  "8 < I"
  
  A second problem can be seen above: key_or() may create 
  adjacent ranges that could be replaced with a single range. 
  Fixes for this is also included in the patch so that the range
  above becomes correct AND optimal:
  "NULL < I < 6",
  "6 <= I <= 6 AND 5 <= J <= 5",
  "6 < I"
  
  Merging adjacent ranges like this gives a slightly lower cost 
  estimate for the range access.
This commit is contained in:
Sergey Petrunya
2011-08-05 22:01:49 +04:00
parent 332b47d718
commit 0e19f3e36f
6 changed files with 200 additions and 11 deletions

View File

@ -9044,11 +9044,53 @@ key_or(RANGE_OPT_PARAM *param, SEL_ARG *key1,SEL_ARG *key2)
This is the case ("cmp>=0" means that tmp.max >= key2.min):
key2: [----]
tmp: [------------*****]
*/
if (!tmp->next_key_part)
{
/*
tmp->next_key_part is empty: cut the range that is covered
by tmp from key2.
Reason: (key2->next_key_part OR tmp->next_key_part) will be
empty and therefore equal to tmp->next_key_part. Thus, this
part of the key2 range is completely covered by tmp.
*/
if (tmp->cmp_max_to_max(key2) >= 0)
{
/*
tmp covers the entire range in key2.
key2: [----]
tmp: [-----------------]
Move on to next range in key2
*/
key2->increment_use_count(-1); // Free not used tree
key2=key2->next;
continue;
}
else
{
/*
This is the case:
key2: [-------]
tmp: [---------]
Result:
key2: [---]
tmp: [---------]
*/
key2->copy_max_to_min(tmp);
continue;
}
}
/*
The ranges are overlapping but have not been merged because
next_key_part of tmp and key2 are different
next_key_part of tmp and key2 differ.
key2: [----]
tmp: [------------*****]
Result:
Split tmp in two where key2 starts:
key2: [----]
key1: [--------][--*****]
^ ^
@ -9057,7 +9099,7 @@ key_or(RANGE_OPT_PARAM *param, SEL_ARG *key1,SEL_ARG *key2)
SEL_ARG *new_arg=tmp->clone_first(key2);
if (!new_arg)
return 0; // OOM
if ((new_arg->next_key_part= key1->next_key_part))
if ((new_arg->next_key_part= tmp->next_key_part))
new_arg->increment_use_count(key1->use_count+1);
tmp->copy_min_to_min(key2);
key1=key1->insert(new_arg);
@ -9166,12 +9208,21 @@ key_or(RANGE_OPT_PARAM *param, SEL_ARG *key1,SEL_ARG *key2)
^ ^
new_arg tmp
Steps:
0) If tmp->next_key_part is empty: do nothing. Reason:
(key2_cpy->next_key_part OR tmp->next_key_part) will be
empty and therefore equal to tmp->next_key_part. Thus,
the range in key2_cpy is completely covered by tmp
1) Make new_arg with range [tmp.min, key2_cpy.max].
new_arg->next_key_part is OR between next_key_part
of tmp and key2_cpy
2) Make tmp the range [key2.max, tmp.max]
3) Insert new_arg into key1
*/
if (!tmp->next_key_part) // Step 0
{
key2_cpy.increment_use_count(-1); // Free not used tree
break;
}
SEL_ARG *new_arg=tmp->clone_last(&key2_cpy);
if (!new_arg)
return 0; // OOM