Adjust costs for doing index scan in cost_group_min_max()

The idea is that when doing a tree dive (once per group), we need to compare key values, which is fast. For each new group, we have to compare the full where clause for the row. Compared to original code, the cost of group_min_max() has slightly increased which affects some test with only a few rows. main.group_min_max and main.distinct have been modified to show the effect of the change. The patch also adjust the number of groups in case of quick selects: - For simple WHERE clauses, ensure that we have at least as many groups as we have conditions on the used group-by key parts. The assumption is that each condition will create at least one group. - Ensure that there are no more groups than rows found by quick_select Test changes: - For some small tables there has been a change of Using index for group-by -> Using index for group-by (scanning) Range -> Index and Using index for group-by -> Using index
2025-08-08 11:22:35 +03:00 · 2021-10-08 02:36:58 +03:00
parent bc9805e954
commit 6fa7451759
18 changed files with 223 additions and 104 deletions
--- a/sql/opt_range.cc
+++ b/sql/opt_range.cc
@@ -13728,6 +13728,12 @@ cost_group_min_max(TABLE* table, KEY *index_info, uint used_key_parts,
    The group-by list is a permutation of the select attributes, according
    to their order in the index.

+  EXAMPLES of handled queries
+    select max(keypart2) from t1 group by keypart1
+    select max(keypart2) from t1 where  keypart2 <= const group by keypart1
+    select distinct keypart1 from table;
+    select count(distinct keypart1) from table;
+
  TODO
  - What happens if the query groups by the MIN/MAX field, and there is no
    other field as in: "select MY_MIN(a) from t1 group by a" ?
@@ -14820,7 +14826,22 @@ get_field_keypart(KEY *index, Field *field)
    This method computes the access cost of a TRP_GROUP_MIN_MAX instance and
    the number of rows returned.

+    The used algorithm used for finding the rows is:
+
+    For each range (if no ranges, the range is the whole table)
+      Do an index search for the start of the range
+      As long as the found key is withing the range
+        If the found row matches the where clause, use the row otherwise skip it
+        Scan index for next group, jumping over all identical keys, done in
+        QUICK_GROUP_MIN_MAX_SELECT::next_prefix
+        If the engine does not support a native next_prefix, we will
+        either scan the index for the next value or do a new index dive
+        with 'find next bigger key'.
+
  NOTES
+    See get_best_group_min_max() for which kind of queries this function
+    will be called.
+
    The cost computation distinguishes several cases:
    1) No equality predicates over non-group attributes (thus no key_infix).
       If groups are bigger than blocks on the average, then we assume that it
@@ -14887,17 +14908,18 @@ void cost_group_min_max(TABLE* table, KEY *index_info, uint used_key_parts,
  if (!group_key_parts)
  {
    /* Summary over the whole table */
-    keys_per_group= table_records;
+    keys_per_group= MY_MAX(table_records,1);
  }
  else
  {
    keys_per_group= (ha_rows) index_info->actual_rec_per_key(group_key_parts -
                                                             1);
+    if (keys_per_group == 0) /* If there is no statistics try to guess */
+    {
+      /* each group contains 10% of all records */
+      keys_per_group= (table_records / 10) + 1;
+    }
  }
-
-  if (keys_per_group == 0) /* If there is no statistics try to guess */
-    /* each group contains 10% of all records */
-    keys_per_group= (table_records / 10) + 1;
  num_groups= (table_records / keys_per_group) + 1;

  /* Apply the selectivity of the quick select for group prefixes. */
@@ -14906,7 +14928,21 @@ void cost_group_min_max(TABLE* table, KEY *index_info, uint used_key_parts,
    quick_prefix_selectivity= (double) quick_prefix_records /
                              (double) table_records;
    num_groups= (ha_rows) rint(num_groups * quick_prefix_selectivity);
-    set_if_bigger(num_groups, 1);
+    /*
+      Expect at least as many groups as there is ranges in the index
+      WHERE a in (1,2,3) GROUP BY a
+
+      This is mostly relevant for queries with few records, which is
+      something we have a lot of in our test suites.
+      In theory it is possible to scan the index_tree and for cases
+      where all ranges are eq ranges, we could calculate the exact number
+      of groups. This is probably an overkill so for now we estimate
+      the lower level of number of groups by the range elements in the
+      tree.
+    */
+    set_if_bigger(num_groups, MY_MAX(index_tree->elements, 1));
+    /* There cannot be more groups than matched records */
+    set_if_smaller(num_groups, quick_prefix_records);
  }
  /* Ensure we don't have more groups than rows in table */
  set_if_smaller(num_groups, table_records);
@@ -14943,21 +14979,22 @@ void cost_group_min_max(TABLE* table, KEY *index_info, uint used_key_parts,
    group. To make the CPU cost reflect this, we estimate the CPU cost
    as the sum of:
    1. Cost for evaluating the condition for each num_group
-       (1/TIME_FOR_COMPARE) (similarly as for index scan).
+       (1/TIME_FOR_COMPARE_IDX) (similarly as for index scan).
    2. Cost for navigating the index structure (assuming a b-tree).
       Note: We only add the cost for one index comparision per block. For a
             b-tree the number of comparisons will be larger. However the cost
             is low as all of the upper level b-tree blocks should be in
             memory.
       TODO: This cost should be provided by the storage engine.
+    3. Cost for comparing the row with the where clause
  */
  const double tree_traversal_cost=
    ceil(log(static_cast<double>(table_records))/
         log(static_cast<double>(keys_per_block))) * 
-    1/(2*TIME_FOR_COMPARE);
+    1/(TIME_FOR_COMPARE_IDX);

  const double cpu_cost= (num_groups *
-                          (tree_traversal_cost + 1/TIME_FOR_COMPARE_IDX));
+                          (tree_traversal_cost + 1/TIME_FOR_COMPARE));

  *read_cost= io_cost + cpu_cost;
  *records= num_groups;