methods in Item_bool_func descendants, which gives some advantages:
- Removing the "bool inv" parameter, as its now available through "this"
for Item_func_between and Item_func_in, and is not needed for the other
Item_func_xxx.
- Removing casts
- Making a step to data types plugings
- Changed ER(ER_...) to ER_THD(thd, ER_...) when thd was known or if there was many calls to current_thd in the same function.
- Changed ER(ER_..) to ER_THD_OR_DEFAULT(current_thd, ER...) in some places where current_thd is not necessary defined.
- Removing calls to current_thd when we have access to thd
Part of this is optimization (not calling current_thd when not needed),
but part is bug fixing for error condition when current_thd is not defined
(For example on startup and end of mysqld)
Notable renames done as otherwise a lot of functions would have to be changed:
- In JOIN structure renamed:
examined_rows -> join_examined_rows
record_count -> join_record_count
- In Field, renamed new_field() to make_new_field()
Other things:
- Added DBUG_ASSERT(thd == tmp_thd) in Item_singlerow_subselect() just to be safe.
- Removed old 'tab' prefix in JOIN_TAB::save_explain_data() and use members directly
- Added 'thd' as argument to a few functions to avoid calling current_thd.
Fixed several optimizer issues relatied to GROUP BY:
a) Refering to a SELECT column in HAVING sometimes calculated it twice, which caused problems with non determinstic functions
b) Removing duplicate fields and constants from GROUP BY was done too late for "using index for group by" optimization to work
c) EXPLAIN SELECT ... GROUP BY did wrongly show 'Using filesort' in some cases involving "Using index for group-by"
a) was fixed by:
- Changed last argument to Item::split_sum_func2() from bool to int to allow more flags
- Added flag argument to Item::split_sum_func() to allow on to specify if the item was in the SELECT part
- Mark all split_sum_func() calls from SELECT with SPLIT_SUM_SELECT
- Changed split_sum_func2() to do nothing if called with an argument that is not a sum function and doesn't include sum functions, if we are not an argument to SELECT.
This ensures that in a case like
select a*sum(b) as f1 from t1 where a=1 group by c having f1 <= 10;
That 'a' in the SELECT part is stored as a reference in the temporary table togeher with sum(b) while the 'a' in having isn't (not needed as 'a' is already a reference to a column in the result)
b) was fixed by:
- Added an extra remove_const() pass for GROUP BY arguments before make_join_statistics() in case of one table SELECT.
This allowes get_best_group_min_max() to optimize things better.
c) was fixed by:
- Added test for group by optimization in JOIN::exec_inner for
select->quick->get_type() == QUICK_SELECT_I::QS_TYPE_GROUP_MIN_MAX
item.cc:
- Simplifed Item::split_sum_func2()
- Split test to make them faster and easier to read
- Changed last argument to Item::split_sum_func2() from bool to int to allow more flags
- Added flag argument to Item::split_sum_func() to allow on to specify if the item was in the SELECT part
- Changed split_sum_func2() to do nothing if called with an argument that is not a sum function and doesn't include sum functions, if we are not an argument to SELECT.
opt_range.cc:
- Simplified get_best_group_min_max() by calcuating first how many group_by elements.
- Use join->group instead of join->group_list to test if group by, as join->group_list may be NULL if everything was optimized away.
sql_select.cc:
- Added an extra remove_const() pass for GROUP BY arguments before make_join_statistics() in case of one table SELECT.
- Use group instead of group_list to test if group by, as group_list may be NULL if everything was optimized away.
- Moved printing of "Error in remove_const" to remove_const() instead of having it in caller.
- Simplified some if tests by re-ordering code.
- update_depend_map_for_order() and remove_const() fixed to handle the case where make_join_statistics() has not yet been called (join->join_tab is 0 in this case)
Moving Item_func_spatial_rel from Item_bool_func to Item_bool_func2.
to make OP(const,field) use indexes.
- MBR functions supported OP(const,field) optimization in 10.0,
but were inintentionally broken in an earlier 10.1 change that introduced
a common parent for Item_func_spatial_mbr_rel and Item_func_spatial_precise_rel.
- Precise functions never supported optimization for OP(const,field).
Now both MBR and precise functions support OP(const,field) optimization.
count_sargable_conds() instead for Item_func_in, Item_func_null_predicate,
Item_bool_func2. There other Item_int_func descendants that used to set
"sargable" to true (Item_func_between, Item_equal) already have their
own implementation of count_sargable_conds(). There is no sense to
have two parallel coding models for the same thing.
Item_func_eq's created during conversion of a ROW equality to a conjunction
of scalar equalities did not set cmp_context for its arguments properly,
so some of these created Item_func_eq could be later erroneously eliminated.
Pass THD to find_all_keys() and Item_equal::Item_equal().
In MRR use table->in_use instead of current_thd.
This reduces number of pthread_getspecific() calls from 354 to 320.
Step #8: Adding get_mm_tree() in Item_func, Item_func_between,
Item_func_in, Item_equal. This removes one virtual call item->type()
in queries like:
SELECT * FROM t1 WHERE c BETWEEN const1 AND const2;
SELECT * FROM t1 WHERE c>const;
SELECT * FROM t1 WHERE c IN (const_list);
Step #7 (mostly preparatory for the next step #8):
Splitting the function get_mm_parts() into a virtual method in Item.
This changes a virtual call for item->type() into a virtual call for item->get_mm_tree(),
but also *removes* one virtual call Item_cond::functype(), which used to distinguish
between COND_AND_FUNC vs COND_OR_FUNC.
- Changing Comp_creator::create() and create_swap() to return
Item_bool_rowready_func2 instead of Item_bool_func2, as they
can never return neither Item_func_like nor Item_func_xor
- Changing the first argument of Comp_create::create() and create_swap()
from THD to MEM_ROOT, so the method implementations can now reside in
item_cmpfunc.h instead of item_cmpfunc.cc and thus make the code slightly
easier to read.
Step#5: changing the function remove_eq_conds() into a virtual method in Item.
It removes 6 virtual calls for Item_func::type(), and adds only 2
virtual calls for Item***::remove_eq_conds().
sql_alloc() has additional costs compared to direct mem_root allocation:
- function call: it is defined in a separate translation unit and can't be
inlined
- it needs to call pthread_getspecific() to get THD::mem_root
It is called dozens of times implicitely at least by:
- List<>::push_back()
- List<>::push_front()
- new (for Sql_alloc derived classes)
- sql_memdup()
Replaced lots of implicit sql_alloc() calls with direct mem_root allocation,
passing through THD pointer whenever it is needed.
Number of sql_alloc() calls reduced 345 -> 41 per OLTP RO transaction.
pthread_getspecific() overhead dropped 0.76 -> 0.59
sql_alloc() overhed dropped 0.25 -> 0.06
Step #3: Splitting the function check_equality() into a method in Item.
Implementing Item::check_equality() and Item_func_eq::check_equality().
Implement Item_func_eq::build_equal_items() in addition to
Item_func::build_equal_items() and moving the call for check_equality()
from Item_func::build_equal_items() to Item_func_eq::build_equal_items().
Step#2:
1. Removes the function build_equal_items_for_cond() and
introduces a new method Item::build_equal_items() instead,
with specific implementations in the following Items:
Item (the default implementation)
Item_ident_or_func_or_sum
Item_cond
Item_cond_and
2. Adds a new abstract class Item_ident_or_func_or_sum,
a common parent for Item_ident and Item_func_or_sum,
as they have exactly the same build_equal_items().
3. Renames Item_cond_and::cond_equal to Item_cond_and::m_cond_equal,
to avoid confusion between the member and local variables named
"cond_equal".
- Adding a new class Item_args, represending regular function or
aggregate function arguments array.
- Adding a new class Item_func_or_sum,
a parent class for Item_func and Item_sum
- Moving Item_result_field::name() to Item_func_or_sum(),
as name() is not needed on Item_result_field level.
- Renaming Item::is_bool_func() to is_bool_type(), to avoid assumption
that the item is an Item_func derivant.
- Deriving Item_func_spatial_rel from Item_bool_func rather than Item_int_func
Item_func::print() prints itself as name + "(" + arguments + ")".
Normally that works, but Item_func_interval internally implements its
arguments as one single Item_row. Item_row prints itself as
"(" + values + ")". As a result, "INTERVAL(1,2)" was being printed
as "INTERVAL((1,2))". Fixed with a custom Item_func_interval::print().
Fixing a wrong assymetric code in Arg_comparator::set_cmp_func().
It existed for a long time, but showed up in 10.0.14 after the fix
for "MDEV-6666 Malformed result for CONCAT(utf8_column, binary_string)".
Simply disallowing equality propagation into LIKE.
A more delicate fix is be possible, but it would need too many changes,
which is not desirable in 10.0 at this point.
Moving Item_bool_func2 and Item_func_opt_neg from Item_int_func to
Item_bool_func. Now all functions that return is_bool_func()=true
have a common root class Item_bool_func.
This change is needed to fix MDEV-7149 properly.
The bug is not very important per se, but it was helpful to move
Item_func_strcmp out of Item_bool_func2 (to Item_int_func),
for the purposes of "MDEV-4912 Add a plugin to field types (column types)".