- When the optimizer chose LooseScan, make_join_readinfo() should
use the index that was chosen for LooseScan, and should not try
to find a better (shortest) index.
"HAVING SUM(DISTINCT)": WRONG RESULTS.
ISSUE:
------
If a query uses loose index scan and it has both
AGG(DISTINCT) and MIN()/MAX()functions. Then, result values
of MIN/MAX() is set improperly.
When query has AGG(DISTINCT) then end_select is set to
end_send_group. "end_send_group" keeps doing aggregation
until it sees a record from next group. And, then it will
send out the result row of that group.
Since query also has MIN()/MAX() and loose index scan is
used, values of MIN/MAX() are set as part of loose index
scan itself. Setting MIN()/MAX() values as part of loose
index scan overwrites values computed in end_send_group.
This caused invalid result.
For such queries to work loose index scan should stop
performing MIN/MAX() aggregation. And, let end_send_group to
do the same. But according to current design loose index
scan can produce only one row per group key. If we have both
MIN() and MAX() then it has to give two records out. This is
not possible as interface has to use common buffer
record[0]! for both records at a time.
SOLUTIONS:
----------
For such queries to work we need a new interface for loose
index scan. Hence, do not choose loose_index_scan for such
cases. So a new rule SA7 is introduced to take care of the
same.
SA7: "If Q has both AGG_FUNC(DISTINCT ...) and
MIN/MAX() functions then loose index scan access
method is not used."
mysql-test/r/group_min_max.result:
Expected result.
mysql-test/t/group_min_max.test:
1. Test with various combination of AGG(DISTINCT) and
MIN(), MAX() functions.
2. Corrected the plan for old queries.
sql/opt_range.cc:
A new rule SA7 is introduced.
SHOW PROCESSLIST, SHOW BINLOGS
Fixing post push test failure (MTR does not like giving
127.0.0.1 for localhost incase of --embedded run, it thinks
it is an external ip address)
SHOW PROCESSLIST, SHOW BINLOGS
Problem: A deadlock was occurring when 4 threads were
involved in acquiring locks in the following way
Thread 1: Dump thread ( Slave is reconnecting, so on
Master, a new dump thread is trying kill
zombie dump threads. It acquired thread's
LOCK_thd_data and it is about to acquire
mysys_var->current_mutex ( which LOCK_log)
Thread 2: Application thread is executing show binlogs and
acquired LOCK_log and it is about to acquire
LOCK_index.
Thread 3: Application thread is executing Purge binary logs
and acquired LOCK_index and it is about to
acquire LOCK_thread_count.
Thread 4: Application thread is executing show processlist
and acquired LOCK_thread_count and it is
about to acquire zombie dump thread's
LOCK_thd_data.
Deadlock Cycle:
Thread 1 -> Thread 2 -> Thread 3-> Thread 4 ->Thread 1
The same above deadlock was observed even when thread 4 is
executing 'SELECT * FROM information_schema.processlist' command and
acquired LOCK_thread_count and it is about to acquire zombie
dump thread's LOCK_thd_data.
Analysis:
There are four locks involved in the deadlock. LOCK_log,
LOCK_thread_count, LOCK_index and LOCK_thd_data.
LOCK_log, LOCK_thread_count, LOCK_index are global mutexes
where as LOCK_thd_data is local to a thread.
We can divide these four locks in two groups.
Group 1 consists of LOCK_log and LOCK_index and the order
should be LOCK_log followed by LOCK_index.
Group 2 consists of other two mutexes
LOCK_thread_count, LOCK_thd_data and the order should
be LOCK_thread_count followed by LOCK_thd_data.
Unfortunately, there is no specific predefined lock order defined
to follow in the MySQL system when it comes to locks across these
two groups. In the above problematic example,
there is no problem in the way we are acquiring the locks
if you see each thread individually.
But If you combine all 4 threads, they end up in a deadlock.
Fix:
Since everything seems to be fine in the way threads are taking locks,
In this patch We are changing the duration of the locks in Thread 4
to break the deadlock. i.e., before the patch, Thread 4
('show processlist' command) mysqld_list_processes()
function acquires LOCK_thread_count for the complete duration
of the function and it also acquires/releases
each thread's LOCK_thd_data.
LOCK_thread_count is used to protect addition and
deletion of threads in global threads list. While show
process list is looping through all the existing threads,
it will be a problem if a thread is exited but there is no problem
if a new thread is added to the system. Hence a new mutex is
introduced "LOCK_thd_remove" which will protect deletion
of a thread from global threads list. All threads which are
getting exited should acquire LOCK_thd_remove
followed by LOCK_thread_count. (It should take LOCK_thread_count
also because other places of the code still thinks that exit thread
is protected with LOCK_thread_count. In this fix, we are changing
only 'show process list' query logic )
(Eg: unlink_thd logic will be protected with
LOCK_thd_remove).
Logic of mysqld_list_processes(or file_schema_processlist)
will now be protected with 'LOCK_thd_remove' instead of
'LOCK_thread_count'.
Now the new locking order after this patch is:
LOCK_thd_remove -> LOCK_thd_data -> LOCK_log ->
LOCK_index -> LOCK_thread_count
ISSUE:
------
For UNION of selects, rows examined by the query will be sum
of rows examined by individual select operations and rows
examined for union operation. The value of session level
global counter that is used to count the rows examined by a
select statement should be accumulated and reset before it
is used for next select statement. But we have missed to
reset the same. Because of this examined row count of a
select query is accounted more than once.
SOLUTION:
---------
In union reset the session level global counter used to
accumulate count of examined rows after its value is saved.
mysql-test/r/union.result:
Expected output of testcase added.
mysql-test/t/union.test:
Test to verify examined row count of Union operations.
sql/sql_union.cc:
Reset the value of thd->examined_row_count after
accumulating the value.
Problem:
If there is a predicate on a column referenced by MIN/MAX and
that predicate is not present in all the disjunctions on
keyparts earlier in the compound index, Loose Index Scan will
not return correct result.
Analysis:
When loose index scan is chosen, range optimizer currently
groups all the predicates that contain group parts separately
and minmax parts separately. It therefore applies all the
conditions on the group parts first to the fetched row.
Then in the call to next_max, it processes the conditions
which have min/max keypart.
For ex in the following query:
Select f1, max(f2) from t1 where (f1 = 10 and f2 = 13) or
(f1 = 3) group by f1;
Condition (f2 = 13) would be applied even for rows that
satisfy (f1 = 3) thereby giving wrong results.
Solution:
Do not choose loose_index_scan for such cases. So a new rule
WA2 is introduced to take care of the same.
WA2: "If there are predicates on C, these predicates must
be in conjuction to all predicates on all earlier keyparts
in I."
Todo the same, fix reuses the function get_constant_key_infix().
Since this funciton will fail for all multi-range conditions, it
is re-written to recognize that if the sub-conditions are
equivalent across the disjuncts: it will now succeed.
And to achieve this a new helper function is introduced called
all_same().
The fix also moves the test of NGA3 up to the former only
caller, get_constant_key_infix().
mysql-test/r/group_min_max_innodb.result:
Added test result change for Bug#17909656
mysql-test/t/group_min_max_innodb.test:
Added test cases for Bug#17909656
sql/opt_range.cc:
Introduced Rule WA2 because of Bug#17909656
Typo leading to not including the last list values (partition).
Also improved pruning to skip last partition if not used.
rb#4762 approved by Aditya and Marko.
- Fixed bug that we where using wrong checksum algorithm when using VARCHAR with fixed lenth rows
- Ensure in myisampack that HA_OPTION_NULL_FIELDS is set for tables with null fields.
mysql-test/r/myisampack.result:
Updated results
mysql-test/t/myisampack.test:
Added more tests
storage/myisam/mi_open.c:
Use correct checksum algorithm when we have VARCHAR fields with fixed length records
storage/myisam/myisampack.c:
Ensure HA_OPTION_NULL_FIELDS is set for tables with null fields.
(This was not set by default for not compressed tables without checksums to keep MyISAM tables compatible with MySQL)
It is triple bug with one test suite:
1. Incorrect outer table detection
2. Incorrect leaf table processing for multi-update (should be full like for usual updates and inserts)
3. ON condition fix_fields() fould be called for all tables of the query.
ARE PERMANENTLY SKIPPED IN 5.5/5.6).
The problem was that some result files were not updated,
so the tests were skipped.
The fix is to record updated result files.
Bug#18396916 MAIN.OUTFILE_LOADDATA TEST FAILS ON ARM, AARCH64, PPC/PPC64
The recorded results for the failing tests were wrong.
They were introduced by the patch for
Bug#30946 mysqldump silently ignores --default-character-set when used with --tab
Correct results were returned for platforms where 'char' is implemented as unsigned.
This was reported as
Bug#46895 Test "outfile_loaddata" fails (reproducible)
Bug#11755168 46895: TEST "OUTFILE_LOADDATA" FAILS (REPRODUCIBLE)
The patch for that bug fixed only parts of the problem,
leaving the incorrect results in the .result file.
Solution: use 'uchar' for field_terminator and line_terminator on all platforms.
Also: remove some un-necessary casts, leaving the ones we actually need.
MDEV-5980: EITS: if condition is used for REF access, its selectivity is still in filtered%
MDEV-5985: EITS: selectivity estimates look illogical for join and non-key equalities
MDEV-6003: EITS: ref access, keypart2=const vs keypart2=expr - inconsistent filtered% value
- Made a number of fixes in table_cond_selectivity() so that it returns
correct selectivity estimates.
- Added comments in related code.
Better comments
MDEV-5985: EITS: selectivity estimates look illogical for join and non-key equalities
MDEV-6003: EITS: ref access, keypart2=const vs keypart2=expr - inconsistent filtered% value
- Made a number of fixes in table_cond_selectivity() so that it returns
correct selectivity estimates.
- Added comments in related code.
Back-ported from the mysql 5.6 code line the patch with
the following comment:
Fix for Bug#11757108 CHANGE IN EXECUTION PLAN FOR COUNT_DISTINCT_GROUP_ON_KEY
CAUSES PEFORMANCE REGRESSION
The cause for the performance regression is that the access strategy for the
GROUP BY query is changed form using "index scan" in mysql-5.1 to use "loose
index scan" in mysql-5.5. The index used for group by is unique and thus each
"loose scan" group will only contain one record. Since loose scan needs to
re-position on each "loose scan" group this query will do a re-position for
each index entry. Compared to just reading the next index entry as a normal
index scan does, the use of loose scan for this query becomes more expensive.
The cause for selecting to use loose scan for this query is that in the current
code when the size of the "loose scan" group is one, the formula for
calculating the cost estimates becomes almost identical to the cost of using
normal index scan. Differences in use of integer versus floating point arithmetic
can cause one or the other access strategy to be selected.
The main issue with the formula for estimating the cost of using loose scan is
that it does not take into account that it is more costly to do a re-position
for each "loose scan" group compared to just reading the next index entry.
Both index scan and loose scan estimates the cpu cost as:
"number of entries needed too read/scan" * ROW_EVALUATE_COST
The results from testing with the query in this bug indicates that the real
cost for doing re-position four to eight times higher than just reading the
next index entry. Thus, the cpu cost estimate for loose scan should be increased.
To account for the extra work to re-position in the index we increase the
cost for loose index scan to include the cost of navigating the index.
This is modelled as a function of the height of the b-tree:
navigation cost= ceil(log(records in table)/log(indexes per block))
* ROWID_COMPARE_COST;
This will avoid loose index scan being used for indexes where the "loose scan"
group contains very few index entries.
back-ported the patch for bug #13256831 from mysql-5.6 code line.
Here's the comment this patch was provided with:
Fixed bug#13256831 - ERROR 1032 (HY000): CAN'T FIND RECORD.
This bug only occurs if a user tries to update a base table using
an updatable view and this view was created as a join for which
the clause 'WITH CHECK OPTION' was specified.
The reason for the bug was that when such an update was
executed, row positions were not properly handled for tables
that were not updated but had constraints that had to be
checked due to the 'WITH CHECK OPTION' clause.
The reason for the bug was that when such update is executed
then for tables specified in the view definition and
also listed in the 'WITH CHECK OPTION' clause the positioning to
row being updated is not performed.
Both bugs are caused by the same problem: the function optimize_cond() should
update the value of *cond_equal rather than the value of join->cond_equal,
because it is called not only for the WHERE condition, but for the HAVING
condition as well.
Add a testcase and backport this fix:
Bug#14338686: MYSQL IS GENERATING DIFFERENT AND SLOWER
(IN NEWER VERSIONS) EXECUTION PLAN
PROBLEM:
While checking for an index to sort for the order by clause
in this query
"SELECT datestamp FROM contractStatusHistory WHERE
contract_id = contracts.id ORDER BY datestamp asc limit 1;"
we do not calculate the number of rows to be examined correctly.
As a result we choose index 'idx_contractStatusHistory_datestamp'
defined on the 'datestamp' field, rather than choosing index
'contract_id'. And hence the lower performance.
ANALYSIS:
While checking if an index is present to give the records in
sorted order(datestamp), we consider the selectivity of the
'ref_key'(contract_id here) using 'table->quick_condition_rows'.
'ref_key' here can be an index from 'REF_ACCESS' or from 'RANGE'.
As this is a 'REF_ACCESS', 'table->quick_condition_rows' is not
set to the actual value which is 2. Instead is set to the number
of tuples present in the table indicating that every row that
is selected would be satisfying the condition present in the query.
Hence, the selectivity becomes 1 even when we choose the index
on the order by column instead of the join_condition.
But, in reality as only 2 rows satisy the condition, we need to
examine half of the entire data set to get one tuple when we
choose index on the order by column.
Had we chosen the 'REF_ACCESS' we would have examined only 2 tuples.
Hence the delay in executing the query specified.
FIX:
While calculating the selectivity of the ref_key:
For REF_ACCESS consider quick_rows[ref_key] if range
optimizer has an estimate for this key. Else consider
'rec_per_key' statistic.
For RANGE ACCESS consider 'table->quick_condition_rows'.
- Make JOIN::const_key_parts include keyparts for which
the WHERE clause has an equality in form
"t.key_part=reference_outside_this_select"
- This allows to avoid filesort'ing in some cases (and also
avoid a difficult choice between using filesort or using an index)
The problem was that the view substitute its fields (on prepare) with reverting the change after execution. After prepare on optimization exists2in convertion substituted arguments of '=' with constsnt '1', but then one of the arguments of '=' was reverted to the view field reference.This lead to incorrect WHERE condition on the second execution.
To fix the problem we replace whole '=' with '1' permannently.
MDEV-5984: EITS: Incorrect filtered% value for single-table select with range access
- Fix calculate_cond_selectivity_for_table() to work correctly with range accesses
over multi-component keys:
= First, take selectivity of all possible range scans into account. Remember which
fields were used bt the range scans.
= Then, calculate selectivity produced by sargable predicates on fields. If a
field was used in a possible range access, assume its selectivity is already
taken into account.
- Fix table_cond_selectivity(): when quick select is used, selectivity of
COND(table) is taken into account in matching_candidates_in_table(). In
table_cond_selectivity() we should not apply it for the second time.