is reached
Problem was bad error handling, leaving some new temporary
partitions locked and initialized and some not yet initialized
and locked, leading to a crash when trying to unlock the not
yet initialized and locked partitions
Solution was to unlock the already locked partitions, and not
include any of the new temporary partitions in later unlocks
"insert into.. select * from"
When inserting into a partitioned table using 'insert into
<target> select * from <src>', read_buffer_size bytes of memory
are allocated for each partition in the target table.
This resulted in large memory consumption when the number of
partitions are high.
This patch introduces a new method which tries to estimate the
buffer size required for each partition and limits the maximum
buffer size used to maximum of 10 * read_buffer_size,
11 * read_buffer_size in case of monotonic partition functions.
(Backport)
Problem is that when insert (ha_start_bulk_insert) in i partitioned table,
it will call ha_start_bulk_insert for every partition, used or not.
Solution is to delay the call to the partitions ha_start_bulk_insert until
the first row is to be inserted into that partition
Inserting a negative value in the autoincrement column of a
partitioned innodb table was causing the value of the auto
increment counter to wrap around into a very large positive
value. The consequences are the same as if a very large positive
value was inserted into a column, e.g. reduced autoincrement
range, failure to read autoincrement counter.
The current patch ensures that before calculating the next
auto increment value, the current value is within the positive
maximum allowed limit.
INSERT ... SELECT ...
Problem was that when bulk insert is used on an empty
table/partition, it disables the indexes for better
performance, but in this specific case it also tries
to read from that partition using an index, which is
not possible since it has been disabled.
Solution was to allow index reads on disabled indexes
if there are no records.
Also reverted the patch for bug#38005, since that was a workaround
in the partitioning engine instead of a fix in myisam.
column on partitioned table
An assertion 'ASSERT_COULUMN_MARKED_FOR_READ' is failed if the query
is executed with index containing double column on partitioned table.
The problem is that assertion expects all the fields which are read,
to be in the read_set.
In this query only the field 'a' is in the readset as the tables in
the query are joined by the field 'a' and so the assertion fails
expecting other field 'b'.
Since the function cmp() is just comparison of two parameters passed,
the assertion is not required.
Fixed by removing the assertion in the double fields comparision
function and also fixed the index initialization to do ordered
index scan with RW LOCK which ensures all the fields from a key are in
the read_set.
Note: this bug is not reproducible with other datatypes because the
assertion doesn't exist in comparision function for other
datatypes.
We disallow the partitioning of a log table. You could however
partition a table first, and then point logging to it. This is
not only against the docs, it also crashes the server.
We catch this case now.
Problem was that a failing rename just left the partitions at the state
it was at the failure.
Solution was to try to revert the started rename if a failure occured.
NOTE: This undoes changes by BUG#42829 in sql_class.cc:binlog_query().
I will revert the change in a post-push fix (the binlog filter should
be checked in sql_base.cc:decide_logging_format()).
General overview:
The logic for switching to row format when binlog_format=MIXED had
numerous flaws. The underlying problem was the lack of a consistent
architecture.
General purpose of this changeset:
This changeset introduces an architecture for switching to row format
when binlog_format=MIXED. It enforces the architecture where it has
to. It leaves some bugs to be fixed later. It adds extensive tests to
verify that unsafe statements work as expected and that appropriate
errors are produced by problems with the selection of binlog format.
It was not practical to split this into smaller pieces of work.
Problem 1:
To determine the logging mode, the code has to take several parameters
into account (namely: (1) the value of binlog_format; (2) the
capabilities of the engines; (3) the type of the current statement:
normal, unsafe, or row injection). These parameters may conflict in
several ways, namely:
- binlog_format=STATEMENT for a row injection
- binlog_format=STATEMENT for an unsafe statement
- binlog_format=STATEMENT for an engine only supporting row logging
- binlog_format=ROW for an engine only supporting statement logging
- statement is unsafe and engine does not support row logging
- row injection in a table that does not support statement logging
- statement modifies one table that does not support row logging and
one that does not support statement logging
Several of these conflicts were not detected, or were detected with
an inappropriate error message. The problem of BUG#39934 was that no
appropriate error message was written for the case when an engine
only supporting row logging executed a row injection with
binlog_format=ROW. However, all above cases must be handled.
Fix 1:
Introduce new error codes (sql/share/errmsg.txt). Ensure that all
conditions are detected and handled in decide_logging_format()
Problem 2:
The binlog format shall be determined once per statement, in
decide_logging_format(). It shall not be changed before or after that.
Before decide_logging_format() is called, all information necessary to
determine the logging format must be available. This principle ensures
that all unsafe statements are handled in a consistent way.
However, this principle is not followed:
thd->set_current_stmt_binlog_row_based_if_mixed() is called in several
places, including from code executing UPDATE..LIMIT,
INSERT..SELECT..LIMIT, DELETE..LIMIT, INSERT DELAYED, and
SET @@binlog_format. After Problem 1 was fixed, that caused
inconsistencies where these unsafe statements would not print the
appropriate warnings or errors for some of the conflicts.
Fix 2:
Remove calls to THD::set_current_stmt_binlog_row_based_if_mixed() from
code executed after decide_logging_format(). Compensate by calling the
set_current_stmt_unsafe() at parse time. This way, all unsafe statements
are detected by decide_logging_format().
Problem 3:
INSERT DELAYED is not unsafe: it is logged in statement format even if
binlog_format=MIXED, and no warning is printed even if
binlog_format=STATEMENT. This is BUG#45825.
Fix 3:
Made INSERT DELAYED set itself to unsafe at parse time. This allows
decide_logging_format() to detect that a warning should be printed or
the binlog_format changed.
Problem 4:
LIMIT clause were not marked as unsafe when executed inside stored
functions/triggers/views/prepared statements. This is
BUG#45785.
Fix 4:
Make statements containing the LIMIT clause marked as unsafe at
parse time, instead of at execution time. This allows propagating
unsafe-ness to the view.
the auto_increment value
This is an alternative patch that instead of allowing RECREATE TABLE
on TRUNCATE TABLE it implements reset_auto_increment that is called
after delete_all_rows.
Note: this bug was fixed by Mattias Jonsson:
Pusing this patch: http://lists.mysql.com/commits/70370
Problem is that DATA_FREE in SHOW TABLE STATUS
is not correct when not using innodb_file_per_table.
The solution is to use I_S.PARTITIONS instead.
This is only a small fix for correcting mean record length and
always return 0 if the table is empty.
'lock wait timeout exceeded'
Problem was a bug in the implementation of scan in partitioning
which masked the error code from the partition's handler.
Fixed by returning the value from the underlying handler.
mysql-test/t/partition.test
sql/ha_partition.cc
Bug#40954: Crash in MyISAM index code with concurrency test using partitioned tables
Problem was usage of read_range_first with an empty key.
Solution was to not to give a key if it was empty. (real author Mattias Jonsson)
storage/archive/archive_reader.c
client/mysqlslap.c
Aligned the copyright texts output from "--version" of tools, to
let internal tools be able to change them if needed.
storage/ndb/test/tools/connect.cpp
storage/ndb/test/run-test/atrt.hpp
Corrected a few GPL headers not restricted to GPL version 2
Makefile.am
Added missing --report-features to the 'test-bt-fast' target
support-files/mysql.spec.sh
Reversed the removal of the "%define license GPL" in as internal
tools depended on it
on tables with partitions
Problem was that the handler function try_semi_consistent_read
was not propagated to the innodb handler.
Solution was to implement that function in the partitioning
handler.
order by
Problem was that the first index read was unordered,
and the next was ordered, resulting in use of
uninitialized data.
Solution was to use the correct variable to see if
the 'next' call should be ordered or not.
Problem was that partitioning cached the table flags.
These flags could change due to TRANSACTION LEVEL changes.
Solution was to remove the cache and always return the table flags
from the first partition (if the handler was initialized).
breaks auto increment
The auto_increment value was not initialized if
the first statement after opening a table was
an 'UPDATE'.
solution was to check initialize if it was not,
before trying to increase it in update.
on non-partitioned table
Problem was that partitioning specific commands was accepted
for non partitioned tables and treated like
ANALYZE/CHECK/OPTIMIZE/REPAIR TABLE, after bug-20129 was fixed,
which changed the code path from mysql_alter_table to
mysql_admin_table.
Solution was to check if the table was partitioned before
trying to execute the admin command
index column
There was actually two problems
1) when clustered pk, order by non pk index should also
compare with pk as last resort to differ keys from each
other
2) bug in the index search handling in ha_partition (was
found when extending the test case
Solution to 1 was to include the pk in key compare if
clustered pk and search on other index.
Solution for 2 was to remove the optimization from
ordered scan to unordered scan if clustered pk.
MyISAM blocks index usage for bulk insert into zero-records tables.
See ha_myisam::start_bulk_insert() lines from
...
if (file->state->records == 0 ...
...
That causes problems for partition engine when some partitions have records some not
as the engine uses same access method for all partitions.
Now partition engine doesn't call index_first/index_last
for empty tables.
per-file comments:
mysql-test/r/partition.result
Bug#38005 Partitions: error with insert select.
test result
mysql-test/t/partition.test
Bug#38005 Partitions: error with insert select.
test case
sql/ha_partition.cc
Bug#38005 Partitions: error with insert select.
ha_engine::index_first and
ha_engine::index_last not called for empty tables.
InnoDB Plugin locks table
The fast/on-line add/drop index handler calls was not implemented
whithin the partitioning.
This implements it in the partitioning handler.
Since this is only used by the not included InnoDB plugin, there
is no test case. (Have tested it manually with the plugin, and
it does not allow unique indexes not including partitioning
function, or removal of pk, which in innodb generates a new pk,
which is not in the partitioning function.)
NOTE: This introduces a new handler method, and because of that
changes the storage engine api. (One cannot use a handlerton to
see the capabilities of a table's handler if it is partitioned.
So I added a wrapper function in the handler that defaults to
the handlerton function, which the partitioning handler overrides.