ht->start_consistent_snapshot() is also not a way,
because some engines (e.g. rocksdb) only do it readonly.
instead, downgrade the lock after reading the first row
(which implicitly opens a read view).
* Log rows in online_alter_binlog.
* Table online data is replicated within dedicated binlog file
* Cached data is written on commit.
* Versioning is fully supported.
* Works both wit and without binlog enabled.
* For now savepoints setup is forbidden while ONLINE ALTER goes on.
Extra support is required. We can simply log the SAVEPOINT query events
and replicate them together with row events. But it's not implemented
for now.
* Cache flipping:
We want to care for the possible bottleneck in the online alter binlog
reading/writing in advance.
IO_CACHE does not provide anything better that sequential access,
besides, only a single write is mutex-protected, which is not suitable,
since we should write a transaction atomically.
To solve this, a special layer on top Event_log is implemented.
There are two IO_CACHE files underneath: one for reading, and one for
writing.
Once the read cache is empty, an exclusive lock is acquired (we can wait
for a currently active transaction finish writing), and flip() is emitted,
i.e. the write cache is reopened for read, and the read cache is emptied,
and reopened for writing.
This reminds a buffer flip that happens in accelerated graphics
(DirectX/OpenGL/etc).
Cache_flip_event_log is considered non-blocking for a single reader and a
single writer in this sense, with the only lock held by reader during flip.
An alternative approach by implementing a fair concurrent circular buffer
is described in MDEV-24676.
* Cache managers:
We have two cache sinks: statement and transactional.
It is important that the changes are first cached per-statement and
per-transaction.
If a statement fails, then only statement data is rolled back. The
transaction moves along, however.
Turns out, there's no guarantee that TABLE well persist in
thd->open_tables to the transaction commit moment.
If an error occurs, tables from statement are purged.
Therefore, we can't store te caches in TABLE. Ideally, it should be
handlerton, but we cut the corner and store it in THD in a list.
Event_log is supposed to be a basic logging class that can write events in
a single file.
MYSQL_BIN_LOG in comparison will have:
* rotation support
* index files
* purging
* gtid and transactional information handling.
* is dedicated for a general-purpose binlog
it was redundant, duplicating vcol_type == VCOL_GENERATED_STORED.
Note that VCOL_DEFAULT is not "stored", "stored vcol" means that after
rnd_next or index_read/etc the field value is already in the record[0]
and does not need to be calculated separately
make TRANSACTIONAL table option behave similar to other engine-defined
table options. If the engine doesn't suport it:
* if specified expicitly in CREATE or ALTER - it's ER_UNKNOWN_OPTION
* an error or a warning depending on sql_mode IGNORE_BAD_TABLE_OPTIONS
* in ALTER TABLE from the engine that suppors it to the engine that
doesn't - silently preserved (no warning)
* it is commented out in SHOW CREATE unless IGNORE_BAD_TABLE_OPTIONS
* invoke check_expression() for all vcol_info's in
mysql_prepare_create_table() to check for FK CASCADE
* also check for SET NULL and SET DEFAULT
* to check against existing FKs when a vcol is added in ALTER TABLE,
old FKs must be added to the new_key_list just like other indexes are
* check columns recursively, if vcol1 references vcol2,
flags of vcol2 must be taken into account
* remove check_table_name_processor(), put that logic under
check_vcol_func_processor() to avoid walking the tree twice
mark old keys in the ALTER TABLE with the `old` flag, not with
the `key_create_info.check_for_duplicate_indexes`.
This allows to mark old foreign keys too.
differently react to SQL_MODE => unusable SHOW CREATE
Use abort_on_warning dependent on strict mode over create new table
like it is done for copy data and inplace alter.
The DBUG_ASSER in HA_CREATE_INFO::resolve_to_charset_collation_context()
didn't take into account that the second execution is possible not only
during a prepared EXECUTE, but also during a CALL.
This patch adds a way to override default collations
(or "character set collations") for desired character sets.
The SQL standard says:
> Each collation known in an SQL-environment is applicable to one
> or more character sets, and for each character set, one or more
> collations are applicable to it, one of which is associated with
> it as its character set collation.
In MariaDB, character set collations has been hard-coded so far,
e.g. utf8mb4_general_ci has been a hard-coded character set collation
for utf8mb4.
This patch allows to override (globally per server, or per session)
character set collations, so for example, uca1400_ai_ci can be set as a
character set collation for Unicode character sets
(instead of compiled xxx_general_ci).
The array of overridden character set collations is stored in a new
(session and global) system variable @@character_set_collations and
can be set as a comma separated list of charset=collation pairs, e.g.:
SET @@character_set_collations='utf8mb3=uca1400_ai_ci,utf8mb4=uca1400_ai_ci';
The variable is empty by default, which mean use the hard-coded
character set collations (e.g. utf8mb4_general_ci for utf8mb4).
The variable can also be set globally by passing to the server startup command
line, and/or in my.cnf.
- When foreign_key_check is disabled, allowing to modify the
column which is part of foreign key constraint can lead to
refusal of TRUNCATE TABLE, OPTIMIZE TABLE later. So it make
sense to block the column modify operation when foreign key
is involved irrespective of foreign_key_check variable.
Correct way to modify the charset of the column when fk is involved:
SET foreign_key_checks=OFF;
ALTER TABLE child DROP FOREIGN KEY fk, MODIFY m VARCHAR(200) CHARSET utf8mb4;
ALTER TABLE parent MODIFY m VARCHAR(200) CHARSET utf8mb4;
ALTER TABLE child ADD CONSTRAINT FOREIGN KEY (m) REFERENCES PARENT(m);
SET foreign_key_checks=ON;
fk_check_column_changes(): Remove the FOREIGN_KEY_CHECKS while
checking the column change for foreign key constraint. This
is the partial revert of commit 5f1f2fc0e443f098af24d21f7d1ec1a8166a4030
and it changes the behaviour of copy alter algorithm
ha_innobase::prepare_inplace_alter_table(): Find the modified
column and check whether it is part of existing and newly
added foreign key constraint.
Problem for Galera is the fact that sequences are not really
transactional. Sequence operation is committed immediately
in sql_sequence.cd and later Galera could find out that
we have changes but actual statement is not there anymore.
Therefore, we must make some restrictions what kind
of sequences Galera can support.
(1) Galera cluster supports only sequences implemented
by InnoDB storage engine. This is because Galera replication
supports currently only InnoDB.
(2) We do not allow LOCK TABLE on sequence object and
we do not allow sequence creation under LOCK TABLE, instead
lock is released and we issue warning.
(3) We allow sequences with NOCACHE definition or with
INCREMEMENT BY 0 CACHE=n definition. This makes sure that
sequence values are unique accross Galera cluster.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Type_handler::partition_field_append_value() erroneously
passed the address of my_collation_contextually_typed_binary
to conversion functions copy_and_convert() and my_convert().
This happened because generate_partition_syntax_for_frm()
was called from mysql_create_frm_image() in the stage when
the fields in List<Create_field> can still contain unresolved
contextual collations, like "binary" in the reported crash scenario:
ALTER TABLE t CHANGE COLUMN a a CHAR BINARY;
Fix:
1. Splitting mysql_prepare_create_table() into two parts:
- mysql_prepare_create_table_stage1() interates through
List<Create_field> and calls Create_field::prepare_stage1(),
which performs basic attribute initialization, including
context collation resolution.
- mysql_prepare_create_table_finalize() - the rest of the
old mysql_prepare_create_table() code.
2. Changing mysql_create_frm_image():
It now calls:
- mysql_prepare_create_table_stage1() in the very
beginning, before the partition related code.
- mysql_prepare_create_table_finalize() in the end,
instead of the old mysql_prepare_create_table() call
3. Adding mysql_prepare_create_table() as a wrapper
for two calls:
mysql_prepare_create_table_stage1() ||
mysql_prepare_create_table_finalize()
so the code stays unchanged in the other places
where mysql_prepare_create_table() was used.
4. Changing prototype for Type_handler::Column_definition_prepare_stage1()
Removing arguments:
- handler *file
- ulonglong table_flags
Adding a new argument instead:
- column_definition_type_t type
This allows to call Column_definition_prepare_stage1() and
therefore to call mysql_prepare_create_table_stage1()
before instantiation of a handler.
This simplifies the code, because in case of a partitioned table,
mysql_create_frm_image() creates a handler of the underlying
partition first, the frees it and created a ha_partition
instance instead.
mysql_prepare_create_table() before the fix was called with the final
(ha_partition) handler.
5. Moving parts of Column_definition_prepare_stage1() which
need a pointer to handler and table_flags to
Column_definition_prepare_stage2().
Extended keys works by first checking if the engine supports extended
keys.
If yes, it extends secondary key with primary key components and mark the
secondary keys as HA_EXT_NOSAME (unique).
If we later notice that there where no primary key, the extended key
information for secondary keys in share->key_info is reset. However the
key_info->flag HA_EXT_NOSAME was not reset!
This causes some strange things to happen:
- Tables that have no primary key or secondary index that contained the
primary key would be wrongly optimized as the secondary key could be
thought to be unique when it was not and not unique when it was.
- The problem was not shown in EXPLAIN because of a bug in
create_ref_for_key() that caused EQ_REF to be displayed by EXPLAIN as REF
when extended keys where used and the secondary key contained the primary
key.
This is fixed with:
- Removed wrong test in make_join_select() which did not detect that key
where unique when a secondary key contains the primary.
- Moved initialization of extended keys from create_key_infos() to
init_from_binary_frm_image() after we know if there is a usable primary
key or not. One disadvantage with this approach is that
key_info->key_parts may have not used slots (for keys we thought could
be extended but could not). Fixed by adding a check for unused key_parts
to copy_keys_from_share().
Other things:
- Simplified copying of first key part in create_key_infos().
- Added a lot of code comments in code that I had to check as part of
finding the issue.
- Fixed some indentation.
- Replaced a couple of looks using references to pointers in C
context where the reference does not give any benefit.
- Updated Aria and Maria to not assume the all key_info->rec_per_key
are in one memory block (this could happen when using dervived
tables with many keys).
- Fixed a bug where key_info->rec_per_key where not allocated
- Optimized TABLE::add_tmp_key() to only call alloc() once.
(No logic changes)
Test case changes:
- innodb_mysql.test changed index as an index the optimizer thought
was unique, was not. (Table had no primary key)
TODO:
- Move code that checks for partial or too long keys to the primary loop
earlier that initally decides if we should add extended key fields.
This is needed to ensure that HA_EXT_NOSAME is not set for partial or
too long keys. It will also shorten the current code notable.