When executing a query like "select id, 0 as const, val from ...", there are 3 columns(items) in Query->select at handlerton->create_group_by(). After that, MariaDB makes a temporary table with 2 columns. The skipped items are const item, so fixing Spider to skip const items for items at Query->select.
Prototype change:
- virtual ha_rows records_in_range(uint inx, key_range *min_key,
- key_range *max_key)
+ virtual ha_rows records_in_range(uint inx, const key_range *min_key,
+ const key_range *max_key,
+ page_range *res)
The handler can ignore the page_range parameter. In the case the handler
updates the parameter, the optimizer can deduce the following:
- If previous range's last key is on the same block as next range's first
key
- If the current key range is in one block
- We can also assume that the first and last block read are cached!
This can be used for a better calculation of IO seeks when we
estimate the cost of a range index scan.
The parameter is fully implemented for MyISAM, Aria and InnoDB.
A separate patch will update handler::multi_range_read_info_const() to
take the benefits of this change and also remove the double
records_in_range() calls that are not anymore needed.
This was to remove a performance regression between 10.3 and 10.4
In 10.5 we will have a better implementation of records_in_range
that will enable us to get more statistics.
This change was not done in 10.4 because the 10.5 will be part of
a larger change that is not suitable for the GA 10.4 version
Other things:
- Changed default handler block_size to 8192 to fix things statistics
for engines that doesn't set the block size.
- Fixed a bug in spider when using multiple part const ranges
(Patch from Kentoku)
The MDEV-20265 commit e746f451d5
introduces DBUG_ASSERT(right_op == r_tbl) in
st_select_lex::add_cross_joined_table(), and that assertion would
fail in several tests that exercise joins. That commit was skipped
in this merge, and a separate fix of MDEV-20265 will be necessary in 10.4.
Add the following parameter.
- spider_sync_sql_mode
Local sql_mode synchronous existence to remote server.
0 : It doesn't synchronize.
1 : It synchronizes.
The default value is 1
followup for be5c432a42
ha_partition::calculate_checksum() has to invoke calculate_checksum()
for partitions unconditionally, not under (HA_HAS_OLD_CHECKSUM | HA_HAS_NEW_CHECKSUM).
Because the server uses ::info() to ask for a live checksum, while
calculate_checksum() must, precisely, calculate it the slow way,
also for tables that don't have the live checksum at all.
Also, fix the compilation on Windows (ha_checksum/ulonglong type mix).
test case:
t1 is a spider engine table;
CREATE TABLE `t1` (
`id` int(11) NOT NULL DEFAULT '0',
`name` char(64) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=SPIDER
query: "select * from t1 where name not like 'x%' " would dispatch "select xxx name name like 'x%' " to remote mysqld, is wrong
Add the following parameters.
- spider_remote_wait_timeout
Set remote wait_timeout at connecting for improvement performance of
connection if you know.
-1,0 : No set.
1 or more : Seconds.
The default value is -1
- spider_wait_timeout
The wait time to remote servers.
-1,0 : No set.
1 or more : Seconds.
The default value is 604800
Fix the following valgrind error.
==94390== Thread 29:
==94390== Invalid read of size 8
==94390== at 0x78389D: thd_increment_bytes_sent (sql_class.cc:4265)
==94390== by 0xC8EC46: net_real_write (net_serv.cc:730)
==94390== by 0xC8E0C8: net_flush (net_serv.cc:383)
==94390== by 0xC8E4D0: net_write_command (net_serv.cc:521)
==94390== by 0xADCE61: cli_advanced_command (client.c:468)
==94390== by 0xAE3CAF: mysql_close_slow_part (client.c:3671)
==94390== by 0xAE3D28: mysql_close (client.c:3683)
==94390== by 0x149E69A8: spider_db_mbase::disconnect() (spd_db_mysql.cc:2217)
==94390== by 0x1491EA26: spider_db_disconnect(st_spider_conn*) (spd_db_conn.cc:297)
==94390== by 0x14948EBE: spider_free_conn_alloc(st_spider_conn*) (spd_conn.cc:196)
==94390== by 0x1494B26A: spider_free_conn(st_spider_conn*) (spd_conn.cc:1251)
==94390== by 0x1494941F: spider_free_conn_from_trx(st_spider_transaction*, st_spider_conn*, bool, bool, int*) (spd_conn.cc:315)
==94390== Address 0x1f0e0990 is 4,832 bytes inside a block of size 25,728 free'd
==94390== at 0x4C2ACBD: free (vg_replace_malloc.c:530)
==94390== by 0x13F5545: my_free (my_malloc.c:222)
==94390== by 0x6C75B7: ilink::operator delete(void*, unsigned long) (sql_list.h:618)
==94390== by 0x77B9F6: THD::~THD() (sql_class.cc:1724)
==94390== by 0x1494FCE0: spider_bg_conn_action(void*) (spd_conn.cc:2580)
==94390== by 0x4E3DDD4: start_thread (in /usr/lib64/libpthread-2.17.so)
==94390== by 0x5FBFEAC: clone (in /usr/lib64/libc-2.17.so)
==94390== Block was alloc'd at
==94390== at 0x4C29BC3: malloc (vg_replace_malloc.c:299)
==94390== by 0x13F4DFA: my_malloc (my_malloc.c:101)
==94390== by 0x1491CF06: ilink::operator new(unsigned long) (sql_list.h:614)
==94390== by 0x1494F7FD: spider_bg_conn_action(void*) (spd_conn.cc:2501)
==94390== by 0x4E3DDD4: start_thread (in /usr/lib64/libpthread-2.17.so)
==94390== by 0x5FBFEAC: clone (in /usr/lib64/libc-2.17.so)
==94390== Invalid write of size 8
==94390== at 0x7838AF: thd_increment_bytes_sent (sql_class.cc:4265)
==94390== by 0xC8EC46: net_real_write (net_serv.cc:730)
==94390== by 0xC8E0C8: net_flush (net_serv.cc:383)
==94390== by 0xC8E4D0: net_write_command (net_serv.cc:521)
==94390== by 0xADCE61: cli_advanced_command (client.c:468)
==94390== by 0xAE3CAF: mysql_close_slow_part (client.c:3671)
==94390== by 0xAE3D28: mysql_close (client.c:3683)
==94390== by 0x149E69A8: spider_db_mbase::disconnect() (spd_db_mysql.cc:2217)
==94390== by 0x1491EA26: spider_db_disconnect(st_spider_conn*) (spd_db_conn.cc:297)
==94390== by 0x14948EBE: spider_free_conn_alloc(st_spider_conn*) (spd_conn.cc:196)
==94390== by 0x1494B26A: spider_free_conn(st_spider_conn*) (spd_conn.cc:1251)
==94390== by 0x1494941F: spider_free_conn_from_trx(st_spider_transaction*, st_spider_conn*, bool, bool, int*) (spd_conn.cc:315)
==94390== Address 0x1f0e0990 is 4,832 bytes inside a block of size 25,728 free'd
==94390== at 0x4C2ACBD: free (vg_replace_malloc.c:530)
==94390== by 0x13F5545: my_free (my_malloc.c:222)
==94390== by 0x6C75B7: ilink::operator delete(void*, unsigned long) (sql_list.h:618)
==94390== by 0x77B9F6: THD::~THD() (sql_class.cc:1724)
==94390== by 0x1494FCE0: spider_bg_conn_action(void*) (spd_conn.cc:2580)
==94390== by 0x4E3DDD4: start_thread (in /usr/lib64/libpthread-2.17.so)
==94390== by 0x5FBFEAC: clone (in /usr/lib64/libc-2.17.so)
==94390== Block was alloc'd at
==94390== at 0x4C29BC3: malloc (vg_replace_malloc.c:299)
==94390== by 0x13F4DFA: my_malloc (my_malloc.c:101)
==94390== by 0x1491CF06: ilink::operator new(unsigned long) (sql_list.h:614)
==94390== by 0x1494F7FD: spider_bg_conn_action(void*) (spd_conn.cc:2501)
==94390== by 0x4E3DDD4: start_thread (in /usr/lib64/libpthread-2.17.so)
==94390== by 0x5FBFEAC: clone (in /usr/lib64/libc-2.17.so)
This patch implements engine independent unique hash index.
Usage:- Unique HASH index can be created automatically for blob/varchar/test column whose key
length > handler->max_key_length()
or it can be explicitly specified.
Automatic Creation:-
Create TABLE t1 (a blob unique);
Explicit Creation:-
Create TABLE t1 (a int , unique(a) using HASH);
Internal KEY_PART Representations:-
Long unique key_info will have 2 representations.
(lets understand this with an example create table t1(a blob, b blob , unique(a, b)); )
1. User Given Representation:- key_info->key_part array will be similar to what user has defined.
So in case of example it will have 2 key_parts (a, b)
2. Storage Engine Representation:- In this case there will be only one key_part and it will point to
HASH_FIELD. This key_part will be always after user defined key_parts.
So:- User Given Representation [a] [b] [hash_key_part]
key_info->key_part ----^
Storage Engine Representation [a] [b] [hash_key_part]
key_info->key_part ------------^
Table->s->key_info will have User Given Representation, While table->key_info will have Storage Engine
Representation.Representation can be changed into each other by calling re/setup_keyinfo_hash function.
Working:-
1. So when user specifies HASH_INDEX or key_length is > handler->max_key_length(), In mysql_prepare_create_table
One extra vfield is added (for each long unique key). And key_info->algorithm is set to HA_KEY_ALG_LONG_HASH.
2. In init_from_binary_frm_image values for hash_keypart is set (like fieldnr , field and flags)
3. In parse_vcol_defs, HASH_FIELD->vcol_info is created. Item_func_hash is used with list of Item_fields,
When Explicit length is given by user then Item_left is used to concatenate Item_field values.
4. In ha_write_row/ha_update_row check_duplicate_long_entry_key is called which will create the hash key from
table->record[0] and then call ha_index_read_map , if we found duplicated hash , we will compare the result
field by field.
Add a system variable spider_slave_trx_isolation.
- spider_slave_trx_isolation
The transaction isolation level when Spider table is used by slave SQL thread.
-1 : OFF
0 : READ UNCOMMITTED
1 : READ COMMITTED
2 : REPEATABLE READ
3 : SERIALIZABLE
The default value is -1
Miscellaneous Spider typos
Change default value of the followings
quick_mode 0 -> 3
quick_page_size 100 -> 1024
Add the following parameter for limiting result page size by byte
- quick_page_byte(qpb)
Number of bytes in a page when acquisition one by one.
When quick_mode is 1 or 2, Spider stores at least 1 record even if
quick_page_byte is smaller than 1 record. When quick_mode is 3,
quick_page_byte is used for judging using temporary table.
That is given to priority when server parameter spider_quick_page_byte
is set.
The default value is 10485760
Fix "out of sync" issue at using quick_mode = 1 or 2
The problem occurs because the statement generated by Spider used an
internal function name, ADD_TIME.
This problem has been corrected by the fix for bug MDEV-16878 within the
server, which enables Spider to generate the statement using the actual
SQL function name. I have made some additional changes within Spider to fix
related problems that I observed while testing.
Author:
Jacob Mathew.
First Reviewer:
Alexander Barkov.
Second Reviewer:
Kentoku Shiba.
The problem occurred because the Spider node was incorrectly handling
timestamp values sent to and received from the data nodes.
The problem has been corrected as follows:
- Added logic to set and maintain the UTC time zone on the data nodes.
To prevent timestamp ambiguity, it is necessary for the data nodes to use
a time zone such as UTC which does not have daylight savings time.
- Removed the spider_sync_time_zone configuration variable, which did not
solve the problem and which interfered with the solution.
- Added logic to convert to the UTC time zone all timestamp values sent to
and received from the data nodes. This is done for both unique and
non-unique timestamp columns. It is done for WHERE clauses, applying to
SELECT, UPDATE and DELETE statements, and for UPDATE columns.
- Disabled Spider's use of direct update when any of the columns to update is
a timestamp column. This is necessary to prevent false duplicate key value
errors.
- Added a new test spider.timestamp to thoroughly test Spider's handling of
timestamp values.
Author:
Jacob Mathew.
Reviewer:
Kentoku Shiba.
Cherry-Picked:
Commit 97cc9d3 on branch bb-10.3-MDEV-16246
When an attempt to connect to the remote server fails, Spider retries to
connect to the remote server 1000 times or until the connection attempt
succeeds. This is perceived as a hang if the remote server remains
unavailable.
I have introduced changes in Spider's table status handler to fix this problem.
Author:
Jacob Mathew.
Reviewer:
Kentoku Shiba.
Cherry-Picked:
Commit 6ee6933 on branch bb-10.3-MDEV-15712
When an attempt to connect to the remote server fails, Spider retries to
connect to the remote server 1000 times or until the connection attempt
succeeds. This is perceived as a hang if the remote server remains
unavailable.
I have introduced changes in Spider's table status handler to fix this problem.
Author:
Jacob Mathew.
Reviewer:
Kentoku Shiba.
- Max_index_length is supported by MyISAM and Aria tables.
- Temporary is a placeholder to signal that a table is a
temporary table. For the moment this is always "N", except
"Y" for generated information_schema tables and NULL for
views. Full temporary table support will be done in another task.
(No reason to have to update a lot of result files twice in a row)
Spider patches 026 (MDEV-7723), 031 (MDEV-7727) and 058 (MDEV-12532)
This allows the storage engine to internally compute sum and count
operations.
- Enhance sum items to be able to store the sum value directly.
- return_record_by_parent() is enabled in spider as
HANDLER_HAS_DIRECT_AGGREGATE is defined
- Added spd_environ.h to spider. This is loaded first to ensure that all
MariaDB specific defines that are used by include files are properly
defined.
- This code is tested by the existing spider tests direct_aggregate.test
and direct_aggregate_part.test and also partition.test