1
0
mirror of https://github.com/MariaDB/server.git synced 2025-11-27 05:41:41 +03:00
Commit Graph

195 Commits

Author SHA1 Message Date
Yuchen Pei
d0fcac4450 MDEV-35422 Fix spider group by handler trying to use fake group by fields
This is a fixup of MDEV-26345 commit
77ed235d50.

In MDEV-26345 the spider group by handler was updated so that it uses
the item_ptr fields of Query::group_by and Query::order_by, instead of
item. This was and is because the call to
join->set_items_ref_array(join->items1) during the execution stage,
just before the execution replaces the order-by / group-by item arrays
with Item_temptable_field.

Spider traverses the item tree during the group by handler (gbh)
creation at the end of the optimization stage, and decides a gbh could
handle the execution of the query. Basically spider gbh can handle the
execution if it can construct a well-formed query, executes on the
data node, and store the results in the correct places. If so, it will
create one, otherwise it will return NULL and the execution will use
the usual handler (ha_spider instead of spider_group_by_handler). To
that end, the general principle is the items checked for creation
should be the same items later used for query construciton. Since in
MDEV-26345 we changed to use the item_ptr field instead of item field
of order-by and group-by in query construction, in this patch we do
the same for the gbh creation.

The item_ptr field could be the uninitialised NULL value during the
gbh creation. This is because the optimizer may replace a DISTINCT
with a GROUP BY, which only happens if the original GROUP BY is empty.
It creates the artificial GROUP BY by calling create_distinct_group(),
which creates the corresponding ORDER object with item field aligning
with somewhere in ref_pointer_array, but leaving item_ptr to be NULL.
When spider finds out that item_ptr is NULL, it knows there's some
optimizer skullduggery and it is passed a query different from the
original. Without a clear contract between the server layer and the
gbh, it is better to be safe than sorry and not create the gbh in this
case.

Also add a check and error reporting for the unlikely case of item_ptr
changing from non-NULL at gbh construction to NULL at execution to
prevent server crash.

Also, we remove a check added in MDEV-29480 of order by items being
aggregate functions. That check was added with the premise that spider
was including auxiliary SELECT items which is referenced by ORDER BY
items. This premise was no longer true since MDEV-26345, and caused
problems such as MDEV-29546, which was fixed by MDEV-26345.
2024-12-03 10:32:42 +11:00
Yuchen Pei
a8cc40d9a4 MDEV-35064 Reduce the default spider connect retry counts to 2
The existing default value 1000 is too big and could result in
"hanging" when failing to connect a remote server. Three tries in
total is a more sensible default.
2024-11-27 10:25:14 +11:00
Brandon Nesterenko
840fe316d4 MDEV-34348: my_hash_get_key fixes
Partial commit of the greater MDEV-34348 scope.
MDEV-34348: MariaDB is violating clang-16 -Wcast-function-type-strict

Change the type of my_hash_get_key to:
 1) Return const
 2) Change the context parameter to be const void*

Also fix casting in hash adjacent areas.

Reviewed By:
============
Marko Mäkelä <marko.makela@mariadb.com>
2024-11-23 08:14:22 -07:00
ParadoxV5
cf2d49ddcf Extract some of #3360 fixes to 10.5.x
That PR uncovered countless issues on `my_snprintf` uses.
This commit backports a squashed subset of their fixes.
2024-11-21 22:43:56 +11:00
Yuchen Pei
77ed235d50 MDEV-26345 Spider GBH should execute original queries on the data node
Stop skipping const items when selecting but skip them when storing
their results to spider row to avoid storing in mismatching temporary
table fields.

Skip auxiliary fields in SELECTing, and do not store
the (non-existing) results to the corresponding temporary table
accordingly.

When there are BOTH auxiliary fields AND const items in the auxiliary
field items, do not use the spider GBH. This is a rare occasion if it
happens at all and not worth the added complexity to cover it.

Use the original item (item_ptr) in constructing GROUP BY and ORDER
BY, which also means using item->name instead of field->field_name as
aliases in constructing SELECT items. This fixes spurious regressions
caused by the above changes in some tests using ORDER BY, such as
mdev_24517.test. As a by-product, this also fixes MDEV-29546.
Therefore we update mdev_29008.test to include the MDEV-29546 case.
2024-10-15 15:36:12 +11:00
Yuchen Pei
03a5c683f9 MDEV-27650 Spider: remove #ifdef SPIDER_HAS_GROUP_BY_HANDLER 2024-10-15 14:30:39 +11:00
Yuchen Pei
0a59aafc5f MDEV-34659 Bound check in spider cast function query construction
During spider query construction of certain cast functions, it
locates the last occurrence of a keyword in the output of the
Item::print() function and append from there to the constructed query
so far. For example, consider the following query

SELECT * FROM t2 ORDER BY CAST(c AS INET6);

It constructs the following query and executes it at the data
node (assuming the data node table is called t0).

select cast(t0.`c` as inet6) ``,t0.`c` `c` from `test`.`t1` t0 order by ``

When the construction has completed the initial part

select cast(t0.`c`

It then attempts to construct the " as inet6" part. To that end, it
calls print() on the Item_typecast_fbt corresponding to the cast item,
and obtains

cast(`test`.`t2`.`c` as inet6)

It then looks for " as ", and places cursor there for appending:

cast(`test`.`t2`.`c` as inet6)
                    ^

In this patch, if the search fails, i.e. there's no " as ...", we
make sure that the cursor is not placed before the beginning of the
string (out of bound).

We also relax the search from " as char" to " as " in the case of
CHAR_TYPECAST_FUNC, since there is more than one Item type with this
func type. For example, "AS INET6" is an Item_typecast_fbt which has
this func type.
2024-10-15 14:30:30 +11:00
Yuchen Pei
282b92f0a2 MDEV-34589 Do not execute before queries in spider_db_mbase::rollback()
Rollback is not supposed to fail. This prevents false failures in
spider rollback.
2024-09-30 16:16:27 +10:00
Yuchen Pei
9e1579788f MDEV-31788 Factor spider locking and unlocking code around sending queries 2024-09-10 11:52:22 +10:00
Yuchen Pei
84067291b4 MDEV-28360 Spider: remove #ifdef SPIDER_use_LEX_CSTRING_for_KEY_Field_name 2024-09-10 11:19:19 +10:00
Yuchen Pei
f5b7c25e1e MDEV-27643 Spider: remove #ifdef HA_CAN_BULK_ACCESS 2024-09-10 11:19:19 +10:00
Yuchen Pei
e7570c7759 MDEV-31788 Remove spider_file_pos
They are for unnecessary debugging purposes only.
2024-09-10 11:19:18 +10:00
Yuchen Pei
a81f419b06 MDEV-27648 remove #define HASH_UPDATE_WITH_HASH_VALUE
The functions called in blocks protected by this macro remain
undefined as of 11.5 c96b23f994
2024-09-10 11:19:14 +10:00
Yuchen Pei
5d54e86c22 MDEV-26178 spider: delete spd_environ.h
It's virtually empty now
2024-09-10 11:15:18 +10:00
Yuchen Pei
869c501ac3 MDEV-27644 Spider: remove HANDLER_HAS_DIRECT_AGGREGATE 2024-09-10 11:15:18 +10:00
Yuchen Pei
6287fb6e17 MDEV-27652 remove #ifdef HA_HAS_CHECKSUM_EXTENDED
handler::pre_calculate_checksum was added in MDEV-16249
be5c432a42
2024-09-10 11:15:17 +10:00
Yuchen Pei
ab49b46d01 MDEV-27664 remove SPIDER_SQL_CACHE_IS_IN_LEX
sql_cache was moved to lex in MDEV-11953 in
de745ecf29
2024-09-10 11:15:16 +10:00
Yuchen Pei
0650c87d9b MDEV-27647 Spider: remove HANDLER_HAS_DIRECT_UPDATE_ROWS 2024-09-10 11:15:13 +10:00
Yuchen Pei
64581c83e8 MDEV-28893 Spider: remove #ifdef SPIDER_NET_HAS_THD
net has thd since 2015 in 56aa19989f for MDEV-6152
2024-09-10 11:15:12 +10:00
Yuchen Pei
05fafaf82d MDEV-27646 remove SPIDER_HAS_HASH_VALUE_TYPE
unifdef -DSPIDER_HAS_HASH_VALUE_TYPE -m storage/spider/spd_* storage/spider/ha_spider.* storage/spider/hs_client/*
2024-09-10 11:15:12 +10:00
Yuchen Pei
58bc83e1a7 [fixup] Spider: Restored lines accidentally deleted in MDEV-32157
Also restored a change that resulted in off-by-one, as well as
appending the correctly indexed key_hint.
2024-08-27 15:36:39 +10:00
Yuchen Pei
132270d3de MDEV-34541 Clean up spider self reference check
SPIDER_CONN::loop_check_meraged_first is useless, because all
SPIDER_CONN_LOOP_CHECKs are in SPIDER_CONN::loop_check_queue, which in
spider_db_conn::fin_loop_check() is iterated over.

This fixes the use-after-free issue when there are three spider tables
sharing the same remote, and their corresponding
SPIDER_CONN_LOOP_CHECKs getting merged in
spider_conn_queue_and_merge_loop_check()

This also fixes MDEV-34555
2024-07-16 16:33:05 +08:00
Yuchen Pei
581712b989 MDEV-33490 MENT-1504 Fix some english strings in spider. 2024-06-04 12:25:08 +10:00
Nayuta Yanagisawa
6d0c9872d9 MDEV-28522 Delete constant SPIDER_SQL_TYPE_*_HS
The HandlerSocket support of Spider has been deleted by MDEV-26858.
Thus, the constants, SPIDER_SQL_TYPE_*_HS, are no longer necessary.
2024-05-31 09:06:55 +10:00
Yuchen Pei
6c30220780 MDEV-26858 Spider: Remove dead code related to HandlerSocket
Remove the dead-code, in Spider, which is related to the Spider's
HandlerSocket support. The code has been disabled for a long time
and it is unlikely that the code will be enabled.

- rm all files under storage/spider/hs_client/ except hs_compat.h
- rm storage/spider/spd_db_handlersocket.*
- unifdef -UHS_HAS_SQLCOM -UHAVE_HANDLERSOCKET \
  -m storage/spider/spd_* storage/spider/ha_spider.* storage/spider/hs_client/*
- remove relevant files from storage/spider/CMakeLists.txt
2024-05-31 09:06:55 +10:00
Oleksandr Byelkin
9b18275623 Merge branch '10.4' into 10.5 2024-04-16 11:04:14 +02:00
Kristian Nielsen
16aa4b5f59 Merge from 10.4 to 10.5
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-04-15 17:46:49 +02:00
Yuchen Pei
051a1fa0e9 MDEV-33777 Spider: Correct checks for show index column numbers
It was updated for 10.6+ in MDEV-7317. Because a lower version spider
node may connect to a higher version data node, we need to change this
for 10.4 and 10.5 as well.
2024-04-15 09:59:24 +10:00
Yuchen Pei
18b93d6eb0 MDEV-28993 Spider: Push down CASE statement 2024-04-15 09:56:24 +10:00
Yuchen Pei
99dc0f030f MDEV-28993 spider: revert removal of ITEM_FUNC_CASE_PARAMS_ARE_PUBLIC
It was done in MDEV-29447.
2024-04-15 09:56:23 +10:00
Yuchen Pei
860c1ca9ad MDEV-33679 Spider group by handler: skip on multiple equalities
The spider group by handler is created in
JOIN::make_aggr_tables_info(), by which time calls to
substitute_for_best_equal_field() should have already removed all the
multiple equalities (i.e. Item_equal, with MULT_EQUAL_FUNC func_type).
Therefore, if there is still such items, it is deemed as an optimizer
bug and should be skipped.
2024-04-08 14:35:35 +10:00
Yuchen Pei
9c93d41ad7 MDEV-33728 spider: remove use of MYSQL_VERSION_ID and MARIADB_BASE_VERSION
change created by:

unifdef -DMYSQL_VERSION_ID=100400 -DMARIADB_BASE_VERSION -m storage/spider/spd_* storage/spider/ha_spider.* storage/spider/hs_client/*

basically MDEV-27637, MDEV-27641, MDEV-27655
2024-04-08 14:35:35 +10:00
Yuchen Pei
44c88faeca MDEV-28992 Spider group by handler: Push down TIMESTAMPDIFF function
Also removed ITEM_FUNC_TIMESTAMPDIFF_ARE_PUBLIC.

Similar to pr#2225, with the testcase adapted from that patch:

--8<---------------cut here---------------start------------->8---
From 884f7c6df1 Mon Sep 17 00:00:00 2001
From: "Norio Akagi (norakagi)" <norakagi@amazon.com>
Date: Wed, 3 Aug 2022 23:30:34 -0700
Subject: [PATCH] [MDEV-28992] Push down TIMESTAMP_DIFF in spider

This changes so that TIMESTAMP_DIFF function in a query is pushed down and works natively in Spider.
Instead of directly accessing item's member, now we can rely on a public accessor method to make it work.
Unit tests are added under spider.pushdown_timestamp_diff.

All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer
Amazon Web Services, Inc.
--8<---------------cut here---------------end--------------->8---
2024-04-08 14:35:35 +10:00
Yuchen Pei
ef9cdacf51 MDEV-33220 Fix -wmaybe-uninitialized warnings for g++-13 2024-03-25 12:56:00 +11:00
Yuchen Pei
1b568fb917 MDEV-33539 spider: remove some unused code in self reference checks 2024-03-04 11:52:13 +11:00
Yuchen Pei
c9902a20b3 Merge branch '10.4' into 10.5 2024-01-10 18:01:46 +11:00
Yuchen Pei
bc3d416a17 MDEV-29718 Fix spider detection of same data node server
When the host is not specified, it defaults to localhost.
2024-01-10 16:37:36 +11:00
Yuchen Pei
eabc74aaef MDEV-33008 Fix spider table discovery
A new column was introduced to the show index output in 10.6 in
f691d9865b

Thus we update the check of the number of columns to be at least 13,
rather than exactly 13.

Also backport an err number and format from 10.5 for better error
messages when the column number is wrong.
2024-01-10 16:36:39 +11:00
Marko Mäkelä
a3dd7ea09f Merge 10.4 into 10.5 2023-12-21 11:30:32 +02:00
Yuchen Pei
c73417c68e MDEV-32986 Make regexp operator work in spider group by handler
In spider_db_mbase_util::print_item_func(), if the sql item_func has
an UNKNOWN_FUNC type, by default the spider group by handler (gbh)
transform infix to prefix. But regexp should remain infix, so we add
an if condition to account for this.
2023-12-21 10:31:12 +11:00
Yuchen Pei
ddd5449c57 [fixup] post-merge spider fixup
MDEV-32524: a couple missed magic numbers
MDEV-26247: a couple missed goto statements that could lead to memory leak
2023-12-05 14:33:16 +11:00
Sergei Golubchik
98a39b0c91 Merge branch '10.4' into 10.5 2023-12-02 01:02:50 +01:00
Yuchen Pei
0b36694ff8 MDEV-32524 Use enums for ids passed to spider mem alloc functions
This will avoid issues like MDEV-32486

IDs used in
- spider_alloc_calc_mem_init()
- spider_string::init_calc_mem()
- spider_malloc()
- spider_bulk_alloc_mem()
- spider_bulk_malloc()
2023-11-20 09:25:43 +11:00
Yuchen Pei
178396573a MDEV-26247 Re-implement spider gbh query rewrite of tables
Spider GBH's query rewrite of table joins is overly complex and
error-prone. We replace it with something closer to what
dbug_print() (more specifically, print_join()) does, but catered to
spider.

More specifically, we replace the body of
spider_db_mbase_util::append_from_and_tables() with a call to
spider_db_mbase_util::append_join(), and remove downstream append_X
functions.

We make it handle const tables by rewriting them as (select 1). This
fixes the main issue in MDEV-26247.

We also ban semijoin from spider gbh, which fixes MDEV-31645 and
MDEV-30392, as semi-join is an "internal" join, and "semi join" does
not parse, and it is different from "join" in that it deduplicates the
right hand side

Not all queries passed to a group by handler are valid (MDEV-32273),
for example, a join on expr may refer outer fields not in the current
context. We detect this during the handler creation when walking the
join. See also gbh_outer_fields_in_join.test.

It also skips eliminated tables, which fixes MDEV-26193.
2023-11-17 11:07:50 +11:00
Yuchen Pei
2d1e09a77f MDEV-26247 Clean up spider_fields
Spider gbh query rewrite should get table for fields in a simple way.
Add a method spider_fields::find_table that searches its table holders
to find table for a given field. This way we will be able to get rid
of the first pass during the gbh creation where field_chains and
field_holders are created.

We also check that the field belongs to a spider table while walking
through the query, so we could remove
all_query_fields_are_query_table_members(). However, this requires an
earlier creation of the table_holder so that tables are added before
checking. We do that, and in doing so, also decouple table_holder and
spider_fields

Remove unused methods and fields. Add comments.
2023-11-17 10:04:12 +11:00
Yuchen Pei
8c1dcb2579 MDEV-26247 Remove some unused spider methods
Two methods from spider_fields. There are probably more of these
conn_holder related methods that can be removed

reappend_tables_part()
reappend_tables()
2023-11-17 10:04:12 +11:00
Yuchen Pei
68a002071b MDEV-29502 Fix some issues with spider direct aggregate
The direct aggregate mechanism sems to be only intended to work when
otherwise a full table scan query will be executed from the spider
node and the aggregation done at the spider node too. Typically this
happens in sub_select(). In the test spider.direct_aggregate_part
direct aggregate allows to send COUNT statements directly to the data
nodes and adds up the results at the spider node, instead of iterating
over the rows one by one at the spider node.

By contrast, the group by handler (GBH) typically sends aggregated
queries directly to data nodes, in which case DA does not improve the
situation here.

That is why we should fix it by disabling DA when GBH is used.

There are other reasons supporting this change. First, the creation of
GBH results in a call to change_to_use_tmp_fields() (as opposed to
setup_copy_fields()) which causes the spider DA function
spider_db_fetch_for_item_sum_funcs() to work on wrong items. Second,
the spider DA function only calls direct_add() on the items, and the
follow-up add() needs to be called by the sql layer code. In
do_select(), after executing the query with the GBH, it seems that the
required add() would not necessarily be called.

Disabling DA when GBH is used does fix the bug. There are a few
other things included in this commit to improve the situation with
spider DA:

1. Add a session variable that allows user to disable DA completely,
this will help as a temporary measure if/when further bugs with DA
emerge.

2. Move the increment of direct_aggregate_count to the spider DA
function. Currently this is done in rather bizarre and random
locations.

3. Fix the spider_db_mbase_row creation so that the last of its row
field (sentinel) is NULL. The code is already doing a null check, but
somehow the sentinel field is on an invalid address, causing the
segfaults. With a correct implementation of the row creation, we can
avoid such segfaults.
2023-09-15 12:08:25 +10:00
Yuchen Pei
e95e9a221f Merge branch '10.4' into 10.5 2023-09-15 12:04:44 +10:00
Yuchen Pei
d8e9f3d981 MDEV-31673 MDEV-29502 Remove spider_db_handler::need_lock_before_set_sql_for_exec
This function trivially returns false
2023-09-14 16:37:34 +10:00
Oleksandr Byelkin
f52954ef42 Merge commit '10.4' into 10.5 2023-07-20 11:54:52 +02:00