1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-07 00:36:50 +03:00

Fix partitionwise join with partially-redundant join clauses

To determine if the two relations being joined can use partitionwise
join, we need to verify the existence of equi-join conditions
involving pairs of matching partition keys for all partition keys.
Currently we do that by looking through the join's restriction
clauses.  However, it has been discovered that this approach is
insufficient, because there might be partition keys known equal by a
specific EC, but they do not form a join clause because it happens
that other members of the EC than the partition keys are constrained
to become a join clause.

To address this issue, in addition to examining the join's restriction
clauses, we also check if any partition keys are known equal by ECs,
by leveraging function exprs_known_equal().  To accomplish this, we
enhance exprs_known_equal() to check equality per the semantics of the
opfamily, if provided.

It could be argued that exprs_known_equal() could be called O(N^2)
times, where N is the number of partition key expressions, resulting
in noticeable performance costs if there are a lot of partition key
expressions.  But I think this is not a problem.  The number of a
joinrel's partition key expressions would only be equal to the join
degree, since each base relation within the join contributes only one
partition key expression.  That is to say, it does not scale with the
number of partitions.  A benchmark with a query involving 5-way joins
of partitioned tables, each with 3 partition keys and 1000 partitions,
shows that the planning time is not significantly affected by this
patch (within the margin of error), particularly when compared to the
impact caused by partitionwise join.

Thanks to Tom Lane for the idea of leveraging exprs_known_equal() to
check if partition keys are known equal by ECs.

Author: Richard Guo, Tom Lane
Reviewed-by: Tom Lane, Ashutosh Bapat, Robert Haas
Discussion: https://postgr.es/m/CAN_9JTzo_2F5dKLqXVtDX5V6dwqB0Xk+ihstpKEt3a1LT6X78A@mail.gmail.com
This commit is contained in:
Richard Guo
2024-07-30 15:51:54 +09:00
parent 2309eff62b
commit 9b282a9359
6 changed files with 214 additions and 21 deletions

View File

@ -2443,15 +2443,17 @@ find_join_domain(PlannerInfo *root, Relids relids)
* Detect whether two expressions are known equal due to equivalence * Detect whether two expressions are known equal due to equivalence
* relationships. * relationships.
* *
* Actually, this only shows that the expressions are equal according * If opfamily is given, the expressions must be known equal per the semantics
* to some opfamily's notion of equality --- but we only use it for * of that opfamily (note it has to be a btree opfamily, since those are the
* selectivity estimation, so a fuzzy idea of equality is OK. * only opfamilies equivclass.c deals with). If opfamily is InvalidOid, we'll
* return true if they're equal according to any opfamily, which is fuzzy but
* OK for estimation purposes.
* *
* Note: does not bother to check for "equal(item1, item2)"; caller must * Note: does not bother to check for "equal(item1, item2)"; caller must
* check that case if it's possible to pass identical items. * check that case if it's possible to pass identical items.
*/ */
bool bool
exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2) exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2, Oid opfamily)
{ {
ListCell *lc1; ListCell *lc1;
@ -2466,6 +2468,17 @@ exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2)
if (ec->ec_has_volatile) if (ec->ec_has_volatile)
continue; continue;
/*
* It's okay to consider ec_broken ECs here. Brokenness just means we
* couldn't derive all the implied clauses we'd have liked to; it does
* not invalidate our knowledge that the members are equal.
*/
/* Ignore if this EC doesn't use specified opfamily */
if (OidIsValid(opfamily) &&
!list_member_oid(ec->ec_opfamilies, opfamily))
continue;
foreach(lc2, ec->ec_members) foreach(lc2, ec->ec_members)
{ {
EquivalenceMember *em = (EquivalenceMember *) lfirst(lc2); EquivalenceMember *em = (EquivalenceMember *) lfirst(lc2);
@ -2494,8 +2507,7 @@ exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2)
* (In principle there might be more than one matching eclass if multiple * (In principle there might be more than one matching eclass if multiple
* collations are involved, but since collation doesn't matter for equality, * collations are involved, but since collation doesn't matter for equality,
* we ignore that fine point here.) This is much like exprs_known_equal, * we ignore that fine point here.) This is much like exprs_known_equal,
* except that we insist on the comparison operator matching the eclass, so * except for the format of the input.
* that the result is definite not approximate.
* *
* On success, we also set fkinfo->eclass[colno] to the matching eclass, * On success, we also set fkinfo->eclass[colno] to the matching eclass,
* and set fkinfo->fk_eclass_member[colno] to the eclass member for the * and set fkinfo->fk_eclass_member[colno] to the eclass member for the
@ -2536,7 +2548,7 @@ match_eclasses_to_foreign_key_col(PlannerInfo *root,
/* Never match to a volatile EC */ /* Never match to a volatile EC */
if (ec->ec_has_volatile) if (ec->ec_has_volatile)
continue; continue;
/* Note: it seems okay to match to "broken" eclasses here */ /* It's okay to consider "broken" ECs here, see exprs_known_equal */
foreach(lc2, ec->ec_members) foreach(lc2, ec->ec_members)
{ {

View File

@ -2080,10 +2080,9 @@ have_partkey_equi_join(PlannerInfo *root, RelOptInfo *joinrel,
JoinType jointype, List *restrictlist) JoinType jointype, List *restrictlist)
{ {
PartitionScheme part_scheme = rel1->part_scheme; PartitionScheme part_scheme = rel1->part_scheme;
bool pk_known_equal[PARTITION_MAX_KEYS];
int num_equal_pks;
ListCell *lc; ListCell *lc;
int cnt_pks;
bool pk_has_clause[PARTITION_MAX_KEYS];
bool strict_op;
/* /*
* This function must only be called when the joined relations have same * This function must only be called when the joined relations have same
@ -2092,13 +2091,19 @@ have_partkey_equi_join(PlannerInfo *root, RelOptInfo *joinrel,
Assert(rel1->part_scheme == rel2->part_scheme); Assert(rel1->part_scheme == rel2->part_scheme);
Assert(part_scheme); Assert(part_scheme);
memset(pk_has_clause, 0, sizeof(pk_has_clause)); /* We use a bool array to track which partkey columns are known equal */
memset(pk_known_equal, 0, sizeof(pk_known_equal));
/* ... as well as a count of how many are known equal */
num_equal_pks = 0;
/* First, look through the join's restriction clauses */
foreach(lc, restrictlist) foreach(lc, restrictlist)
{ {
RestrictInfo *rinfo = lfirst_node(RestrictInfo, lc); RestrictInfo *rinfo = lfirst_node(RestrictInfo, lc);
OpExpr *opexpr; OpExpr *opexpr;
Expr *expr1; Expr *expr1;
Expr *expr2; Expr *expr2;
bool strict_op;
int ipk1; int ipk1;
int ipk2; int ipk2;
@ -2176,11 +2181,15 @@ have_partkey_equi_join(PlannerInfo *root, RelOptInfo *joinrel,
if (ipk1 != ipk2) if (ipk1 != ipk2)
continue; continue;
/* Ignore clause if we already proved these keys equal. */
if (pk_known_equal[ipk1])
continue;
/* /*
* The clause allows partitionwise join only if it uses the same * The clause allows partitionwise join only if it uses the same
* operator family as that specified by the partition key. * operator family as that specified by the partition key.
*/ */
if (rel1->part_scheme->strategy == PARTITION_STRATEGY_HASH) if (part_scheme->strategy == PARTITION_STRATEGY_HASH)
{ {
if (!OidIsValid(rinfo->hashjoinoperator) || if (!OidIsValid(rinfo->hashjoinoperator) ||
!op_in_opfamily(rinfo->hashjoinoperator, !op_in_opfamily(rinfo->hashjoinoperator,
@ -2192,17 +2201,89 @@ have_partkey_equi_join(PlannerInfo *root, RelOptInfo *joinrel,
continue; continue;
/* Mark the partition key as having an equi-join clause. */ /* Mark the partition key as having an equi-join clause. */
pk_has_clause[ipk1] = true; pk_known_equal[ipk1] = true;
}
/* Check whether every partition key has an equi-join condition. */
for (cnt_pks = 0; cnt_pks < part_scheme->partnatts; cnt_pks++)
{
if (!pk_has_clause[cnt_pks])
return false;
}
/* We can stop examining clauses once we prove all keys equal. */
if (++num_equal_pks == part_scheme->partnatts)
return true; return true;
}
/*
* Also check to see if any keys are known equal by equivclass.c. In most
* cases there would have been a join restriction clause generated from
* any EC that had such knowledge, but there might be no such clause, or
* it might happen to constrain other members of the ECs than the ones we
* are looking for.
*/
for (int ipk = 0; ipk < part_scheme->partnatts; ipk++)
{
Oid btree_opfamily;
/* Ignore if we already proved these keys equal. */
if (pk_known_equal[ipk])
continue;
/*
* We need a btree opfamily to ask equivclass.c about. If the
* partopfamily is a hash opfamily, look up its equality operator, and
* select some btree opfamily that that operator is part of. (Any
* such opfamily should be good enough, since equivclass.c will track
* multiple opfamilies as appropriate.)
*/
if (part_scheme->strategy == PARTITION_STRATEGY_HASH)
{
Oid eq_op;
List *eq_opfamilies;
eq_op = get_opfamily_member(part_scheme->partopfamily[ipk],
part_scheme->partopcintype[ipk],
part_scheme->partopcintype[ipk],
HTEqualStrategyNumber);
if (!OidIsValid(eq_op))
break; /* we're not going to succeed */
eq_opfamilies = get_mergejoin_opfamilies(eq_op);
if (eq_opfamilies == NIL)
break; /* we're not going to succeed */
btree_opfamily = linitial_oid(eq_opfamilies);
}
else
btree_opfamily = part_scheme->partopfamily[ipk];
/*
* We consider only non-nullable partition keys here; nullable ones
* would not be treated as part of the same equivalence classes as
* non-nullable ones.
*/
foreach(lc, rel1->partexprs[ipk])
{
Node *expr1 = (Node *) lfirst(lc);
ListCell *lc2;
foreach(lc2, rel2->partexprs[ipk])
{
Node *expr2 = (Node *) lfirst(lc2);
if (exprs_known_equal(root, expr1, expr2, btree_opfamily))
{
pk_known_equal[ipk] = true;
break;
}
}
if (pk_known_equal[ipk])
break;
}
if (pk_known_equal[ipk])
{
/* We can stop examining keys once we prove all keys equal. */
if (++num_equal_pks == part_scheme->partnatts)
return true;
}
else
break; /* no chance to succeed, give up */
}
return false;
} }
/* /*

View File

@ -3313,10 +3313,11 @@ add_unique_group_var(PlannerInfo *root, List *varinfos,
/* /*
* Drop known-equal vars, but only if they belong to different * Drop known-equal vars, but only if they belong to different
* relations (see comments for estimate_num_groups) * relations (see comments for estimate_num_groups). We aren't too
* fussy about the semantics of "equal" here.
*/ */
if (vardata->rel != varinfo->rel && if (vardata->rel != varinfo->rel &&
exprs_known_equal(root, var, varinfo->var)) exprs_known_equal(root, var, varinfo->var, InvalidOid))
{ {
if (varinfo->ndistinct <= ndistinct) if (varinfo->ndistinct <= ndistinct)
{ {

View File

@ -158,7 +158,8 @@ extern List *generate_join_implied_equalities_for_ecs(PlannerInfo *root,
Relids join_relids, Relids join_relids,
Relids outer_relids, Relids outer_relids,
RelOptInfo *inner_rel); RelOptInfo *inner_rel);
extern bool exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2); extern bool exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2,
Oid opfamily);
extern EquivalenceClass *match_eclasses_to_foreign_key_col(PlannerInfo *root, extern EquivalenceClass *match_eclasses_to_foreign_key_col(PlannerInfo *root,
ForeignKeyOptInfo *fkinfo, ForeignKeyOptInfo *fkinfo,
int colno); int colno);

View File

@ -62,6 +62,45 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1, prt2 t2 WHERE t1.a = t2.b AND t1.b =
450 | 0450 | 450 | 0450 450 | 0450 | 450 | 0450
(4 rows) (4 rows)
-- inner join with partially-redundant join clauses
EXPLAIN (COSTS OFF)
SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1, prt2 t2 WHERE t1.a = t2.a AND t1.a = t2.b ORDER BY t1.a, t2.b;
QUERY PLAN
---------------------------------------------------------------
Sort
Sort Key: t1.a
-> Append
-> Merge Join
Merge Cond: (t1_1.a = t2_1.a)
-> Index Scan using iprt1_p1_a on prt1_p1 t1_1
-> Sort
Sort Key: t2_1.b
-> Seq Scan on prt2_p1 t2_1
Filter: (a = b)
-> Hash Join
Hash Cond: (t1_2.a = t2_2.a)
-> Seq Scan on prt1_p2 t1_2
-> Hash
-> Seq Scan on prt2_p2 t2_2
Filter: (a = b)
-> Hash Join
Hash Cond: (t1_3.a = t2_3.a)
-> Seq Scan on prt1_p3 t1_3
-> Hash
-> Seq Scan on prt2_p3 t2_3
Filter: (a = b)
(22 rows)
SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1, prt2 t2 WHERE t1.a = t2.a AND t1.a = t2.b ORDER BY t1.a, t2.b;
a | c | b | c
----+------+----+------
0 | 0000 | 0 | 0000
6 | 0006 | 6 | 0006
12 | 0012 | 12 | 0012
18 | 0018 | 18 | 0018
24 | 0024 | 24 | 0024
(5 rows)
-- left outer join, 3-way -- left outer join, 3-way
EXPLAIN (COSTS OFF) EXPLAIN (COSTS OFF)
SELECT COUNT(*) FROM prt1 t1 SELECT COUNT(*) FROM prt1 t1
@ -1803,6 +1842,55 @@ SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_l t1, prt2_l t2 WHERE t1.a = t2.b AND t1
450 | 0002 | 450 | 0002 450 | 0002 | 450 | 0002
(4 rows) (4 rows)
-- inner join with partially-redundant join clauses
EXPLAIN (COSTS OFF)
SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_l t1, prt2_l t2 WHERE t1.a = t2.a AND t1.a = t2.b AND t1.c = t2.c ORDER BY t1.a, t2.b;
QUERY PLAN
------------------------------------------------------------------------------------
Sort
Sort Key: t1.a
-> Append
-> Hash Join
Hash Cond: ((t1_1.a = t2_1.a) AND ((t1_1.c)::text = (t2_1.c)::text))
-> Seq Scan on prt1_l_p1 t1_1
-> Hash
-> Seq Scan on prt2_l_p1 t2_1
Filter: (a = b)
-> Hash Join
Hash Cond: ((t1_2.a = t2_2.a) AND ((t1_2.c)::text = (t2_2.c)::text))
-> Seq Scan on prt1_l_p2_p1 t1_2
-> Hash
-> Seq Scan on prt2_l_p2_p1 t2_2
Filter: (a = b)
-> Hash Join
Hash Cond: ((t1_3.a = t2_3.a) AND ((t1_3.c)::text = (t2_3.c)::text))
-> Seq Scan on prt1_l_p2_p2 t1_3
-> Hash
-> Seq Scan on prt2_l_p2_p2 t2_3
Filter: (a = b)
-> Hash Join
Hash Cond: ((t1_5.a = t2_5.a) AND ((t1_5.c)::text = (t2_5.c)::text))
-> Append
-> Seq Scan on prt1_l_p3_p1 t1_5
-> Seq Scan on prt1_l_p3_p2 t1_6
-> Hash
-> Append
-> Seq Scan on prt2_l_p3_p1 t2_5
Filter: (a = b)
-> Seq Scan on prt2_l_p3_p2 t2_6
Filter: (a = b)
(32 rows)
SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_l t1, prt2_l t2 WHERE t1.a = t2.a AND t1.a = t2.b AND t1.c = t2.c ORDER BY t1.a, t2.b;
a | c | b | c
----+------+----+------
0 | 0000 | 0 | 0000
6 | 0002 | 6 | 0002
12 | 0000 | 12 | 0000
18 | 0002 | 18 | 0002
24 | 0000 | 24 | 0000
(5 rows)
-- left join -- left join
EXPLAIN (COSTS OFF) EXPLAIN (COSTS OFF)
SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_l t1 LEFT JOIN prt2_l t2 ON t1.a = t2.b AND t1.c = t2.c WHERE t1.b = 0 ORDER BY t1.a, t2.b; SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_l t1 LEFT JOIN prt2_l t2 ON t1.a = t2.b AND t1.c = t2.c WHERE t1.b = 0 ORDER BY t1.a, t2.b;

View File

@ -34,6 +34,11 @@ EXPLAIN (COSTS OFF)
SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1, prt2 t2 WHERE t1.a = t2.b AND t1.b = 0 ORDER BY t1.a, t2.b; SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1, prt2 t2 WHERE t1.a = t2.b AND t1.b = 0 ORDER BY t1.a, t2.b;
SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1, prt2 t2 WHERE t1.a = t2.b AND t1.b = 0 ORDER BY t1.a, t2.b; SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1, prt2 t2 WHERE t1.a = t2.b AND t1.b = 0 ORDER BY t1.a, t2.b;
-- inner join with partially-redundant join clauses
EXPLAIN (COSTS OFF)
SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1, prt2 t2 WHERE t1.a = t2.a AND t1.a = t2.b ORDER BY t1.a, t2.b;
SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1, prt2 t2 WHERE t1.a = t2.a AND t1.a = t2.b ORDER BY t1.a, t2.b;
-- left outer join, 3-way -- left outer join, 3-way
EXPLAIN (COSTS OFF) EXPLAIN (COSTS OFF)
SELECT COUNT(*) FROM prt1 t1 SELECT COUNT(*) FROM prt1 t1
@ -386,6 +391,11 @@ EXPLAIN (COSTS OFF)
SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_l t1, prt2_l t2 WHERE t1.a = t2.b AND t1.b = 0 ORDER BY t1.a, t2.b; SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_l t1, prt2_l t2 WHERE t1.a = t2.b AND t1.b = 0 ORDER BY t1.a, t2.b;
SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_l t1, prt2_l t2 WHERE t1.a = t2.b AND t1.b = 0 ORDER BY t1.a, t2.b; SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_l t1, prt2_l t2 WHERE t1.a = t2.b AND t1.b = 0 ORDER BY t1.a, t2.b;
-- inner join with partially-redundant join clauses
EXPLAIN (COSTS OFF)
SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_l t1, prt2_l t2 WHERE t1.a = t2.a AND t1.a = t2.b AND t1.c = t2.c ORDER BY t1.a, t2.b;
SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_l t1, prt2_l t2 WHERE t1.a = t2.a AND t1.a = t2.b AND t1.c = t2.c ORDER BY t1.a, t2.b;
-- left join -- left join
EXPLAIN (COSTS OFF) EXPLAIN (COSTS OFF)
SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_l t1 LEFT JOIN prt2_l t2 ON t1.a = t2.b AND t1.c = t2.c WHERE t1.b = 0 ORDER BY t1.a, t2.b; SELECT t1.a, t1.c, t2.b, t2.c FROM prt1_l t1 LEFT JOIN prt2_l t2 ON t1.a = t2.b AND t1.c = t2.c WHERE t1.b = 0 ORDER BY t1.a, t2.b;