mirror of
https://github.com/postgres/postgres.git
synced 2025-11-19 13:42:17 +03:00
Fix partitionwise join with partially-redundant join clauses
To determine if the two relations being joined can use partitionwise join, we need to verify the existence of equi-join conditions involving pairs of matching partition keys for all partition keys. Currently we do that by looking through the join's restriction clauses. However, it has been discovered that this approach is insufficient, because there might be partition keys known equal by a specific EC, but they do not form a join clause because it happens that other members of the EC than the partition keys are constrained to become a join clause. To address this issue, in addition to examining the join's restriction clauses, we also check if any partition keys are known equal by ECs, by leveraging function exprs_known_equal(). To accomplish this, we enhance exprs_known_equal() to check equality per the semantics of the opfamily, if provided. It could be argued that exprs_known_equal() could be called O(N^2) times, where N is the number of partition key expressions, resulting in noticeable performance costs if there are a lot of partition key expressions. But I think this is not a problem. The number of a joinrel's partition key expressions would only be equal to the join degree, since each base relation within the join contributes only one partition key expression. That is to say, it does not scale with the number of partitions. A benchmark with a query involving 5-way joins of partitioned tables, each with 3 partition keys and 1000 partitions, shows that the planning time is not significantly affected by this patch (within the margin of error), particularly when compared to the impact caused by partitionwise join. Thanks to Tom Lane for the idea of leveraging exprs_known_equal() to check if partition keys are known equal by ECs. Author: Richard Guo, Tom Lane Reviewed-by: Tom Lane, Ashutosh Bapat, Robert Haas Discussion: https://postgr.es/m/CAN_9JTzo_2F5dKLqXVtDX5V6dwqB0Xk+ihstpKEt3a1LT6X78A@mail.gmail.com
This commit is contained in:
@@ -2443,15 +2443,17 @@ find_join_domain(PlannerInfo *root, Relids relids)
|
||||
* Detect whether two expressions are known equal due to equivalence
|
||||
* relationships.
|
||||
*
|
||||
* Actually, this only shows that the expressions are equal according
|
||||
* to some opfamily's notion of equality --- but we only use it for
|
||||
* selectivity estimation, so a fuzzy idea of equality is OK.
|
||||
* If opfamily is given, the expressions must be known equal per the semantics
|
||||
* of that opfamily (note it has to be a btree opfamily, since those are the
|
||||
* only opfamilies equivclass.c deals with). If opfamily is InvalidOid, we'll
|
||||
* return true if they're equal according to any opfamily, which is fuzzy but
|
||||
* OK for estimation purposes.
|
||||
*
|
||||
* Note: does not bother to check for "equal(item1, item2)"; caller must
|
||||
* check that case if it's possible to pass identical items.
|
||||
*/
|
||||
bool
|
||||
exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2)
|
||||
exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2, Oid opfamily)
|
||||
{
|
||||
ListCell *lc1;
|
||||
|
||||
@@ -2466,6 +2468,17 @@ exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2)
|
||||
if (ec->ec_has_volatile)
|
||||
continue;
|
||||
|
||||
/*
|
||||
* It's okay to consider ec_broken ECs here. Brokenness just means we
|
||||
* couldn't derive all the implied clauses we'd have liked to; it does
|
||||
* not invalidate our knowledge that the members are equal.
|
||||
*/
|
||||
|
||||
/* Ignore if this EC doesn't use specified opfamily */
|
||||
if (OidIsValid(opfamily) &&
|
||||
!list_member_oid(ec->ec_opfamilies, opfamily))
|
||||
continue;
|
||||
|
||||
foreach(lc2, ec->ec_members)
|
||||
{
|
||||
EquivalenceMember *em = (EquivalenceMember *) lfirst(lc2);
|
||||
@@ -2494,8 +2507,7 @@ exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2)
|
||||
* (In principle there might be more than one matching eclass if multiple
|
||||
* collations are involved, but since collation doesn't matter for equality,
|
||||
* we ignore that fine point here.) This is much like exprs_known_equal,
|
||||
* except that we insist on the comparison operator matching the eclass, so
|
||||
* that the result is definite not approximate.
|
||||
* except for the format of the input.
|
||||
*
|
||||
* On success, we also set fkinfo->eclass[colno] to the matching eclass,
|
||||
* and set fkinfo->fk_eclass_member[colno] to the eclass member for the
|
||||
@@ -2536,7 +2548,7 @@ match_eclasses_to_foreign_key_col(PlannerInfo *root,
|
||||
/* Never match to a volatile EC */
|
||||
if (ec->ec_has_volatile)
|
||||
continue;
|
||||
/* Note: it seems okay to match to "broken" eclasses here */
|
||||
/* It's okay to consider "broken" ECs here, see exprs_known_equal */
|
||||
|
||||
foreach(lc2, ec->ec_members)
|
||||
{
|
||||
|
||||
Reference in New Issue
Block a user