1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-21 16:02:15 +03:00

Avoid making commutatively-duplicate clauses in EquivalenceClasses.

When we decide we need to make a derived clause equating a.x and
b.y, we already will re-use a previously-made clause "a.x = b.y".
But we might instead have "b.y = a.x", which is perfectly usable
because equivclass.c has never promised anything about the
operand order in clauses it builds.  Saving construction of a
new RestrictInfo doesn't matter all that much in itself --- but
because we cache selectivity estimates and so on per-RestrictInfo,
there's a possibility of saving a fair amount of duplicative
effort downstream.

Hence, check for commutative matches as well as direct ones when
seeing if we have a pre-existing clause.  This changes the visible
clause order in several regression test cases, but they're all
clearly-insignificant changes.

Checking for the reverse operand order is simple enough, but
if we wanted to check for operator OID match we'd need to call
get_commutator here, which is not so cheap.  I concluded that
we don't really need the operator check anyway, so I just
removed it.  It's unlikely that an opfamily contains more than
one applicable operator for a given pair of operand datatypes;
and if it does they had better give the same answers, so there
seems little need to insist that we use exactly the one
select_equality_operator chose.

Using the current core regression suite as a test case, I see
this change reducing the number of new join clauses built by
create_join_clause from 9673 to 5142 (out of 26652 calls).
So not quite 50% savings, but pretty close to it.

Discussion: https://postgr.es/m/78062.1666735746@sss.pgh.pa.us
This commit is contained in:
Tom Lane
2022-10-27 14:42:18 -04:00
parent 4ab8c81bd9
commit a5fc46414d
5 changed files with 59 additions and 47 deletions

View File

@ -1382,7 +1382,9 @@ generate_base_implied_equalities_broken(PlannerInfo *root,
* whenever we select a particular pair of EquivalenceMembers to join,
* we check to see if the pair matches any original clause (in ec_sources)
* or previously-built clause (in ec_derives). This saves memory and allows
* re-use of information cached in RestrictInfos.
* re-use of information cached in RestrictInfos. We also avoid generating
* commutative duplicates, i.e. if the algorithm selects "a.x = b.y" but
* we already have "b.y = a.x", we return the existing clause.
*
* join_relids should always equal bms_union(outer_relids, inner_rel->relids).
* We could simplify this function's API by computing it internally, but in
@ -1790,7 +1792,8 @@ select_equality_operator(EquivalenceClass *ec, Oid lefttype, Oid righttype)
/*
* create_join_clause
* Find or make a RestrictInfo comparing the two given EC members
* with the given operator.
* with the given operator (or, possibly, its commutator, because
* the ordering of the operands in the result is not guaranteed).
*
* parent_ec is either equal to ec (if the clause is a potentially-redundant
* join clause) or NULL (if not). We have to treat this as part of the
@ -1811,16 +1814,22 @@ create_join_clause(PlannerInfo *root,
/*
* Search to see if we already built a RestrictInfo for this pair of
* EquivalenceMembers. We can use either original source clauses or
* previously-derived clauses. The check on opno is probably redundant,
* but be safe ...
* previously-derived clauses, and a commutator clause is acceptable.
*
* We used to verify that opno matches, but that seems redundant: even if
* it's not identical, it'd better have the same effects, or the operator
* families we're using are broken.
*/
foreach(lc, ec->ec_sources)
{
rinfo = (RestrictInfo *) lfirst(lc);
if (rinfo->left_em == leftem &&
rinfo->right_em == rightem &&
rinfo->parent_ec == parent_ec &&
opno == ((OpExpr *) rinfo->clause)->opno)
rinfo->parent_ec == parent_ec)
return rinfo;
if (rinfo->left_em == rightem &&
rinfo->right_em == leftem &&
rinfo->parent_ec == parent_ec)
return rinfo;
}
@ -1829,8 +1838,11 @@ create_join_clause(PlannerInfo *root,
rinfo = (RestrictInfo *) lfirst(lc);
if (rinfo->left_em == leftem &&
rinfo->right_em == rightem &&
rinfo->parent_ec == parent_ec &&
opno == ((OpExpr *) rinfo->clause)->opno)
rinfo->parent_ec == parent_ec)
return rinfo;
if (rinfo->left_em == rightem &&
rinfo->right_em == leftem &&
rinfo->parent_ec == parent_ec)
return rinfo;
}