mirror of
https://github.com/postgres/postgres.git
synced 2025-10-27 00:12:01 +03:00
Fix foreign-key selectivity estimation in the presence of constants.
get_foreign_key_join_selectivity() looks for join clauses that equate the two sides of the FK constraint. However, if we have a query like "WHERE fktab.a = pktab.a and fktab.a = 1", it won't find any such join clause, because equivclass.c replaces the given clauses with "fktab.a = 1 and pktab.a = 1", which can be enforced at the scan level, leaving nothing to be done for column "a" at the join level. We can fix that expectation without much trouble, but then a new problem arises: applying the foreign-key-based selectivity rule produces a rowcount underestimate, because we're effectively double-counting the selectivity of the "fktab.a = 1" clause. So we have to cancel that selectivity out of the estimate. To fix, refactor process_implied_equality() so that it can pass back the new RestrictInfo to its callers in equivclass.c, allowing the generated "fktab.a = 1" clause to be saved in the EquivalenceClass's ec_derives list. Then it's not much trouble to dig out the relevant RestrictInfo when we need to adjust an FK selectivity estimate. (While at it, we can also remove the expensive use of initialize_mergeclause_eclasses() to set up the new RestrictInfo's left_ec and right_ec pointers. The equivclass.c code can set those basically for free.) This seems like clearly a bug fix, but I'm hesitant to back-patch it, first because there's some API/ABI risk for extensions and second because we're usually loath to destabilize plan choices in stable branches. Per report from Sigrid Ehrenreich. Discussion: https://postgr.es/m/1019549.1603770457@sss.pgh.pa.us Discussion: https://postgr.es/m/AM6PR02MB5287A0ADD936C1FA80973E72AB190@AM6PR02MB5287.eurprd02.prod.outlook.com
This commit is contained in:
@@ -5066,9 +5066,16 @@ get_foreign_key_join_selectivity(PlannerInfo *root,
|
||||
* remove back into the worklist.
|
||||
*
|
||||
* Since the matching clauses are known not outerjoin-delayed, they
|
||||
* should certainly have appeared in the initial joinclause list. If
|
||||
* we didn't find them, they must have been matched to, and removed
|
||||
* by, some other FK in a previous iteration of this loop. (A likely
|
||||
* would normally have appeared in the initial joinclause list. If we
|
||||
* didn't find them, there are two possibilities:
|
||||
*
|
||||
* 1. If the FK match is based on an EC that is ec_has_const, it won't
|
||||
* have generated any join clauses at all. We discount such ECs while
|
||||
* checking to see if we have "all" the clauses. (Below, we'll adjust
|
||||
* the selectivity estimate for this case.)
|
||||
*
|
||||
* 2. The clauses were matched to some other FK in a previous
|
||||
* iteration of this loop, and thus removed from worklist. (A likely
|
||||
* case is that two FKs are matched to the same EC; there will be only
|
||||
* one EC-derived clause in the initial list, so the first FK will
|
||||
* consume it.) Applying both FKs' selectivity independently risks
|
||||
@@ -5078,8 +5085,9 @@ get_foreign_key_join_selectivity(PlannerInfo *root,
|
||||
* Later we might think of a reasonable way to combine the estimates,
|
||||
* but for now, just punt, since this is a fairly uncommon situation.
|
||||
*/
|
||||
if (list_length(removedlist) !=
|
||||
(fkinfo->nmatched_ec + fkinfo->nmatched_ri))
|
||||
if (removedlist == NIL ||
|
||||
list_length(removedlist) !=
|
||||
(fkinfo->nmatched_ec - fkinfo->nconst_ec + fkinfo->nmatched_ri))
|
||||
{
|
||||
worklist = list_concat(worklist, removedlist);
|
||||
continue;
|
||||
@@ -5138,9 +5146,48 @@ get_foreign_key_join_selectivity(PlannerInfo *root,
|
||||
|
||||
fkselec *= 1.0 / ref_tuples;
|
||||
}
|
||||
|
||||
/*
|
||||
* If any of the FK columns participated in ec_has_const ECs, then
|
||||
* equivclass.c will have generated "var = const" restrictions for
|
||||
* each side of the join, thus reducing the sizes of both input
|
||||
* relations. Taking the fkselec at face value would amount to
|
||||
* double-counting the selectivity of the constant restriction for the
|
||||
* referencing Var. Hence, look for the restriction clause(s) that
|
||||
* were applied to the referencing Var(s), and divide out their
|
||||
* selectivity to correct for this.
|
||||
*/
|
||||
if (fkinfo->nconst_ec > 0)
|
||||
{
|
||||
for (int i = 0; i < fkinfo->nkeys; i++)
|
||||
{
|
||||
EquivalenceClass *ec = fkinfo->eclass[i];
|
||||
|
||||
if (ec && ec->ec_has_const)
|
||||
{
|
||||
EquivalenceMember *em = fkinfo->fk_eclass_member[i];
|
||||
RestrictInfo *rinfo = find_derived_clause_for_ec_member(ec,
|
||||
em);
|
||||
|
||||
if (rinfo)
|
||||
{
|
||||
Selectivity s0;
|
||||
|
||||
s0 = clause_selectivity(root,
|
||||
(Node *) rinfo,
|
||||
0,
|
||||
jointype,
|
||||
sjinfo);
|
||||
if (s0 > 0)
|
||||
fkselec /= s0;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
*restrictlist = worklist;
|
||||
CLAMP_PROBABILITY(fkselec);
|
||||
return fkselec;
|
||||
}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user