mirror of
https://github.com/postgres/postgres.git
synced 2025-06-25 01:02:05 +03:00
Improve planner's handling of set-returning functions in grouping columns.
Improve query_is_distinct_for() to accept SRFs in the targetlist when we can prove distinctness from a DISTINCT clause. In that case the de-duplication will surely happen after SRF expansion, so the proof still works. Continue to punt in the case where we'd try to prove distinctness from GROUP BY (or, in the future, source relations). To do that, we'd have to determine whether the SRFs were in the grouping columns or elsewhere in the tlist, and it still doesn't seem worth the trouble. But this trivial change allows us to recognize that "SELECT DISTINCT unnest(foo) FROM ..." produces unique-ified output, which seems worth having. Also, fix estimate_num_groups() to consider the possibility of SRFs in the grouping columns. Its failure to do so was masked before v10 because grouping_planner() scaled up plan rowcount estimates by the estimated SRF multiplier after performing grouping. That doesn't happen anymore, which is more correct, but it means we need an adjustment in the estimate for the number of groups. Failure to do this leads to an underestimate for the number of output rows of subqueries like "SELECT DISTINCT unnest(foo)" compared to what 9.6 and earlier estimated, thus breaking plan choices in some cases. Per report from Dmitry Shalashov. Back-patch to v10 to avoid degraded plan choices compared to previous releases. Discussion: https://postgr.es/m/CAKPeCUGAeHgoh5O=SvcQxREVkoX7UdeJUMj1F5=aBNvoTa+O8w@mail.gmail.com
This commit is contained in:
@ -3361,6 +3361,7 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
|
||||
List **pgset)
|
||||
{
|
||||
List *varinfos = NIL;
|
||||
double srf_multiplier = 1.0;
|
||||
double numdistinct;
|
||||
ListCell *l;
|
||||
int i;
|
||||
@ -3394,6 +3395,7 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
|
||||
foreach(l, groupExprs)
|
||||
{
|
||||
Node *groupexpr = (Node *) lfirst(l);
|
||||
double this_srf_multiplier;
|
||||
VariableStatData vardata;
|
||||
List *varshere;
|
||||
ListCell *l2;
|
||||
@ -3402,6 +3404,21 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
|
||||
if (pgset && !list_member_int(*pgset, i++))
|
||||
continue;
|
||||
|
||||
/*
|
||||
* Set-returning functions in grouping columns are a bit problematic.
|
||||
* The code below will effectively ignore their SRF nature and come up
|
||||
* with a numdistinct estimate as though they were scalar functions.
|
||||
* We compensate by scaling up the end result by the largest SRF
|
||||
* rowcount estimate. (This will be an overestimate if the SRF
|
||||
* produces multiple copies of any output value, but it seems best to
|
||||
* assume the SRF's outputs are distinct. In any case, it's probably
|
||||
* pointless to worry too much about this without much better
|
||||
* estimates for SRF output rowcounts than we have today.)
|
||||
*/
|
||||
this_srf_multiplier = expression_returns_set_rows(groupexpr);
|
||||
if (srf_multiplier < this_srf_multiplier)
|
||||
srf_multiplier = this_srf_multiplier;
|
||||
|
||||
/* Short-circuit for expressions returning boolean */
|
||||
if (exprType(groupexpr) == BOOLOID)
|
||||
{
|
||||
@ -3467,9 +3484,15 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
|
||||
*/
|
||||
if (varinfos == NIL)
|
||||
{
|
||||
/* Apply SRF multiplier as we would do in the long path */
|
||||
numdistinct *= srf_multiplier;
|
||||
/* Round off */
|
||||
numdistinct = ceil(numdistinct);
|
||||
/* Guard against out-of-range answers */
|
||||
if (numdistinct > input_rows)
|
||||
numdistinct = input_rows;
|
||||
if (numdistinct < 1.0)
|
||||
numdistinct = 1.0;
|
||||
return numdistinct;
|
||||
}
|
||||
|
||||
@ -3638,6 +3661,10 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
|
||||
varinfos = newvarinfos;
|
||||
} while (varinfos != NIL);
|
||||
|
||||
/* Now we can account for the effects of any SRFs */
|
||||
numdistinct *= srf_multiplier;
|
||||
|
||||
/* Round off */
|
||||
numdistinct = ceil(numdistinct);
|
||||
|
||||
/* Guard against out-of-range answers */
|
||||
|
Reference in New Issue
Block a user