1
0
mirror of https://github.com/postgres/postgres.git synced 2025-11-09 06:21:09 +03:00

Remove redundant grouping and DISTINCT columns.

Avoid explicitly grouping by columns that we know are redundant
for sorting, for example we need group by only one of x and y in
	SELECT ... WHERE x = y GROUP BY x, y
This comes up more often than you might think, as shown by the
changes in the regression tests.  It's nearly free to detect too,
since we are just piggybacking on the existing logic that detects
redundant pathkeys.  (In some of the existing plans that change,
it's visible that a sort step preceding the grouping step already
didn't bother to sort by the redundant column, making the old plan
a bit silly-looking.)

To do this, build processed_groupClause and processed_distinctClause
lists that omit any provably-redundant sort items, and consult those
not the originals where relevant.  This means that within the
planner, one should usually consult root->processed_groupClause or
root->processed_distinctClause if one wants to know which columns
are to be grouped on; but to check whether grouping or distinct-ing
is happening at all, check non-NIL-ness of parse->groupClause or
parse->distinctClause.  This is comparable to longstanding rules
about handling the HAVING clause, so I don't think it'll be a huge
maintenance problem.

nodeAgg.c also needs minor mods, because it's now possible to generate
AGG_PLAIN and AGG_SORTED Agg nodes with zero grouping columns.

Patch by me; thanks to Richard Guo and David Rowley for review.

Discussion: https://postgr.es/m/185315.1672179489@sss.pgh.pa.us
This commit is contained in:
Tom Lane
2023-01-18 12:37:57 -05:00
parent d540a02a72
commit 8d83a5d0a2
16 changed files with 303 additions and 171 deletions

View File

@@ -1152,18 +1152,62 @@ List *
make_pathkeys_for_sortclauses(PlannerInfo *root,
List *sortclauses,
List *tlist)
{
List *result;
bool sortable;
result = make_pathkeys_for_sortclauses_extended(root,
&sortclauses,
tlist,
false,
&sortable);
/* It's caller error if not all clauses were sortable */
Assert(sortable);
return result;
}
/*
* make_pathkeys_for_sortclauses_extended
* Generate a pathkeys list that represents the sort order specified
* by a list of SortGroupClauses
*
* The comments for make_pathkeys_for_sortclauses apply here too. In addition:
*
* If remove_redundant is true, then any sort clauses that are found to
* give rise to redundant pathkeys are removed from the sortclauses list
* (which therefore must be pass-by-reference in this version).
*
* *sortable is set to true if all the sort clauses are in fact sortable.
* If any are not, they are ignored except for setting *sortable false.
* (In that case, the output pathkey list isn't really useful. However,
* we process the whole sortclauses list anyway, because it's still valid
* to remove any clauses that can be proven redundant via the eclass logic.
* Even though we'll have to hash in that case, we might as well not hash
* redundant columns.)
*/
List *
make_pathkeys_for_sortclauses_extended(PlannerInfo *root,
List **sortclauses,
List *tlist,
bool remove_redundant,
bool *sortable)
{
List *pathkeys = NIL;
ListCell *l;
foreach(l, sortclauses)
*sortable = true;
foreach(l, *sortclauses)
{
SortGroupClause *sortcl = (SortGroupClause *) lfirst(l);
Expr *sortkey;
PathKey *pathkey;
sortkey = (Expr *) get_sortgroupclause_expr(sortcl, tlist);
Assert(OidIsValid(sortcl->sortop));
if (!OidIsValid(sortcl->sortop))
{
*sortable = false;
continue;
}
pathkey = make_pathkey_from_sortop(root,
sortkey,
root->nullable_baserels,
@@ -1175,6 +1219,8 @@ make_pathkeys_for_sortclauses(PlannerInfo *root,
/* Canonical form eliminates redundant ordering keys */
if (!pathkey_is_redundant(pathkey, pathkeys))
pathkeys = lappend(pathkeys, pathkey);
else if (remove_redundant)
*sortclauses = foreach_delete_current(*sortclauses, l);
}
return pathkeys;
}