Account for SRFs in targetlists in planner rowcount estimates.

We made use of the ROWS estimate for set-returning functions used in FROM, but not for those used in SELECT targetlists; which is a bit of an oversight considering there are common usages that require the latter approach. Improve that. (I had initially thought it might be worth folding this into cost_qual_eval, but after investigation concluded that that wouldn't be very helpful, so just do it separately.) Per complaint from David Johnston. Back-patch to 9.2, but not further, for fear of destabilizing plan choices in existing releases.
2025-11-06 07:49:08 +03:00 · 2012-07-21 17:45:07 -04:00
parent ed0af33247
commit 31c7c642b6
7 changed files with 120 additions and 48 deletions
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -661,10 +661,12 @@ find_window_functions_walker(Node *node, WindowFuncLists *lists)

 /*
 * expression_returns_set_rows
- *	  Estimate the number of rows in a set result.
+ *	  Estimate the number of rows returned by a set-returning expression.
+ *	  The result is 1 if there are no set-returning functions.
 *
 * We use the product of the rowcount estimates of all the functions in
- * the given tree.	The result is 1 if there are no set-returning functions.
+ * the given tree (this corresponds to the behavior of ExecMakeFunctionResult
+ * for nested set-returning functions).
 *
 * Note: keep this in sync with expression_returns_set() in nodes/nodeFuncs.c.
 */
@@ -674,7 +676,7 @@ expression_returns_set_rows(Node *clause)
 	double		result = 1;

 	(void) expression_returns_set_rows_walker(clause, &result);
-	return result;
+	return clamp_row_est(result);
 }

 static bool
@@ -736,6 +738,40 @@ expression_returns_set_rows_walker(Node *node, double *count)
 								  (void *) count);
 }

+/*
+ * tlist_returns_set_rows
+ *	  Estimate the number of rows returned by a set-returning targetlist.
+ *	  The result is 1 if there are no set-returning functions.
+ *
+ * Here, the result is the largest rowcount estimate of any of the tlist's
+ * expressions, not the product as you would get from naively applying
+ * expression_returns_set_rows() to the whole tlist.  The behavior actually
+ * implemented by ExecTargetList produces a number of rows equal to the least
+ * common multiple of the expression rowcounts, so that the product would be
+ * a worst-case estimate that is typically not realistic.  Taking the max as
+ * we do here is a best-case estimate that might not be realistic either,
+ * but it's probably closer for typical usages.  We don't try to compute the
+ * actual LCM because we're working with very approximate estimates, so their
+ * LCM would be unduly noisy.
+ */
+double
+tlist_returns_set_rows(List *tlist)
+{
+	double		result = 1;
+	ListCell   *lc;
+
+	foreach(lc, tlist)
+	{
+		TargetEntry *tle = (TargetEntry *) lfirst(lc);
+		double		colresult;
+
+		colresult = expression_returns_set_rows((Node *) tle->expr);
+		if (result < colresult)
+			result = colresult;
+	}
+	return result;
+}
+

 /*****************************************************************************
 *		Subplan clause manipulation