Support multi-argument UNNEST(), and TABLE() syntax for multiple functions.

This patch adds the ability to write TABLE( function1(), function2(), ...) as a single FROM-clause entry. The result is the concatenation of the first row from each function, followed by the second row from each function, etc; with NULLs inserted if any function produces fewer rows than others. This is believed to be a much more useful behavior than what Postgres currently does with multiple SRFs in a SELECT list. This syntax also provides a reasonable way to combine use of column definition lists with WITH ORDINALITY: put the column definition list inside TABLE(), where it's clear that it doesn't control the ordinality column as well. Also implement SQL-compliant multiple-argument UNNEST(), by turning UNNEST(a,b,c) into TABLE(unnest(a), unnest(b), unnest(c)). The SQL standard specifies TABLE() with only a single function, not multiple functions, and it seems to require an implicit UNNEST() which is not what this patch does. There may be something wrong with that reading of the spec, though, because if it's right then the spec's TABLE() is just a pointless alternative spelling of UNNEST(). After further review of that, we might choose to adopt a different syntax for what this patch does, but in any case this functionality seems clearly worthwhile. Andrew Gierth, reviewed by Zoltán Böszörményi and Heikki Linnakangas, and significantly revised by me
2025-11-21 00:42:43 +03:00 · 2013-11-21 19:37:02 -05:00
parent 38f4328981
commit 784e762e88
48 changed files with 2643 additions and 1207 deletions
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -1076,9 +1076,9 @@ cost_functionscan(Path *path, PlannerInfo *root,
 		path->rows = baserel->rows;

 	/*
-	 * Estimate costs of executing the function expression.
+	 * Estimate costs of executing the function expression(s).
 	 *
-	 * Currently, nodeFunctionscan.c always executes the function to
+	 * Currently, nodeFunctionscan.c always executes the functions to
 	 * completion before returning any rows, and caches the results in a
 	 * tuplestore.	So the function eval cost is all startup cost, and per-row
 	 * costs are minimal.
@@ -1088,7 +1088,7 @@ cost_functionscan(Path *path, PlannerInfo *root,
 	 * estimates for functions tend to be, there's not a lot of point in that
 	 * refinement right now.
 	 */
-	cost_qual_eval_node(&exprcost, rte->funcexpr, root);
+	cost_qual_eval_node(&exprcost, (Node *) rte->functions, root);

 	startup_cost += exprcost.startup + exprcost.per_tuple;

@@ -3845,14 +3845,26 @@ void
 set_function_size_estimates(PlannerInfo *root, RelOptInfo *rel)
 {
 	RangeTblEntry *rte;
+	ListCell   *lc;

 	/* Should only be applied to base relations that are functions */
 	Assert(rel->relid > 0);
 	rte = planner_rt_fetch(rel->relid, root);
 	Assert(rte->rtekind == RTE_FUNCTION);

-	/* Estimate number of rows the function itself will return */
-	rel->tuples = expression_returns_set_rows(rte->funcexpr);
+	/*
+	 * Estimate number of rows the functions will return. The rowcount of the
+	 * node is that of the largest function result.
+	 */
+	rel->tuples = 0;
+	foreach(lc, rte->functions)
+	{
+		RangeTblFunction *rtfunc = (RangeTblFunction *) lfirst(lc);
+		double		ntup = expression_returns_set_rows(rtfunc->funcexpr);
+
+		if (ntup > rel->tuples)
+			rel->tuples = ntup;
+	}

 	/* Now estimate number of output rows, etc */
 	set_baserel_size_estimates(root, rel);