Drop no-op CoerceToDomain nodes from expressions at planning time.

If a domain has no constraints, then CoerceToDomain doesn't really do anything and can be simplified to a RelabelType. This not only eliminates cycles at execution, but allows the planner to optimize better (for instance, match the coerced expression to an index on the underlying column). However, we do have to support invalidating the plan later if a constraint gets added to the domain. That's comparable to the case of a change to a SQL function that had been inlined into a plan, so all the necessary logic already exists for plans depending on functions. We need only duplicate or share that logic for domains. ALTER DOMAIN ADD/DROP CONSTRAINT need to be taught to send out sinval messages for the domain's pg_type entry, since those operations don't update that row. (ALTER DOMAIN SET/DROP NOT NULL do update that row, so no code change is needed for them.) Testing this revealed what's really a pre-existing bug in plpgsql: it caches the SQL-expression-tree expansion of type coercions and had no provision for invalidating entries in that cache. Up to now that was only a problem if such an expression had inlined a SQL function that got changed, which is unlikely though not impossible. But failing to track changes of domain constraints breaks an existing regression test case and would likely cause practical problems too. We could fix that locally in plpgsql, but what seems like a better idea is to build some generic infrastructure in plancache.c to store standalone expressions and track invalidation events for them. (It's tempting to wonder whether plpgsql's "simple expression" stuff could use this code with lower overhead than its current use of the heavyweight plancache APIs. But I've left that idea for later.) Other stuff fixed in passing: * Allow estimate_expression_value() to drop CoerceToDomain unconditionally, effectively assuming that the coercion will succeed. This will improve planner selectivity estimates for cases involving estimatable expressions that are coerced to domains. We could have done this independently of everything else here, but there wasn't previously any need for eval_const_expressions_mutator to know about CoerceToDomain at all. * Use a dlist for plancache.c's list of cached plans, rather than a manually threaded singly-linked list. That eliminates a potential performance problem in DropCachedPlan. * Fix a couple of inconsistencies in typecmds.c about whether operations on domains drop RowExclusiveLock on pg_type. Our common practice is that DDL operations do drop catalog locks, so standardize on that choice. Discussion: https://postgr.es/m/19958.1544122124@sss.pgh.pa.us
2025-07-07 00:36:50 +03:00 · 2018-12-13 13:24:43 -05:00
parent 52ac6cd2d0
commit 04fe805a17
12 changed files with 510 additions and 92 deletions
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@ -5923,10 +5923,16 @@ adjust_paths_for_srfs(PlannerInfo *root, RelOptInfo *rel,
 * side-effect that is useful when the expression will get evaluated more than
 * once.  Also, we must fix operator function IDs.
 *
+ * This does not return any information about dependencies of the expression.
+ * Hence callers should use the results only for the duration of the current
+ * query.  Callers that would like to cache the results for longer should use
+ * expression_planner_with_deps, probably via the plancache.
+ *
 * Note: this must not make any damaging changes to the passed-in expression
 * tree.  (It would actually be okay to apply fix_opfuncids to it, but since
 * we first do an expression_tree_mutator-based walk, what is returned will
- * be a new node tree.)
+ * be a new node tree.)  The result is constructed in the current memory
+ * context; beware that this can leak a lot of additional stuff there, too.
 */
 Expr *
 expression_planner(Expr *expr)
@ -5945,6 +5951,57 @@ expression_planner(Expr *expr)
 	return (Expr *) result;
 }

+/*
+ * expression_planner_with_deps
+ *		Perform planner's transformations on a standalone expression,
+ *		returning expression dependency information along with the result.
+ *
+ * This is identical to expression_planner() except that it also returns
+ * information about possible dependencies of the expression, ie identities of
+ * objects whose definitions affect the result.  As in a PlannedStmt, these
+ * are expressed as a list of relation Oids and a list of PlanInvalItems.
+ */
+Expr *
+expression_planner_with_deps(Expr *expr,
+							 List **relationOids,
+							 List **invalItems)
+{
+	Node	   *result;
+	PlannerGlobal glob;
+	PlannerInfo root;
+
+	/* Make up dummy planner state so we can use setrefs machinery */
+	MemSet(&glob, 0, sizeof(glob));
+	glob.type = T_PlannerGlobal;
+	glob.relationOids = NIL;
+	glob.invalItems = NIL;
+
+	MemSet(&root, 0, sizeof(root));
+	root.type = T_PlannerInfo;
+	root.glob = &glob;
+
+	/*
+	 * Convert named-argument function calls, insert default arguments and
+	 * simplify constant subexprs.  Collect identities of inlined functions
+	 * and elided domains, too.
+	 */
+	result = eval_const_expressions(&root, (Node *) expr);
+
+	/* Fill in opfuncid values if missing */
+	fix_opfuncids(result);
+
+	/*
+	 * Now walk the finished expression to find anything else we ought to
+	 * record as an expression dependency.
+	 */
+	(void) extract_query_dependencies_walker(result, &root);
+
+	*relationOids = glob.relationOids;
+	*invalItems = glob.invalItems;
+
+	return (Expr *) result;
+}
+

 /*
 * plan_cluster_use_sort
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@ -138,8 +138,7 @@ static List *set_returning_clause_references(PlannerInfo *root,
 								Plan *topplan,
 								Index resultRelation,
 								int rtoffset);
-static bool extract_query_dependencies_walker(Node *node,
-								  PlannerInfo *context);
+

 /*****************************************************************************
 *
@ -175,8 +174,8 @@ static bool extract_query_dependencies_walker(Node *node,
 * This will be used by plancache.c to drive invalidation of cached plans.
 * Relation dependencies are represented by OIDs, and everything else by
 * PlanInvalItems (this distinction is motivated by the shared-inval APIs).
- * Currently, relations and user-defined functions are the only types of
- * objects that are explicitly tracked this way.
+ * Currently, relations, user-defined functions, and domains are the only
+ * types of objects that are explicitly tracked this way.
 *
 * 8. We assign every plan node in the tree a unique ID.
 *
@ -2577,6 +2576,42 @@ record_plan_function_dependency(PlannerInfo *root, Oid funcid)
 	}
 }

+/*
+ * record_plan_type_dependency
+ *		Mark the current plan as depending on a particular type.
+ *
+ * This is exported so that eval_const_expressions can record a
+ * dependency on a domain that it's removed a CoerceToDomain node for.
+ *
+ * We don't currently need to record dependencies on domains that the
+ * plan contains CoerceToDomain nodes for, though that might change in
+ * future.  Hence, this isn't actually called in this module, though
+ * someday fix_expr_common might call it.
+ */
+void
+record_plan_type_dependency(PlannerInfo *root, Oid typeid)
+{
+	/*
+	 * As in record_plan_function_dependency, ignore the possibility that
+	 * someone would change a built-in domain.
+	 */
+	if (typeid >= (Oid) FirstBootstrapObjectId)
+	{
+		PlanInvalItem *inval_item = makeNode(PlanInvalItem);
+
+		/*
+		 * It would work to use any syscache on pg_type, but the easiest is
+		 * TYPEOID since we already have the type's OID at hand.  Note that
+		 * plancache.c knows we use TYPEOID.
+		 */
+		inval_item->cacheId = TYPEOID;
+		inval_item->hashValue = GetSysCacheHashValue1(TYPEOID,
+													  ObjectIdGetDatum(typeid));
+
+		root->glob->invalItems = lappend(root->glob->invalItems, inval_item);
+	}
+}
+
 /*
 * extract_query_dependencies
 *		Given a rewritten, but not yet planned, query or queries
@ -2586,6 +2621,13 @@ record_plan_function_dependency(PlannerInfo *root, Oid funcid)
 *
 * This is needed by plancache.c to handle invalidation of cached unplanned
 * queries.
+ *
+ * Note: this does not go through eval_const_expressions, and hence doesn't
+ * reflect its additions of inlined functions and elided CoerceToDomain nodes
+ * to the invalItems list.  This is obviously OK for functions, since we'll
+ * see them in the original query tree anyway.  For domains, it's OK because
+ * we don't care about domains unless they get elided.  That is, a plan might
+ * have domain dependencies that the query tree doesn't.
 */
 void
 extract_query_dependencies(Node *query,
@ -2615,14 +2657,20 @@ extract_query_dependencies(Node *query,
 	*hasRowSecurity = glob.dependsOnRole;
 }

-static bool
+/*
+ * Tree walker for extract_query_dependencies.
+ *
+ * This is exported so that expression_planner_with_deps can call it on
+ * simple expressions (post-planning, not before planning, in that case).
+ * In that usage, glob.dependsOnRole isn't meaningful, but the relationOids
+ * and invalItems lists are added to as needed.
+ */
+bool
 extract_query_dependencies_walker(Node *node, PlannerInfo *context)
 {
 	if (node == NULL)
 		return false;
 	Assert(!IsA(node, PlaceHolderVar));
-	/* Extract function dependencies and check for regclass Consts */
-	fix_expr_common(context, node);
 	if (IsA(node, Query))
 	{
 		Query	   *query = (Query *) node;
@ -2662,6 +2710,8 @@ extract_query_dependencies_walker(Node *node, PlannerInfo *context)
 		return query_tree_walker(query, extract_query_dependencies_walker,
 								 (void *) context, 0);
 	}
+	/* Extract function dependencies and check for regclass Consts */
+	fix_expr_common(context, node);
 	return expression_tree_walker(node, extract_query_dependencies_walker,
 								  (void *) context);
 }
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@ -3699,6 +3699,70 @@ eval_const_expressions_mutator(Node *node,
 				newbtest->location = btest->location;
 				return (Node *) newbtest;
 			}
+		case T_CoerceToDomain:
+			{
+				/*
+				 * If the domain currently has no constraints, we replace the
+				 * CoerceToDomain node with a simple RelabelType, which is
+				 * both far faster to execute and more amenable to later
+				 * optimization.  We must then mark the plan as needing to be
+				 * rebuilt if the domain's constraints change.
+				 *
+				 * Also, in estimation mode, always replace CoerceToDomain
+				 * nodes, effectively assuming that the coercion will succeed.
+				 */
+				CoerceToDomain *cdomain = (CoerceToDomain *) node;
+				CoerceToDomain *newcdomain;
+				Node	   *arg;
+
+				arg = eval_const_expressions_mutator((Node *) cdomain->arg,
+													 context);
+				if (context->estimate ||
+					!DomainHasConstraints(cdomain->resulttype))
+				{
+					/* Record dependency, if this isn't estimation mode */
+					if (context->root && !context->estimate)
+						record_plan_type_dependency(context->root,
+													cdomain->resulttype);
+
+					/* Generate RelabelType to substitute for CoerceToDomain */
+					/* This should match the RelabelType logic above */
+
+					while (arg && IsA(arg, RelabelType))
+						arg = (Node *) ((RelabelType *) arg)->arg;
+
+					if (arg && IsA(arg, Const))
+					{
+						Const	   *con = (Const *) arg;
+
+						con->consttype = cdomain->resulttype;
+						con->consttypmod = cdomain->resulttypmod;
+						con->constcollid = cdomain->resultcollid;
+						return (Node *) con;
+					}
+					else
+					{
+						RelabelType *newrelabel = makeNode(RelabelType);
+
+						newrelabel->arg = (Expr *) arg;
+						newrelabel->resulttype = cdomain->resulttype;
+						newrelabel->resulttypmod = cdomain->resulttypmod;
+						newrelabel->resultcollid = cdomain->resultcollid;
+						newrelabel->relabelformat = cdomain->coercionformat;
+						newrelabel->location = cdomain->location;
+						return (Node *) newrelabel;
+					}
+				}
+
+				newcdomain = makeNode(CoerceToDomain);
+				newcdomain->arg = (Expr *) arg;
+				newcdomain->resulttype = cdomain->resulttype;
+				newcdomain->resulttypmod = cdomain->resulttypmod;
+				newcdomain->resultcollid = cdomain->resultcollid;
+				newcdomain->coercionformat = cdomain->coercionformat;
+				newcdomain->location = cdomain->location;
+				return (Node *) newcdomain;
+			}
 		case T_PlaceHolderVar:

 			/*
@ -3770,7 +3834,7 @@ eval_const_expressions_mutator(Node *node,
 	 * For any node type not handled above, copy the node unchanged but
 	 * const-simplify its subexpressions.  This is the correct thing for node
 	 * types whose behavior might change between planning and execution, such
-	 * as CoerceToDomain.  It's also a safe default for new node types not
+	 * as CurrentOfExpr.  It's also a safe default for new node types not
 	 * known to this routine.
 	 */
 	return ece_generic_processing(node);