1
0
mirror of https://github.com/postgres/postgres.git synced 2025-06-16 06:01:02 +03:00

Reconsider the representation of join alias Vars.

The core idea of this patch is to make the parser generate join alias
Vars (that is, ones with varno pointing to a JOIN RTE) only when the
alias Var is actually different from any raw join input, that is a type
coercion and/or COALESCE is necessary to generate the join output value.
Otherwise just generate varno/varattno pointing to the relevant join
input column.

In effect, this means that the planner's flatten_join_alias_vars()
transformation is already done in the parser, for all cases except
(a) columns that are merged by JOIN USING and are transformed in the
process, and (b) whole-row join Vars.  In principle that would allow
us to skip doing flatten_join_alias_vars() in many more queries than
we do now, but we don't have quite enough infrastructure to know that
we can do so --- in particular there's no cheap way to know whether
there are any whole-row join Vars.  I'm not sure if it's worth the
trouble to add a Query-level flag for that, and in any case it seems
like fit material for a separate patch.  But even without skipping the
work entirely, this should make flatten_join_alias_vars() faster,
particularly where there are nested joins that it previously had to
flatten recursively.

An essential part of this change is to replace Var nodes'
varnoold/varoattno fields with varnosyn/varattnosyn, which have
considerably more tightly-defined meanings than the old fields: when
they differ from varno/varattno, they identify the Var's position in
an aliased JOIN RTE, and the join alias is what ruleutils.c should
print for the Var.  This is necessary because the varno change
destroyed ruleutils.c's ability to find the JOIN RTE from the Var's
varno.

Another way in which this change broke ruleutils.c is that it's no
longer feasible to determine, from a JOIN RTE's joinaliasvars list,
which join columns correspond to which columns of the join's immediate
input relations.  (If those are sub-joins, the joinaliasvars entries
may point to columns of their base relations, not the sub-joins.)
But that was a horrid mess requiring a lot of fragile assumptions
already, so let's just bite the bullet and add some more JOIN RTE
fields to make it more straightforward to figure that out.  I added
two integer-List fields containing the relevant column numbers from
the left and right input rels, plus a count of how many merged columns
there are.

This patch depends on the ParseNamespaceColumn infrastructure that
I added in commit 5815696bc.  The biggest bit of code change is
restructuring transformFromClauseItem's handling of JOINs so that
the ParseNamespaceColumn data is propagated upward correctly.

Other than that and the ruleutils fixes, everything pretty much
just works, though some processing is now inessential.  I grabbed
two pieces of low-hanging fruit in that line:

1. In find_expr_references, we don't need to recurse into join alias
Vars anymore.  There aren't any except for references to merged USING
columns, which are more properly handled when we scan the join's RTE.
This change actually fixes an edge-case issue: we will now record a
dependency on any type-coercion function present in a USING column's
joinaliasvar, even if that join column has no references in the query
text.  The odds of the missing dependency causing a problem seem quite
small: you'd have to posit somebody dropping an implicit cast between
two data types, without removing the types themselves, and then having
a stored rule containing a whole-row Var for a join whose USING merge
depends on that cast.  So I don't feel a great need to change this in
the back branches.  But in theory this way is more correct.

2. markRTEForSelectPriv and markTargetListOrigin don't need to recurse
into join alias Vars either, because the cases they care about don't
apply to alias Vars for USING columns that are semantically distinct
from the underlying columns.  This removes the only case in which
markVarForSelectPriv could be called with NULL for the RTE, so adjust
the comments to describe that hack as being strictly internal to
markRTEForSelectPriv.

catversion bump required due to changes in stored rules.

Discussion: https://postgr.es/m/7115.1577986646@sss.pgh.pa.us
This commit is contained in:
Tom Lane
2020-01-09 11:56:59 -05:00
parent ed10f32e37
commit 9ce77d75c5
19 changed files with 424 additions and 393 deletions

View File

@ -714,12 +714,15 @@ scanNSItemForColumn(ParseState *pstate, ParseNamespaceItem *nsitem,
colname,
rte->eref->aliasname)));
var = makeVar(nsitem->p_rtindex,
attnum,
var = makeVar(nscol->p_varno,
nscol->p_varattno,
nscol->p_vartype,
nscol->p_vartypmod,
nscol->p_varcollid,
sublevels_up);
/* makeVar doesn't offer parameters for these, so set them by hand: */
var->varnosyn = nscol->p_varnosyn;
var->varattnosyn = nscol->p_varattnosyn;
}
else
{
@ -991,9 +994,10 @@ searchRangeTableForCol(ParseState *pstate, const char *alias, const char *colnam
*
* col == InvalidAttrNumber means a "whole row" reference
*
* The caller should pass the actual RTE if it has it handy; otherwise pass
* NULL, and we'll look it up here. (This uglification of the API is
* worthwhile because nearly all external callers have the RTE at hand.)
* External callers should always pass the Var's RTE. Internally, we
* allow NULL to be passed for the RTE and then look it up if needed;
* this takes less code than requiring each internal recursion site
* to perform a lookup.
*/
static void
markRTEForSelectPriv(ParseState *pstate, RangeTblEntry *rte,
@ -1062,21 +1066,11 @@ markRTEForSelectPriv(ParseState *pstate, RangeTblEntry *rte,
else
{
/*
* Regular join attribute, look at the alias-variable list.
*
* The aliasvar could be either a Var or a COALESCE expression,
* but in the latter case we should already have marked the two
* referent variables as being selected, due to their use in the
* JOIN clause. So we need only be concerned with the Var case.
* But we do need to drill down through implicit coercions.
* Join alias Vars for ordinary columns must refer to merged JOIN
* USING columns. We don't need to do anything here, because the
* join input columns will also be referenced in the join's qual
* clause, and will get marked for select privilege there.
*/
Var *aliasvar;
Assert(col > 0 && col <= list_length(rte->joinaliasvars));
aliasvar = (Var *) list_nth(rte->joinaliasvars, col - 1);
aliasvar = (Var *) strip_implicit_coercions((Node *) aliasvar);
if (aliasvar && IsA(aliasvar, Var))
markVarForSelectPriv(pstate, aliasvar, NULL);
}
}
/* other RTE types don't require privilege marking */
@ -1085,9 +1079,6 @@ markRTEForSelectPriv(ParseState *pstate, RangeTblEntry *rte,
/*
* markVarForSelectPriv
* Mark the RTE referenced by a Var as requiring SELECT privilege
*
* The caller should pass the Var's referenced RTE if it has it handy
* (nearly all do); otherwise pass NULL.
*/
void
markVarForSelectPriv(ParseState *pstate, Var *var, RangeTblEntry *rte)
@ -2110,7 +2101,10 @@ addRangeTableEntryForJoin(ParseState *pstate,
List *colnames,
ParseNamespaceColumn *nscolumns,
JoinType jointype,
int nummergedcols,
List *aliasvars,
List *leftcols,
List *rightcols,
Alias *alias,
bool inFromCl)
{
@ -2135,7 +2129,10 @@ addRangeTableEntryForJoin(ParseState *pstate,
rte->relid = InvalidOid;
rte->subquery = NULL;
rte->jointype = jointype;
rte->joinmergedcols = nummergedcols;
rte->joinaliasvars = aliasvars;
rte->joinleftcols = leftcols;
rte->joinrightcols = rightcols;
rte->alias = alias;
eref = alias ? copyObject(alias) : makeAlias("unnamed_join", NIL);
@ -2713,11 +2710,11 @@ expandRTE(RangeTblEntry *rte, int rtindex, int sublevels_up,
/*
* During ordinary parsing, there will never be any
* deleted columns in the join; but we have to check since
* this routine is also used by the rewriter, and joins
* found in stored rules might have join columns for
* since-deleted columns. This will be signaled by a null
* pointer in the alias-vars list.
* deleted columns in the join. While this function is
* also used by the rewriter and planner, they do not
* currently call it on any JOIN RTEs. Therefore, this
* next bit is dead code, but it seems prudent to handle
* the case correctly anyway.
*/
if (avar == NULL)
{
@ -2753,11 +2750,26 @@ expandRTE(RangeTblEntry *rte, int rtindex, int sublevels_up,
{
Var *varnode;
varnode = makeVar(rtindex, varattno,
exprType(avar),
exprTypmod(avar),
exprCollation(avar),
sublevels_up);
/*
* If the joinaliasvars entry is a simple Var, just
* copy it (with adjustment of varlevelsup and
* location); otherwise it is a JOIN USING column and
* we must generate a join alias Var. This matches
* the results that expansion of "join.*" by
* expandNSItemVars would have produced, if we had
* access to the ParseNamespaceItem for the join.
*/
if (IsA(avar, Var))
{
varnode = copyObject((Var *) avar);
varnode->varlevelsup = sublevels_up;
}
else
varnode = makeVar(rtindex, varattno,
exprType(avar),
exprTypmod(avar),
exprCollation(avar),
sublevels_up);
varnode->location = location;
*colvars = lappend(*colvars, varnode);
@ -2971,12 +2983,15 @@ expandNSItemVars(ParseNamespaceItem *nsitem,
Var *var;
Assert(nscol->p_varno > 0);
var = makeVar(nsitem->p_rtindex,
colindex + 1,
var = makeVar(nscol->p_varno,
nscol->p_varattno,
nscol->p_vartype,
nscol->p_vartypmod,
nscol->p_varcollid,
sublevels_up);
/* makeVar doesn't offer parameters for these, so set by hand: */
var->varnosyn = nscol->p_varnosyn;
var->varattnosyn = nscol->p_varattnosyn;
var->location = location;
result = lappend(result, var);
if (colnames)