Putting a reference to an expanded-format value into a Const node would be
a bad idea for a couple of reasons. It'd be possible for the supposedly
immutable Const to change value, if something modified the referenced
variable ... in fact, if the Const's reference were R/W, any function that
has the Const as argument might itself change it at runtime. Also, because
datumIsEqual() is pretty simplistic, the Const might fail to compare equal
to other Consts that it should compare equal to, notably including copies
of itself. This could lead to unexpected planner behavior, such as "could
not find pathkey item to sort" errors or inferior plans.
I have not been able to find any way to get an expanded value into a Const
within the existing core code; but Paul Ramsey was able to trigger the
problem by writing a datatype input function that returns an expanded
value.
The best fix seems to be to establish a rule that varlena values being
placed into Const nodes should be passed through pg_detoast_datum().
That will do nothing (and cost little) in normal cases, but it will flatten
expanded values and thereby avoid the above problems. Also, it will
convert short-header or compressed values into canonical format, which will
avoid possible unexpected lack-of-equality issues for those cases too.
And it provides a last-ditch defense against putting a toasted value into
a Const, which we already knew was dangerous, cf commit 2b0c86b66563cf2f.
(In the light of this discussion, I'm no longer sure that that commit
provided 100% protection against such cases, but this fix should do it.)
The test added in commit 65c3d05e18e7c530 to catch datatype input functions
with unstable results would fail for functions that returned expanded
values; but it seems a bit uncharitable to deem a result unstable just
because it's expressed in expanded form, so revise the coding so that we
check for bitwise equality only after applying pg_detoast_datum(). That's
a sufficient condition anyway given the new rule about detoasting when
forming a Const.
Back-patch to 9.5 where the expanded-object facility was added. It's
possible that this should go back further; but in the absence of clear
evidence that there's any live bug in older branches, I'll refrain for now.
coerce_type() has local variables named targetTypeId, baseTypeId, and
targetType. targetType has been the Type structure for baseTypeId, so
rename it to baseType.
This patch introduces generic support for ordered-set and hypothetical-set
aggregate functions, as well as implementations of the instances defined in
SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(),
percent_rank(), cume_dist()). We also added mode() though it is not in the
spec, as well as versions of percentile_cont() and percentile_disc() that
can compute multiple percentile values in one pass over the data.
Unlike the original submission, this patch puts full control of the sorting
process in the hands of the aggregate's support functions. To allow the
support functions to find out how they're supposed to sort, a new API
function AggGetAggref() is added to nodeAgg.c. This allows retrieval of
the aggregate call's Aggref node, which may have other uses beyond the
immediate need. There is also support for ordered-set aggregates to
install cleanup callback functions, so that they can be sure that
infrastructure such as tuplesort objects gets cleaned up.
In passing, make some fixes in the recently-added support for variadic
aggregates, and make some editorial adjustments in the recent FILTER
additions for aggregates. Also, simplify use of IsBinaryCoercible() by
allowing it to succeed whenever the target type is ANY or ANYELEMENT.
It was inconsistent that it dealt with other polymorphic target types
but not these.
Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing,
and rather heavily editorialized upon by Tom Lane
This reduces unnecessary exposure of other headers through htup.h, which
is very widely included by many files.
I have chosen to move the function prototypes to the new file as well,
because that means htup.h no longer needs to include tupdesc.h. In
itself this doesn't have much effect in indirect inclusion of tupdesc.h
throughout the tree, because it's also required by execnodes.h; but it's
something to explore in the future, and it seemed best to do the htup.h
change now while I'm busy with it.
Generally, the parse location assigned to a multiple-token construct is
the location of its leftmost token. This commit breaks that rule for
the syntaxes TYPENAME 'LITERAL' and CAST(CONSTANT AS TYPENAME) --- the
resulting Const will have the location of the literal string, not the
typename or CAST keyword. The cases where this matters are pretty thin on
the ground (no error messages in the regression tests change, for example),
and it's unlikely that any user would be confused anyway by an error cursor
pointing at the literal. But still it's less than consistent. The reason
for changing it is that contrib/pg_stat_statements wants to know the parse
location of the original literal, and it was agreed that this is the least
unpleasant way to preserve that information through parse analysis.
Peter Geoghegan
Because coerce_type recurses into the argument of a CollateExpr,
coerce_to_target_type's longstanding code for detecting whether coerce_type
had actually done anything (to wit, returned a different node than it
passed in) was broken in 9.1. This resulted in unexpected failures in
hide_coercion_node; which was not the latter's fault, since it's critical
that we never call it on anything that wasn't inserted by coerce_type.
(Else we might decide to "hide" a user-written function call.)
Fix by removing and replacing the CollateExpr in coerce_to_target_type
itself. This is all pretty ugly but I don't immediately see a way to make
it nicer.
Per report from Jean-Yves F. Barbier.
This use-case was broken in commit 529cb267a6843a6a8190c86b75d091771d99d6a9
of 2010-10-21, in which I commented "For the moment, we just forbid such
matching. We might later wish to insert an automatic downcast to the
underlying array type, but such a change should also change matching of
domains to ANYELEMENT for consistency". We still lack consensus about what
to do with ANYELEMENT; but not matching ANYARRAY is a clear loss of
functionality compared to prior releases, so let's go ahead and make that
happen. Per complaint from Regina Obe and extensive subsequent discussion.
All the other fields of the constant are being extracted from the syscache
entry we already have, so handle collation similarly. (There don't seem
to be any other uses for the new function at the moment.)
I'm not sure these have any non-cosmetic implications, but I'm not sure
they don't, either. In particular, ensure the CaseTestExpr generated
by transformAssignmentIndirection to represent the base target column
carries the correct collation, because parse_collate.c won't fix that.
Tweak lsyscache.c API so that we can get the appropriate collation
without an extra syscache lookup.
In nearly all cases, the caller already knows the correct collation, and
in a number of places, the value the caller has handy is more correct than
the default for the type would be. (In particular, this patch makes it
significantly less likely that eval_const_expressions will result in
changing the exposed collation of an expression.) So an internal lookup
is both expensive and wrong.
All expression nodes now have an explicit output-collation field, unless
they are known to only return a noncollatable data type (such as boolean
or record). Also, nodes that can invoke collation-aware functions store
a separate field that is the collation value to pass to the function.
This avoids confusion that arises when a function has collatable inputs
and noncollatable output type, or vice versa.
Also, replace the parser's on-the-fly collation assignment method with
a post-pass over the completed expression tree. This allows us to use
a more complex (and hopefully more nearly spec-compliant) assignment
rule without paying for it in extra storage in every expression node.
Fix assorted bugs in the planner's handling of collations by making
collation one of the defining properties of an EquivalenceClass and
by converting CollateExprs into discardable RelabelType nodes during
expression preprocessing.
CollateClause is now used only in raw grammar output, and CollateExpr after
parse analysis. This is for clarity and to avoid carrying collation names
in post-analysis parse trees: that's both wasteful and possibly misleading,
since the collation's name could be changed while the parsetree still
exists.
Also, clean up assorted infelicities and omissions in processing of the
node type.
This adds collation support for columns and domains, a COLLATE clause
to override it per expression, and B-tree index support.
Peter Eisentraut
reviewed by Pavel Stehule, Itagaki Takahiro, Robert Haas, Noah Misch
This patch eliminates various bizarre behaviors caused by sloppy thinking
about the difference between a domain type and its underlying array type.
In particular, the operation of updating one element of such an array
has to be considered as yielding a value of the underlying array type,
*not* a value of the domain, because there's no assurance that the
domain's CHECK constraints are still satisfied. If we're intending to
store the result back into a domain column, we have to re-cast to the
domain type so that constraints are re-checked.
For similar reasons, such a domain can't be blindly matched to an ANYARRAY
polymorphic parameter, because the polymorphic function is likely to apply
array-ish operations that could invalidate the domain constraints. For the
moment, we just forbid such matching. We might later wish to insert an
automatic downcast to the underlying array type, but such a change should
also change matching of domains to ANYELEMENT for consistency.
To ensure that all such logic is rechecked, this patch removes the original
hack of setting a domain's pg_type.typelem field to match its base type;
the typelem will always be zero instead. In those places where it's really
okay to look through the domain type with no other logic changes, use the
newly added get_base_element_type function in place of get_element_type.
catversion bumped due to change in pg_type contents.
Per bug #5717 from Richard Huxton and subsequent discussion.
The purpose of this change is to eliminate the need for every caller
of SearchSysCache, SearchSysCacheCopy, SearchSysCacheExists,
GetSysCacheOid, and SearchSysCacheList to know the maximum number
of allowable keys for a syscache entry (currently 4). This will
make it far easier to increase the maximum number of keys in a
future release should we choose to do so, and it makes the code
shorter, too.
Design and review by Tom Lane.
recent proposal. As proof of concept, remove knowledge of Params from the
core parser, arranging for them to be handled entirely by parser hook
functions. It turns out we need an additional hook for that --- I had
forgotten about the code that handles inferring a parameter's type from
context.
This is a preliminary step towards letting plpgsql handle its variables
through parser hooks. Additional work remains to be done to expose the
facility through SPI, but I think this is all the changes needed in the core
parser.
ability to lock relations as they scan pg_inherits, and to ignore any
relations that have disappeared by the time we get lock on them. This
makes uses of these functions safe against concurrent DROP operations
on child tables: we will effectively ignore any just-dropped child,
rather than possibly throwing an error as in recent bug report from
Thomas Johansson (and similar past complaints). The behavior should
not change otherwise, since the code was acquiring those same locks
anyway, just a little bit later.
An exception is LockTableCommand(), which is still behaving unsafely;
but that seems to require some more discussion before we change it.
find_inheritance_children() and find_all_inheritors(). I got annoyed that
these are buried inside the planner but mostly used elsewhere. So, create
a new file catalog/pg_inherits.c and put them there, along with a couple
of other functions that search pg_inherits.
The code that modifies pg_inherits is (still) in tablecmds.c --- it's
kind of entangled with unrelated code that modifies pg_depend and other
stuff, so pulling it out seemed like a bigger change than I wanted to make
right now. But this file provides a natural home for it if anyone ever
gets around to that.
This commit just moves code around; it doesn't change anything, except
I succumbed to the temptation to make a couple of trivial optimizations
in typeInheritsFrom().
actual argument type of ANYARRAY to match an argument declared ANYARRAY,
so long as ANYELEMENT etc aren't used. I had overlooked the fact that this
is a possible case while fixing bug #3852; but it is possible because
pg_statistic contains columns declared ANYARRAY. Per gripe from Corey Horton.
into an OR of equality comparisons, rather than x = ANY(ARRAY[...]), when there
are Vars in the right-hand side. This avoids a performance regression compared
to pre-8.2 releases, in cases where the OR form can be optimized into scans
of multiple indexes. Limit the possible downside by preferring this form only
when the list isn't very long (I set the cutoff at 32 elements, which is a
bit arbitrary but in the right ballpark). Per discussion with Jim Nasby.
In passing, also make it try the OR form if it cannot select a common type
for the array elements; we've seen a complaint or two about how the OR form
worked for such cases and ARRAY doesn't.
pseudo-type record[] to represent arrays of possibly-anonymous composite
types. Since composite datums carry their own type identification, no
extra knowledge is needed at the array level.
The main reason for doing this right now is that it is necessary to support
the general case of detection of cycles in recursive queries: if you need to
compare more than one column to detect a cycle, you need to compare a ROW()
to an array built from ROW()s, at least if you want to do it as the spec
suggests. Add some documentation and regression tests concerning the cycle
detection issue.
the column alias names of the RTE referenced by the Var to the RowExpr.
This is needed to allow ruleutils.c to correctly deparse FieldSelect nodes
referencing such a construct. Per my recent bug report.
Adding a field to RowExpr forces initdb (because of stored rules changes)
so this solution is not back-patchable; which is unfortunate because 8.2
and 8.3 have this issue. But it only affects EXPLAIN for some pretty odd
corner cases, so we can probably live without a solution for the back
branches.
a lot closer than it was before). To do this, tweak coerce_type() to pass
through the typmod information when invoking interval_in() on an UNKNOWN
constant; then fix DecodeInterval to pay attention to the typmod when deciding
how to interpret a units-less integer value. I changed one or two other
details as well. I believe the code now reacts as expected by spec for all
the literal syntaxes that are specifically enumerated in the spec. There
are corner cases involving strings that don't exactly match the set of fields
called out by the typmod, for which we might want to tweak the behavior some
more; but I think this is an area of user friendliness rather than spec
compliance. There remain some non-compliant details about the SQL syntax
(as opposed to what's inside the literal string); but at least we'll throw
error rather than silently doing the wrong thing in those cases.
most node types used in expression trees (both before and after parse
analysis). This allows us to place an error cursor in many situations
where we formerly could not, because the information wasn't available
beyond the very first level of parse analysis. There's a fair amount
of work still to be done to persuade individual ereport() calls to actually
include an error location, but this gets the initdb-forcing part of the
work out of the way; and the situation is already markedly better than
before for complaints about unimplementable implicit casts, such as
CASE and UNION constructs with incompatible alternative data types.
Per my proposal of a few days ago.
into nodes/nodeFuncs, so as to reduce wanton cross-subsystem #includes inside
the backend. There's probably more that should be done along this line,
but this is a start anyway.
of the STRING type category, thereby opening up the mechanism for user-defined
types. This is mainly for the benefit of citext, though; there aren't likely
to be a lot of types that are all general-purpose character strings.
Per discussion with David Wheeler.