This was not changed in HEAD, but will be done later as part of a
pgindent run. Future pgindent runs will also do this.
Report by Tom Lane
Backpatch through all supported branches, but not HEAD
If we have an array of records stored on disk, the individual record fields
cannot contain out-of-line TOAST pointers: the tuptoaster.c mechanisms are
only prepared to deal with TOAST pointers appearing in top-level fields of
a stored row. The same applies for ranges over composite types, nested
composites, etc. However, the existing code only took care of expanding
sub-field TOAST pointers for the case of nested composites, not for other
structured types containing composites. For example, given a command such
as
UPDATE tab SET arraycol = ARRAY[(ROW(x,42)::mycompositetype] ...
where x is a direct reference to a field of an on-disk tuple, if that field
is long enough to be toasted out-of-line then the TOAST pointer would be
inserted as-is into the array column. If the source record for x is later
deleted, the array field value would become a dangling pointer, leading
to errors along the line of "missing chunk number 0 for toast value ..."
when the value is referenced. A reproducible test case for this was
provided by Jan Pecek, but it seems likely that some of the "missing chunk
number" reports we've heard in the past were caused by similar issues.
Code-wise, the problem is that PG_DETOAST_DATUM() is not adequate to
produce a self-contained Datum value if the Datum is of composite type.
Seen in this light, the problem is not just confined to arrays and ranges,
but could also affect some other places where detoasting is done in that
way, for example form_index_tuple().
I tried teaching the array code to apply toast_flatten_tuple_attribute()
along with PG_DETOAST_DATUM() when the array element type is composite,
but this was messy and imposed extra cache lookup costs whether or not any
TOAST pointers were present, indeed sometimes when the array element type
isn't even composite (since sometimes it takes a typcache lookup to find
that out). The idea of extending that approach to all the places that
currently use PG_DETOAST_DATUM() wasn't attractive at all.
This patch instead solves the problem by decreeing that composite Datum
values must not contain any out-of-line TOAST pointers in the first place;
that is, we expand out-of-line fields at the point of constructing a
composite Datum, not at the point where we're about to insert it into a
larger tuple. This rule is applied only to true composite Datums, not
to tuples that are being passed around the system as tuples, so it's not
as invasive as it might sound at first. With this approach, the amount
of code that has to be touched for a full solution is greatly reduced,
and added cache lookup costs are avoided except when there actually is
a TOAST pointer that needs to be inlined.
The main drawback of this approach is that we might sometimes dereference
a TOAST pointer that will never actually be used by the query, imposing a
rather large cost that wasn't there before. On the other side of the coin,
if the field value is used multiple times then we'll come out ahead by
avoiding repeat detoastings. Experimentation suggests that common SQL
coding patterns are unaffected either way, though. Applications that are
very negatively affected could be advised to modify their code to not fetch
columns they won't be using.
In future, we might consider reverting this solution in favor of detoasting
only at the point where data is about to be stored to disk, using some
method that can drill down into multiple levels of nested structured types.
That will require defining new APIs for structured types, though, so it
doesn't seem feasible as a back-patchable fix.
Note that this patch changes HeapTupleGetDatum() from a macro to a function
call; this means that any third-party code using that macro will not get
protection against creating TOAST-pointer-containing Datums until it's
recompiled. The same applies to any uses of PG_RETURN_HEAPTUPLEHEADER().
It seems likely that this is not a big problem in practice: most of the
tuple-returning functions in core and contrib produce outputs that could
not possibly be toasted anyway, and the same probably holds for third-party
extensions.
This bug has existed since TOAST was invented, so back-patch to all
supported branches.
Given a composite-type parameter named x, "$1.*" worked fine, but "x.*"
not so much. This has been broken since named parameter references were
added in commit 9bff0780cf, so patch back
to 9.2. Per bug #9085 from Hardy Falk.
fmgr_sql had been designed on the assumption that the FmgrInfo it's called
with has only query lifespan. This is demonstrably unsafe in connection
with range types, as shown in bug #7881 from Andrew Gierth. Fix things
so that we re-generate the function's cache data if the (sub)transaction
it was made in is no longer active.
Back-patch to 9.2. This might be needed further back, but it's not clear
whether the case can realistically arise without range types, so for now
I'll desist from back-patching further.
Making this operation look like a utility statement seems generally a good
idea, and particularly so in light of the desire to provide command
triggers for utility statements. The original choice of representing it as
SELECT with an IntoClause appendage had metastasized into rather a lot of
places, unfortunately, so that this patch is a great deal more complicated
than one might at first expect.
In particular, keeping EXPLAIN working for SELECT INTO and CREATE TABLE AS
subcommands required restructuring some EXPLAIN-related APIs. Add-on code
that calls ExplainOnePlan or ExplainOneUtility, or uses
ExplainOneQuery_hook, will need adjustment.
Also, the cases PREPARE ... SELECT INTO and CREATE RULE ... SELECT INTO,
which formerly were accepted though undocumented, are no longer accepted.
The PREPARE case can be replaced with use of CREATE TABLE AS EXECUTE.
The CREATE RULE case doesn't seem to have much real-world use (since the
rule would work only once before failing with "table already exists"),
so we'll not bother with that one.
Both SELECT INTO and CREATE TABLE AS still return a command tag of
"SELECT nnnn". There was some discussion of returning "CREATE TABLE nnnn",
but for the moment backwards compatibility wins the day.
Andres Freund and Tom Lane
Since collation is effectively an argument, not a property of the function,
FmgrInfo is really the wrong place for it; and this becomes critical in
cases where a cached FmgrInfo is used for varying purposes that might need
different collation settings. Fix by passing it in FunctionCallInfoData
instead. In particular this allows a clean fix for bug #5970 (record_cmp
not working). This requires touching a bit more code than the original
method, but nobody ever thought that collations would not be an invasive
patch...
In nearly all cases, the caller already knows the correct collation, and
in a number of places, the value the caller has handy is more correct than
the default for the type would be. (In particular, this patch makes it
significantly less likely that eval_const_expressions will result in
changing the exposed collation of an expression.) So an internal lookup
is both expensive and wrong.
Ensure that parameter symbols receive collation from the function's
resolved input collation, and fix inlining to behave properly.
BTW, this commit lays about 90% of the infrastructure needed to support
use of argument names in SQL functions. Parsing of parameters is now
done via the parser-hook infrastructure ... we'd just need to supply
a column-ref hook ...
All expression nodes now have an explicit output-collation field, unless
they are known to only return a noncollatable data type (such as boolean
or record). Also, nodes that can invoke collation-aware functions store
a separate field that is the collation value to pass to the function.
This avoids confusion that arises when a function has collatable inputs
and noncollatable output type, or vice versa.
Also, replace the parser's on-the-fly collation assignment method with
a post-pass over the completed expression tree. This allows us to use
a more complex (and hopefully more nearly spec-compliant) assignment
rule without paying for it in extra storage in every expression node.
Fix assorted bugs in the planner's handling of collations by making
collation one of the defining properties of an EquivalenceClass and
by converting CollateExprs into discardable RelabelType nodes during
expression preprocessing.
With this patch, portals, SQL functions, and SPI all agree that there
should be only a CommandCounterIncrement between the queries that are
generated from a single SQL command by rule expansion. Fetching a whole
new snapshot now happens only between original queries. This is equivalent
to the existing behavior of EXPLAIN ANALYZE, and it was judged to be the
best choice since it eliminates one source of concurrency hazards for
rules. The patch should also make things marginally faster by reducing the
number of snapshot push/pop operations.
The patch removes pg_parse_and_rewrite(), which is no longer used anywhere.
There was considerable discussion about more aggressive refactoring of the
query-processing functions exported by postgres.c, but for the moment
nothing more has been done there.
I also took the opportunity to refactor snapmgr.c's API slightly: the
former PushUpdatedSnapshot() has been split into two functions.
Marko Tiikkaja, reviewed by Steve Singer and Tom Lane
The originally committed patch for modifying CTEs didn't interact well
with EXPLAIN, as noted by myself, and also had corner-case problems with
triggers, as noted by Dean Rasheed. Those problems show it is really not
practical for ExecutorEnd to call any user-defined code; so split the
cleanup duties out into a new function ExecutorFinish, which must be called
between the last ExecutorRun call and ExecutorEnd. Some Asserts have been
added to these functions to help verify correct usage.
It is no longer necessary for callers of the executor to call
AfterTriggerBeginQuery/AfterTriggerEndQuery for themselves, as this is now
done by ExecutorStart/ExecutorFinish respectively. If you really need to
suppress that and do it for yourself, pass EXEC_FLAG_SKIP_TRIGGERS to
ExecutorStart.
Also, refactor portal commit processing to allow for the possibility that
PortalDrop will invoke user-defined code. I think this is not actually
necessary just yet, since the portal-execution-strategy logic forces any
non-pure-SELECT query to be run to completion before we will consider
committing. But it seems like good future-proofing.
There were corner cases in which the planner would attempt to inline such
a function, which would result in a failure at runtime due to loss of
information about exactly what the result record type is. Fix by disabling
inlining when the function's recorded result type is RECORD. There might
be some sub-cases where inlining could still be allowed, but this is a
simple and backpatchable fix, so leave refinements for another day.
Per bug #5777 from Nate Carson.
Back-patch to all supported branches. 8.1 happens to avoid a core-dump
here, but it still does the wrong thing.
catalog entries via SearchSysCache and related operations. Although, at the
time that these callbacks are called by elog.c, we have not officially aborted
the current transaction, it still seems rather risky to initiate any new
catalog fetches. In all these cases the needed information is readily
available in the caller and so it's just a matter of a bit of extra notation
to pass it to the callback.
Per crash report from Dennis Koegel. I've concluded that the real fix for
his problem is to clear the error context stack at entry to proc_exit, but
it still seems like a good idea to make the callbacks a bit less fragile
for other cases.
Backpatch to 8.4. We could go further back, but the patch doesn't apply
cleanly. In the absence of proof that this fixes something and isn't just
paranoia, I'm not going to expend the effort.
The purpose of this change is to eliminate the need for every caller
of SearchSysCache, SearchSysCacheCopy, SearchSysCacheExists,
GetSysCacheOid, and SearchSysCacheList to know the maximum number
of allowable keys for a syscache entry (currently 4). This will
make it far easier to increase the maximum number of keys in a
future release should we choose to do so, and it makes the code
shorter, too.
Design and review by Tom Lane.
PL/pgSQL function within an exception handler. Make sure we use the right
resource owner when we create the tuplestore to hold returned tuples.
Simplify tuplestore API so that the caller doesn't need to be in the right
memory context when calling tuplestore_put* functions. tuplestore.c
automatically switches to the memory context used when the tuplestore was
created. Tuplesort was already modified like this earlier. This patch also
removes the now useless MemoryContextSwitch calls from callers.
Report by Aleksei on pgsql-bugs on Dec 22 2009. Backpatch to 8.1, like
the previous patch that broke this.
This patch also removes buffer-usage statistics from the track_counts
output, since this (or the global server statistics) is deemed to be a better
interface to this information.
Itagaki Takahiro, reviewed by Euler Taveira de Oliveira.
we have to cope with the possibility that the declared result rowtype contains
dropped columns. This fails in 8.4, as per bug #5240.
While at it, be more paranoid about inserting binary coercions when inlining.
The pre-8.4 code did not really need to worry about that because it could not
inline at all in any case where an added coercion could change the behavior
of the function's statement. However, when inlining a SRF we allow sorting,
grouping, and set-ops such as UNION. In these cases, modifying one of the
targetlist entries that the sort/group/setop depends on could conceivably
change the behavior of the function's statement --- so don't inline when
such a case applies.
As proof of concept, modify plpgsql to use the hooks. plpgsql is still
inserting $n symbols textually, but the "back end" of the parsing process now
goes through the ParamRef hook instead of using a fixed parameter-type array,
and then execution only fetches actually-referenced parameters, using a hook
added to ParamListInfo.
Although there's a lot left to be done in plpgsql, this already cures the
"if (TG_OP = 'INSERT' and NEW.foo ...)" problem, as illustrated by the
changed regression test.
function returning setof record. This used to work, more or less
accidentally, but I had broken it while extending the code to allow
materialize-mode functions to be called in select lists. Add a regression
test case so it doesn't get broken again. Per gripe from Greg Davidson.
mode while callers hold pointers to in-memory tuples. I reported this for
the case of nodeWindowAgg's primary scan tuple, but inspection of the code
shows that all of the calls in nodeWindowAgg and nodeCtescan are at risk.
For the moment, fix it with a rather brute-force approach of copying
whenever one of the at-risk callers requests a tuple. Later we might
think of some sort of reference-count approach to reduce tuple copying.
practically free given prior 8.4 changes in plancache and portal management,
and it makes it a lot easier for ExecutorStart/Run/End hooks to get at the
query text. Extracted from Itagaki Takahiro's pg_stat_statements patch,
with minor editorialization.
that a Portal is a useful and sufficient additional argument for
CreateDestReceiver --- it just isn't, in most cases. Instead formalize
the approach of passing any needed parameters to the receiver separately.
One unexpected benefit of this change is that we can declare typedef Portal
in a less surprising location.
This patch is just code rearrangement and doesn't change any functionality.
I'll tackle the HOLD-cursor-vs-toast problem in a follow-on patch.
DestReceiver created during postquel_start needs to be destroyed during
postquel_end. In a moment of brain fade I had assumed this would be taken
care of by FreeQueryDesc, but it's not (and shouldn't be).
it just return void instead of sometimes returning a TupleTableSlot. SQL
functions don't need that anymore, and noplace else does either. Eliminating
the return value also means one less hassle for the ExecutorRun hook functions
that will be supported beginning in 8.4.
RETURNING clause, not just a SELECT as formerly.
A side effect of this patch is that when a set-returning SQL function is used
in a FROM clause, performance is improved because the output is collected into
a tuplestore within the function, rather than using the less efficient
value-per-call mechanism.
into nodes/nodeFuncs, so as to reduce wanton cross-subsystem #includes inside
the backend. There's probably more that should be done along this line,
but this is a start anyway.
There are two ways to track a snapshot: there's the "registered" list, which
is used for arbitrary long-lived snapshots; and there's the "active stack",
which is used for the snapshot that is considered "active" at any time.
This also allows users of snapshots to stop worrying about snapshot memory
allocation and freeing, and about using PG_TRY blocks around ActiveSnapshot
assignment. This is all done automatically now.
As a consequence, this allows us to reset MyProc->xmin when there are no
more snapshots registered in the current backend, reducing the impact that
long-running transactions have on VACUUM.
snapmgmt.c file for the former. The header files have also been reorganized
in three parts: the most basic snapshot definitions are now in a new file
snapshot.h, and the also new snapmgmt.h keeps the definitions for snapmgmt.c.
tqual.h has been reduced to the bare minimum.
This patch is just a first step towards managing live snapshots within a
transaction; there is no functionality change.
Per my proposal to pgsql-patches on 20080318191940.GB27458@alvh.no-ip.org and
subsequent discussion.
strings. This patch introduces four support functions cstring_to_text,
cstring_to_text_with_len, text_to_cstring, and text_to_cstring_buffer, and
two macros CStringGetTextDatum and TextDatumGetCString. A number of
existing macros that provided variants on these themes were removed.
Most of the places that need to make such conversions now require just one
function or macro call, in place of the multiple notational layers that used
to be needed. There are no longer any direct calls of textout or textin,
and we got most of the places that were using handmade conversions via
memcpy (there may be a few still lurking, though).
This commit doesn't make any serious effort to eliminate transient memory
leaks caused by detoasting toasted text objects before they reach
text_to_cstring. We changed PG_GETARG_TEXT_P to PG_GETARG_TEXT_PP in a few
places where it was easy, but much more could be done.
Brendan Jurd and Tom Lane
are declared to return set, and consist of just a single SELECT. We
can replace the FROM-item with a sub-SELECT and then optimize much as
if we were dealing with a view. Patch from Richard Rowell, cleaned up
by me.
few lines in sql_exec_error_callback() by using the function source string
field that the patch added to SQL function cache entries. This doesn't work
because the fn_extra field isn't filled in yet during init_sql_fcache().
Probably it could be made to work, but it doesn't seem appropriate to contort
the main code paths to make an error-reporting path a tad faster. Per report
from Pavel Stehule.
were accepted by prior Postgres releases. This takes care of the loose end
left by the preceding patch to downgrade implicit casts-to-text. To avoid
breaking desirable behavior for array concatenation, introduce a new
polymorphic pseudo-type "anynonarray" --- the added concatenation operators
are actually text || anynonarray and anynonarray || text.
types of unspecified parameters when submitted via extended query protocol.
This worked in 8.2 but I had broken it during plancache changes. DECLARE
CURSOR is now treated almost exactly like a plain SELECT through parse
analysis, rewrite, and planning; only just before sending to the executor
do we divert it away to ProcessUtility. This requires a special-case check
in a number of places, but practically all of them were already special-casing
SELECT INTO, so it's not too ugly. (Maybe it would be a good idea to merge
the two by treating IntoClause as a form of utility statement? Not going to
worry about that now, though.) That approach doesn't work for EXPLAIN,
however, so for that I punted and used a klugy solution of running parse
analysis an extra time if under extended query protocol.
access to the planner's cursor-related planning options, and provide new
FETCH/MOVE routines that allow access to the full power of those commands.
Small refactoring of planner(), pg_plan_query(), and pg_plan_queries()
APIs to make it convenient to pass the planning options down from SPI.
This is the core-code portion of Pavel Stehule's patch for scrollable
cursor support in plpgsql; I'll review and apply the plpgsql changes
separately.