If postmaster changed postmaster.pid while pg_ctl was reading it, pg_ctl
could overrun the buffer it allocated for the file. Fix by reading the
whole file to memory with one read() call.
initdb contains an identical copy of the readfile() function, but the files
that initdb reads are static, not modified concurrently. Nevertheless, add
a simple bounds-check there, if only to silence static analysis tools.
Per report from Dave Vitek. Backpatch to all supported branches.
In the previous coding, new backend processes would attempt to create their
self-pipe during the OwnLatch call in InitProcess. However, pipe creation
could fail if the kernel is short of resources; and the system does not
recover gracefully from a FATAL error right there, since we have armed the
dead-man switch for this process and not yet set up the on_shmem_exit
callback that would disarm it. The postmaster then forces an unnecessary
database-wide crash and restart, as reported by Sean Chittenden.
There are various ways we could rearrange the code to fix this, but the
simplest and sanest seems to be to split out creation of the self-pipe into
a new function InitializeLatchSupport, which must be called from a place
where failure is allowed. For most processes that gets called in
InitProcess or InitAuxiliaryProcess, but processes that don't call either
but still use latches need their own calls.
Back-patch to 9.1, which has only a part of the latch logic that 9.2 and
HEAD have, but nonetheless includes this bug.
This change ensures that the planner will see implicit and explicit casts
as equivalent for all purposes, except in the minority of cases where
there's actually a semantic difference (as reflected by having a 3-argument
cast function). In particular, this fixes cases where the EquivalenceClass
machinery failed to consider two references to a varchar column as
equivalent if one was implicitly cast to text but the other was explicitly
cast to text, as seen in bug #7598 from Vaclav Juza. We have had similar
bugs before in other parts of the planner, so I think it's time to fix this
problem at the core instead of continuing to band-aid around it.
Remove set_coercionform_dontcare(), which represents the band-aid
previously in use for allowing matching of index and constraint expressions
with inconsistent cast labeling. (We can probably get rid of
COERCE_DONTCARE altogether, but I don't think removing that enum value in
back branches would be wise; it's possible there's third party code
referring to it.)
Back-patch to 9.2. We could go back further, and might want to once this
has been tested more; but for the moment I won't risk destabilizing plan
choices in long-since-stable branches.
When hashing a subplan like "WHERE (a, b) NOT IN (SELECT x, y FROM ...)",
findPartialMatch() attempted to match rows using the hashtable's internal
equality operators, which of course are for x and y's datatypes. What we
need to use are the potentially cross-type operators for a=x, b=y, etc.
Failure to do that leads to wrong answers or even crashes. The scope for
problems is limited to cases where we have different types with compatible
hash functions (else we'd not be using a hashed subplan), but for example
int4 vs int8 can cause the problem.
Per bug #7597 from Bo Jensen. This has been wrong since the hashed-subplan
code was written, so patch all the way back.
Building a shlib on AIX requires use of the mkldexport.sh script, but we
failed to install that, preventing its use from non-source-tree contexts.
Also, Makefile.aix had the wrong idea about where to find the installed
copy of the postgres.imp symbol file used by AIX.
Per report from John Pierce. Patch all the way back, since this has been
broken since the beginning of PGXS.
I found that these functions tend to return -1 while leaving an empty error
message string in the PGconn, if they suffer some kind of I/O error on the
file. The reason is that lo_close, which thinks it's executed a perfectly
fine SQL command, clears the errorMessage. The minimum-change workaround
is to reorder operations here so that we don't fill the errorMessage until
after lo_close.
Instead of continuing if the next character is not an array boundary get_data()
used to continue only on finding a boundary so it was not able to read any
element after the first.
These reference pages still claimed that you have to be superuser to create
a database or schema owned by a different role. That was true before 8.1,
but it was changed in commits aa1110624c08298393dfce996f7b21809d98d3fd and
f91370cd2faf1fd35a1ac74d84652a85ed841919 to allow assignment of ownership
to any role you are a member of. However, at the time we were thinking of
that primarily as a change to the ALTER OWNER rules, so the need to touch
these two CREATE ref pages got missed.
examine_simple_variable supposed that any RTE_SUBQUERY rel it gets pointed
at must have been planned already. However, this isn't a safe assumption
because we must do selectivity estimation while generating indexscan paths,
and that code might look at join clauses involving a rel that the loop in
set_base_rel_sizes() hasn't reached yet. The simplest fix is to play dumb
in such a situation, that is give up trying to extract any stats for the
Var. This could possibly be improved by making a separate pass over the
RTE list to plan each unflattened subquery before we start the main
planning work --- but that would be pretty invasive and it doesn't seem
worth it, for now at least. (We couldn't just break set_base_rel_sizes()
into two loops: the prescan would need to handle all subquery rels in the
query, not only those in the current join subproblem.)
This bug was introduced in commit 1cb108efb0e60d87e4adec38e7636b6e8efbeb57,
although I think that subsequent changes may have exposed it more than it
was originally. Per bug #7580 from Maxim Boguk.
Apparently this was considered in the original code (see commit
cec3b0a9) but I failed to notice that such entries would always be
skipped by the database check at the start of the loop.
Per bugs #7578 by Nikolay, #6116 by tushar.qa@gmail.com.
On some platforms these functions return NULL, rather than the more common
practice of returning a pointer to a zero-sized block of memory. Hack our
various wrapper functions to hide the difference by substituting a size
request of 1. This is probably not so important for the callers, who
should never touch the block anyway if they asked for size 0 --- but it's
important for the wrapper functions themselves, which mistakenly treated
the NULL result as an out-of-memory failure. This broke at least pg_dump
for the case of no user-defined aggregates, as per report from
Matthew Carrington.
Back-patch to 9.2 to fix the pg_dump issue. Given the lack of previous
complaints, it seems likely that there is no live bug in previous releases,
even though some of these functions were in place before that.
entries are not dumped. This fixes an error caused by
droping/recreating the information_schema, but other failures were also
possible.
Backpatch to 9.2.
timeval.t_sec is of type time_t, which is not always compatible with long.
I'm not sure if this was just harmless warning or a real bug, but this
fixes it, anyway.
The tar output module did some very ugly and ultimately incorrect hacking
on COPY commands to try to get them to work in the context of restoring a
deconstructed tar archive. In particular, it would fail altogether for
table names containing any upper-case characters, since it smashed the
command string to lower-case before modifying it (and, just to add insult
to injury, did that in a way that would fail in multibyte encodings).
I don't see any particular value in being flexible about the case of the
command keywords, since the string will just have been created by
dumpTableData, so let's get rid of the whole case-folding thing.
Also, it doesn't seem to meet the POLA for the script to restore data only
in COPY mode, so add \i commands to make it have comparable behavior in
--inserts mode.
Noted while looking at the tar-output code in connection with Brian
Weaver's patch.
Back-patch portions of commit 05b555d12bc2ad0d581f48a12b45174db41dc10d.
There doesn't seem to be any reason not to fix pg_basebackup fully, but
we can't change pg_dump's "magic" string without breaking older versions
of pg_restore. Instead, just patch pg_restore to accept either version
of the magic string, in hopes of avoiding compatibility problems when
9.3 comes out. I also fixed pg_dump to write the correct 2-block EOF
marker, since that won't create a compatibility problem with pg_restore
and it could help with some versions of tar.
Brian Weaver and Tom Lane
This fixes another error in commit 9e8da0f75731aaa7605cf4656c21ea09e84d2eb1.
I neglected to make the mark/restore functionality save and restore the
current set of array key values, which led to strange behavior if an
IndexScan with ScalarArrayOpExpr quals was used as the inner side of a
mergejoin. Per bug #7570 from Melese Tesfaye.
This worked fine for superusers, but not for ordinary users trying to
cancel their own processes. Tweak the order the checks are done in so
that we correctly return SIGNAL_BACKEND_ERROR (which current callers
know to ignore without erroring out) so that an ordinary user can loop
through a resultset without fearing that a process might exit in the
middle of said looping -- causing the remaining processes to go
unsignalled.
Incidentally, the last in-core caller of IsBackendPid() is now gone.
However, the function is exported and must remain in place, because
there are plenty of callers in external modules.
Author: Josh Kupershmidt
Reviewed by Noah Misch
The syntax "su -c 'command' username" is not accepted by all versions of
su, for example not OpenBSD's. More portable is "su username -c
'command'". So change runtime.sgml to recommend that syntax. Also,
add a -D switch to the OpenBSD example script, for consistency with other
examples. Per Denis Lapshin and Gábor Hidvégi.
These calls were removed in commit 4240e429d0c2d889d0cda23c618f94e12c13ade7
as part of a general refactoring and improvement of DDL locking. However,
there's a problem not solved by the rewrite, which is that GRANT/REVOKE
update pg_class.relacl without taking any particular lock on the target
table as such. If another backend fails to do AcceptInvalidationMessages,
it won't notice a recently-committed change in ACLs. Bug #7557 from Piotr
Czachur demonstrates that there's at least one code path in 9.2.0 in which
a command fails to do any AcceptInvalidationMessages calls at all, if the
current transaction already holds all the locks it will need.
Since we're hard up against the release deadline for 9.2.1, fix this by
putting back the AcceptInvalidationMessages calls in heap_openrv and
heap_openrv_extended, thereby restoring the historical behavior in this
area. We ought to look for a more elegant and perhaps more bulletproof
solution, but there's no time for that right now.
In commit 9e8da0f75731aaa7605cf4656c21ea09e84d2eb1, I improved btree
to handle ScalarArrayOpExpr quals natively, so that constructs like
"indexedcol IN (list)" could be supported by index-only scans. Using
such a qual results in multiple scans of the index, under-the-hood.
I went to some lengths to ensure that this still produces rows in index
order ... but I failed to recognize that if a higher-order index column
is lacking an equality constraint, rescans can produce out-of-order
data from that column. Tweak the planner to not expect sorted output
in that case. Per trouble report from Robert McGehee.
Somewhere along the line, somebody decided to remove all trace of this
notation from the documentation text. It was still in the command syntax
synopses, or at least some of them, but with no indication what it meant.
This will not do, as evidenced by the confusion apparent in bug #7543;
even if the notation is now unnecessary, people will find it in legacy
SQL code and need to know what it does.
Some experimentation with examples similar to bug #7539 has convinced me
that indxpath.c's original implementation of parameterized-path generation
was several bricks shy of a load. In general, if we are relying on a
particular outer rel or set of outer rels for a parameterized path, the
path should use every indexable join clause that's available from that rel
or rels. Any join clauses that get left out of the indexqual will end up
getting applied as plain filter quals (qpquals), and that's generally a
significant loser compared to having the index AM enforce them. (This is
particularly true with btree, which can skip the index scan entirely if
it can see that the given indexquals are mutually contradictory.) The
original heuristics failed to ensure this, though, and were overly
complicated anyway. Rewrite to make the code explicitly identify each
useful set of outer rels and then select all applicable join clauses for
each one. The one plan that changes in the regression tests is in fact
for the better according to the planner's cost estimates.
(Note: this is not a correctness issue but just a matter of plan quality.
I don't yet know what is going on in bug #7539, but I don't expect this
change to fix that.)
Recovery code documents clearly that a shutdown checkpoint is executed at
end of recovery - a shutdown checkpoint WAL record is written but the buffer
manager had been altered to treat end of recovery as a normal checkpoint.
This bug exacerbates the bufmgr relpersistence bug.
Bug spotted by Andres Freund, patch by me.
The documentation mentioned setting autovacuum_freeze_max_age to
"its maximum allowed value of a little less than two billion".
This led to a post asking about the exact maximum allowed value,
which is precisely two billion, not "a little less".
Based on question by Radovan Jablonovsky. Backpatch to 8.3.
Back-patch commits 9afc6481117d2dd936e752da0424a2b6b05f6459 and
b8fbbcf37f22c5e8361da939ad0fc4be18a34ca9. The first of these is really
a minor code cleanup to save a few cycles, but it turns out to provide
a workaround for the misoptimization problem described in bug #7516.
The second commit adds a regression test case.
Back-patch the fix to all active branches. The test case only works
as far back as 9.0, because it relies on plpgsql which isn't installed
by default before that. (I didn't have success modifying it into an
all-plperl form that still provoked a crash, though this may just reflect
my lack of Perl-fu.)
In commit 1bc16a946008a7cbb33a9a06a7c6765a807d7f59 I added a minor
optimization to drop the component variables of a GROUP BY expression from
the target list computed at the aggregation level of a query, if those Vars
weren't referenced elsewhere in the tlist. However, I overlooked that the
window-function planning code would deconstruct such expressions and thus
need to have access to their component variables. Fix it to not do that.
While at it, I removed the distinction between volatile and nonvolatile
window partition/order expressions: the code now computes all of them
at the aggregation level. This saves a relatively expensive check for
volatility, and it's unclear that the resulting plan isn't better anyway.
Per bug #7535 from Louis-David Mitterrand. Back-patch to 9.2.