Per reports from Andres Freund and Luke Campbell, a server failure during
set_pglocale_pgservice results in a segfault rather than a useful error
message, because the infrastructure needed to use ereport hasn't been
initialized; specifically, MemoryContextInit hasn't been called.
One known cause of this is starting the server in a directory it
doesn't have permission to read.
We could try to prevent set_pglocale_pgservice from using anything that
depends on palloc or elog, but that would be messy, and the odds of future
breakage seem high. Moreover there are other things being called in main.c
that look likely to use palloc or elog too --- perhaps those things
shouldn't be there, but they are there today. The best solution seems to
be to move the call of MemoryContextInit to very early in the backend's
real main() function. I've verified that an elog or ereport occurring
immediately after that is now capable of sending something useful to
stderr.
I also added code to elog.c to print something intelligible rather than
just crashing if MemoryContextInit hasn't created the ErrorContext.
This could happen if MemoryContextInit itself fails (due to malloc
failure), and provides some future-proofing against someone trying to
sneak in new code even earlier in server startup.
Back-patch to all supported branches. Since we've only heard reports of
this type of failure recently, it may be that some recent change has made
it more likely to see a crash of this kind; but it sure looks like it's
broken all the way back.
Back-patch commits 8e68816cc2 and
8dace66e07 into the stable branches.
Buildfarm testing revealed no great portability surprises, and it
seems useful to have this robustness improvement in all branches.
On a platform that isn't supplying __FILE__, previous coding would either
crash or give a stale result for the filename string. Not sure how likely
that is, but the original code catered for it, so let's keep doing so.
In vpath builds, the __FILE__ macro that is used in verbose error
reports contains the full absolute file name, which makes the error
messages excessively verbose. So keep only the base name, thus
matching the behavior of non-vpath builds.
Per gripe from Fujii Masao, though this is not exactly his proposed patch.
Categorize as DEVELOPER_OPTIONS and set context PGC_SIGHUP, as per Fujii,
but set the default to LOG because higher values aren't really sensible
(see the code for trace_recovery()). Fix the documentation to agree with
the code and to try to explain what the variable actually does. Get rid
of no-op calls trace_recovery(LOG), which accomplish nothing except to
demonstrate that this option confuses even its author.
path when CSV logging is configured but not yet operational. It's sufficient
to send the message to stderr, as we were already doing, and the "Not safe"
gripe has already confused at least two core members ...
Backpatch to 9.0, but not further --- doesn't seem appropriate to change
this behavior in stable branches.
Depending on which spec you read, field widths and precisions in %s may be
counted either in bytes or characters. Our code was assuming bytes, which
is wrong at least for glibc's implementation, and in any case libc might
have a different idea of the prevailing encoding than we do. Hence, for
portable results we must avoid using anything more complex than just "%s"
unless the string to be printed is known to be all-ASCII.
This patch fixes the cases I could find, including the psql formatting
failure reported by Hernan Gonzalez. In HEAD only, I also added comments
to some places where it appears safe to continue using "%.*s".
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.
New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.
This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.
Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.
Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
by extending the ereport() API to cater for pluralization directly. This
is better than the original method of calling ngettext outside the elog.c
code because (1) it avoids double translation, which wastes cycles and in
the worst case could give a wrong result; and (2) it avoids having to use
a different coding method in PL code than in the core backend. The
client-side uses of ngettext are not touched since neither of these concerns
is very pressing in the client environment. Per my proposal of yesterday.
encoding conversion of any elog/ereport message being sent to the frontend.
This generalizes a patch that I put in last October, which suppressed
translation of only specific messages known to be associated with recursive
can't-translate-the-message behavior. As shown in bug #4680, we need a more
general answer in order to have some hope of coping with broken encoding
conversion setups. This approach seems a good deal less klugy anyway.
Patch in all supported branches.
consistent. Currently, in csvlog, vxid of an auxiliary process isn't
displayed. On the other hand, in stderr/syslog, invalid vxid (-1/0) of
that is displayed.
Fujii Masao
recursion when we are unable to convert a localized error message to the
client's encoding. We've been over this ground before, but as reported by
Ibrar Ahmed, it still didn't work in the case of conversion failures for
the conversion-failure message itself :-(. Fix by installing a "circuit
breaker" that disables attempts to localize this message once we get into
recursion trouble.
Patch all supported branches, because it is in fact broken in all of them;
though I had to add some missing translations to the older branches in
order to expose the failure in the particular test case I was using.
1024 to improve performance when sending large elog messages. Also add a
comment about why we use that number.
Since this represents an externally visible behavior change, and might
possibly result in portability issues, it seems best not to back-patch it.
log message at newlines cost O(N^2) for very long messages with few or no
newlines. For messages in the megabyte range this became the dominant cost.
Per gripe from Achilleas Mantzios.
Patch all the way back, since this is a safe change with no portability
risks. I am also thinking of increasing PG_SYSLOG_LIMIT, but that should
be done separately.
errdetail except the string goes only to the server log, replacing the normal
errdetail there. This provides a reasonably clean way of dealing with error
details that are too security-sensitive or too bulky to send to the client.
This commit just adds the infrastructure --- actual uses to follow.
variables to it. More need to be converted, but I wanted to get this in
before it conflicts with too much...
Other than just centralising the text-to-int conversion for parameters,
this allows the pg_settings view to contain a list of available options
and allows an error hint to show what values are allowed.
with the logged event. CSV logs are now a first-class citizen along plain
text logs in that they carry much of the same information.
Per complaint from depesz on bug #3799.
the same transaction can be identified even when no regular XID was assigned.
This seems essential after addition of the lazy-XID patch. Also some
minor code cleanup in write_csvlog().
rows will normally never obtain an XID at all. We already did things this way
for subtransactions, but this patch extends the concept to top-level
transactions. In applications where there are lots of short read-only
transactions, this should improve performance noticeably; not so much from
removal of the actual XID-assignments, as from reduction of overhead that's
driven by the rate of XID consumption. We add a concept of a "virtual
transaction ID" so that active transactions can be uniquely identified even
if they don't have a regular XID. This is a much lighter-weight concept:
uniqueness of VXIDs is only guaranteed over the short term, and no on-disk
record is made about them.
Florian Pflug, with some editorialization by Tom.
between the setting of log_line_prefix and the setting of log_timezone. We
can't realistically set log_timezone any earlier than we do now, so the best
behavior seems to be to use GMT zone if any timestamps are to be logged during
early startup. Create a dummy zone variable with a minimal definition of GMT
(in particular it will never know about leap seconds), so that we can set it
up without reference to any external files.
displayed in the postmaster log. This avoids Windows-specific problems with
localized time zone names that are in the wrong encoding, and generally seems
like a good idea to forestall other potential platform-dependent issues.
To preserve the existing behavior that all backends will log in the same time
zone, create a new GUC variable log_timezone that can only be changed on a
system-wide basis, and reference log-related calculations to that zone instead
of the TimeZone variable.
This fixes the issue reported by Hiroshi Saito that timestamps printed by
xlog.c startup could be improperly localized on Windows. We still need a
simpler patch for that problem in the back branches, however.
so that we will be able to create a cookie for all processes for CSVlogs.
It is set wherever MyProcPid is set. Take the opportunity to remove the now
unnecessary session-only restriction on the %s and %c escapes in log_line_prefix.
log_min_error_statement is active and there is some problem in logging the
current query string; for example, that it's too long to include in the log
message without running out of memory. This problem has existed since the
log_min_error_statement feature was introduced. No doubt the reason it
wasn't detected long ago is that 8.2 is the first release that defaults
log_min_error_statement to less than PANIC level.
Per report from Bill Moran.
reassembled in the syslogger before writing to the log file. This prevents
partial messages from being written, which mucks up log rotation, and
messages from different backends being interleaved, which causes garbled
logs. Backport as far as 8.0, where the syslogger was introduced.
Tom Lane and Andrew Dunstan
which is the only state in which it's safe to initiate database queries.
It turns out that all but two of the callers thought that's what it meant;
and the other two were using it as a proxy for "will GetTopTransactionId()
return a nonzero XID"? Since it was in fact an unreliable guide to that,
make those two just invoke GetTopTransactionId() always, then deal with a
zero result if they get one.
and for other compilers, insert a dummy exit() call so that they understand
PG_RE_THROW() doesn't return. Insert fflush(stderr) in ExceptionalCondition,
per recent buildfarm evidence that that might not happen automatically on some
platforms. And const-ify ExceptionalCondition's declaration while at it.
isn't any place to throw the error to. If so, we should treat the error
as FATAL, just as we would have if it'd been thrown outside the PG_TRY
block to begin with.
Although this is clearly a *potential* source of bugs, it is not clear
at the moment whether it is an *actual* source of bugs; there may not
presently be any PG_TRY blocks in code that can be reached with no outer
longjmp catcher. So for the moment I'm going to be conservative and not
back-patch this. The change breaks ABI for users of PG_RE_THROW and hence
might create compatibility problems for loadable modules, so we should not
put it into released branches without proof that it's needed.
log_min_messages does; and arrange to suppress the duplicative output
that would otherwise result from log_statement and log_duration messages.
Bruce Momjian and Tom Lane.