1
0
mirror of https://github.com/postgres/postgres.git synced 2025-05-17 06:41:24 +03:00

35730 Commits

Author SHA1 Message Date
Tom Lane
a11577f47e Fix pg_restore's processing of old-style BLOB COMMENTS data.
Prior to 9.0, pg_dump handled comments on large objects by dumping a bunch
of COMMENT commands into a single BLOB COMMENTS archive object.  With
sufficiently many such comments, some of the commands would likely get
split across bufferloads when restoring, causing failures in
direct-to-database restores (though no problem would be evident in text
output).  This is the same type of issue we have with table data dumped as
INSERT commands, and it can be fixed in the same way, by using a mini SQL
lexer to figure out where the command boundaries are.  Fortunately, the
COMMENT commands are no more complex to lex than INSERTs, so we can just
re-use the existing lexer for INSERTs.

Per bug #10611 from Jacek Zalewski.  Back-patch to all active branches.
2014-06-12 20:14:39 -04:00
Tom Lane
71ffbe5e60 Remove inadvertent copyright violation in largeobject regression test.
Robert Frost is no longer with us, but his copyrights still are, so
let's stop using "Stopping by Woods on a Snowy Evening" as test data
before somebody decides to sue us.  Wordsworth is more safely dead.
2014-06-12 16:51:08 -04:00
Tom Lane
87db9534a1 Fix ancient encoding error in hungarian.stop.
When we grabbed this file off the Snowball project's website, we mistakenly
supposed that it was in LATIN1 encoding, but evidently it was actually in
LATIN2.  This resulted in ő (o-double-acute, U+0151, which is code 0xF5 in
LATIN2) being misconverted into õ (o-tilde, U+00F5), as complained of in
bug #10589 from Zoltán Sörös.  We'd have messed up u-double-acute too,
but there aren't any of those in the file.  Other characters used in the
file have the same codes in LATIN1 and LATIN2, which no doubt helped hide
the problem for so long.

The error is not only ours: the Snowball project also was confused about
which encoding is required for Hungarian.  But dealing with that will
require source-code changes that I'm not at all sure we'll wish to
back-patch.  Fixing the stopword file seems reasonably safe to back-patch
however.
2014-06-10 22:48:39 -04:00
Tom Lane
04b598b230 Forward-port regression test for bug #10587 into 9.3 and HEAD.
Although this bug is already fixed in post-9.2 branches, the case
triggering it is quite different from what was under consideration
at the time.  It seems worth memorializing this example in HEAD
just to make sure it doesn't get broken again in future.

Extracted from commit 187ae17300776f48b2bd9d0737923b1bf70f606e.
2014-06-09 21:37:20 -04:00
Tom Lane
717c116f1d Fix infinite loop when splitting inner tuples in SPGiST text indexes.
Previously, the code used a node label of zero both for strings that
contain no bytes beyond the inner tuple's prefix, and for cases where an
"allTheSame" inner tuple has to be split to allow a string with a different
next byte to be inserted into it.  Failing to distinguish these cases meant
that if a string ending with the current prefix needed to be inserted into
an allTheSame tuple, we got into an infinite loop, because after splitting
the tuple we'd descend into the child allTheSame tuple and then find we
need to split again.

To fix, instead use -1 and -2 as the node labels for these two cases.
This requires widening the node label type from "char" to int2, but
fortunately SPGiST stores all pass-by-value node label types in their
Datum representation, which means that this change is transparently upward
compatible so far as the on-disk representation goes.  We continue to
recognize zero as a dummy node label for reading purposes, but will not
attempt to push new index entries down into such a label, so that the loop
won't occur even when dealing with an existing index.

Per report from Teodor Sigaev.  Back-patch to 9.2 where the faulty
code was introduced.
2014-06-09 16:31:16 -04:00
Alvaro Herrera
167a2535f8 Wrap multixact/members correctly during extension, take 2
In a50d97625497b7 I already changed this, but got it wrong for the case
where the number of members is larger than the number of entries that
fit in the last page of the last segment.

As reported by Serge Negodyuck in a followup to bug #8673.
2014-06-09 15:17:23 -04:00
Fujii Masao
e2f02ed64e Fix breakages of hot standby regression test.
This commit changes HS regression test so that it uses
REPEATABLE READ transaction instead of SERIALIZABLE one
because SERIALIZABLE transaction isolation level is not
available in HS. Also this commit fixes VACUUM/ANALYZE
label mixup.

This was fixed in HEAD (commit 2985e16), but it should
have been back-patched to 9.1 which had introduced SSI
and forbidden SERIALIZABLE transaction in HS.

Amit Langote
2014-06-06 18:46:32 +09:00
Tom Lane
780852ef00 Add defenses against running with a wrong selection of LOBLKSIZE.
It's critical that the backend's idea of LOBLKSIZE match the way data has
actually been divided up in pg_largeobject.  While we don't provide any
direct way to adjust that value, doing so is a one-line source code change
and various people have expressed interest recently in changing it.  So,
just as with TOAST_MAX_CHUNK_SIZE, it seems prudent to record the value in
pg_control and cross-check that the backend's compiled-in setting matches
the on-disk data.

Also tweak the code in inv_api.c so that fetches from pg_largeobject
explicitly verify that the length of the data field is not more than
LOBLKSIZE.  Formerly we just had Asserts() for that, which is no protection
at all in production builds.  In some of the call sites an overlength data
value would translate directly to a security-relevant stack clobber, so it
seems worth one extra runtime comparison to be sure.

In the back branches, we can't change the contents of pg_control; but we
can still make the extra checks in inv_api.c, which will offer some amount
of protection against running with the wrong value of LOBLKSIZE.
2014-06-05 11:31:09 -04:00
Andres Freund
edde59db13 Fix longstanding bug in HeapTupleSatisfiesVacuum().
HeapTupleSatisfiesVacuum() didn't properly discern between
DELETE_IN_PROGRESS and INSERT_IN_PROGRESS for rows that have been
inserted in the current transaction and deleted in a aborted
subtransaction of the current backend. At the very least that caused
problems for CLUSTER and CREATE INDEX in transactions that had
aborting subtransactions producing rows, leading to warnings like:
WARNING:  concurrent delete in progress within table "..."
possibly in an endless, uninterruptible, loop.

Instead of treating *InProgress xmins the same as *IsCurrent ones,
treat them as being distinct like the other visibility routines. As
implemented this separatation can cause a behaviour change for rows
that have been inserted and deleted in another, still running,
transaction. HTSV will now return INSERT_IN_PROGRESS instead of
DELETE_IN_PROGRESS for those. That's both, more in line with the other
visibility routines and arguably more correct. The latter because a
INSERT_IN_PROGRESS will make callers look at/wait for xmin, instead of
xmax.
The only current caller where that's possibly worse than the old
behaviour is heap_prune_chain() which now won't mark the page as
prunable if a row has concurrently been inserted and deleted. That's
harmless enough.

As a cautionary measure also insert a interrupt check before the gotos
in IndexBuildHeapScan() that lead to the uninterruptible loop. There
are other possible causes, like a row that several sessions try to
update and all fail, for repeated loops and the cost of doing so in
the retry case is low.

As this bug goes back all the way to the introduction of
subtransactions in 573a71a5da backpatch to all supported releases.

Reported-By: Sandro Santilli
2014-06-04 23:26:08 +02:00
Fujii Masao
8dc90b9c4c Add description of pg_stat directory into doc.
Back-patch to 9.3 where pg_stat directory was introduced.
2014-06-05 01:46:31 +09:00
Tom Lane
9dee1e4b38 Make plpython_unicode regression test work in more database encodings.
This test previously used a data value containing U+0080, and would
therefore fail if the database encoding didn't have an equivalent to
that; which only about half of our supported server encodings do.
We could fall back to using some plain-ASCII character, but that seems
like it's losing most of the point of the test.  Instead switch to using
U+00A0 (no-break space), which translates into all our supported encodings
except the four in the EUC_xx family.

Per buildfarm testing.  Back-patch to 9.1, which is as far back as this
test is expected to succeed everywhere.  (9.0 has the test, but without
back-patching some 9.1 code changes we could not expect to get consistent
results across platforms anyway.)
2014-06-03 12:02:13 -04:00
Andres Freund
ee6c5f10d5 Set the process latch when processing recovery conflict interrupts.
Because RecoveryConflictInterrupt() didn't set the process latch
anything using the latter to wait for events didn't get notified about
recovery conflicts. Most latch users are never the target of recovery
conflicts, which explains the lack of reports about this until
now.
Since 9.3 two possible affected users exist though: The sql callable
pg_sleep() now uses latches to wait and background workers are
expected to use latches in their main loop. Both would currently wait
until the end of WaitLatch's timeout.

Fix by adding a SetLatch() to RecoveryConflictInterrupt(). It'd also
be possible to fix the issue by having each latch user set
set_latch_on_sigusr1. That seems failure prone and though, as most of
these callsites won't often receive recovery conflicts and thus will
likely only be tested against normal query cancels et al. It'd also be
unnecessarily verbose.

Backpatch to 9.1 where latches were introduced. Arguably 9.3 would be
sufficient, because that's where pg_sleep() was converted to waiting
on the latch and background workers got introduced; but there could be
user level code making use of the latch pre 9.3.
2014-06-03 14:13:31 +02:00
Tom Lane
c102c32e9a PL/Python: Adjust the regression tests for Python 3.4
Back-patch commit d0765d50f429472d00554701ac6531c84d324811 into 9.3
and 9.2, which is as far back as we previously bothered to adjust the
regression tests for Python 3.3.  Per gripe from Honza Horak.
2014-06-01 15:03:15 -04:00
Tom Lane
4f5f4da79e On OS X, link libpython normally, ignoring the "framework" framework.
As of Xcode 5.0, Apple isn't including the Python framework as part of the
SDK-level files, which means that linking to it might fail depending on
whether Xcode thinks you've selected a specific SDK version.  According to
their Tech Note 2328, they've basically deprecated the framework method of
linking to libpython and are telling people to link to the shared library
normally.  (I'm pretty sure this is in direct contradiction to the advice
they were giving a few years ago, but whatever.)  Testing says that this
approach works fine at least as far back as OS X 10.4.11, so let's just
rip out the framework special case entirely.  We do still need a special
case to decide that OS X provides a shared library at all, unfortunately
(I wonder why the distutils check doesn't work ...).  But this is still
less of a special case than before, so it's fine.

Back-patch to all supported branches, since we'll doubtless be hearing
about this more as more people update to recent Xcode.
2014-05-30 18:19:14 -04:00
Heikki Linnakangas
6f11869d13 Fix typos in MSVC solution file.
Michael Paquier
2014-05-30 10:27:40 +03:00
Tom Lane
961dd203a2 When using the OSSP UUID library, cache its uuid_t state object.
The original coding in contrib/uuid-ossp created and destroyed a uuid_t
object (or, in some cases, even two of them) each time it was called.
This is not the intended usage: you're supposed to keep the uuid_t object
around so that the library can cache its state across uses.  (Other UUID
libraries seem to keep equivalent state behind-the-scenes in static
variables, but OSSP chose differently.)  Aside from being quite inefficient,
creating a new uuid_t loses knowledge of the previously generated UUID,
which in theory could result in duplicate V1-style UUIDs being created
on sufficiently fast machines.

On at least some platforms, creating a new uuid_t also draws some entropy
from /dev/urandom, leaving less for the rest of the system.  This seems
sufficiently unpleasant to justify back-patching this change.
2014-05-29 13:51:05 -04:00
Tom Lane
2677bace52 Revert "Fix bogus %name-prefix option syntax in all our Bison files."
This reverts commit ece7aa8b0f57d92577055a88579555df895eb929.

It turns out that the %name-prefix syntax without "=" does not work
at all in pre-2.4 Bison.  We are not prepared to make such a large
jump in minimum required Bison version just to suppress a warning
message in a version hardly any developers are using yet.
When 3.0 gets more popular, we'll figure out a way to deal with this.
In the meantime, BISONFLAGS=-Wno-deprecated is recommendable for
anyone using 3.0 who doesn't want to see the warning.
2014-05-28 19:28:37 -04:00
Tom Lane
ece7aa8b0f Fix bogus %name-prefix option syntax in all our Bison files.
%name-prefix doesn't use an "=" sign according to the Bison docs, but it
silently accepted one anyway, until Bison 3.0.  This was originally a
typo of mine in commit 012abebab1bc72043f3f670bf32e91ae4ee04bd2, and we
seem to have slavishly copied the error into all the other grammar files.

Per report from Vik Fearing; analysis by Peter Eisentraut.

Back-patch to all active branches, since somebody might try to build
a back branch with up-to-date tools.
2014-05-28 15:41:55 -04:00
Magnus Hagander
433cacfc0f Ensure cleanup in case of early errors in streaming base backups
Move the code that sends the initial status information as well as the
calculation of paths inside the ENSURE_ERROR_CLEANUP block. If this code
failed, we would "leak" a counter of number of concurrent backups, thereby
making the system always believe it was in backup mode. This could happen
if the sending failed (which it probably never did given that the small
amount of data to send would never cause a flush). It is very low risk, but
all operations after do_pg_start_backup should be protected.
2014-05-28 13:00:09 +02:00
Tom Lane
b8cf89c041 Avoid unportable usage of sscanf(UINT64_FORMAT).
On Mingw, it seems that scanf() doesn't necessarily accept the same format
codes that printf() does, and in particular it may fail to recognize %llu
even though printf() does.  Since configure only probes printf() behavior
while setting up the INT64_FORMAT macros, this means it's unsafe to use
those macros with scanf().  We had only one instance of such a coding
pattern, in contrib/pg_stat_statements, so change that code to avoid
the problem.

Per buildfarm warnings.  Back-patch to 9.0 where the troublesome code
was introduced.

Michael Paquier
2014-05-26 22:23:33 -04:00
Tom Lane
0266a9c781 Prevent auto_explain from changing the output of a user's EXPLAIN.
Commit af7914c6627bcf0b0ca614e9ce95d3f8056602bf, which introduced the
EXPLAIN (TIMING) option, for some reason coded explain.c to look at
planstate->instrument->need_timer rather than es->timing to decide
whether to print timing info.  However, the former flag might get set
as a result of contrib/auto_explain wanting timing information.  We
certainly don't want activation of auto_explain to change user-visible
statement behavior, so fix that.

Also fix an independent bug introduced in the same patch: in the code
path for a never-executed node with a machine-friendly output format,
if timing was selected, it would fail to print the Actual Rows and Actual
Loops items.

Per bug #10404 from Tomonari Katsumata.  Back-patch to 9.2 where the
faulty code was introduced.
2014-05-20 12:20:52 -04:00
Heikki Linnakangas
e0c04b7c38 Use 0-based numbering in comments about backup blocks.
The macros and functions that work with backup blocks in the redo function
use 0-based numbering, so let's use that consistently in the function that
generates the records too. Makes it so much easier to compare the
generation and replay functions.

Backpatch to 9.0, where we switched from 1-based to 0-based numbering.
2014-05-19 13:28:26 +03:00
Tom Lane
777d07d7a3 Fix non-C89-compatible coding in pgbench.
C89 says that compound initializers may only contain constant expressions;
a restriction violated by commit 89d00cbe.  While we've had no actual field
complaints about this, C89 is still the project standard, and it's not
saving all that much code to break compatibility here.  So let's adhere to
the old restriction.

In passing, replace a bunch of hardwired constants "256" with
sizeof(target-variable), just because the latter is more readable and
less breakable.  And const-ify where possible.

Back-patch to 9.3 where the nonportable code was added.

Andres Freund and Tom Lane
2014-05-19 00:06:28 -04:00
Heikki Linnakangas
d6a9767404 Initialize tsId and dbId fields in WAL record of COMMIT PREPARED.
Commit dd428c79 added dbId and tsId to the xl_xact_commit struct but missed
that prepared transaction commits reuse that struct. Fix that.

Because those fields were left unitialized, replaying a commit prepared WAL
record in a hot standby node would fail to remove the relcache init file.
That can lead to "could not open file" errors on the standby. Relcache init
file only needs to be removed when a system table/index is rewritten in the
transaction using two phase commit, so that should be rare in practice. In
HEAD, the incorrect dbId/tsId values are also used for filtering in logical
replication code, causing the transaction to always be filtered out.

Analysis and fix by Andres Freund. Backpatch to 9.0 where hot standby was
introduced.
2014-05-16 10:05:50 +03:00
Tom Lane
8960b2db51 Fix unportable setvbuf() usage in initdb.
In yesterday's commit 2dc4f011fd61501cce507be78c39a2677690d44b, I tried
to force buffering of stdout/stderr in initdb to be what it is by
default when the program is run interactively on Unix (since that's how
most manual testing is done).  This tripped over the fact that Windows
doesn't support _IOLBF mode.  We dealt with that a long time ago in
syslogger.c by falling back to unbuffered mode on Windows.  Export that
solution in port.h and use it in initdb.

Back-patch to 8.4, like the previous commit.
2014-05-15 15:57:57 -04:00
Heikki Linnakangas
19d7e8f07a Handle duplicate XIDs in txid_snapshot.
The proc array can contain duplicate XIDs, when a transaction is just being
prepared for two-phase commit. To cope, remove any duplicates in
txid_current_snapshot(). Also ignore duplicates in the input functions, so
that if e.g. you have an old pg_dump file that already contains duplicates,
it will be accepted.

Report and fix by Jan Wieck. Backpatch to all supported versions.
2014-05-15 18:30:53 +03:00
Heikki Linnakangas
48fc8962f9 Fix race condition in preparing a transaction for two-phase commit.
To lock a prepared transaction's shared memory entry, we used to mark it
with the XID of the backend. When the XID was no longer active according
to the proc array, the entry was implicitly considered as not locked
anymore. However, when preparing a transaction, the backend's proc array
entry was cleared before transfering the locks (and some other state) to
the prepared transaction's dummy PGPROC entry, so there was a window where
another backend could finish the transaction before it was in fact fully
prepared.

To fix, rewrite the locking mechanism of global transaction entries. Instead
of an XID, just have simple locked-or-not flag in each entry (we store the
locking backend's backend id rather than a simple boolean, but that's just
for debugging purposes). The backend is responsible for explicitly unlocking
the entry, and to make sure that that happens, install a callback to unlock
it on abort or process exit.

Backpatch to all supported versions.
2014-05-15 16:58:02 +03:00
Tom Lane
4511e6c831 In initdb, ensure stdout/stderr buffering behavior is what we expect.
Since this program may print to either stdout or stderr, the relative
ordering of its messages depends on the buffering behavior of those files.
Force stdout to be line-buffered and stderr to be unbuffered, ensuring
that the behavior will match standard Unix interactive behavior, even
when stdout and stderr are rerouted to a file.

Per complaint from Tomas Vondra.  The particular case he pointed out is
new in HEAD, but issues of the same sort could arise in any branch with
other error messages, so back-patch to all branches.

I'm unsure whether we might not want to do this in other client programs
as well.  For the moment, just fix initdb.
2014-05-14 21:13:56 -04:00
Tom Lane
e7a9020939 Code review for recent changes in relcache.c.
rd_replidindex should be managed the same as rd_oidindex, and rd_keyattr
and rd_idattr should be managed like rd_indexattr.  Omissions in this area
meant that the bitmapsets computed for rd_keyattr and rd_idattr would be
leaked during any relcache flush, resulting in a slow but permanent leak in
CacheMemoryContext.  There was also a tiny probability of relcache entry
corruption if we ran out of memory at just the wrong point in
RelationGetIndexAttrBitmap.  Otherwise, the fields were not zeroed where
expected, which would not bother the code any AFAICS but could greatly
confuse anyone examining the relcache entry while debugging.

Also, create an API function RelationGetReplicaIndex rather than letting
non-relcache code be intimate with the mechanisms underlying caching of
that value (we won't even mention the memory leak there).

Also, fix a relcache flush hazard identified by Andres Freund:
RelationGetIndexAttrBitmap must not assume that rd_replidindex stays valid
across index_open.

The aspects of this involving rd_keyattr date back to 9.3, so back-patch
those changes.
2014-05-14 14:55:50 -04:00
Heikki Linnakangas
d5b912c905 Initialize padding bytes in btree_gist varbit support.
The code expands a varbit gist leaf key to a node key by copying the bit
data twice in a varlen datum, as both the lower and upper key. The lower key
was expanded to INTALIGN size, but the padding bytes were not initialized.
That's a problem because when the lower/upper keys are compared, the padding
bytes are used compared too, when the values are otherwise equal. That could
lead to incorrect query results.

REINDEX is advised for any btree_gist indexes on bit or bit varying data
type, to fix any garbage padding bytes on disk.

Per Valgrind, reported by Andres Freund. Backpatch to all supported
versions.
2014-05-13 15:27:14 +03:00
Tom Lane
b65a258a51 Ignore config.pl and buildenv.pl in src/tools/msvc.
config.pl and buildenv.pl can be used to customize build settings when
using MSVC.  They should never get committed into the common source tree.

Back-patch to 9.0; it looks like the rules were different in 8.4.

Michael Paquier
2014-05-12 14:24:28 -04:00
Heikki Linnakangas
0ad78306be Free PQresult on error in pg_receivexlog.
The leak is fairly small and rare, but a leak nevertheless.

Per Coverity report. Backpatch to 9.2, where pg_receivexlog was added.
pg_basebackup shares the code, but it always exits on error, so there is
no real leak.
2014-05-12 10:59:08 +03:00
Tom Lane
c6d048c978 Accept tcl 8.6 in configure's probe for tclsh.
Usually the search would find plain "tclsh" without any trouble,
but some installations might only have the version-numbered flavor
of that program.

No compatibility problems have been reported with 8.6, so we might
as well back-patch this to all active branches.

Christoph Berg
2014-05-10 10:48:04 -04:00
Tom Lane
13c6799958 Get rid of bogus dependency on typcategory in to_json() and friends.
These functions were relying on typcategory to identify arrays and
composites, which is not reliable and not the normal way to do it.
Using typcategory to identify boolean, numeric types, and json itself is
also pretty questionable, though the code in those cases didn't seem to be
at risk of anything worse than wrong output.  Instead, use the standard
lsyscache functions to identify arrays and composites, and rely on a direct
check of the type OID for the other cases.

In HEAD, also be sure to look through domains so that a domain is treated
the same as its base type for conversions to JSON.  However, this is a
small behavioral change; given the lack of field complaints, we won't
back-patch it.

In passing, refactor so that there's only one copy of the code that decides
which conversion strategy to apply, not multiple copies that could (and
have) gotten out of sync.
2014-05-09 12:55:03 -04:00
Tom Lane
5c6d3e405f Document permissions needed for pg_database_size and pg_tablespace_size.
Back in 8.3, we installed permissions checks in these functions (see
commits 8bc225e7990a and cc26599b7206).  But we forgot to document that
anywhere in the user-facing docs; it did get mentioned in the 8.3 release
notes, but nobody's looking at that any more.  Per gripe from Suya Huang.
2014-05-08 21:45:14 -04:00
Noah Misch
40d6f6a30f Un-break ecpg test suite under --disable-integer-datetimes.
Commit 4318daecc959886d001a6e79c6ea853e8b1dfb4b broke it.  The change in
sub-second precision at extreme dates is normal.  The inconsistent
truncation vs. rounding is essentially a bug, albeit a longstanding one.
Back-patch to 8.4, like the causative commit.
2014-05-08 19:29:30 -04:00
Heikki Linnakangas
34572920c9 Protect against torn pages when deleting GIN list pages.
To-be-deleted list pages contain no useful information, as they are being
deleted, but we must still protect the writes from being torn by a crash
after a partial write. To do that, re-initialize the pages on WAL replay.

Jeff Janes caught this with a test program to test partial writes.
Backpatch to all supported versions.
2014-05-08 14:43:04 +03:00
Heikki Linnakangas
cd7df1060b Include files copied from libpqport in .gitignore
Michael Paquier
2014-05-08 11:02:22 +03:00
Tom Lane
b4f9c93ce0 Avoid buffer bloat in libpq when server is consistently faster than client.
If the server sends a long stream of data, and the server + network are
consistently fast enough to force the recv() loop in pqReadData() to
iterate until libpq's input buffer is full, then upon processing the last
incomplete message in each bufferload we'd usually double the buffer size,
due to supposing that we didn't have enough room in the buffer to finish
collecting that message.  After filling the newly-enlarged buffer, the
cycle repeats, eventually resulting in an out-of-memory situation (which
would be reported misleadingly as "lost synchronization with server").
Of course, we should not enlarge the buffer unless we still need room
after discarding already-processed messages.

This bug dates back quite a long time: pqParseInput3 has had the behavior
since perhaps 2003, getCopyDataMessage at least since commit 70066eb1a1ad
in 2008.  Probably the reason it's not been isolated before is that in
common environments the recv() loop would always be faster than the server
(if on the same machine) or faster than the network (if not); or at least
it wouldn't be slower consistently enough to let the buffer ramp up to a
problematic size.  The reported cases involve Windows, which perhaps has
different timing behavior than other platforms.

Per bug #7914 from Shin-ichi Morita, though this is different from his
proposed solution.  Back-patch to all supported branches.
2014-05-07 21:38:38 -04:00
Tom Lane
fc58c39d46 Fix failure to set ActiveSnapshot while rewinding a cursor.
ActiveSnapshot needs to be set when we call ExecutorRewind because some
plan node types may execute user-defined functions during their ReScan
calls (nodeLimit.c does so, at least).  The wisdom of that is somewhat
debatable, perhaps, but for now the simplest fix is to make sure the
required context is valid.  Failure to do this typically led to a
null-pointer-dereference core dump, though it's possible that in more
complex cases a function could be executed with the wrong snapshot
leading to very subtle misbehavior.

Per report from Leif Jensen.  It's been broken for a long time, so
back-patch to all active branches.
2014-05-07 14:25:13 -04:00
Jeff Davis
b671061e14 Fix interval test, which was broken for floating-point timestamps.
Commit 4318daecc959886d001a6e79c6ea853e8b1dfb4b introduced a test that
couldn't be made consistent between integer and floating-point
timestamps.

It was designed to test the longest possible interval output length,
so removing four zeros from the number of hours, as this patch does,
is not ideal. But the test still has some utility for its original
purpose, and there aren't a lot of other good options.

Noah Misch suggested a different approach where we test that the
output either matches what we expect from integer timestamps or what
we expect from floating-point timestamps. That seemed to obscure an
otherwise simple test, however.

Reviewed by Tom Lane and Noah Misch.
2014-05-06 21:30:15 -07:00
Bruce Momjian
04e15c69d2 Remove tabs after spaces in C comments
This was not changed in HEAD, but will be done later as part of a
pgindent run.  Future pgindent runs will also do this.

Report by Tom Lane

Backpatch through all supported branches, but not HEAD
2014-05-06 11:26:28 -04:00
Simon Riggs
41fdcf71d2 Correct comment in Hot Standby nbtree handling
Logic is correct, matching handling of LP_DEAD elsewhere.
2014-05-06 14:45:05 +01:00
Heikki Linnakangas
a7a3e71c85 Fix use of free in walsender error handling after a sysid mismatch.
Found via valgrind. The bug exists since the introduction of the walsender,
so backpatch to 9.0.

Andres Freund
2014-05-06 15:16:51 +03:00
Michael Meskes
b4eeb9d58e Fix handling of array of char pointers in ecpglib.
When array of char * was used as target for a FETCH statement returning more
than one row, it tried to store all the result in the first element. Instead it
should dump array of char pointers with right offset, use the address instead
of the value of the C variable while reading the array and treat such variable
as char **, instead of char * for pointer arithmetic.

Patch by Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>
2014-05-06 13:04:30 +02:00
Tom Lane
4f4ef042fd Fix possible cache invalidation failure in ReceiveSharedInvalidMessages.
Commit fad153ec45299bd4d4f29dec8d9e04e2f1c08148 modified sinval.c to reduce
the number of calls into sinvaladt.c (which require taking a shared lock)
by keeping a local buffer of collected-but-not-yet-processed messages.
However, if processing of the last message in a batch resulted in a
recursive call to ReceiveSharedInvalidMessages, we could overwrite that
message with a new one while the outer invalidation function was still
working on it.  This would be likely to lead to invalidation of the wrong
cache entry, allowing subsequent processing to use stale cache data.
The fix is just to make a local copy of each message while we're processing
it.

Spotted by Andres Freund.  Back-patch to 8.4 where the bug was introduced.
2014-05-05 14:43:42 -04:00
Tom Lane
693a82dddf Fix "quiet inline" configure test for newer clang compilers.
This test used to just define an unused static inline function and check
whether that causes a warning.  But newer clang versions warn about
unused static inline functions when defined inside a .c file, but not
when defined in an included header, which is the case we care about.
Change the test to cope.

Andres Freund
2014-05-02 15:30:29 -04:00
Tom Lane
e31193d495 Fix yet another corner case in dumping rules/views with USING clauses.
ruleutils.c tries to cope with additions/deletions/renamings of columns in
tables referenced by views, by means of adding machine-generated aliases to
the printed form of a view when needed to preserve the original semantics.
A recent blog post by Marko Tiikkaja pointed out a case I'd missed though:
if one input of a join with USING is itself a join, there is nothing to
stop the user from adding a column of the same name as the USING column to
whichever side of the sub-join didn't provide the USING column.  And then
there'll be an error when the view is re-parsed, since now the sub-join
exposes two columns matching the USING specification.  We were catching a
lot of related cases, but not this one, so add some logic to cope with it.

Back-patch to 9.3, which is the first release that makes any serious
attempt to cope with such cases (cf commit 2ffa740be and follow-ons).
2014-05-01 20:22:39 -04:00
Tom Lane
b72e90bc31 Fix failure to detoast fields in composite elements of structured types.
If we have an array of records stored on disk, the individual record fields
cannot contain out-of-line TOAST pointers: the tuptoaster.c mechanisms are
only prepared to deal with TOAST pointers appearing in top-level fields of
a stored row.  The same applies for ranges over composite types, nested
composites, etc.  However, the existing code only took care of expanding
sub-field TOAST pointers for the case of nested composites, not for other
structured types containing composites.  For example, given a command such
as

UPDATE tab SET arraycol = ARRAY[(ROW(x,42)::mycompositetype] ...

where x is a direct reference to a field of an on-disk tuple, if that field
is long enough to be toasted out-of-line then the TOAST pointer would be
inserted as-is into the array column.  If the source record for x is later
deleted, the array field value would become a dangling pointer, leading
to errors along the line of "missing chunk number 0 for toast value ..."
when the value is referenced.  A reproducible test case for this was
provided by Jan Pecek, but it seems likely that some of the "missing chunk
number" reports we've heard in the past were caused by similar issues.

Code-wise, the problem is that PG_DETOAST_DATUM() is not adequate to
produce a self-contained Datum value if the Datum is of composite type.
Seen in this light, the problem is not just confined to arrays and ranges,
but could also affect some other places where detoasting is done in that
way, for example form_index_tuple().

I tried teaching the array code to apply toast_flatten_tuple_attribute()
along with PG_DETOAST_DATUM() when the array element type is composite,
but this was messy and imposed extra cache lookup costs whether or not any
TOAST pointers were present, indeed sometimes when the array element type
isn't even composite (since sometimes it takes a typcache lookup to find
that out).  The idea of extending that approach to all the places that
currently use PG_DETOAST_DATUM() wasn't attractive at all.

This patch instead solves the problem by decreeing that composite Datum
values must not contain any out-of-line TOAST pointers in the first place;
that is, we expand out-of-line fields at the point of constructing a
composite Datum, not at the point where we're about to insert it into a
larger tuple.  This rule is applied only to true composite Datums, not
to tuples that are being passed around the system as tuples, so it's not
as invasive as it might sound at first.  With this approach, the amount
of code that has to be touched for a full solution is greatly reduced,
and added cache lookup costs are avoided except when there actually is
a TOAST pointer that needs to be inlined.

The main drawback of this approach is that we might sometimes dereference
a TOAST pointer that will never actually be used by the query, imposing a
rather large cost that wasn't there before.  On the other side of the coin,
if the field value is used multiple times then we'll come out ahead by
avoiding repeat detoastings.  Experimentation suggests that common SQL
coding patterns are unaffected either way, though.  Applications that are
very negatively affected could be advised to modify their code to not fetch
columns they won't be using.

In future, we might consider reverting this solution in favor of detoasting
only at the point where data is about to be stored to disk, using some
method that can drill down into multiple levels of nested structured types.
That will require defining new APIs for structured types, though, so it
doesn't seem feasible as a back-patchable fix.

Note that this patch changes HeapTupleGetDatum() from a macro to a function
call; this means that any third-party code using that macro will not get
protection against creating TOAST-pointer-containing Datums until it's
recompiled.  The same applies to any uses of PG_RETURN_HEAPTUPLEHEADER().
It seems likely that this is not a big problem in practice: most of the
tuple-returning functions in core and contrib produce outputs that could
not possibly be toasted anyway, and the same probably holds for third-party
extensions.

This bug has existed since TOAST was invented, so back-patch to all
supported branches.
2014-05-01 15:19:10 -04:00
Tom Lane
0140bee43d Check for interrupts and stack overflow during rule/view dumps.
Since ruleutils.c recurses, it could be driven to stack overflow by
deeply nested constructs.  Very large queries might also take long
enough to deparse that a check for interrupts seems like a good idea.
Stick appropriate tests into a couple of key places.

Noted by Greg Stark.  Back-patch to all supported branches.
2014-04-30 13:46:16 -04:00