An ancient logic error in cfindloop() could cause the regex engine to fail
to find matches that begin later than the start of the string. This
function is only used when the regex pattern contains a back reference,
and so far as we can tell the error is only reachable if the pattern is
non-greedy (i.e. its first quantifier uses the ? modifier). Furthermore,
the actual match must begin after some potential match that satisfies the
DFA but then fails the back-reference's match test.
Reported and fixed by Jeevan Chalke, with cosmetic adjustments by me.
In pg_basebackup.c:reached_end_position(), we're reading from an
internal pipe with our own background process but we're possibly
reading more bytes than will actually fit into our buffer due to
an off-by-one error. As we're reading from an internal pipe
there's no real risk here, but it's good form to not depend on
such convenient arrangements.
Bug spotted by the Coverity scanner.
Back-patch to 9.2 where this showed up.
In tuplesort.c:inittapes(), we calculate tapeSpace by first figuring
out how many 'tapes' we can use (maxTapes) and then multiplying the
result by the tape buffer overhead for each. Unfortunately, when
we are on a system with an 8-byte long, we allow work_mem to be
larger than 2GB and that allows maxTapes to be large enough that the
32bit arithmetic can overflow when multiplied against the buffer
overhead.
When this overflow happens, we end up adding the overflow to the
amount of space available, causing the amount of memory allocated to
be larger than work_mem.
Note that to reach this point, you have to set work mem to at least
24GB and be sorting a set which is at least that size. Given that a
user who can set work_mem to 24GB could also set it even higher, if
they were looking to run the system out of memory, this isn't
considered a security issue.
This overflow risk was found by the Coverity scanner.
Back-patch to all supported branches, as this issue has existed
since before 8.4.
The code in set_append_rel_pathlist() for building parameterized paths
for append relations (inheritance and UNION ALL combinations) supposed
that the cheapest regular path for a child relation would still be cheapest
when reparameterized. Which might not be the case, particularly if the
added join conditions are expensive to compute, as in a recent example from
Jeff Janes. Fix it to compare child path costs *after* reparameterizing.
We can short-circuit that if the cheapest pre-existing path is already
parameterized correctly, which seems likely to be true often enough to be
worth checking for.
Back-patch to 9.2 where parameterized paths were introduced.
With -Wtype-limits, gcc correctly points out that size_t can never be < 0.
Backpatch to 9.3 and 9.2. It's been like this forever, but in <= 9.1 you got
a lot other warnings with -Wtype-limits anyway (at least with my version of
gcc).
Andres Freund
When there's a comment on an index that was created with UNIQUE or PRIMARY
KEY constraint syntax, we need to label the comment as depending on the
constraint not the index, since only the constraint object actually appears
in the dump. This incorrect dependency can lead to parallel pg_restore
trying to restore the comment before the index has been created, per bug
#8257 from Lloyd Albin.
This patch fixes pg_dump to produce the right dependency in dumps made
in the future. Usually we also try to hack pg_restore to work around
bogus dependencies, so that existing (wrong) dumps can still be restored in
parallel mode; but that doesn't seem practical here since there's no easy
way to relate the constraint dump entry to the comment after the fact.
Andres Freund
On Unix-ish platforms, EWOULDBLOCK may be the same as EAGAIN, which is
*not* a success return, at least not on Linux. We need to treat it as a
failure to avoid giving a misleading error message. Per the Single Unix
Spec, only EINPROGRESS and EINTR returns indicate that the connection
attempt is in progress.
On Windows, on the other hand, EWOULDBLOCK (WSAEWOULDBLOCK) is the expected
case. We must accept EINPROGRESS as well because Cygwin will return that,
and it doesn't seem worth distinguishing Cygwin from native Windows here.
It's not very clear whether EINTR can occur on Windows, but let's leave
that part of the logic alone in the absence of concrete trouble reports.
Also, remove the test for errno == 0, effectively reverting commit
da9501bddb42222dc33c031b1db6ce2133bcee7b, which AFAICS was just a thinko;
or at best it might have been a workaround for a platform-specific bug,
which we can hope is gone now thirteen years later. In any case, since
libpq makes no effort to reset errno to zero before calling connect(),
it seems unlikely that that test has ever reliably done anything useful.
Andres Freund and Tom Lane
In binary upgrade mode, we need to recreate and then drop dropped
columns so that all the columns get the right attribute number. This is
true for foreign tables as well as for native tables. For foreign
tables we have been getting the first part right but not the second,
leading to bogus columns in the upgraded database. Fix this all the way
back to 9.1, where foreign tables were introduced.
In replication, when we shutdown the master, walsender tries to send
all the outstanding WAL records to the standby, and then to exit. This
basically means that all the WAL records are fully synced between
two servers after the clean shutdown of the master. So, after
promoting the standby to new master, we can restart the stopped
master as new standby without the need for a fresh backup from
new master.
But there was one problem so far: though walsender tries to send all
the outstanding WAL records, it doesn't wait for them to be replicated
to the standby. Then, before receiving all the WAL records,
walreceiver can detect the closure of connection and exit. We cannot
guarantee that there is no missing WAL in the standby after clean
shutdown of the master. In this case, backup from new master is
required when restarting the stopped master as new standby.
This patch fixes this problem. It just changes walsender so that it
waits for all the outstanding WAL records to be replicated to the
standby before closing the replication connection.
Per discussion, this is a fix that needs to get backpatched rather than
new feature. So, back-patch to 9.1 where enough infrastructure for
this exists.
Patch by me, reviewed by Andres Freund.
In some cases with higher numbers of subtransactions
it was possible for us to incorrectly initialize
subtrans leading to complaints of missing pages.
Bug report by Sergey Konoplev
Analysis and fix by Andres Freund
In Danish collations, there are letter combinations which sort
higher than 'Z'. A test for values > 'WA' was picking up rows
where the value started with 'AA', causing the test to fail.
Backpatch to 9.2, where the failing test was added.
Per report from Svenne Krap and analysis by Jeff Janes
SP-GiST's original scheme for avoiding deadlocks during concurrent index
insertions doesn't work, as per report from Hailong Li, and there isn't any
evident way to make it work completely. We could possibly lock individual
inner tuples instead of their whole pages, but preliminary experimentation
suggests that the performance penalty would be huge. Instead, if we fail
to get a buffer lock while descending the tree, just restart the tree
descent altogether. We keep the old tuple positioning rules, though, in
hopes of reducing the number of cases where this can happen.
Teodor Sigaev, somewhat edited by Tom Lane
In most scenarios a portal without a ResourceOwner is dead and not subject
to any further execution, but a portal for a cursor WITH HOLD remains in
existence with no ResourceOwner after the creating transaction is over.
In this situation, if we attempt to "execute" the portal directly to fetch
data from it, we were setting CurrentResourceOwner to NULL, leading to a
segfault if the datatype output code did anything that required a resource
owner (such as trying to fetch system catalog entries that weren't already
cached). The case appears to be impossible to provoke with stock libpq,
but psqlODBC at least is able to cause it when working with held cursors.
Simplest fix is to just skip the assignment to CurrentResourceOwner, so
that any resources used by the data output operations will be managed by
the transaction-level resource owner instead. For consistency I changed
all the places that install a portal's resowner as current, even though
some of them are probably not reachable with a held cursor's portal.
Per report from Joshua Berry (with thanks to Hiroshi Inoue for developing
a self-contained test case). Back-patch to all supported versions.
We need to increment the refcount on the composite type's cached tuple
descriptor while we do lookups of its column types. Otherwise a cache
flush could occur and release the tuple descriptor before we're done with
it. This fails reliably with -DCLOBBER_CACHE_ALWAYS, but the odds of a
failure in a production build seem rather low (since the pfree'd descriptor
typically wouldn't get scribbled on immediately). That may explain the
lack of any previous reports. Buildfarm issue noted by Christian Ullrich.
Back-patch to 9.1 where the bogus code was added.
getSchemaData() must identify extension member objects and mark them
as not to be dumped. This must happen after reading all objects that can be
direct members of extensions, but before we begin to process table subsidiary
objects. Both rules and event triggers were wrong in this regard.
Backport rules portion of patch to 9.1 -- event triggers do not exist prior to 9.3.
Suggested fix by Tom Lane, initial complaint and patch by me.
When the existing code here was written, it made sense to special-case
RowExprs because that was the only way that we could handle row comparisons
at all. Now that we have record_eq() and arrays of composites, the generic
logic for "scalar" types will in fact work on RowExprs too, so there's no
reason to throw error for combinations of RowExprs and other ways of
forming composite values, nor to ignore the possibility of using a
ScalarArrayOpExpr. But keep using the old logic when comparing two
RowExprs, for consistency with the main transformAExprOp() logic. (This
allows some cases with not-quite-identical rowtypes to succeed, so we might
get push-back if we removed it.) Per bug #8198 from Rafal Rzepecki.
Back-patch to all supported branches, since this works fine as far back as
8.4.
Rafal Rzepecki and Tom Lane
Per discussion, this restriction isn't needed for any real security reason,
and it seems to confuse people more often than it helps them. It could
also result in some database states being unrestorable. So just drop it.
Back-patch to 9.0, where ALTER DEFAULT PRIVILEGES was introduced.
AllocateFile(), AllocateDir(), and some sister routines share a small array
for remembering requests, so that the files can be closed on transaction
failure. Previously that array had a fixed size, MAX_ALLOCATED_DESCS (32).
While historically that had seemed sufficient, Steve Toutant pointed out
that this meant you couldn't scan more than 32 file_fdw foreign tables in
one query, because file_fdw depends on the COPY code which uses
AllocateFile(). There are probably other cases, or will be in the future,
where this nonconfigurable limit impedes users.
We can't completely remove any such limit, at least not without a lot of
work, since each such request requires a kernel file descriptor and most
platforms limit the number we can have. (In principle we could
"virtualize" these descriptors, as fd.c already does for the main VFD pool,
but not without an additional layer of overhead and a lot of notational
impact on the calling code.) But we can at least let the array size be
configurable. Hence, change the code to allow up to max_safe_fds/2
allocated file requests. On modern platforms this should allow several
hundred concurrent file_fdw scans, or more if one increases the value of
max_files_per_process. To go much further than that, we'd need to do some
more work on the data structure, since the current code for closing
requests has potentially O(N^2) runtime; but it should still be all right
for request counts in this range.
Back-patch to 9.1 where contrib/file_fdw was introduced.
Long-standing code has called tolower() on identifier character bytes
with the high bit set. This is clearly an error and produces junk output
when the encoding is multi-byte. This patch therefore restricts this
activity to cases where there is a character with the high bit set AND
the encoding is single-byte.
There have been numerous gripes about this, most recently from Martin
Schäfer.
Backpatch to all live releases.
The planner is aware that it mustn't push down upper-level quals into
subqueries if the quals reference subquery output columns that contain
set-returning functions or volatile functions, or are non-DISTINCT outputs
of a DISTINCT ON subquery. However, it missed making this check when
there were one or more levels of UNION or INTERSECT above the dangerous
expression. This could lead to "set-valued function called in context that
cannot accept a set" errors, as seen in bug #8213 from Eric Soroos, or to
silently wrong answers in the other cases.
To fix, refactor the checks so that we make the column-is-unsafe checks
during subquery_is_pushdown_safe(), which already has to recursively
inspect all arms of a set-operation tree. This makes
qual_is_pushdown_safe() considerably simpler, at the cost that we will
spend some cycles checking output columns that possibly aren't referenced
in any upper qual. But the cases where this code gets executed at all
are already nontrivial queries, so it's unlikely anybody will notice any
slowdown of planning.
This has been broken since commit 05f916e6add9726bf4ee046e4060c1b03c9961f2,
which makes the bug over ten years old. A bit surprising nobody noticed it
before now.
In commit 2c92edad48796119c83d7dbe6c33425d1924626d, I broke "EXPLAIN
(ANALYZE)" syntax, because I mistakenly thought that ANALYZE/ANALYSE were
only partially reserved and thus would be included in NonReservedWord;
but actually they're fully reserved so they still need to be called out
here.
A nicer solution would be to demote these words to type_func_name_keyword
status (they can't be less than that because of "VACUUM [ANALYZE] ColId").
While that works fine so far as the core grammar is concerned, it breaks
ECPG's grammar for reasons I don't have time to isolate at the moment.
So do this for the time being.
Per report from Kevin Grittner. Back-patch to 9.0, like the previous
commit.
The new message (and SQLSTATE) matches the corresponding error cases in
namespace.c.
This was thought to be a "can't happen" case when extension.c was written,
so we didn't think hard about how to report it. But it definitely can
happen in 9.2 and later, since we no longer require search_path to contain
any valid schema names. It's probably also possible in 9.1 if search_path
came from a noninteractive source. So, back-patch to all releases
containing this code.
Per report from Sean Chittenden, though this isn't exactly his patch.
Use the same gcc atomic functions as we do on newer ARM chips.
(Basically this is a copy and paste of the __arm__ code block,
but omitting the SWPB option since that definitely won't work.)
Back-patch to 9.2. The patch would work further back, but we'd also
need to update config.guess/config.sub in older branches to make them
build out-of-the-box, and there hasn't been demand for it.
Mark Salter
The array allocated by GetRunningTransactionLocks() needs to be pfree'd
when we're done with it. Otherwise we leak some memory during each
checkpoint, if wal_level = hot_standby. This manifests as memory bloat
in the checkpointer process, or in bgwriter in versions before we made
the checkpointer separate.
Reported and fixed by Naoya Anzai. Back-patch to 9.0 where the issue
was introduced.
In passing, improve comments for GetRunningTransactionLocks(), and add
an Assert that we didn't overrun the palloc'd array.
"eval q{foo}" used to complain that the error was on line 2 of the eval'd
string, because eval internally tacked on "\n;" so that the end of the
erroneous command was indeed on line 2. But as of Perl 5.18 it more
sanely says that the error is on line 1. To avoid Perl-version-dependent
regression test results, use "eval q{foo;}" instead in the two places
where this matters. Per buildfarm.
Since people might try to use newer Perl versions with older PG releases,
back-patch as far as 9.0 where these test cases were added.
This change makes type_func_name_keywords less reserved than they were
before, by allowing them for role names, language names, EXPLAIN and COPY
options, and SET values for GUCs; which are all places where few if any
actual keywords could appear instead, so no new ambiguities are introduced.
The main driver for this change is to allow "COPY ... (FORMAT BINARY)"
to work without quoting the word "binary". That is an inconsistency that
has been complained of repeatedly over the years (at least by Pavel Golub,
Kurt Lidl, and Simon Riggs); but we hadn't thought of any non-ugly solution
until now.
Back-patch to 9.0 where the COPY (FORMAT BINARY) syntax was introduced.
When COPY uses the multi-insert method to insert a batch of tuples into the
heap at a time, incorrect line number was printed if something went wrong in
inserting the index tuples (primary key failure, for exampl), or processing
after row triggers.
Fixes bug #8173 reported by Lloyd Albin. Backpatch to 9.2, where the multi-
insert code was added.
PathNameOpenFile failed to ensure that the correct value of errno was
returned to its caller after a failure (because it incorrectly supposed
that free() can never change errno). In some cases this would result
in a user-visible failure because an expected ENOENT errno was replaced
with something else. Bogus EINVAL failures have been observed on OS X,
for example.
There were also a couple of places that could mangle an important value
of errno if FDDEBUG was defined. While the usefulness of that debug
support is highly debatable, we might as well make it safe to use,
so add errno save/restore logic to the DO_DB macro.
Per bug #8167 from Nelson Minar, diagnosed by RhodiumToad.
Back-patch to all supported branches.
If OID wraparound should occur while in standalone mode (unlikely but
possible), we want to advance the counter to FirstNormalObjectId not
FirstBootstrapObjectId. Otherwise, user objects might be created with OIDs
in the system-reserved range. That isn't immediately harmful but it poses
a risk of conflicts during future pg_upgrade operations.
Noted by Andres Freund. Back-patch to all supported branches, since all of
them are supported sources for pg_upgrade operations.
This case doesn't normally happen, because the planner usually clamps
all row estimates to at least one row; but I found that it can arise
when dealing with relations excluded by constraints. Without a defense,
estimate_num_groups() can return zero, which leads to divisions by zero
inside the planner as well as assertion failures in the executor.
An alternative fix would be to change set_dummy_rel_pathlist() to make
the size estimate for a dummy relation 1 row instead of 0, but that seemed
pretty ugly; and probably someday we'll want to drop the convention that
the minimum rowcount estimate is 1 row.
Back-patch to 8.4, as the problem can be demonstrated that far back.
Commit d22a09dc70f9830fa78c1cd1a3a453e4e473d354 introduced official support
for GiST consistentFns that want to cache data using the FmgrInfo fn_extra
pointer: the idea was to preserve the cached values across gistrescan(),
whereas formerly they'd been leaked. However, there was an oversight in
that, namely that multiple scan keys might reference the same column's
consistentFn; the code would result in propagating the same cache value
into multiple scan keys, resulting in crashes or wrong answers. Use a
separate array instead to ensure that each scan key keeps its own state.
Per bug #8143 from Joel Roller. Back-patch to 9.2 where the bug was
introduced.
This reverts commit 15b0421002624919c62ae3c6574af2a8452bf6c4. Per
complaint from Robert Haas, that patch caused crashes on appendrel
cases, and it didn't fix all forms of the problem anyway. Consensus
is we'll leave this problem alone in the back branches, at least
for now.
WAL segment means a 16 MB physical WAL file; this comment meant a logical
4 GB log file.
Amit Langote. Apply to backbranches only, as the comment is gone in master.
A view defined as "select <something> where false" had the curious property
that the system wouldn't check whether users had the privileges necessary
to select from it. More generally, permissions checks could be skipped
for tables referenced in sub-selects or views that were proven empty by
constraint exclusion (although some quick testing suggests this seldom
happens in cases of practical interest). This happened because the planner
failed to include rangetable entries for such tables in the finished plan.
This was noticed in connection with erroneous handling of materialized
views, but actually the issue is quite unrelated to matviews. Therefore,
revert commit 200ba1667b3a8d7a9d559d2f05f83d209c9d8267 in favor of a more
direct test for the real problem.
Back-patch to 9.2 where the bug was introduced (by commit
7741dd6590073719688891898e85f0cb73453159).
This is a follow-up to the earlier fix, which changed the recycling logic
to recycle WAL segments under the current recovery target timeline. That
turns out to be a bad idea, because installing a recycled segment with
a TLI higher than what we're recovering at the moment means that the recovery
logic will find the recycled WAL segment and try to replay it. It will fail,
but but the mere presence of such a WAL segment will mask any other, real,
file with the same log/seg, but smaller TLI.
Per report from Mitsumasa Kondo. Apply to 9.1 and 9.2, like the previous
fix. Master was already doing this differently; this patch makes 9.1 and
9.2 to do the same thing as master.
This patch gets rid of the concept of, and infrastructure for,
non-canonical PathKeys; we now only ever create canonical pathkey lists.
The need for non-canonical pathkeys came from the desire to have
grouping_planner initialize query_pathkeys and related pathkey lists before
calling query_planner. However, since query_planner didn't actually *do*
anything with those lists before they'd been made canonical, we can get rid
of the whole mess by just not creating the lists at all until the point
where we formerly canonicalized them.
There are several ways in which we could implement that without making
query_planner itself deal with grouping/sorting features (which are
supposed to be the province of grouping_planner). I chose to add a
callback function to query_planner's API; other alternatives would have
required adding more fields to PlannerInfo, which while not bad in itself
would create an ABI break for planner-related plugins in the 9.2 release
series. This still breaks ABI for anything that calls query_planner
directly, but it seems somewhat unlikely that there are any such plugins.
I had originally conceived of this change as merely a step on the way to
fixing bug #8049 from Teun Hoogendoorn; but it turns out that this fixes
that bug all by itself, as per the added regression test. The reason is
that now get_eclass_for_sort_expr is adding the ORDER BY expression at the
end of EquivalenceClass creation not the start, and so anything that is in
a multi-member EquivalenceClass has already been created with correct
em_nullable_relids. I am suspicious that there are related scenarios in
which we still need to teach get_eclass_for_sort_expr to compute correct
nullable_relids, but am not eager to risk destabilizing either 9.2 or 9.3
to fix bugs that are only hypothetical. So for the moment, do this and
stop here.
Back-patch to 9.2 but not to earlier branches, since they don't exhibit
this bug for lack of join-clause-movement logic that depends on
em_nullable_relids being correct. (We might have to revisit that choice
if any related bugs turn up.) In 9.2, don't change the signature of
make_pathkeys_for_sortclauses nor remove canonicalize_pathkeys, so as
not to risk more plugin breakage than we have to.
Patch b19e4250b45e91c9cbdd18d35ea6391ab5961c8d attempted to
preserve existing behavior regarding statistics generation in the
case that a truncation attempt was canceled due to lock conflicts.
It failed to do this accurately in two regards: (1) autovacuum had
previously generated statistics if the truncate attempt failed to
initially get the lock rather than having started the attempt, and
(2) the VACUUM ANALYZE command had always generated statistics.
Both of these changes were unintended, and are reverted by this
patch. On review, there seems to be consensus that the previous
failure to generate statistics when the truncate was terminated
was more an unfortunate consequence of how that effort was
previously terminated than a feature we want to keep; so this
patch generates statistics even when an autovacuum truncation
attempt terminates early. Another unintended change which is kept
on the basis that it is an improvement is that when a VACUUM
command is truncating, it will the new heuristic for avoiding
blocking other processes, rather than keeping an
AccessExclusiveLock on the table for however long the truncation
takes.
Per multiple reports, with some renaming per patch by Jeff Janes.
Backpatch to 9.0, where problem was created.
There was a high probability of two or more concurrent C.I.C. commands
deadlocking just before completion, because each would wait for the others
to release their reference snapshots. Fix by releasing the snapshot
before waiting for other snapshots to go away.
Per report from Paul Hinze. Back-patch to all active branches.
When creating or manipulating a cached plan for a transaction control
command (particularly ROLLBACK), we must not perform any catalog accesses,
since we might be in an aborted transaction. However, plancache.c busily
saved or examined the search_path for every cached plan. If we were
unlucky enough to do this at a moment where the path's expansion into
schema OIDs wasn't already cached, we'd do some catalog accesses; and with
some more bad luck such as an ill-timed signal arrival, that could lead to
crashes or Assert failures, as exhibited in bug #8095 from Nachiket Vaidya.
Fortunately, there's no real need to consider the search path for such
commands, so we can just skip the relevant steps when the subject statement
is a TransactionStmt. This is somewhat related to bug #5269, though the
failure happens during initial cached-plan creation rather than
revalidation.
This bug has been there since the plan cache was invented, so back-patch
to all supported branches.