We were considering the INCLUDE columns as part of the key, allowing
unicity-violating rows to be inserted in different partitions.
Concurrent development conflict in eb7ed3f30634 and 8224de4f42cc.
Reported-by: Justin Pryzby
Discussion: https://postgr.es/m/20190109065109.GA4285@telsasoft.com
This especially helps braces that surround code blocks. Back-patch to
v11, where commit 56fb890ace8ac0ca955ae0803c580c2074f876f6 first
appeared; before that, settings were even more distant from perltidy.
Reviewed by Andrew Dunstan.
Discussion: https://postgr.es/m/20190103055355.GB267595@gust.leadboat.com
Some makefiles were trying to do this:
temp-install: EXTRA_INSTALL=contrib/test_decoding
but that no longer works as of commit aa019da52: the macro is now
consulted by the checkprep target, one level down, and apparently
gmake doesn't propagate such macro settings recursively.
The problem is masked since 42e61c774 because pgxs.mk also sets up
EXTRA_INSTALL, and correctly applies it to the checkprep target.
Unfortunately I'd not risked back-patching that to before v11.
Since aa019da52 was pushed back to v10, it broke test_decoding
there (the only module for which this actually makes a difference
at present).
Hence, back-patch 42e61c774 to v10. Also, remove some demonstrably
useless settings of EXTRA_INSTALL in v10 and v11 (they'd already
been cleaned up in HEAD).
Per buildfarm.
Discussion: https://postgr.es/m/CAEepm=1pEJdwv6DSGmOfpX0EaX7L7sT28c1nXpqvQvmLfEWb1g@mail.gmail.com
Up to now, createplan.c attempted to share PARAM_EXEC slots for
NestLoopParams across different plan levels, if the same underlying Var
was being fed down to different righthand-side subplan trees by different
NestLoops. This was, I think, more of an artifact of using subselect.c's
PlannerParamItem infrastructure than an explicit design goal, but anyway
that was the end result.
This works well enough as long as the plan tree is executing synchronously,
but the feature whereby Gather can execute the parallelized subplan locally
breaks it. An upper NestLoop node might execute for a row retrieved from
a parallel worker, and assign a value for a PARAM_EXEC slot from that row,
while the leader's copy of the parallelized subplan is suspended with a
different active value of the row the Var comes from. When control
eventually returns to the leader's subplan, it gets the wrong answers if
the same PARAM_EXEC slot is being used within the subplan, as reported
in bug #15577 from Bartosz Polnik.
This is pretty reminiscent of the problem fixed in commit 46c508fbc, and
the proper fix seems to be the same: don't try to share PARAM_EXEC slots
across different levels of controlling NestLoop nodes.
This requires decoupling NestLoopParam handling from PlannerParamItem
handling, although the logic remains somewhat similar. To avoid bizarre
division of labor between subselect.c and createplan.c, I decided to move
all the param-slot-assignment logic for both cases out of those files
and put it into a new file paramassign.c. Hopefully it's a bit better
documented now, too.
A regression test case for this might be nice, but we don't know a
test case that triggers the problem with a suitably small amount
of data.
Back-patch to 9.6 where we added Gather nodes. It's conceivable that
related problems exist in older branches; but without some evidence
for that, I'll leave the older branches alone.
Discussion: https://postgr.es/m/15577-ca61ab18904af852@postgresql.org
Since approximately PostgreSQL 10, it is no longer required that
environment variables at installation time such as PERL, PYTHON, TCLSH
be "full path names", so change that phrasing in the installation
instructions. (The exact time of change appears to differ for PERL
and the others, but it works consistently in PostgreSQL 10.)
Also while we're here document the defaults for PERL and PYTHON, but
since the search list for TCLSH is so long, let's leave that out so we
don't need to maintain a copy of that list in the installation
instructions.
This was an oversight in commit 16828d5c. If the table is going to be
rewritten, we simply clear all the missing values from all the table's
attributes, since there will no longer be any rows with the attributes
missing. Otherwise, we repackage the missing value in an array
constructed with the new type specifications.
Backpatch to release 11.
This fixes bug #15446, reported by Dmitry Molotkov
Reviewed by Dean Rasheed
For a long time, plpgsql has allowed trigger functions to parse
references to OLD and NEW even if the current trigger event type didn't
assign a value to one or the other variable; but actually executing such
a reference would fail. The v11 changes to use "expanded records" for
DTYPE_REC variables changed the behavior so that the unassigned variable
now reads as a null composite value. While this behavioral change was
more or less unintentional, it seems that leaving it like this is better
than adding code and complexity to be bug-compatible with the old way.
The change doesn't break any code that worked before, and it eliminates
a gotcha that often required extra code to work around.
Hence, update the docs to say that these variables are "null" not
"unassigned" when not relevant to the event type. And add a regression
test covering the behavior, so that we'll notice if we ever break it
again.
Per report from Kristjan Tammekivi.
Discussion: https://postgr.es/m/CAABK7uL-uC9ZxKBXzo_68pKt7cECfNRv+c35CXZpjq6jCAzYYA@mail.gmail.com
runtime.sgml said that you couldn't change SysV IPC parameters on OpenBSD
except by rebuilding the kernel. That's definitely wrong in OpenBSD 6.x,
and excavation in their man pages says it changed in OpenBSD 3.3.
Update NetBSD and OpenBSD sections to recommend adjustment of the SEMMNI
and SEMMNS settings, which are painfully small by default on those
platforms. (The discussion thread contemplated recommending that
people select POSIX semaphores instead, but the performance consequences
of that aren't really clear, so I'll refrain.)
Remove pointless discussion of SEMMNU and SEMMAP from the FreeBSD
section. Minor other wordsmithing.
Discussion: https://postgr.es/m/27582.1546928073@sss.pgh.pa.us
This patch changes the rule for whether or not a tuple seen by ANALYZE
should be included in its sample.
When we last touched this logic, in commit 51e1445f1, we weren't
thinking very hard about tuples being UPDATEd by a long-running
concurrent transaction. In such a case, we might see the pre-image as
either LIVE or DELETE_IN_PROGRESS depending on timing; and we might see
the post-image not at all, or as INSERT_IN_PROGRESS. Since the existing
code will not sample either DELETE_IN_PROGRESS or INSERT_IN_PROGRESS
tuples, this leads to concurrently-updated rows being omitted from the
sample entirely. That's not very helpful, and it's especially the wrong
thing if the concurrent transaction ends up rolling back.
The right thing seems to be to sample DELETE_IN_PROGRESS rows just as if
they were live. This makes the "sample it" and "count it" decisions the
same, which seems good for consistency. It's clearly the right thing
if the concurrent transaction ends up rolling back; in effect, we are
sampling as though IN_PROGRESS transactions haven't happened yet.
Also, this combination of choices ensures maximum robustness against
the different combinations of whether and in which state we might see the
pre- and post-images of an update.
It's slightly annoying that we end up recording immediately-out-of-date
stats in the case where the transaction does commit, but on the other
hand the stats are fine for columns that didn't change in the update.
And the alternative of sampling INSERT_IN_PROGRESS rows instead seems
like a bad idea, because then the sampling would be inconsistent with
the way rows are counted for the stats report.
Per report from Mark Chambers; thanks to Jeff Janes for diagnosing
what was happening. Back-patch to all supported versions.
Discussion: https://postgr.es/m/CAFh58O_Myr6G3tcH3gcGrF-=OExB08PJdWZcSBcEcovaiPsrHA@mail.gmail.com
Debian testing and newer now require that RSA and DHE keys are at
least 2048 bit long and no longer allow SHA-1 for signatures in
certificates. This is currently causing the ssl tests to fail there
because the test certificates and keys have been created in violation
of those conditions.
Update the parameters to create the test files and create a new set of
test files.
Author: Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>
Reported-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/20180917131340.GE31460%40paquier.xyz
MinMaxExpr invokes the btree comparison function for its input datatype,
so it's only leakproof if that function is. Many such functions are
indeed leakproof, but others are not, and we should not just assume that
they are. Hence, adjust contain_leaked_vars to verify the leakproofness
of the referenced function explicitly.
I didn't add a regression test because it would need to depend on
some particular comparison function being leaky, and that's a moving
target, per discussion.
This has been wrong all along, so back-patch to supported branches.
Discussion: https://postgr.es/m/31042.1546194242@sss.pgh.pa.us
fe0a0b5, which has added a stronger random source in Postgres, has
introduced a thinko when creating a padding message which gets encrypted
for Elgamal. The padding message cannot have zeros, which are replaced
by random bytes. However if pg_strong_random() failed, the message
would finish by being considered in correct shape for encryption with
zeros.
Author: Tom Lane
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/20186.1546188423@sss.pgh.pa.us
Backpatch-through: 10
Detect it the way pg_ctl's wait_for_postmaster() does. When pg_regress
spawned a postmaster that failed startup, we were detecting that only
with "pg_regress: postmaster did not respond within 60 seconds".
Back-patch to 9.4 (all supported versions).
Reviewed by Tom Lane.
Discussion: https://postgr.es/m/20181231172922.GA199150@gust.leadboat.com
POSIX specifies that jrand48() returns a signed 32-bit value (in the
range [-2^31, 2^31)), but our code was returning an unsigned 32-bit
value (in the range [0, 2^32)). This doesn't actually matter to any
existing call site, because they all cast the "long" result to int32
or uint32; but it will doubtless bite somebody in the future.
To fix, cast the arithmetic result to int32 explicitly before the
compiler widens it to long (if widening is needed).
While at it, upgrade this file's far-short-of-project-style comments.
Had there been some peer pressure to document pg_jrand48() properly,
maybe this thinko wouldn't have gotten committed to begin with.
Backpatch to v10 where pg_jrand48() was added, just in case somebody
back-patches a fix that uses it and depends on the standard behavior.
Discussion: https://postgr.es/m/17235.1545951602@sss.pgh.pa.us
Isolation test suite of GIN predicate locking was criticized for being too slow,
especially under Valgrind. This commit is intended to accelerate it. Tests are
simplified in the following ways.
1) Amount of data is reduced. We're now close to the minimal amount of data,
which produces at least one posting tree and at least two pages of entry
tree.
2) Three isolation tests are merged into one.
3) Only one tuple is queried from posting tree. So, locking of index is the
same, but tuple locks are not propagated to relation lock. Also, it is
faster.
4) Test cases itself are simplified. Now each test case run just one INSERT
and one SELECT involving GIN, which either conflict or not.
Discussion: https://postgr.es/m/20181204000740.ok2q53nvkftwu43a%40alap3.anarazel.de
Reported-by: Andres Freund
Tested-by: Andrew Dunstan
Author: Alexander Korotkov
Backpatch-through: 11
According to README we acquire predicate locks on entry tree leafs and posting
tree roots. However, when ginFindLeafPage() is going to lock leaf in exclusive
mode, then it checks root for conflicts regardless whether it's a entry or
posting tree. Assuming that we never place predicate lock on entry tree root
(excluding corner case when root is leaf), this check is redundant. This
commit removes this check. Now, root conflict checking is controlled by
separate argument of ginFindLeafPage().
Discussion: https://postgr.es/m/CAPpHfdv7rrDyy%3DMgsaK-L9kk0AH7az0B-mdC3w3p0FSb9uoyEg%40mail.gmail.com
Author: Alexander Korotkov
Backpatch-through: 11
Inheritance trees can include temporary tables if the parent is
permanent, which makes possible the presence of multiple temporary
children from different sessions. Trying to issue a TRUNCATE on the
parent in this scenario causes a failure, so similarly to any other
queries just ignore such cases, which makes TRUNCATE work
transparently.
This makes truncation behave similarly to any other DML query working on
the parent table with queries which need to be issues on children. A
set of isolation tests is added to cover basic cases.
Reported-by: Zhou Digoal
Author: Amit Langote, Michael Paquier
Discussion: https://postgr.es/m/15565-ce67a48d0244436a@postgresql.org
Backpatch-through: 9.4
I made a frontend fprintf() format use %m, forgetting that that's only
safe in HEAD not the back branches; prior to 96bf88d52 and d6c55de1f,
it would work on glibc platforms but not elsewhere. Revert to using
%s ... strerror(errno) as the code did before.
We could have left HEAD as-is, but for code consistency across branches,
I chose to apply this patch there too.
Per Coverity and a few buildfarm members.
At the end of recovery for the post-promotion process, a new history
file is created followed by the last partial segment of the previous
timeline. Based on the timing, the archiver would first try to archive
the last partial segment and then the history file. This can delay the
detection of a new timeline taken, particularly depending on the time it
takes to transfer the last partial segment as it delays the moment the
history file of the new timeline gets archived. This can cause promoted
standbys to use the same timeline as one already taken depending on the
circumstances if multiple instances look at archives at the same
location.
This commit changes the order of archiving so as history files are
archived in priority over other file types, which reduces the likelihood
of the same timeline being taken (still not reducing the window to
zero), and it makes the archiver behave more consistently with the
startup process doing its post-promotion business.
Author: David Steele
Reviewed-by: Michael Paquier, Kyotaro Horiguchi
Discussion: https://postgr.es/m/929068cf-69e1-bba2-9dc0-e05986aed471@pgmasters.net
Backpatch-through: 9.5
COPY can skip writing WAL when loading data on a table which has been
created in the same transaction as the one loading the data, however
this cannot work on views or foreign table as this would result in
trying to flush relation files which do not exist. So disable the
optimization so as commands are able to work the same way with any
configuration of wal_level.
Tests are added to cover the different cases, which need to have
wal_level set to minimal to allow the problem to show up, and that is
not the default configuration.
Reported-by: Luis M. Carril, Etsuro Fujita
Author: Amit Langote, Michael Paquier
Reviewed-by: Etsuro Fujita
Discussion: https://postgr.es/m/15552-c64aa14c5c22f63c@postgresql.org
Backpatch-through: 10, where support for COPY on views has been added,
while v11 has added support for COPY on foreign tables.
013ebc0a7b implements so-called GiST microvacuum. That is gistgettuple() marks
index tuples as dead when kill_prior_tuple is set. Later, when new tuple
insertion claims page space, those dead index tuples are physically deleted
from page. When this deletion is replayed on standby, it might conflict with
read-only queries. But 013ebc0a7b doesn't handle this. That may lead to
disappearance of some tuples from read-only snapshots on standby.
This commit implements resolving of conflicts between replay of GiST microvacuum
and standby queries. On the master we implement new WAL record type
XLOG_GIST_DELETE, which comprises necessary information. On stable releases
we've to be tricky to keep WAL compatibility. Information required for conflict
processing is just appended to data of XLOG_GIST_PAGE_UPDATE record. So,
PostgreSQL version, which doesn't know about conflict processing, will just
ignore that.
Reported-by: Andres Freund
Diagnosed-by: Andres Freund
Discussion: https://postgr.es/m/20181212224524.scafnlyjindmrbe6%40alap3.anarazel.de
Author: Alexander Korotkov
Backpatch-through: 9.6
For probably bogus reasons, we acquire only AccessShareLock on the
partition when we try to detach it from its parent partitioned table.
This can cause ugly things to happen if another transaction is doing
any sort of DDL to the partition concurrently.
Upgrade that lock to ShareUpdateExclusiveLock, which per discussion
seems to be the minimum needed.
Reported by Robert Haas.
Discussion: https://postgr.es/m/CA+TgmoYruJQ+2qnFLtF1xQtr71pdwgfxy3Ziy-TxV28M6pEmyA@mail.gmail.com
"$user" in a search_path string is replaced by CURRENT_USER not
SESSION_USER. (It actually was SESSION_USER in the initial implementation,
but we changed it shortly later, and evidently forgot to fix the docs to
match.)
Noted by antonov@stdpr.ru
Discussion: https://postgr.es/m/159151fb45d490c8d31ea9707e9ba99d@stdpr.ru
When a partition is detached from its parent, we acquire locks on all
attached indexes to also detach them ... but we release those locks
immediately. This is a violation of the policy of keeping locks on user
objects to the end of the transaction. Bug introduced in 8b08f7d4820f.
It's unclear that there are any ill effects possible, but it's clearly
wrong nonetheless. It's likely that bad behavior *is* possible, but
mostly because the relation that the index is for is only locked with
AccessShareLock, which is an older bug that shall be fixed separately.
While touching that line of code, close the index opened with
index_open() using index_close() instead of relation_close().
No difference in practice, but let's be consistent.
Unearthed by Robert Haas.
Discussion: https://postgr.es/m/CA+TgmoYruJQ+2qnFLtF1xQtr71pdwgfxy3Ziy-TxV28M6pEmyA@mail.gmail.com
The flag for IF NOT EXISTS was only being passed down in the normal
recursing case. It's been this way since originally added in 9.6 in
commit 2cd40adb85 so backpatch back to 9.6.
Commit 40dae7ec537, which made the handling of interrupted nbtree page
splits more robust, removed an nbtree-specific end-of-recovery cleanup
step. This meant that it was no longer possible to complete an
interrupted page split during recovery. However, a reference to
recovery as a reason for using a NULL stack while inserting into a
parent page was missed. Remove the reference.
Remove a similar obsolete reference to recovery that was introduced much
more recently, as part of the btree fastpath optimization enhancement
that made it into Postgres 11 (commit 2b272734, and follow-up commits).
Backpatch: 11-, where the fastpath optimization was introduced.
"rescanratio" was computed as 1 + rescanned-tuples / total-inner-tuples,
which is sensible if it's to be multiplied by total-inner-tuples or a cost
value corresponding to scanning all the inner tuples. But in reality it
was (mostly) multiplied by inner_rows or a related cost, numbers that take
into account the possibility of stopping short of scanning the whole inner
relation thanks to a limited key range in the outer relation. This'd
still make sense if we could expect that stopping short would result in a
proportional decrease in the number of tuples that have to be rescanned.
It does not, however. The argument that establishes the validity of our
estimate for that number is independent of whether we scan all of the inner
relation or stop short, and experimentation also shows that stopping short
doesn't reduce the number of rescanned tuples. So the correct calculation
is 1 + rescanned-tuples / inner_rows, and we should be sure to multiply
that by inner_rows or a corresponding cost value.
Most of the time this doesn't make much difference, but if we have
both a high rescan rate (due to lots of duplicate values) and an outer
key range much smaller than the inner key range, then the error can
be significant, leading to a large underestimate of the cost associated
with rescanning.
Per report from Vijaykumar Jain. This thinko appears to go all the way
back to the introduction of the rescan estimation logic in commit
70fba7043, so back-patch to all supported branches.
Discussion: https://postgr.es/m/CAE7uO5hMb_TZYJcZmLAgO6iD68AkEK6qCe7i=vZUkCpoKns+EQ@mail.gmail.com
The new grammar pattern of ALTER INDEX SET STATISTICS able to use column
numbers on top of the existing column names introduced by commit 5b6d13e
forgot to add support for the feature in pg_dump, so defining statistics
on index columns was missing from the dumps, potentially causing silent
planning problems with a subsequent restore.
pg_dump ought to not use column names in what it generates as these are
automatically generated by the server and could conflict with real
relation attributes with matching patterns. "expr" and "exprN", N
incremented automatically after the creation of the first one, are used
as default attribute names for index expressions, and that could easily
match what is defined in other relations, causing the dumps to fail if
some of those attributes are renamed at some point. So to avoid any
problems, the new grammar with column numbers gets used.
Reported-by: Ronan Dunklau
Author: Michael Paquier
Reviewed-by: Tom Lane, Adrien Nayrat, Amul Sul
Discussion: https://postgr.es/m/CAARsnT3UQ4V=yDNW468w8RqHfYiY9mpn2r_c5UkBJ97NAApUEw@mail.gmail.com
Backpatch-through: 11, where the new syntax has been introduced.
This is an oversight from recent commit b13fd344. While on it, tweak
the previous test with a better name for the renamed primary key.
Detected by buildfarm member prion which forces relation cache release
with -DRELCACHE_FORCE_RELEASE. Back-patch down to 9.4 as the previous
commit.
When a constraint gets renamed, it may have associated with it a target
relation (for example domain constraints don't have one). Not
invalidating the target relation cache when issuing the renaming can
result in issues with subsequent commands that refer to the old
constraint name using the relation cache, causing various failures. One
pattern spotted was using CREATE TABLE LIKE after a constraint
renaming.
Reported-by: Stuart <sfbarbee@gmail.com>
Author: Amit Langote
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/2047094.V130LYfLq4@station53.ousa.org
reap_child() basically ignored the possibility of either an error in
waitpid() itself or a child process failure on signal. We don't really
need to do more than report and crash hard, but proceeding as though
nothing is wrong is definitely Not Acceptable. The error report for
nonzero child exit status was pretty off-point, as well.
Noted while fooling around with child-process failure detection
logic elsewhere. It's been like this a long time, so back-patch to
all supported branches.
Commit ffa4cbd62 added logic to detect SIGPIPE failure of a COPY child
process, but it only worked correctly if the SIGPIPE occurred in the
immediate child process. Depending on the shell in use and the
complexity of the shell command string, we might instead get back
an exit code of 128 + SIGPIPE, representing a shell error exit
reporting SIGPIPE in the child process.
We could just hack up ClosePipeToProgram() to add the extra case,
but it seems like this is a fairly general issue deserving a more
general and better-documented solution. I chose to add a couple
of functions in src/common/wait_error.c, which is a natural place
to know about wait-result encodings, that will test for either a
specific child-process signal type or any child-process signal failure.
Then, adjust other places that were doing ad-hoc tests of this type
to use the common functions.
In RestoreArchivedFile, this fixes a race condition affecting whether
the process will report an error or just silently proc_exit(1): before,
that depended on whether the intermediate shell got SIGTERM'd itself
or reported a child process failing on SIGTERM.
Like the previous patch, back-patch to v10; we could go further
but there seems no real need to.
Per report from Erik Rijkers.
Discussion: https://postgr.es/m/f3683f87ab1701bea5d86a7742b22432@xs4all.nl
The idea here is to not call recordDependencyOn for the default collation,
since we know that's pinned. But what the code actually did was to record
the partition key's dependency on the opclass twice, instead.
Evidently introduced by sloppy coding in commit 2186b608b. Back-patch
to v10 where that came in.