A collection of typos I happened to spot while reading code, as well as
grepping for common mistakes.
Backpatch to all supported versions, as applicable, to avoid conflicts
when backporting other commits in the future.
In final_cost_hashjoin(), commit 9c7f5229a allowed inner_unique cases
to follow a code path previously used only for SEMI/ANTI joins; but it
neglected to fix an if-test within that path that assumed SEMI and ANTI
were the only possible cases. This resulted in a wrong value for
hashjointuples, and an ensuing bad cost estimate, for inner_unique normal
joins. Fortunately, for inner_unique normal joins we can assume the number
of joined tuples is the same as for a SEMI join; so there's no need for
more code, we just have to invert the test to check for ANTI not SEMI.
It turns out that in two contrib tests in which commit 9c7f5229a
changed the plan expected for a query, the change was actually wrong
and induced by this estimation error, not by any real improvement.
Hence this patch also reverts those changes.
Per report from RK Korlapati. Backpatch to v10 where the error was
introduced.
David Rowley
Discussion: https://postgr.es/m/CA+SNy03bhq0fodsfOkeWDCreNjJVjsdHwUsb7AG=jpe0PtZc_g@mail.gmail.com
lca_inner() wasn't prepared for the possibility of getting no inputs.
Fix that, and make some cosmetic improvements to the code while at it.
Also, I thought the documentation of this function as returning the
"longest common prefix" of the paths was entirely misleading; it really
returns a path one shorter than the longest common prefix, for the typical
definition of "prefix". Don't use that term in the docs, and adjust the
examples to clarify what really happens.
This has been broken since its beginning, so back-patch to all supported
branches.
Per report from Hailong Li. Thanks to Pierre Ducroquet for diagnosing
and for the initial patch, though I whacked it around some and added
test cases.
Discussion: https://postgr.es/m/5b0d8e4f-f2a3-1305-d612-e00e35a7be66@qunar.com
Back-patch commit dddfc4cb2, which broke LDFLAGS and related Makefile
variables into two parts, one for within-build-tree library references and
one for external libraries, to ensure that the order of -L flags has all
of the former before all of the latter. This turns out to fix a problem
recently noted on buildfarm member peripatus, that we attempted to
incorporate code from libpgport.a into a shared library. That will fail on
platforms that are sticky about putting non-PIC code into shared libraries.
(It's quite surprising we hadn't seen such failures before, since the code
in question has been like that for a long time.)
I think that peripatus' problem could have been fixed with just a subset
of this patch; but since the previous issue of accidentally linking to the
wrong copy of a Postgres shlib seems likely to bite people in the field,
let's just back-patch the whole change. Now that commit dddfc4cb2 has
survived some beta testing, I'm less afraid to back-patch it than I was
at the time.
This also fixes undesired inclusion of "-DFRONTEND" in pg_config's CPPFLAGS
output (in 9.6 and up) and undesired inclusion of "-L../../src/common" in
its LDFLAGS output (in all supported branches).
Back-patch to v10 and older branches; this is already in v11.
Discussion: https://postgr.es/m/20180704234304.bq2dxispefl65odz@ler-imac.local
Change a whole-database VACUUM into doing just pg_attribute, which is
the portion that verifies what we want it to do. The original
formulation wastes a lot of CPU time, which leads the test to fail when
runtime exceeds isolationtester timeout when it's super-slow, such as
under CLOBBER_CACHE_ALWAYS. Per buildfarm member friarbird.
It turns out that the previous shape of the test doesn't always detect
the condition it is supposed to detect (on unpatched reorderbuffer
code): the reason is that there is a good chance of encountering a
xl_running_xacts record (logged every 15 seconds) before the checkpoint
-- and because we advance the xmin when we receive that WAL record, and
we *don't* advance the xmin twice consecutively without receiving a
client message in between, that means the xmin is not advanced enough
for the tuple to be pruned from pg_attribute by VACUUM. So the test
would spuriously pass.
The reason this test deficiency wasn't detected earlier is that HOT
pruning removes the tuple anyway, even if vacuum leaves it in place, so
the test correctly fails (detecting the coding mistake), but for the
wrong reason.
To fix this mess, run the s0_get_changes step twice before vacuum
instead of once: this seems to cause the xmin to be advanced reliably,
wreaking havoc with more certainty.
Author: Arseny Sher
Discussion: https://postgr.es/m/87h8lkuxoa.fsf@ars-thinkpad
Two closely related bugs are fixed. First, xmin of logical slots was
advanced too early. During xl_running_xacts processing, xmin of the
slot was set to the oldest running xid in the record, but that's wrong:
actually, snapshots which will be used for not-yet-replayed transactions
might consider older txns as running too, so we need to keep xmin back
for them. The problem wasn't noticed earlier because DDL which allows
to delete tuple (set xmax) while some another not-yet-committed
transaction looks at it is pretty rare, if not unique: e.g. all forms of
ALTER TABLE which change schema acquire ACCESS EXCLUSIVE lock
conflicting with any inserts. The included test case (test_decoding's
oldest_xmin) uses ALTER of a composite type, which doesn't have such
interlocking.
To deal with this, we must be able to quickly retrieve oldest xmin
(oldest running xid among all assigned snapshots) from ReorderBuffer. To
fix, add another list of ReorderBufferTXNs to the reorderbuffer, where
transactions are sorted by base-snapshot-LSN. This is slightly
different from the existing (sorted by first-LSN) list, because a
transaction can have an earlier LSN but a later Xmin, if its first
record does not obtain an xmin (eg. xl_xact_assignment). Note this new
list doesn't fully replace the existing txn list: we still need that one
to prevent WAL recycling.
The second issue concerns SnapBuilder snapshots and subtransactions.
SnapBuildDistributeNewCatalogSnapshot never assigned a snapshot to a
transaction that is known to be a subtxn, which is good in the common
case that the top-level transaction already has one (no point in doing
so), but a bug otherwise. To fix, arrange to transfer the snapshot from
the subtxn to its top-level txn as soon as the kinship gets known.
test_decoding's snapshot_transfer verifies this.
Also, fix a minor memory leak: refcount of toplevel's old base snapshot
was not decremented when the snapshot is transferred from child.
Liberally sprinkle code comments, and rewrite a few existing ones. This
part is my (Álvaro's) contribution to this commit, as I had to write all
those comments in order to understand the existing code and Arseny's
patch.
Reported-by: Arseny Sher <a.sher@postgrespro.ru>
Diagnosed-by: Arseny Sher <a.sher@postgrespro.ru>
Co-authored-by: Arseny Sher <a.sher@postgrespro.ru>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Antonin Houska <ah@cybertec.at>
Discussion: https://postgr.es/m/87lgdyz1wj.fsf@ars-thinkpad
Bring this transform function into sync with the policy established
by commit 3a382983d.
Also, fix it to make sure that what it drills down to is indeed a
hash, and not some other kind of Perl SV. Previously, the test
cases added here provoked crashes.
Because of the crash hazard, back-patch to 9.5 where this module
was introduced.
Discussion: https://postgr.es/m/28336.1528393969@sss.pgh.pa.us
In 9.6, we moved a number of functions over to using the GRANT system to
control access instead of having hard-coded superuser checks.
As it turns out, adminpack was creating another function in the catalog
for one of those backend functions where the superuser check was
removed, specifically pg_rotate_logfile(), but it didn't get the memo
about having to REVOKE EXECUTE on the alternative-name function
(pg_logfile_rotate()), meaning that in any installations with adminpack
on 9.6 and higher, any user is able to run the pg_logfile_rotate()
function, which then calls pg_rotate_logfile() and rotates the logfile.
Fix by adding a new version of adminpack (1.1) which handles the REVOKE.
As this function should have only been available to the superuser, this
is a security issue, albeit a minor one.
Security: CVE-2018-1115
On further reflection, commit e5d83995e didn't go far enough: pretty much
everywhere in the planner that examines a clause's is_pushed_down flag
ought to be changed to use the more complicated behavior where we also
check the clause's required_relids. Otherwise we could make incorrect
decisions about whether, say, a clause is safe to use as a hash clause.
Some (many?) of these places are safe as-is, either because they are
never reached while considering a parameterized path, or because there
are additional checks that would reject a pushed-down clause anyway.
However, it seems smarter to just code them all the same way rather
than rely on easily-broken reasoning of that sort.
In support of that, invent a new macro RINFO_IS_PUSHED_DOWN that should
be used in place of direct tests on the is_pushed_down flag.
Like the previous patch, back-patch to all supported branches.
Discussion: https://postgr.es/m/f8128b11-c5bf-3539-48cd-234178b2314d@proxel.se
Spelling access(2)'s second argument as "2" is just horrid.
POSIX makes no promises as to the numeric values of W_OK and related
macros. Even if it accidentally works as intended on every supported
platform, it's still unreadable and inconsistent with adjacent code.
In passing, don't spell "NULL" as "0" either. Yes, that's legal C;
no, it's not project style.
Back-patch, just in case the unportability is real and not theoretical.
(Most likely, even if a platform had different bit assignments for
access()'s modes, there'd not be an observable behavior difference
here; but I'm being paranoid today.)
Due to sloppy division of changes between f50c80dbb (which was not
back-patched) and 563a053bd, this piece of code was wrongly backpatched to
REL_10_STABLE and REL9_6_STABLE. This code never causes real error because
its condition is never satisfied, but it's a dead code, which needs to be
removed.
Alexander Korotkov per gripe from Tom Lane
Count the number of tuples in the index honestly, instead of assuming
that it's the same as the number of tuples in the heap. (It might be
different if the index is partial.)
Fix counting of tuples in current index page, too. This error would
have led to failing to write out the final page of the index if it
contained exactly one tuple, so that the last tuple of the relation
would not get indexed.
Back-patch to 9.6 where contrib/bloom was added.
Tomas Vondra and Tom Lane
Discussion: https://postgr.es/m/3b3d8eac-c709-0d25-088e-b98339a1b28a@2ndquadrant.com
If convert_to_scalar is passed a pair of datatypes it can't cope with,
its former behavior was just to elog(ERROR). While this is OK so far as
the core code is concerned, there's extension code that would like to use
scalarltsel/scalargtsel/etc as selectivity estimators for operators that
work on non-core datatypes, and this behavior is a show-stopper for that
use-case. If we simply allow convert_to_scalar to return FALSE instead of
outright failing, then the main logic of scalarltsel/scalargtsel will work
fine for any operator that behaves like a scalar inequality comparison.
The lack of conversion capability will mean that we can't estimate to
better than histogram-bin-width precision, since the code will effectively
assume that the comparison constant falls at the middle of its bin. But
that's still a lot better than nothing. (Someday we should provide a way
for extension code to supply a custom version of convert_to_scalar, but
today is not that day.)
While poking at this issue, we noted that the existing code for handling
type bytea in convert_to_scalar is several bricks shy of a load.
It assumes without checking that if the comparison value is type bytea,
the bounds values are too; in the worst case this could lead to a crash.
It also fails to detoast the input values, so that the comparison result is
complete garbage if any input is toasted out-of-line, compressed, or even
just short-header. I'm not sure how often such cases actually occur ---
the bounds values, at least, are probably safe since they are elements of
an array and hence can't be toasted. But that doesn't make this code OK.
Back-patch to all supported branches, partly because author requested that,
but mostly because of the bytea bugs. The change in API for the exposed
routine convert_network_to_scalar() is theoretically a back-patch hazard,
but it seems pretty unlikely that any third-party code is calling that
function directly.
Tomas Vondra, with some adjustments by me
Discussion: https://postgr.es/m/b68441b6-d18f-13ab-b43b-9a72188a4e02@2ndquadrant.com
Since 9.5, it's possible that some but not all columns of an index
support returning the indexed value for index-only scans. If the
same indexed column appears in index columns that behave both ways,
check_index_only() supposed that it'd be OK to do an index-only scan
testing that column; but that fails if we have to recheck the indexed
condition on one of the columns that doesn't support this.
In principle we could make this work by remapping the recheck expressions
to pull the value from a column that does support returning the indexed
value. But such cases are so weird and rare that, at least for now,
it doesn't seem worth the trouble. Instead, just teach check_index_only
that a value is returnable only if all the index columns containing it
are returnable, rather than any of them.
Per report from David Pereiro Lagares. Back-patch to 9.5 where the
possibility of this situation appeared.
Kyotaro Horiguchi
Discussion: https://postgr.es/m/1516210494.1798.16.camel@nlpgo.com
Solaris 11.4 has built-in functions named b64_encode and b64_decode.
Rename ours to something else to avoid the conflict (fortunately,
ours are static so the impact is limited).
One could wish for less duplication of code in this area, but that
would be a larger patch and not very suitable for back-patching.
Since this is a portability fix, we want to put it into all supported
branches.
Report and initial patch by Rainer Orth, reviewed and adjusted a bit
by Michael Paquier
Discussion: https://postgr.es/m/ydd372wk28h.fsf@CeBiTec.Uni-Bielefeld.DE
This makes the client programs behave as documented regardless of the
connect-time search_path and regardless of user-created objects. Today,
a malicious user with CREATE permission on a search_path schema can take
control of certain of these clients' queries and invoke arbitrary SQL
functions under the client identity, often a superuser. This is
exploitable in the default configuration, where all users have CREATE
privilege on schema "public".
This changes behavior of user-defined code stored in the database, like
pg_index.indexprs and pg_extension_config_dump(). If they reach code
bearing unqualified names, "does not exist" or "no schema has been
selected to create in" errors might appear. Users may fix such errors
by schema-qualifying affected names. After upgrading, consider watching
server logs for these errors.
The --table arguments of src/bin/scripts clients have been lax; for
example, "vacuumdb -Zt pg_am\;CHECKPOINT" performed a checkpoint. That
now fails, but for now, "vacuumdb -Zt 'pg_am(amname);CHECKPOINT'" still
performs a checkpoint.
Back-patch to 9.3 (all supported versions).
Reviewed by Tom Lane, though this fix strategy was not his first choice.
Reported by Arseniy Sharoglazov.
Security: CVE-2018-1058
The previous limit of INT_MAX / 1000 seems to have been cargo-culted in
from somewhere else. Or possibly the value was converted to microseconds
at some point; but in all supported releases, it's just compared to other
values, so there's no need for the restriction. This change raises the
effective limit from ~35 minutes to ~24 days, which conceivably is useful
to somebody, and anyway it's more consistent with the range of the core
log_min_duration_statement GUC.
Per complaint from Kevin Bloch. Back-patch to all supported releases.
Discussion: https://postgr.es/m/8ea82d7e-cb78-8e05-0629-73aa14d2a0ca@codingthat.com
Commit 4bbf6edfbd5d03743ff82dda2f00c738fb3208f5 added a test case,
but it turns out that the test case doesn't reliably test for the
bug, and in the context of the regression test suite did not because
ANALYZE had not been run.
Report and patch by Etsuro Fujita. I added a comment along lines
previously suggested by Tom Lane.
Discussion: http://postgr.es/m/5A6195D8.8060206@lab.ntt.co.jp
When pushing down a join to a foreign server, postgres_fdw constructs
an alternative plan to be used for any EvalPlanQual rechecks that
prove to be necessary. This plan is stored as the outer subplan of
the Foreign Scan implementing the pushed-down join. Previously, this
alternative plan could have a different nominal sort ordering than its
parent, which seemed OK since there will only be one tuple per base
table anyway in the case of an EvalPlanQual recheck. Actually,
though, it caused a problem if that path was used as a building block
for the EvalPlanQual recheck plan of a higher-level foreign join,
because we could end up with a merge join one of whose inputs was not
labelled with the correct sort order. Repair by injecting an extra
Sort node into the EvalPlanQual recheck plan whenever it would
otherwise fail to be sorted at least as well as its parent Foreign
Scan.
Report by Jeff Janes. Patch by me, reviewed by Tom Lane, who also
provided the test case and comment text.
Discussion: http://postgr.es/m/CAMkU=1y2G8VOVBHv3iXU2TMAj7-RyBFFW1uhkr5sm9LQ2=X35g@mail.gmail.com
Commit 7012b132d, which added the ability to push down aggregates and
grouping to the remote server, wasn't careful to ensure that the remote
server would have the same idea we do about which columns are the grouping
columns, in cases where there are textually identical GROUP BY expressions.
Such cases typically led to "targetlist item has multiple sortgroupref
labels" errors.
To fix this reliably, switch over to using "GROUP BY column-number" syntax
rather than "GROUP BY expression" in transmitted queries, and adjust
foreign_grouping_ok() to be more careful about duplicating the sortgroupref
labeling of the local pathtarget.
Per bug #14890 from Sean Johnston. Back-patch to v10 where the buggy code
was introduced.
Jeevan Chalke, reviewed by Ashutosh Bapat
Discussion: https://postgr.es/m/20171107134948.1508.94783@wrigleys.postgresql.org
~> (cube, int) operator was especially designed for knn-gist search.
However, it appears that knn-gist search can't work correctly with current
behavior of this operator when dataset contains cubes of variable
dimensionality. In this case, the same value of second operator argument
can point to different dimension depending on dimensionality of particular cube.
Such behavior is incompatible with gist indexing of cubes, and knn-gist doesn't
work correctly for it.
This patch changes behavior of ~> (cube, int) operator by introducing dimension
numbering where value of second argument unambiguously identifies number of
dimension. With new behavior, this operator can be correctly supported by
knn-gist. Relevant changes to cube operator class are also included.
Backpatch to v9.6 where operator was introduced.
Since behavior of ~> (cube, int) operator is changed, depending entities
must be refreshed after upgrade. Such as, expression indexes using this
operator must be reindexed, materialized views must be rebuilt, stored
procedures and client code must be revised to correctly use new behavior.
That should be mentioned in release notes.
Noticed by: Tomas Vondra
Author: Alexander Korotkov
Reviewed by: Tomas Vondra, Andrey Borodin
Discussion: https://www.postgresql.org/message-id/flat/a9657f6a-b497-36ff-e56-482a2c7e3292@2ndquadrant.com
Instead of using our standard macro for this calculation, this code
did it itself ... and got it wrong, leading to incorrect display of
the null bitmap in some cases. Noted and fixed by Maksim Milyutin.
In passing, remove a uselessly duplicative error check.
Errors were introduced in commit d6061f83a; back-patch to 9.6
where that came in.
Maksim Milyutin, reviewed by Andrey Borodin
Discussion: https://postgr.es/m/ec295792-a69f-350f-6287-25a20e8f31d5@gmail.com
rewriteTargetListUD's processing is dependent on the relkind of the query's
target table. That was fine at the time it was made to act that way, even
for queries on inheritance trees, because all tables in an inheritance tree
would necessarily be plain tables. However, the 9.5 feature addition
allowing some members of an inheritance tree to be foreign tables broke the
assumption that rewriteTargetListUD's output tlist could be applied to all
child tables with nothing more than column-number mapping. This led to
visible failures if foreign child tables had row-level triggers, and would
also break in cases where child tables belonged to FDWs that used methods
other than CTID for row identification.
To fix, delay running rewriteTargetListUD until after the planner has
expanded inheritance, so that it is applied separately to the (already
mapped) tlist for each child table. We can conveniently call it from
preprocess_targetlist. Refactor associated code slightly to avoid the
need to heap_open the target relation multiple times during
preprocess_targetlist. (The APIs remain a bit ugly, particularly around
the point of which steps scribble on parse->targetList and which don't.
But avoiding such scribbling would require a change in FDW callback APIs,
which is more pain than it's worth.)
Also fix ExecModifyTable to ensure that "tupleid" is reset to NULL when
we transition from rows providing a CTID to rows that don't. (That's
really an independent bug, but it manifests in much the same cases.)
Add a regression test checking one manifestation of this problem, which
was that row-level triggers on a foreign child table did not work right.
Back-patch to 9.5 where the problem was introduced.
Etsuro Fujita, reviewed by Ildus Kurbangaliev and Ashutosh Bapat
Discussion: https://postgr.es/m/20170514150525.0346ba72@postgrespro.ru
hstoreUniquePairs() often called memcpy with equal source and destination
pointers. Although this is almost surely harmless in practice, it's
undefined according to the letter of the C standard. Some versions of
valgrind will complain about it, and some versions of libc as well
(cf. commit ad520ec4a). Tweak the code to avoid doing that.
Noted by Tomas Vondra. Back-patch to all supported versions because
of the hazard of libc assertions.
Discussion: https://postgr.es/m/bf84d940-90d4-de91-19dd-612e011007f4@fuzzy.cz
The scripts in contrib/start-scripts/osx don't work at all on macOS
10.10 (Yosemite) or later, because they depend on SystemStarter which
Apple deprecated long ago and removed in 10.10. Add a new subdirectory
contrib/start-scripts/macos with scripts that use the newer launchd
infrastructure.
Since this problem is independent of which Postgres version you're using,
back-patch to all supported branches.
Discussion: https://postgr.es/m/31338.1510763554@sss.pgh.pa.us
Make bloom WAL test compare psql output text, not just result codes;
this was evidently the intent all along, but it was mis-coded.
In passing, make sure we will notice any failure in setup steps.
Alexander Korotkov, reviewed by Michael Paquier and Masahiko Sawada
Discussion: https://postgr.es/m/CAPpHfdtohPdQ9rc5mdWjxq+3VsBNw534KV_5O65dTQrSdVJNgw@mail.gmail.com
By default, $PGUSER has permission to unlink $PGLOG. If $PGUSER
replaces $PGLOG with a symbolic link, the server will corrupt the
link-targeted file by appending log messages. Since these scripts open
$PGLOG as root, the attack works regardless of target file ownership.
"make install" does not install these scripts anywhere. Users having
manually installed them in the past should repeat that process to
acquire this fix. Most script users have $PGLOG writable to root only,
located in $PGDATA. Just before updating one of these scripts, such
users should rename $PGLOG to $PGLOG.old. The script will then recreate
$PGLOG with proper ownership.
Reviewed by Peter Eisentraut. Reported by Antoine Scemama.
Security: CVE-2017-12172
The logical decoding functions do BeginInternalSubTransaction and
RollbackAndReleaseCurrentSubTransaction to clean up after themselves.
It turns out that AtEOSubXact_SPI has an unrecognized assumption that
we always need to cancel the active SPI operation in the SPI context
that surrounds the subtransaction (if there is one). That's true
when the RollbackAndReleaseCurrentSubTransaction call is coming from
the SPI-using function itself, but not when it's happening inside
some unrelated function invoked by a SPI query. In practice the
affected callers are the various PLs.
To fix, record the current subtransaction ID when we begin a SPI
operation, and clean up only if that ID is the subtransaction being
canceled.
Also, remove AtEOSubXact_SPI's assertion that it must have cleaned
up the surrounding SPI context's active tuptable. That's proven
wrong by the same test case.
Also clarify (or, if you prefer, reinterpret) the calling conventions
for _SPI_begin_call and _SPI_end_call. The memory context cleanup
in the latter means that these have always had the flavor of a matched
resource-management pair, but they weren't documented that way before.
Per report from Ben Chobot.
Back-patch to 9.4 where logical decoding came in. In principle,
the SPI changes should go all the way back, since the problem dates
back to commit 7ec1c5a86. But given the lack of field complaints
it seems few people are using internal subtransactions in this way.
So I don't feel a need to take any risks in 9.2/9.3.
Discussion: https://postgr.es/m/73FBA179-C68C-4540-9473-71E865408B15@silentmedia.com
This doesn't allow routing tuple to the foreign partitions themselves,
but it permits tuples to be routed to regular partitions despite the
presence of foreign partitions in the same inheritance hierarchy.
Etsuro Fujita, reviewed by Amit Langote and by me.
Discussion: http://postgr.es/m/bc3db4c1-1693-3b8a-559f-33ad2b50b7ad@lab.ntt.co.jp
Per a report from AP, it's not that hard to exhaust the supply of
bitmap pages if you create a table with a hash index and then insert a
few billion rows - and then you start getting errors when you try to
insert additional rows. In the particular case reported by AP,
there's another fix that we can make to improve recycling of overflow
pages, which is another way to avoid the error, but there may be other
cases where this problem happens and that fix won't help. So let's
buy ourselves as much headroom as we can without rearchitecting
anything.
The comments claim that the old limit was 64GB, but it was really
only 32GB, because we didn't use all the bits in the page for bitmap
bits - only the largest power of 2 that could fit after deducting
space for the page header and so forth. Thus, we have 4kB per page
for bitmap bits, not 8kB. The new limit is thus actually 8 times the
old *real* limit but only 4 times the old *purported* limit.
Since this breaks on-disk compatibility, bump HASH_VERSION. We've
already done this earlier in this release cycle, so this doesn't cause
any incremental inconvenience for people using pg_upgrade from
releases prior to v10. However, users who use pg_upgrade to reach
10beta3 or later from 10beta2 or earlier will need to REINDEX any hash
indexes again.
Amit Kapila and Robert Haas
Discussion: http://postgr.es/m/20170704105728.mwb72jebfmok2nm2@zip.com.au
The Perl documentation is very clear that stuff calling libperl should
be built with the compiler switches shown by Perl's $Config{ccflags}.
We'd been ignoring that up to now, and mostly getting away with it,
but recent Perl versions contain ABI compatibility cross-checks that
fail on some builds because of this omission. In particular the
sizeof(PerlInterpreter) can come out different due to some fields being
added or removed; which means we have a live ABI hazard that we'd better
fix rather than continuing to sweep it under the rug.
However, it still seems like a bad idea to just absorb $Config{ccflags}
verbatim. In some environments Perl was built with a different compiler
that doesn't even use the same switch syntax. -D switch syntax is pretty
universal though, and absorbing Perl's -D switches really ought to be
enough to fix the problem.
Furthermore, Perl likes to inject stuff like -D_LARGEFILE_SOURCE and
-D_FILE_OFFSET_BITS=64 into $Config{ccflags}, which affect libc ABIs on
platforms where they're relevant. Adopting those seems dangerous too.
It's unclear whether a build wherein Perl and Postgres have different ideas
of sizeof(off_t) etc would work, or whether anyone would care about making
it work. But it's dead certain that having different stdio ABIs in
core Postgres and PL/Perl will not work; we've seen that movie before.
Therefore, let's also ignore -D switches for symbols beginning with
underscore. The symbols that we actually need to import should be the ones
mentioned in perl.h's PL_bincompat_options stanza, and none of those start
with underscore, so this seems likely to work. (If it turns out not to
work everywhere, we could consider intersecting the symbols mentioned in
PL_bincompat_options with the -D switches. But that will be much more
complicated, so let's try this way first.)
This will need to be back-patched, but first let's see what the
buildfarm makes of it.
Ashutosh Sharma, some adjustments by me
Discussion: https://postgr.es/m/CANFyU97OVQ3+Mzfmt3MhuUm5NwPU=-FtbNH5Eb7nZL9ua8=rcA@mail.gmail.com
In Perl builds that define PERL_IMPLICIT_SYS, XSUB.h defines macros
that replace a whole lot of basic libc functions with Perl functions.
We can't tolerate that in plperl.c; it breaks at least PG_TRY and
probably other stuff. The core idea of this patch is to include XSUB.h
only in the .xs files where it's really needed, and to move any code
broken by PERL_IMPLICIT_SYS out of the .xs files and into plperl.c.
The reason this hasn't been a problem before is that our build techniques
did not result in PERL_IMPLICIT_SYS appearing as a #define in PL/Perl,
even on some platforms where Perl thinks it is defined. That's about to
change in order to fix a nasty portability issue, so we need this work to
make the code safe for that.
Rather unaccountably, the Perl people chose XSUB.h as the place to provide
the versions of the aTHX/aTHX_ macros that are needed by code that's not
explicitly aware of the MULTIPLICITY API conventions. Hence, just removing
XSUB.h from plperl.c fails miserably. But we can work around that by
defining PERL_NO_GET_CONTEXT (which would make the relevant stanza of
XSUB.h a no-op anyway). As explained in perlguts.pod, that means we need
to add a "dTHX" macro call in every C function that calls a Perl API
function. In most of them we just add this at the top; but since the
macro fetches the current Perl interpreter pointer, more care is needed
in functions that switch the active interpreter. Lack of the macro is
easily recognized since it results in bleats about "my_perl" not being
defined.
(A nice side benefit of this is that it significantly reduces the number
of fetches of the current interpreter pointer. On my machine, plperl.so
gets more than 10% smaller, and there's probably some performance win too.
We could reduce the number of fetches still more by decorating the code
with pTHX_/aTHX_ macros to pass the interpreter pointer around, as
explained by perlguts.pod; but that's a task for another day.)
Formatting note: pgindent seems happy to treat "dTHX;" as a declaration
so long as it's the first thing after the left brace, as we'd already
observed with respect to the similar macro "dSP;". If you try to put
it later in a set of declarations, pgindent puts ugly extra space
around it.
Having removed XSUB.h from plperl.c, we need only move the support
functions for spi_return_next and util_elog (both of which use PG_TRY)
out of the .xs files and into plperl.c. This seems sufficient to
avoid the known problems caused by PERL_IMPLICIT_SYS, although we
could move more code if additional issues emerge.
This will need to be back-patched, but first let's see what the
buildfarm makes of it.
Patch by me, with some help from Ashutosh Sharma
Discussion: https://postgr.es/m/CANFyU97OVQ3+Mzfmt3MhuUm5NwPU=-FtbNH5Eb7nZL9ua8=rcA@mail.gmail.com
If the user modifies a view that has CHECK OPTIONs and this gets
translated into a modification to an underlying relation which happens
to be a foreign table, the check options should be enforced. In the
normal code path, that was happening properly, but it was not working
properly for "direct" modification because the whole operation gets
pushed to the remote side in that case and we never have an option to
enforce the constraint against individual tuples. Fix by disabling
direct modification when there is a need to enforce CHECK OPTIONs.
Etsuro Fujita, reviewed by Kyotaro Horiguchi and by me.
Discussion: http://postgr.es/m/f8a48f54-6f02-9c8a-5250-9791603171ee@lab.ntt.co.jp
The new test cases added in commit 8bf58c0d9 turn out to have output
that can vary depending on the lc_messages setting prevailing on the
test server. Hide the remote end's error messages to ensure stable
output. This isn't a terribly desirable solution; we'd rather know
that the connection failed for the expected reason and not some other
one. But there seems little choice for the moment.
Per buildfarm.
Discussion: https://postgr.es/m/18419.1500658570@sss.pgh.pa.us
Previously, postgres_fdw would keep on using an existing connection even
if the user did ALTER SERVER or ALTER USER MAPPING commands that should
affect connection parameters. Teach it to watch for catcache invals
on these catalogs and re-establish connections when the relevant catalog
entries change. Per bug #14738 from Michal Lis.
In passing, clean up some rather crufty decisions in commit ae9bfc5d6
about where fields of ConnCacheEntry should be reset. We now reset
all the fields whenever we open a new connection.
Kyotaro Horiguchi, reviewed by Ashutosh Bapat and myself.
Back-patch to 9.3 where postgres_fdw appeared.
Discussion: https://postgr.es/m/20170710113917.7727.10247@wrigleys.postgresql.org
Add missing infrastructure for this node type, notably in ruleutils.c where
its lack could demonstrably cause EXPLAIN to fail. Add outfuncs/readfuncs
support. (outfuncs support is useful today for debugging purposes. The
readfuncs support may never be needed, since at present it would only
matter for parallel query and NextValueExpr should never appear in a
parallelizable query; but it seems like a bad idea to have a primnode type
that isn't fully supported here.) Teach planner infrastructure that
NextValueExpr is a volatile, parallel-unsafe, non-leaky expression node
with cost cpu_operator_cost. Given its limited scope of usage, there
*might* be no live bug today from the lack of that knowledge, but it's
certainly going to bite us on the rear someday. Teach pg_stat_statements
about the new node type, too.
While at it, also teach cost_qual_eval() that MinMaxExpr, SQLValueFunction,
XmlExpr, and CoerceToDomain should be charged as cpu_operator_cost.
Failing to do this for SQLValueFunction was an oversight in my commit
0bb51aa96. The others are longer-standing oversights, but no time like the
present to fix them. (In principle, CoerceToDomain could have cost much
higher than this, but it doesn't presently seem worth trying to examine the
domain's constraints here.)
Modify execExprInterp.c to execute NextValueExpr as an out-of-line
function; it seems quite unlikely to me that it's worth insisting that
it be inlined in all expression eval methods. Besides, providing the
out-of-line function doesn't stop anyone from inlining if they want to.
Adjust some places where NextValueExpr support had been inserted with the
aid of a dartboard rather than keeping it in the same order as elsewhere.
Discussion: https://postgr.es/m/23862.1499981661@sss.pgh.pa.us
Previously, gen_random_uuid() would fall back to a weak random number
generator, unlike gen_random_bytes() which would just fail. And this was
not made very clear in the docs. For consistency, also make
gen_random_uuid() fail outright, if compiled with --disable-strong-random.
Re-word the error message you get with --disable-strong-random. It is also
used by pgp functions that require random salts, and now also
gen_random_uuid().
Reported by Radek Slupik.
Discussion: https://www.postgresql.org/message-id/20170101232054.10135.50528@wrigleys.postgresql.org
Don't move parenthesized lines to the left, even if that means they
flow past the right margin.
By default, BSD indent lines up statement continuation lines that are
within parentheses so that they start just to the right of the preceding
left parenthesis. However, traditionally, if that resulted in the
continuation line extending to the right of the desired right margin,
then indent would push it left just far enough to not overrun the margin,
if it could do so without making the continuation line start to the left of
the current statement indent. That makes for a weird mix of indentations
unless one has been completely rigid about never violating the 80-column
limit.
This behavior has been pretty universally panned by Postgres developers.
Hence, disable it with indent's new -lpl switch, so that parenthesized
lines are always lined up with the preceding left paren.
This patch is much less interesting than the first round of indent
changes, but also bulkier, so I thought it best to separate the effects.
Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org
Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us