This was broken recently by a929e17e5. I'd failed to remember that
parallel tests should have their EXPLAIN output run through the
explain_parallel_append function so that the output is stable when
parallel workers fail to start.
fairywren was first to notice.
Reported-by: Michael Paquier
Discussion: https://postgr.es/m/20201102062951.GB15770@paquier.xyz
Logical replication protocol uses a single byte character to identify a
message type in logical replication protocol. The code uses string
literals for the same. Use Enum so that
1. All the string literals used can be found at a single place. This
makes it easy to add more types without the risk of conflicts.
2. It's easy to locate the code handling a given message type.
3. When used with switch statements, it is easy to identify the missing
cases using -Wswitch.
Author: Ashutosh Bapat
Reviewed-by: Kyotaro Horiguchi, Andres Freund, Peter Smith and Amit Kapila
Discussion: https://postgr.es/m/CAExHW5uPzQ7L0oAd_ENyvaiYMOPgkrAoJpE+ZY5-obdcVT6NPg@mail.gmail.com
Previously we only tagged on the required information to allow the
executor to perform run-time partition pruning for Append/MergeAppend
nodes belonging to base relations. It was thought that nested
Append/MergeAppend nodes were just about always pulled up into the
top-level Append/MergeAppend and that making the run-time pruning info for
any sub Append/MergeAppend nodes was a waste of time. However, that was
likely badly thought through.
Some examples of cases we're unable to pullup nested Append/MergeAppends
are: 1) Parallel Append nodes with a mix of parallel and non-parallel
paths into a Parallel Append. 2) When planning an ordered Append scan a
sub-partition which is unordered may require a nested MergeAppend path to
ensure sub-partitions don't mix up the order of tuples being fed into the
top-level Append.
Unfortunately, it was not just as simple as removing the lines in
createplan.c which were purposefully not building the run-time pruning
info for anything but RELOPT_BASEREL relations. The code in
add_paths_to_append_rel() was far too sloppy about which partitioned_rels
it included for the Append/MergeAppend paths. The original code there
would always assume accumulate_append_subpath() would pull each sub-Append
and sub-MergeAppend path into the top-level path. While it does not
appear that there were any actual bugs caused by having the additional
partitioned table RT indexes recorded, what it did mean is that later in
planning, when we built the run-time pruning info that we wasted effort
and built PartitionedRelPruneInfos for partitioned tables that we had no
subpaths for the executor to run-time prune.
Here we tighten that up so that partitioned_rels only ever contains the RT
index for partitioned tables which actually have subpaths in the given
Append/MergeAppend. We can now Assert that every PartitionedRelPruneInfo
has a non-empty present_parts. That should allow us to catch any weird
corner cases that have been missed.
In passing, it seems there is no longer a good reason to have the
AppendPath and MergeAppendPath's partitioned_rel fields a List of IntList.
We can simply have a List of Relids instead. This is more compact in
memory and faster to add new members to. We still know which is the root
level partition as these always have a lower relid than their children.
Previously this field was used for more things, but run-time partition
pruning now remains the only user of it and it has no need for a List of
IntLists.
Here we also get rid of the RelOptInfo partitioned_child_rels field. This
is what was previously used to (sometimes incorrectly) set the
Append/MergeAppend path's partitioned_rels field. That was the only usage
of that field, so we can happily just remove it.
I also couldn't resist changing some nearby code to make use of the newly
added for_each_from macro so we can skip the first element in the list
without checking if the current item was the first one on each
iteration.
A bug report from Andreas Kretschmer prompted all this work, however,
after some consideration, I'm not personally classing this as a bug fix.
So no backpatch. In Andreas' test case, it just wasn't that clear that
there was a nested Append since the top-level Append just had a single
sub-path which was pulled up a level, per 8edd0e794.
Author: David Rowley
Reviewed-by: Amit Langote
Discussion: https://postgr.es/m/flat/CAApHDvqSchs%2BubdybcfFaSPB%2B%2BEA7kqMaoqajtP0GtZvzOOR3g%40mail.gmail.com
pglz_maximum_compressed_size() potentially underestimated the amount
of compressed data required to produce N bytes of decompressed data;
this is a fault in commit 11a078cf8.
Separately from that, pglz_decompress() failed to protect itself
against corrupt compressed data, particularly off == 0 in a match
tag. Commit c60e520f6 turned such a situation into an infinite loop,
where before it'd just have resulted in garbage output.
The combination of these two bugs seems like it may explain bug #16694
from Tom Vijlbrief, though it's impossible to be quite sure without
direct inspection of the failing session. (One needs to assume that
the pglz_maximum_compressed_size() bug caused us to fail to fetch the
second byte of a match tag, and what happened to be there instead was
a zero. The reported infinite loop is hard to explain without off == 0,
though.)
Aside from fixing the bugs, rewrite associated comments for more
clarity.
Back-patch to v13 where both these commits landed.
Discussion: https://postgr.es/m/16694-f107871e499ec114@postgresql.org
Although error results received from the backend should always have
a SQLSTATE field, ones generated by libpq won't, making this code
vulnerable to a crash after, say, untimely loss of connection.
Noted by Coverity.
Oversight in commit 403a3d91c. Back-patch to 9.5, as that was.
Statistics associated to an index got lost after running REINDEX
CONCURRENTLY, while the non-concurrent case preserves these correctly.
The concurrent and non-concurrent operations need to be consistent for
the end-user, and missing statistics would force to wait for a new
analyze to happen, which could take some time depending on the activity
of the existing autovacuum workers. This issue is fixed by copying any
existing entries in pg_statistic associated to the old index to the new
one. Note that this copy is already done with the data of the index in
the stats collector.
Reported-by: Fabrízio de Royes Mello
Author: Michael Paquier, Fabrízio de Royes Mello
Reviewed-by: Justin Pryzby
Discussion: https://postgr.es/m/CAFcNs+qpFPmiHd1oTXvcPdvAHicJDA9qBUSujgAhUMJyUMb+SA@mail.gmail.com
Backpatch-through: 12
Certain background workers initiate parallel queries while
debug_query_string==NULL, at which point they attempted strlen(NULL) and
died to SIGSEGV. Older debug_query_string observers allow NULL, so do
likewise in these newer ones. Back-patch to v11, where commit
7de4a1bcc5 introduced the first of these.
Discussion: https://postgr.es/m/20201014022636.GA1962668@rfd.leadboat.com
Because of this, if you tried to create an operator family with the new
sortsupport function, you got an error:
ERROR: support function number 11 is invalid for access method gist
We missed this in commit 16fa9b2b30 that added the sortsupport function,
because it only added sortsupport to a built-in operator family.
Author: Andrey Borodin
Discussion: https://www.postgresql.org/message-id/3520A18A-5C38-4697-A2E3-F3BDE3496CD5%40yandex-team.ru
The timetz test cases I added in commit a9632830b were unintentionally
sensitive to whether or not DST is active in the PST8PDT time zone.
Thus, they'll start failing this coming weekend, as reported by
Bernhard M. Wiedemann in bug #16689. Fortunately, DST-awareness is
not significant to the purpose of these test cases, so we can just
force them all to PDT (DST hours) to preserve stability of the
results.
Back-patch to v10, as the prior patch was.
Discussion: https://postgr.es/m/16689-57701daa23b377bf@postgresql.org
On the same reasoning as in commit 36b931214, forbid using custom
oid_symbol macros in pg_type as well as pg_proc, so that we always
rely on the predictable macro names generated by genbki.pl.
We do continue to grant grandfather status to the names CASHOID and
LSNOID, although those are now considered deprecated aliases for the
preferred names MONEYOID and PG_LSNOID. This is because there's
likely to be client-side code using the old names, and this bout of
neatnik-ism doesn't quite seem worth breaking client code.
There might be a case for grandfathering EVTTRIGGEROID, too, since
externally-maintained PLs may reference that symbol. But renaming
such references to EVENT_TRIGGEROID doesn't seem like a particularly
heavy lift --- we make far more significant backend API changes in
every major release. For now I didn't add that, but we could
reconsider if there's pushback.
The other names changed here seem pretty unlikely to have any outside
uses. Again, we could add alias macros if there are complaints, but
for now I didn't.
As before, no need for a catversion bump.
John Naylor
Discussion: https://postgr.es/m/CAFBsxsHpCbjfoddNGpnnnY5pHwckWfiYkMYSF74PmP1su0+ZOw@mail.gmail.com
When ComputeXidHorizons() was called before MyDatabaseOid is set,
e.g. because a dead row in a shared relation is encountered during
InitPostgres(), the horizon for normal tables was computed too
aggressively, ignoring all backends connected to a database.
During subsequent pruning in a data table the too aggressive horizon
could end up still being used, possibly leading to still needed tuples
being removed. Not good.
This is a bug in dc7420c2c9, which the test added in 94bc27b576 made
visible, if run with force_parallel_mode set to regress. In that case
the bug is reliably triggered, because "pruning_query" is run in a
parallel worker and the start of that parallel worker is likely to
encounter a dead row in pg_database.
The fix is trivial: Compute a more pessimistic data table horizon if
MyDatabaseId is not yet known.
Author: Andres Freund
Discussion: https://postgr.es/m/20201029040030.p4osrmaywhqaesd4@alap3.anarazel.de
This adds the statistics about transactions streamed to the decoding
output plugin from ReorderBuffer. Users can query the
pg_stat_replication_slots view to check these stats and call
pg_stat_reset_replication_slot to reset the stats of a particular slot.
Users can pass NULL in pg_stat_reset_replication_slot to reset stats of
all the slots.
Commit 9868167500 has added the basic infrastructure to capture the stats
of slot and this commit extends the statistics collector to track
additional information about slots.
Bump the catversion as we have added new columns in the catalog entry.
Author: Ajin Cherian and Amit Kapila
Reviewed-by: Sawada Masahiko and Dilip Kumar
Discussion: https://postgr.es/m/CAA4eK1+chpEomLzgSoky-D31qev19AmECNiEAietPQUGEFhtVA@mail.gmail.com
This fixes a bug in the edge case where, for a temp table, heap_page_prune()
can end up with a different horizon than heap_vacuum_rel(). Which can trigger
errors like "ERROR: cannot freeze committed xmax ...".
The bug was introduced due to interaction of a7212be8b9 "Set cutoff xmin more
aggressively when vacuuming a temporary table." with dc7420c2c9 "snapshot
scalability: Don't compute global horizons while building snapshots.".
The problem is caused by lazy_scan_heap() assuming that the only reason its
HeapTupleSatisfiesVacuum() call would return HEAPTUPLE_DEAD is if the tuple is
a HOT tuple, or if the tuple's inserting transaction has aborted since the
heap_page_prune() call. But after a7212be8b9 that was also possible in other
cases for temp tables, because heap_page_prune() uses a different visibility
test after dc7420c2c9.
The fix is fairly simple: Move the special case logic for temp tables from
vacuum_set_xid_limits() to the infrastructure introduced in dc7420c2c9. That
ensures that the horizon used for pruning is at least as aggressive as the one
used by lazy_scan_heap(). The concrete horizon used for temp tables is
slightly different than the logic in dc7420c2c9, but should always be as
aggressive as before (see comments).
A significant benefit to centralizing the logic procarray.c is that now the
more aggressive horizons for temp tables does not just apply to VACUUM but
also to e.g. HOT pruning and the nbtree killtuples logic.
Because isTopLevel is not needed by vacuum_set_xid_limits() anymore, I
undid the the related changes from a7212be8b9.
This commit also adds an isolation test ensuring that the more aggressive
vacuuming and pruning of temp tables keeps working.
Debugged-By: Amit Kapila <amit.kapila16@gmail.com>
Debugged-By: Tom Lane <tgl@sss.pgh.pa.us>
Debugged-By: Ashutosh Sharma <ashu.coek88@gmail.com>
Author: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20201014203103.72oke6hqywcyhx7s@alap3.anarazel.de
Discussion: https://postgr.es/m/20201015083735.derdzysdtqdvxshp@alap3.anarazel.de
In almost all other places, we use plain "r" or "w" mode in popen()
calls (the exceptions being for COPY data). This one has been
overlooked (possibly because it's buried in a ".l" flex file?),
but it's using PG_BINARY_R.
Kensuke Okamura complained in bug #16688 that we fail to strip \r
when stripping the trailing newline from a backtick result string.
That's true enough, but we'd also fail to convert embedded \r\n
cleanly, which also seems undesirable. Fixing the popen() mode
seems like the best way to deal with this.
It's been like this for a long time, so back-patch to all supported
branches.
Discussion: https://postgr.es/m/16688-c649c7b69cd7e6f8@postgresql.org
It's unsafe to do this at parse time because addition of generated
columns to a table would not invalidate stored rules containing
UPDATEs on the table ... but there might now be dependent generated
columns that were not there when the rule was made. This also fixes
an oversight that rewriteTargetView failed to update extraUpdatedCols
when transforming an UPDATE on an updatable view. (Since the new
calculation is downstream of that, rewriteTargetView doesn't actually
need to do anything; but before, there was a demonstrable bug there.)
In v13 and HEAD, this leads to easily-visible bugs because (since
commit c6679e4fc) we won't recalculate generated columns that aren't
listed in extraUpdatedCols. In v12 this bitmap is mostly just used
for trigger-firing decisions, so you'd only notice a problem if a
trigger cared whether a generated column had been updated.
I'd complained about this back in May, but then forgot about it
until bug #16671 from Michael Paul Killian revived the issue.
Back-patch to v12 where this field was introduced. If existing
stored rules contain any extraUpdatedCols values, they'll be
ignored because the rewriter will overwrite them, so the bug will
be fixed even for existing rules. (But note that if someone were
to update to 13.1 or 12.5, store some rules with UPDATEs on tables
having generated columns, and then downgrade to a prior minor version,
they might observe issues similar to what this patch fixes. That
seems unlikely enough to not be worth going to a lot of effort to fix.)
Discussion: https://postgr.es/m/10206.1588964727@sss.pgh.pa.us
Discussion: https://postgr.es/m/16671-2fa55851859fb166@postgresql.org
We have a perfectly good convention for OID macros for built-in functions
already, so making custom symbols is just introducing unnecessary
deviation from the convention. Remove the one case that had snuck in,
and add an error check in genbki.pl to discourage future instances.
Although this touches pg_proc.dat, there's no need for a catversion
bump since the actual catalog data isn't changed.
John Naylor
Discussion: https://postgr.es/m/CAFBsxsHpCbjfoddNGpnnnY5pHwckWfiYkMYSF74PmP1su0+ZOw@mail.gmail.com
get_foreign_key_join_selectivity() looks for join clauses that equate
the two sides of the FK constraint. However, if we have a query like
"WHERE fktab.a = pktab.a and fktab.a = 1", it won't find any such join
clause, because equivclass.c replaces the given clauses with "fktab.a
= 1 and pktab.a = 1", which can be enforced at the scan level, leaving
nothing to be done for column "a" at the join level.
We can fix that expectation without much trouble, but then a new problem
arises: applying the foreign-key-based selectivity rule produces a
rowcount underestimate, because we're effectively double-counting the
selectivity of the "fktab.a = 1" clause. So we have to cancel that
selectivity out of the estimate.
To fix, refactor process_implied_equality() so that it can pass back the
new RestrictInfo to its callers in equivclass.c, allowing the generated
"fktab.a = 1" clause to be saved in the EquivalenceClass's ec_derives
list. Then it's not much trouble to dig out the relevant RestrictInfo
when we need to adjust an FK selectivity estimate. (While at it, we
can also remove the expensive use of initialize_mergeclause_eclasses()
to set up the new RestrictInfo's left_ec and right_ec pointers.
The equivclass.c code can set those basically for free.)
This seems like clearly a bug fix, but I'm hesitant to back-patch it,
first because there's some API/ABI risk for extensions and second because
we're usually loath to destabilize plan choices in stable branches.
Per report from Sigrid Ehrenreich.
Discussion: https://postgr.es/m/1019549.1603770457@sss.pgh.pa.us
Discussion: https://postgr.es/m/AM6PR02MB5287A0ADD936C1FA80973E72AB190@AM6PR02MB5287.eurprd02.prod.outlook.com
This makes use of CheckBuffer() introduced in c780a7a, adding a SQL
wrapper able to do checks for all the pages of a relation. By default,
all the fork types of a relation are checked, and it is possible to
check only a given relation fork. Note that if the relation given in
input has no physical storage or is temporary, then no errors are
generated, allowing full-database checks when coupled with a simple scan
of pg_class for example. This is not limited to clusters with data
checksums enabled, as clusters without data checksums can still apply
checks on pages using the page headers or for the case of a page full of
zeros.
This function returns a set of tuples consisting of:
- The physical file where a broken page has been detected (without the
segment number as that can be AM-dependent, which can be guessed from
the block number for heap). A relative path from PGPATH is used.
- The block number of the broken page.
By default, only superusers have an access to this function but
execution rights can be granted to other users.
The feature introduced here is still minimal, and more improvements
could be done, like:
- Addition of a start and end block number to run checks on a range
of blocks, which would apply only if one fork type is checked.
- Addition of some progress reporting.
- Throttling, with configuration parameters in function input or
potentially some cost-based GUCs.
Regression tests are added for positive cases in the main regression
test suite, and TAP tests are added for cases involving the emulation of
page corruptions.
Bump catalog version.
Author: Julien Rouhaud, Michael Paquier
Reviewed-by: Masahiko Sawada, Justin Pryzby
Discussion: https://postgr.es/m/CAOBaU_aVvMjQn=ge5qPiJOPMmOj5=ii3st5Q0Y+WuLML5sR17w@mail.gmail.com
CheckBuffer() is designed to be a concurrent-safe function able to run
sanity checks on a relation page without loading it into the shared
buffers. The operation is done using a lock on the partition involved
in the shared buffer mapping hashtable and an I/O lock for the buffer
itself, preventing the risk of false positives due to any concurrent
activity.
The primary use of this function is the detection of on-disk corruptions
for relation pages. If a page is found in shared buffers, the on-disk
page is checked if not dirty (a follow-up checkpoint would flush a valid
version of the page if dirty anyway), as it could be possible that a
page was present for a long time in shared buffers with its on-disk
version corrupted. Such a scenario could lead to a corrupted cluster if
a host is plugged off for example. If the page is not found in shared
buffers, its on-disk state is checked. PageIsVerifiedExtended() is used
to apply the same sanity checks as when a page gets loaded into shared
buffers.
This function will be used by an upcoming patch able to check the state
of on-disk relation pages using a SQL function.
Author: Julien Rouhaud, Michael Paquier
Reviewed-by: Masahiko Sawada
Discussion: https://postgr.es/m/CAOBaU_aVvMjQn=ge5qPiJOPMmOj5=ii3st5Q0Y+WuLML5sR17w@mail.gmail.com
Now that LOCK TABLE can take any relation type, acquire lock on all
relations that are to be dumped. This prevents schema changes or
deadlock errors that could cause a dump to fail after expending much
effort. The server is tested to have the capability and the feature
disabled if it doesn't, so that a patched pg_dump doesn't fail when
connecting to an unpatched server.
Backpatch to 9.5.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reported-by: Wells Oliver <wells.oliver@gmail.com>
Discussion: https://postgr.es/m/20201021200659.GA32358@alvherre.pgsql
This accompanies select_common_type() and select_common_collation().
Typmods were previously combined using hand-coded logic in several
places. The logic in select_common_typmod() isn't very exciting, but
it makes the code more compact and readable in a few locations, and in
the future we can perhaps do more complicated things if desired.
As a small enhancement, the type unification of the direct and
aggregate arguments of hypothetical-set aggregates now unifies the
typmod as well using this new function, instead of just dropping it.
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://www.postgresql.org/message-id/flat/97df3af9-8b5e-fb7f-a029-3eb7e80d7af9@2ndquadrant.com
The restriction that only tables and views can be locked by LOCK TABLE
is quite arbitrary, since the underlying mechanism can lock any relation
type. Drop the restriction so that programs such as pg_dump can lock
all relations they're interested in, preventing schema changes that
could cause a dump to fail after expending much effort.
Backpatch to 9.5.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reported-by: Wells Oliver <wells.oliver@gmail.com>
Discussion: https://postgr.es/m/20201021200659.GA32358@alvherre.pgsql
Previously, ExecInitModifyTable relied on ExecInitJunkFilter,
and thence ExecCleanTypeFromTL, to build the target descriptor from
the query tlist. While we just checked (in ExecCheckPlanOutput)
that the tlist produces compatible output, this is not a great
substitute for the relation's actual tuple descriptor that's
available from the relcache. For one thing, dropped columns will
not be correctly marked attisdropped; it's a bit surprising that
we've gotten away with that this long. But the real reason for
being concerned with this is that using the table's descriptor means
that the slot will have correct attrmissing data, allowing us to
revert the klugy fix of commit ba9f18abd. (This commit undoes
that one's changes in trigger.c, but keeps the new test case.)
Thus we can solve the bogus-trigger-tuple problem with fewer cycles
rather than more.
No back-patch, since this doesn't fix any additional bug, and it
seems somewhat more likely to have unforeseen side effects than
ba9f18abd's narrow fix.
Discussion: https://postgr.es/m/16644-5da7ef98a7ac4545@postgresql.org
This is useful for checks of relation pages without having to load the
pages into the shared buffers, and two cases can make use of that: page
verification in base backups and the online, lock-safe, flavor.
Compatibility is kept with past versions using a macro that calls the
new extended routine with the set of options compatible with the
original version.
Extracted from a larger patch by the same author.
Author: Anastasia Lubennikova
Reviewed-by: Michael Paquier, Julien Rouhaud
Discussion: https://postgr.es/m/608f3476-0598-2514-2c03-e05c7d2b0cbd@postgrespro.ru
If the old row has any "missing" attributes that are supposed to
be retrieved from an associated tuple descriptor, the wrong things
happened because the trigger result is shoved directly into an
executor slot that lacks the missing-attribute data. Notably,
CHECK-constraint verification would incorrectly see those columns
as NULL, and so would RETURNING-list evaluation.
Band-aid around this by forcibly expanding the tuple before passing
it to the trigger function. (IMO it was a fundamental misdesign to
put the missing-attribute data into tuple constraints, which so
much of the system considers to be optional. But we're probably
stuck with that now, and will have to continue to apply band-aids
as we find other places with similar issues.)
Back-patch to v12. v11 would also have the issue, except that
commit 920311ab1 already applied a similar band-aid. That forced
expansion in more cases than seem really necessary, though, so
this isn't a directly equivalent fix.
Amit Langote, with some cosmetic changes by me
Discussion: https://postgr.es/m/16644-5da7ef98a7ac4545@postgresql.org
We must not set the "done" flag until after we've executed the
initialization function. Otherwise, other threads can fall through
the initial unlocked test before initialization is really complete.
This has been seen to cause rare failures of ecpg's thread/descriptor
test, and it could presumably cause other sorts of misbehavior in
threaded ECPG-using applications, since ecpglib relies on
pthread_once() in several places.
Diagnosis and patch by me, based on investigation by Alexander Lakhin.
Back-patch to all supported branches (the bug dates to 2007).
Discussion: https://postgr.es/m/16685-d6cd241872c101d3@postgresql.org
genhtml has been generating the following warning with this new code:
WARNING: function data mismatch at /path/src/common/unicode_norm.c:102
HTML coverage reports care about the uniqueness of functions defined in
source files, ignoring any assumptions around CFLAGS. 783f0cc
introduced a duplicated definition of get_code_entry(), leading to a
warning and potentially some incorrect data generated in the reports.
This refactors the code so as the code has only one function
declaration, fixing the warning.
Oversight in 783f0cc.
Reported-by: Tom Lane
Author: Michael Paquier
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/207789.1603469272@sss.pgh.pa.us
Instead of immediately PQfinish'ing a dead connection, save it aside
so that we can still extract its parameters for \connect attempts.
(This works because PQconninfo doesn't care whether the PGconn is in
CONNECTION_BAD state.) This allows developers to reconnect with
just \c after a database crash and restart.
It's tempting to use the same approach instead of closing the old
connection after a failed non-interactive \connect command. However,
that would not be very safe: consider a script containing
\c db1 user1 live_server
\c db2 user2 dead_server
\c db3
The script would be expecting to connect to db3 at dead_server, but
if we re-use parameters from the first connection then it might
successfully connect to db3 at live_server. This'd defeat the goal
of not letting a script accidentally execute commands against the
wrong database.
Discussion: https://postgr.es/m/38464.1603394584@sss.pgh.pa.us
The ExplainCloseGroup arguments for incremental sort usage data
didn't match the corresponding ExplainOpenGroup. This only matters
for XML-format output, which is probably why we'd not noticed.
Daniel Gustafsson, per bug #16683 from Frits Jalvingh
Discussion: https://postgr.es/m/16683-8005033324ad34e9@postgresql.org
This replaces the existing binary search with two perfect hash functions
for the composition and the decomposition in the backend code, at the
cost of slightly-larger binaries there (35kB in libpgcommon_srv.a). Per
the measurements done, this improves the speed of the recomposition and
decomposition by up to 30~40 times for the NFC and NFKC conversions,
while all other operations get at least 40% faster. This is not as
"good" as what libicu has, but it closes the gap a lot as per the
feedback from Daniel Verite.
The decomposition table remains the same, getting used for the binary
search in the frontend code, where we care more about the size of the
libraries like libpq over performance as this gets involved only in code
paths related to the SCRAM authentication. In consequence, note that
the perfect hash function for the recomposition needs to use a new
inverse lookup array back to to the existing decomposition table.
The size of all frontend deliverables remains unchanged, even with
--enable-debug, including libpq.
Author: John Naylor
Reviewed-by: Michael Paquier, Tom Lane
Discussion: https://postgr.es/m/CAFBsxsHUuMFCt6-pU+oG-F1==CmEp8wR+O+bRouXWu6i8kXuqA@mail.gmail.com
There's no functional change at all here, but I'm curious to see
whether this change successfully shuts up Coverity's warning about
a useless strcmp(), which appeared with the previous update.
Discussion: http://mm.icann.org/pipermail/tz/2020-October/029370.html
ECPG's PREPARE ... FROM and EXECUTE IMMEDIATE can optionally take
the target query as a simple literal, rather than the more usual
string-variable reference. This was previously documented as
being a C string literal, but that's a lie in one critical respect:
you can't write a data double quote as \" in such literals. That's
because the lexer is in SQL mode at this point, so it'll parse
double-quoted strings as SQL identifiers, within which backslash
is not special, so \" ends the literal.
I looked into making this work as documented, but getting the lexer
to switch behaviors at just the right point is somewhere between
very difficult and impossible. It's not really worth the trouble,
because these cases are next to useless: if you have a fixed SQL
statement to execute or prepare, you might as well write it as
a direct EXEC SQL, saving the messiness of converting it into
a string literal and gaining the opportunity for compile-time
SQL syntax checking.
Instead, let's just document (and test) the workaround of writing
a double quote as an octal escape (\042) in such cases.
There's no code behavioral change here, so in principle this could
be back-patched, but it's such a niche case I doubt it's worth
the trouble.
Per report from 1250kv.
Discussion: https://postgr.es/m/673825.1603223178@sss.pgh.pa.us
If you write the literal 'abc''def' in an EXEC SQL command, that will
come out the other end as 'abc'def', triggering a syntax error in the
backend. Likewise, "abc""def" is reduced to "abc"def" which is wrong
syntax for a quoted identifier.
The cause is that the lexer thinks it should emit just one quote
mark, whereas what it really should do is keep the string as-is.
Add some docs and test cases, too.
Although this seems clearly a bug, I fear users wouldn't appreciate
changing it in minor releases. Some may well be working around it
by applying an extra doubling of affected quotes, as for example
sql/dyntest.pgc has been doing.
Per investigation of a report from 1250kv, although this isn't
exactly what he/she was on about.
Discussion: https://postgr.es/m/673825.1603223178@sss.pgh.pa.us
The check for whether to complain about not having an old connection
to get parameters from was seriously out of date: it had not been
rethought when we invented connstrings, nor when we invented the
-reuse-previous option. Replace it with a check that throws an
error if reuse-previous is active and we lack an old connection to
reuse. While that doesn't move the goalposts very far in terms of
easing reconnection after a server crash, at least it's consistent.
If the user specifies a connstring plus additional parameters
(which is invalid per the documentation), the extra parameters were
silently ignored. That seems like it could be really confusing,
so let's throw a syntax error instead.
Teach the connstring code path to re-use the old connection's password
in the same cases as the old-style-syntax code path would, ie if we
are reusing parameters and the values of username, host/hostaddr, and
port are not being changed. Document this behavior, too, since it was
unmentioned before. Also simplify the implementation a bit, giving
rise to two new and useful properties: if there's a "password=xxx" in
the connstring, we'll use it not ignore it, and by default (i.e.,
except with --no-password) we will prompt for a password if the
re-used password or connstring password doesn't work. The previous
code just failed if the re-used password didn't work.
Given the paucity of field complaints about these issues, I don't
think that they rise to the level of back-patchable bug fixes,
and in any case they might represent undesirable behavior changes
in minor releases. So no back-patch.
Discussion: https://postgr.es/m/235210.1603321144@sss.pgh.pa.us
There is a handful of places where we called list_delete_ptr() to remove
some element from a List. In many of these places we know, or with very
little additional effort know the index of the ListCell that we need to
remove.
Here we change all of those places to instead either use one of;
list_delete_nth_cell(), foreach_delete_current() or list_delete_last().
Each of these saves from having to iterate over the list to search for the
element to remove by its pointer value.
There are some small performance gains to be had by doing this, but in the
general case, none of these lists are likely to be very large, so the
lookup was probably never that expensive anyway. However, some of the
calls are in fairly hot code paths, e.g process_equivalence(). So any
small gains there are useful.
Author: Zhijie Hou and David Rowley
Discussion: https://postgr.es/m/b3517353ec7c4f87aa560678fbb1034b@G08CNEXMBPEKD05.g08.fujitsu.local
psql's \connect claims to be able to re-use previous connection
parameters, but in fact it only re-uses the database name, user name,
host name (and possibly hostaddr, depending on version), and port.
This is problematic for assorted use cases. Notably, pg_dump[all]
emits "\connect databasename" commands which we would like to have
re-use all other parameters. If such a script is loaded in a psql run
that initially had "-d connstring" with some non-default parameters,
those other parameters would be lost, potentially causing connection
failure. (Thus, this is the same kind of bug addressed in commits
a45bc8a4f and 8e5793ab6, although the details are much different.)
To fix, redesign do_connect() so that it pulls out all properties
of the old PGconn using PQconninfo(), and then replaces individual
properties in that array. In the case where we don't wish to re-use
anything, get libpq's default settings using PQconndefaults() and
replace entries in that, so that we don't need different code paths
for the two cases.
This does result in an additional behavioral change for cases where
the original connection parameters allowed multiple hosts, say
"psql -h host1,host2", and the \connect request allows re-use of the
host setting. Because the previous coding relied on PQhost(), it
would only permit reconnection to the same host originally selected.
Although one can think of scenarios where that's a good thing, there
are others where it is not. Moreover, that behavior doesn't seem to
meet the principle of least surprise, nor was it documented; nor is
it even clear it was intended, since that coding long pre-dates the
addition of multi-host support to libpq. Hence, this patch is content
to drop it and re-use the host list as given.
Per Peter Eisentraut's comments on bug #16604. Back-patch to all
supported branches.
Discussion: https://postgr.es/m/16604-933f4b8791227b15@postgresql.org