Although clang claims to be compatible with gcc's printf format
archetypes, this appears to be a falsehood: it likes __syslog__
(which gcc does not, on most platforms) and doesn't accept
gnu_printf. This means that if you try to use gcc with clang++
or clang with g++, you get compiler warnings when compiling
printf-like calls in our C++ code. This has been true for quite
awhile, but it's gotten more annoying with the recent appearance
of several buildfarm members that are configured like this.
To fix, run separate probes for the format archetype to use with the
C and C++ compilers, and conditionally define PG_PRINTF_ATTRIBUTE
depending on __cplusplus.
(We could alternatively insist that you not mix-and-match C and
C++ compilers; but if the case works otherwise, this is a poor
reason to insist on that.)
No back-patch for now, but we may want to do that if this
patch survives buildfarm testing.
Discussion: https://postgr.es/m/986485.1764825548@sss.pgh.pa.us
Always return TIDs in descending order when returning groups of TIDs
from an nbtree posting list tuple during nbtree backwards scans. This
makes backwards scans tend to require fewer buffer hits, since the scan
is less likely to repeatedly pin and unpin the same heap page/buffer
(we'll get exactly as many buffer hits as we get with a similar forwards
scan case).
Commit 0d861bbb, which added nbtree deduplication, originally did things
this way to avoid interfering with _bt_killitems's approach to setting
LP_DEAD bits on posting list tuples. _bt_killitems makes a soft
assumption that it can always iterate through posting lists in ascending
TID order, finding corresponding killItems[]/so->currPos.items[] entries
in that same order. This worked out because of the prior _bt_readpage
backwards scan behavior. If we just changed the backwards scan posting
list logic in _bt_readpage, without altering _bt_killitems itself, it
would break its soft assumption.
Avoid that problem by sorting the so->killedItems[] array at the start
of _bt_killitems. That way the order that dead items are saved in from
btgettuple can't matter; so->killedItems[] will always be in the same
order as so->currPos.items[] in the end. Since so->currPos.items[] is
now always in leaf page order, regardless of the scan direction used
within _bt_readpage, and since so->killedItems[] is always in that same
order, the _bt_killitems loop can continue to make a uniform assumption
about everything being in page order. In fact, sorting like this makes
the previous soft assumption about item order into a hard invariant.
Also deduplicate the so->killedItems[] array after it is sorted. That
way there's no risk of the _bt_killitems loop becoming confused by a
duplicate dead item/TID. This was possible in cases that involved a
scrollable cursor that encountered the same dead TID more than once
(within the same leaf page/so->currPos context). This doesn't come up
very much in practice, but it seems best to be as consistent as possible
about how and when _bt_killitems will LP_DEAD-mark index tuples.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Mircea Cadariu <cadariu.mircea@gmail.com>
Reviewed-By: Victor Yegorov <vyegorov@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wz=Wut2pKvbW-u3hJ_LXwsYeiXHiW8oN1GfbKPavcGo8Ow@mail.gmail.com
The test seemed to incorrectly think that query_safe() takes an
argument that describes what the query does, similar to e.g.
command_ok(). Until commit bd8d9c9bdf the extra arguments were
harmless and were just ignored, but when commit bd8d9c9bdf introduced
a new optional argument to query_safe(), the extra arguments started
clashing with that, causing the test to fail.
Backpatch to v17, that's the oldest branch where the test exists. The
extra arguments didn't cause any trouble on the older branches, but
they were clearly bogus anyway.
Phase I vacuum gives the page a once-over after pruning and freezing to
check that the values of all_visible and all_frozen agree with the
result of heap_page_is_all_visible(). This is meant to keep the logic in
phase I for determining visibility in sync with the logic in phase III.
Rewrite the assertion to avoid an Assert(false).
Suggested by Andres Freund.
Author: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/mhf4vkmh3j57zx7vuxp4jagtdzwhu3573pgfpmnjwqa6i6yj5y%40sy4ymcdtdklo
These functions took a ResourceOwner argument, but only checked if it
was NULL, and then used CurrentResourceOwner for the actual work.
Surely the intention was to use the passed-in resource owner. All
current callers passed CurrentResourceOwner or NULL, so this has no
consequences at the moment, but it's an accident waiting to happen for
future caller and extensions.
Author: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://www.postgresql.org/message-id/CAEze2Whnfv8VuRZaohE-Af+GxBA1SNfD_rXfm84Jv-958UCcJA@mail.gmail.com
Backpatch-through: 17
Buildfarm members skimmer and crake have reported that pg_upgrade
running from v18 fails due to the changes of d52c24b0f8, with the
expectations that the objects removed in the test module
injection_points should still be present post upgrades, but the test
module does not have them anymore.
The origin of the issue is that the following test modules depend on
injection_points, but they do not drop the extension once the tests
finish, leaving its traces in the dumps used for the upgrades:
- gin, down to v17
- typcache, down to v18
- nbtree, HEAD-only
Test modules have no upgrade requirements, as they are used only for..
Tests, so there is no point in keeping them around.
An alternative solution would be to drop the databases created by these
modules in AdjustUpgrade.pm, but the solution of this commit to drop the
extension is simpler. Note that there would be a catch if using a
solution based on AdjustUpgrade.pm as the database name used for the
test runs differs between configure and meson:
- configure relies on USE_MODULE_DB for the database name unicity, that
would build a database name based on the *first* entry of REGRESS, that
lists all the SQL tests.
- meson relies on a "name" field.
For example, for the test module "gin", the regression database is named
"regression_gin" under meson, while it is more complex for configure, as
of "contrib_regression_gin_incomplete_splits". So a AdjustUpgrade.pm
would need a set of DROP DATABASE IF EXISTS to solve this issue, to cope
with each build system.
The failure has been caused by d52c24b0f8, and the problem can happen
with upgrade dumps from v17 and v18 to HEAD. This problem is not
currently reachable in the back-branches, but it could be possible that
a future change in injection_points in stable branches invalidates this
theory, so this commit is applied down to v17 in the test modules that
matter.
Per discussion with Tom Lane and Heikki Linnakangas.
Discussion: https://postgr.es/m/2899652.1765167313@sss.pgh.pa.us
Backpatch-through: 17
The warning was showing up in the early stages of the meson build, when
the contents of Makefile.global is generated based on the configuration
of meson for PGXS.
NM is added to pgxs_empty. This declaration is only used internally for
the libpq sanity check, so there is no point in exposing it in PGXS.
Oversight in 4a8e6f43a6.
Reported-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/4423e01f-1e52-4f47-a6ca-05cc8081c888@eisentraut.org
A comment in tuplesort.c was claiming that the code was defining
INITIAL_MEMTUPSIZE so that it *does not* exceed
ALLOCSET_SEPARATE_THRESHOLD, but the code actually ensures that we
purposefully *do* exceed ALLOCSET_SEPARATE_THRESHOLD for the initial
allocation of the tuples array, as per reasons detailed in the
commentary of grow_memtuples().
Also, there's not much need to repeat the mention about
ALLOCSET_SEPARATE_THRESHOLD in each location where INITIAL_MEMTUPSIZE is
used, so remove those comments.
Author: ChangAo Chen <cca5507@qq.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Discussion: https://postgr.es/m/tencent_6FA14F85D6B5B5291532D6789E07F4765C08%40qq.com
The idea is to encourage more the use of these new routines across the
tree, as these offer stronger type safety guarantees than palloc().
This batch of changes includes most of the trivial changes suggested by
the author for src/backend/.
A total of 334 files are updated here. Among these files, 48 of them
have their build change slightly; these are caused by line number
changes as the new allocation formulas are simpler, shaving around 100
lines of code in total.
Similar work has been done in 0c3c5c3b06 and 31d3847a37.
Author: David Geier <geidav.pg@gmail.com>
Discussion: https://postgr.es/m/ad0748d4-3080-436e-b0bc-ac8f86a3466a@gmail.com
PostgreSQL's src/port/open.c has always set bInheritHandle = TRUE
when opening files on Windows, making all file descriptors inheritable
by child processes. This meant the O_CLOEXEC flag, added to many call
sites by commit 1da569ca1f (v16), was silently ignored.
The original commit included a comment suggesting that our open()
replacement doesn't create inheritable handles, but it was a mis-
understanding of the code path. In practice, the code was creating
inheritable handles in all cases.
This hasn't caused widespread problems because most child processes
(archive_command, COPY PROGRAM, etc.) operate on file paths passed as
arguments rather than inherited file descriptors. Even if a child
wanted to use an inherited handle, it would need to learn the numeric
handle value, which isn't passed through our IPC mechanisms.
Nonetheless, the current behavior is wrong. It violates documented
O_CLOEXEC semantics, contradicts our own code comments, and makes
PostgreSQL behave differently on Windows than on Unix. It also creates
potential issues with future code or security auditing tools.
To fix, define O_CLOEXEC to _O_NOINHERIT in master, previously used by
O_DSYNC. We use different values in the back branches to preserve
existing values. In pgwin32_open_handle() we set bInheritHandle
according to whether O_CLOEXEC is specified, for the same atomic
semantics as POSIX in multi-threaded programs that create processes.
Backpatch-through: 16
Author: Bryan Green <dbryan.green@gmail.com>
Co-authored-by: Thomas Munro <thomas.munro@gmail.com> (minor adjustments)
Discussion: https://postgr.es/m/e2b16375-7430-4053-bda3-5d2194ff1880%40gmail.com
Presently, the "echo" and "quiet" variables are carted around to
various functions, which is a bit tedious. To simplify things,
this commit moves them into the vacuumingOptions struct and removes
the related function parameters. While at it, remove some
redundant initialization code in vacuumdb's main() function.
This is preparatory work for a follow-up commit that will add a
--dry-run option to vacuumdb.
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CADkLM%3DckHkX7Of5SrK7g0LokPUwJ%3Dkk8JU1GXGF5pZ1eBVr0%3DQ%40mail.gmail.com
POSIX has for a long time defined the "j" length modifier for
printf conversions as meaning the size of intmax_t or uintmax_t.
We got away without supporting that so far, because we were not
using intmax_t anywhere. However, commit e6be84356 re-introduced
upstream's use of intmax_t and PRIdMAX into zic.c. It emerges
that on some platforms (at least FreeBSD and macOS), <inttypes.h>
defines PRIdMAX as "jd", so that snprintf.c falls over if that is
used. (We hadn't noticed yet because it would only be apparent
if bad data is fed to zic, resulting in an error report, and even
then the only visible symptom is a missing line number in the
error message.)
We could revert that decision from our copy of zic.c, but
on the whole it seems better to update snprintf.c to support
this standard modifier. There might well be extensions,
now or in future, that expect it to work.
I did this in the lazy man's way of translating "j" to either
"l" or "ll" depending on a compile-time sizeof() check, just
as was done long ago to support "z" for size_t. One could
imagine promoting intmax_t to have full support in snprintf.c,
for example converting fmtint()'s value argument and internal
arithmetic to use [u]intmax_t not [unsigned] long long. But
that'd be more work and I'm hesitant to do it anyway: if there
are any platforms out there where intmax_t is actually wider
than "long long", this would doubtless result in a noticeable
speed penalty to snprintf(). Let's not go there until we have
positive evidence that there's a reason to, and some way to
measure what size of penalty we're taking.
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/3210703.1765236740@sss.pgh.pa.us
This eliminates MultiXactOffset wraparound and the 2^32 limit on the
total number of multixid members. Multixids are still limited to 2^31,
but this is a nice improvement because 'members' can grow much faster
than the number of multixids. On such systems, you can now run longer
before hitting hard limits or triggering anti-wraparound vacuums.
Not having to deal with MultiXactOffset wraparound also simplifies the
code and removes some gnarly corner cases.
We no longer need to perform emergency anti-wraparound freezing
because of running out of 'members' space, so the offset stop limit is
gone. But you might still not want 'members' to consume huge amounts
of disk space. For that reason, I kept the logic for lowering vacuum's
multixid freezing cutoff if a large amount of 'members' space is
used. The thresholds for that are roughly the same as the "safe" and
"danger" thresholds used before, 2 billion transactions and 4 billion
transactions. This keeps the behavior for the freeze cutoff roughly
the same as before. It might make sense to make this smarter or
configurable, now that the threshold is only needed to manage disk
usage, but that's left for the future.
Add code to pg_upgrade to convert multitransactions from the old to
the new format, rewriting the pg_multixact SLRU files. Because
pg_upgrade now rewrites the files, we can get rid of some hacks we had
put in place to deal with old bugs and upgraded clusters. Bump catalog
version for the pg_multixact/offsets format change.
Author: Maxim Orlov <orlovmg@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com>
Discussion: https://www.postgresql.org/message-id/CACG%3DezaWg7_nt-8ey4aKv2w9LcuLthHknwCawmBgEeTnJrJTcw@mail.gmail.com
The description of deferrable constraints in create_table.sgml states
that deferrable constraints cannot be used as conflict arbitrators in
an INSERT with an ON CONFLICT DO UPDATE clause, but in fact this
restriction applies to all ON CONFLICT clauses, not just those with DO
UPDATE. Fix this, and while at it, change the word "arbitrators" to
"arbiters", to match the terminology used elsewhere.
Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/CAEZATCWsybvZP3ce8rGcVNx-QHuDOJZDz8y=p1SzqHwjRXyV4Q@mail.gmail.com
Backpatch-through: 14
query_is_distinct_for() is intended to determine whether a query never
returns duplicates of the specified columns. For queries using
grouping sets, if there are no grouping expressions, the query may
contain one or more empty grouping sets. The goal is to detect
whether there is exactly one empty grouping set, in which case the
query would return a single row and thus be distinct.
The previous logic in query_is_distinct_for() was incomplete because
the check was insufficiently thorough and could return false when it
could have returned true. It failed to consider cases where the
DISTINCT clause is used on the GROUP BY, in which case duplicate empty
grouping sets are removed, leaving only one. It also did not
correctly handle all possible structures of GroupingSet nodes that
represent a single empty grouping set.
To fix, add a check for the groupDistinct flag, and expand the query's
groupingSets tree into a flat list, then verify that the expanded list
contains only one element.
No backpatch as this could result in plan changes.
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAMbWs480Z04NtP8-O55uROq2Zego309+h3hhaZhz6ztmgWLEBw@mail.gmail.com
Similar to the issue with constraint and statistics expressions fixed
in 317c117d6, index expressions and predicate can also suffer from
incorrect reduction of NullTest clauses during const-simplification,
due to unfixed varnos and the use of a NULL root. It has been
reported that this issue can cause the planner to fail to pick up a
partial index that it previously matched successfully.
Because we need to cache the const-simplified index expressions and
predicate in the relcache entry, we cannot fix the Vars before
applying eval_const_expressions. To ensure proper reduction of
NullTest clauses, this patch runs eval_const_expressions a second time
-- after the Vars have been fixed and with a valid root.
It could be argued that the additional call to eval_const_expressions
might increase planning time, but I don't think that's a concern. It
only runs when index expressions and predicate are present; it is
relatively cheap when run on small expression trees (which is
typically the case for index expressions and predicate), and it runs
on expressions that have already been const-simplified once, making
the second pass even cheaper. In return, in cases like the one
reported, it allows the planner to match and use partial indexes,
which can lead to significant execution-time improvements.
Bug: #19007
Reported-by: Bryan Fox <bryfox@gmail.com>
Author: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/19007-4cc6e252ed8aa54a@postgresql.org
Previously, the slotsync worker relied on SIGINT for graceful shutdown
during promotion. However, SIGINT is also used by the LOCK_TIMEOUT handler
to cancel queries. Since the slotsync worker can lock catalog tables while
parsing libpq tuples, this overlap caused it to ignore LOCK_TIMEOUT
signals and potentially wait indefinitely on locks.
This patch replaces the slotsync worker's SIGINT handler with
StatementCancelHandler to correctly process query-cancel interrupts.
Additionally, the startup process now uses SIGUSR1 to signal the slotsync
worker to stop during promotion. The worker exits after detecting that the
shared memory flag stopSignaled is set.
Author: Hou Zhijie <houzj.fnst@fujitsu.com>
Reviewed-by: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 17, here it was introduced
Discussion: https://postgr.es/m/TY4PR01MB169078F33846E9568412D878C94A2A@TY4PR01MB16907.jpnprd01.prod.outlook.com
The idea is to encourage more the use of these new routines across the
tree, as these offer stronger type safety guarantees than palloc().
The following paths are included in this batch, treating all the areas
proposed by the author for the most trivial changes, except src/backend
(by far the largest batch):
src/bin/
src/common/
src/fe_utils/
src/include/
src/pl/
src/test/
src/tutorial/
Similar work has been done in 31d3847a37.
The code compiles the same before and after this commit, with the
following exceptions due to changes in line numbers because some of the
new allocation formulas are shorter:
blkreftable.c
pgfnames.c
pl_exec.c
Author: David Geier <geidav.pg@gmail.com>
Discussion: https://postgr.es/m/ad0748d4-3080-436e-b0bc-ac8f86a3466a@gmail.com
This commit refactors the sanity check done by libpq to ensure that
there is no exit() reference in the build, moving the check from a
standalone Makefile rule to a perl script.
Platform-specific checks are now part of the script, avoiding most of
the duplication created by the introduction of this check for meson, but
not all of them:
- Solaris and Windows skipped in the script.
- Whitelist of symbols is in the script.
- nm availability, with its path given as an option of the script. Its
execution is checked in the script.
- Check is disabled if coverage reports are enabled. This part is not
pushed down to the script.
- Check is disabled for static builds of libpq. This part is filtered
out in each build script.
A trick is required for the stamp file, in the shape of an optional
argument that can be given to the script. Meson expects the stamp in
output and uses this argument, generating the stamp file in the script.
Meson is able to handle the removal of the stamp file internally when
libpq needs to be rebuilt and the check done again.
This refactoring piece has come up while discussing the addition of more
items in the symbols considered as acceptable.
This sanity check has never been run by meson since its introduction in
dc227eb82e, so it is possible that this fails in some of the buildfarm
members. At least the CI is happy with it, but let's see how it goes.
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Co-authored-by: VASUKI M <vasukim1992002@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/19095-6d8256d0c37d4be2@postgresql.org
The argument of isspace() (like other <ctype.h> functions)
must be cast to unsigned char to ensure portable results.
Per NetBSD buildfarm members. Oversight in 636c1914b.
Make _bt_readpage pass down the current scan direction to various
utility functions within its pstate variable. Also have _bt_readpage
work off of a local copy of scan->ignore_killed_tuples within its
per-tuple loop (rather than using scan->ignore_killed_tuples directly).
Testing has shown that this significantly benefits large range scans,
which are naturally able to take full advantage of the pstate.startikey
optimization added by commit 8a510275. Running a pgbench script with a
"SELECT abalance FROM pgbench_accounts WHERE aid BETWEEN ..." query
shows an increase in transaction throughput of over 5%. There also
appears to be a small performance benefit when running pgbench's
built-in select-only script.
Follow-up to commit 65d6acbc.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Victor Yegorov <vyegorov@gmail.com>
Discussion: https://postgr.es/m/CAH2-WzmwMwcwKFgaf+mYPwiz3iL4AqpXnwtW_O0vqpWPXRom9Q@mail.gmail.com
Quite a bit of code within nbtutils.c is only called by _bt_readpage.
Move _bt_readpage and all of the nbtutils.c functions it depends on into
a new .c file, nbtreadpage.c. Also reorder some of the functions within
the new file for clarity.
This commit has no functional impact. It is strictly mechanical.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Victor Yegorov <vyegorov@gmail.com>
Discussion: https://postgr.es/m/CAH2-WzmwMwcwKFgaf+mYPwiz3iL4AqpXnwtW_O0vqpWPXRom9Q@mail.gmail.com
Currently, we use special values that are otherwise invalid for each
option to indicate "option was not given". Replace that with separate
boolean variables for each option. It seems more clear to be explicit.
We were already doing that for the -m option, because there were no
invalid values for nextMulti that we could use (since commit
94939c5f3a).
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://www.postgresql.org/message-id/81adf5f3-36ad-4bcd-9ba5-1b95c7b7a807@iki.fi
The strtoul() function that we used to parse many of the options
accepts negative values, and silently wraps them to the equivalent
unsigned values. For example, -1 becomes 0xFFFFFFFF, on platforms
where unsigned long is 32 bits wide. Also, on platforms where
"unsigned long" is 64 bits wide, we silently casted values larger than
UINT32_MAX to the equivalent 32-bit value. Both of those behaviors
seem undesirable, so tighten up the parsing to reject them.
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://www.postgresql.org/message-id/81adf5f3-36ad-4bcd-9ba5-1b95c7b7a807@iki.fi
When parse.pl processes braces, it does not take into account that
braces could also be their own token if single quoted ('{', '}').
This is not currently used but a future patch wants to make use of it.
This fixes that by using lookaround assertions to detect the quotes.
To make sure all Perl versions in play support this and to avoid
surprises later on, let's give this a spin on the buildfarm now. It
can exist independently of future work.
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/a855795d-e697-4fa5-8698-d20122126567@eisentraut.org
The code in BootStrapXLOG() and in pg_test_fsync.c tried to align WAL
buffers in complicated ways. Also, they still used XLOG_BLCKSZ for
the alignment, even though that should now be PG_IO_ALIGN_SIZE. This
can now be simplified and made more consistent by using
PGAlignedXLogBlock, either directly in BootStrapXLOG() and using
alignas in pg_test_fsync.c.
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/f462a175-b608-44a1-b428-bdf351e914f4%40eisentraut.org