Commit f7f70d5e2 left one inconsistency behind: we're still creating
pg_type entries for the composite types of sequences and toast tables,
but not arrays over those composites. But there seems precious little
reason to have named composite types for toast tables, and not much more
to have them for sequences (especially given the thought that sequences
may someday not be standalone relations at all).
So, let's close that inconsistency by removing these composite types,
rather than adding arrays for them. This buys back a little bit of
the initial pg_type bloat added by the previous patch, and could be
a significant savings in a large database with many toast tables.
Aside from a small logic rearrangement in heap_create_with_catalog,
this patch mostly needs to clean up some places that were assuming that
pg_class.reltype always has a valid value. Those are really pre-existing
bugs, given that it's documented otherwise; notably, the plpgsql changes
fix code that gives "cache lookup failed for type 0" on indexes today.
But none of these seem interesting enough to back-patch.
Also, remove the pg_dump/pg_upgrade infrastructure for propagating
a toast table's pg_type OID into the new database, since we no longer
need that.
Discussion: https://postgr.es/m/761F1389-C6A8-4C15-80CE-950C961F5341@gmail.com
The previous definition of the column was almost universally disliked,
so provide this updated definition which is more useful for monitoring
purposes: a large positive value is good, while zero or a negative value
means danger. This should be operationally more convenient.
Backpatch to 13, where the new column to pg_replication_slots (and the
feature it represents) were added.
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Discussion: https://postgr.es/m/9ddfbf8c-2f67-904d-44ed-cf8bc5916228@oss.nttdata.com
Enabling pg_stat_statements.track_plaanning may incur a noticeable
performance penalty, especially when a fewer kinds of queries are executed
on many concurrent connections. This commit documents this note.
Back-patch to v13 where pg_stat_statements.track_plaanning was added.
Suggested-by: Pavel Stehule
Author: Fujii Masao
Reviewed-by: Pavel Stehule
Discussion: https://postgr.es/m/CAFj8pRC9Jxa8r5i0TNBWLb8mzuaYzEoLq3QOvip0jVpHPOLbVA@mail.gmail.com
After running GetForeignRelSize for a foreign table, adjust rel->tuples
to be at least as large as rel->rows. This prevents bizarre behavior
in estimate_num_groups() and perhaps other places, especially in the
scenario where rel->tuples is zero because pg_class.reltuples is
(suggesting that ANALYZE has never been run for the table). As things
stood, we'd end up estimating one group out of any GROUP BY on such a
table, whereas the default group-count estimate is more likely to result
in a sane plan.
Also, clarify in the documentation that GetForeignRelSize has the option
to override the rel->tuples value if it has a better idea of what to use
than what is in pg_class.reltuples.
Per report from Jeff Janes. Back-patch to all supported branches.
Patch by me; thanks to Etsuro Fujita for review
Discussion: https://postgr.es/m/CAMkU=1xNo9cnan+Npxgz0eK7394xmjmKg-QEm8wYG9P5-CcaqQ@mail.gmail.com
Previously the document explained that restart_lsn indicates the LSN of
oldest WAL won't be automatically removed during checkpoints. But
since v13 this was no longer true thanks to max_slot_wal_keep_size.
Back-patch to v13 where max_slot_wal_keep_size was added.
Author: Fujii Masao
Discussion: https://postgr.es/m/6497f1e9-3148-c5da-7e49-b2fddad9a42f@oss.nttdata.com
Since v13 pg_stat_statements is allowed to track the planning time of
statements when track_planning option is enabled. Its default was on.
But this feature could cause more terrible spinlock contentions in
pg_stat_statements. As a result of this, Robins Tharakan reported that
v13 beta1 showed ~45% performance drop at high DB connection counts
(when compared with v12.3) during fully-cached SELECT-only test using
pgbench.
To avoid this performance regression by the default setting,
this commit changes default of pg_stat_statements.track_planning to off.
Back-patch to v13 where pg_stat_statements.track_planning was introduced.
Reported-by: Robins Tharakan
Author: Fujii Masao
Reviewed-by: Julien Rouhaud
Discussion: https://postgr.es/m/2895b53b033c47ccb22972b589050dd9@EX13D05UWC001.ant.amazon.com
The IANA tzcode library has a feature to read a time zone file named
"posixrules" and apply the daylight-savings transition dates and times
therein, when it is given a POSIX-style time zone specification that
lacks an explicit transition rule. However, there's a problem with
that code: it doesn't work for dates past the Y2038 time_t rollover.
(Effectively, all times beyond that point are treated as standard
time.) The IANA crew regard this feature as legacy, so their plan is
to remove it not fix it. The time frame in which that will happen
is unclear, but presumably it'll happen well before 2038.
Moreover, effective with the next IANA data update (probably this
fall), the recommended default will be to not install a "posixrules"
file in the first place. The time frame in which tzdata packagers
might adopt that suggestion is likewise unclear, but at least some
platforms will probably do it in the next year or so. While we could
ignore that recommendation so far as PG-supplied tzdata trees are
concerned, builds using --with-system-tzdata will be subject to
whatever the platform's tzdata packager decides to do.
Thus, whether or not we do anything, some increasing fraction of
Postgres users will be exposed to the behavior observed when there
is no "posixrules" file; and if we do nothing, we'll have essentially
no control over the timing of that change.
The best thing to do to ameliorate the uncertainty seems to be to
proactively remove the posixrules-reading feature. If we do that in
a scheduled release then at least we can release-note the behavioral
change, rather than having users be surprised by it after a routine
tzdata update.
The change in question is fairly minor anyway: to be affected,
you have to be using a POSIX-style timezone spec, it has to not
have an explicit rule, and it has to not be one of the four traditional
continental-USA zone names (EST5EDT, CST6CDT, MST7MDT, or PST8PDT),
as those are special-cased. Since the default "posixrules" file
provides USA DST rules, the number of people who are likely to find
such a zone spec useful is probably quite small. Moreover, the
fallback behavior with no explicit rule and no "posixrules" file is to
apply current USA rules, so the only thing that really breaks is the
DST transitions in years before 2007 (and you get the countervailing
fix that transitions after 2038 will be applied).
Now, some installations might have replaced the "posixrules" file,
allowing e.g. EU rules to be applied to a POSIX-style timezone spec.
That won't work anymore. But it's not exactly clear why this solution
would be preferable to using a regular named zone. In any case, given
the Y2038 issue, we need to be pushing users to stop depending on this.
Back-patch into v13; it hasn't been released yet, so it seems OK to
change its behavior. (Personally I think we ought to back-patch
further, but I've been outvoted.)
Discussion: https://postgr.es/m/1390.1562258309@sss.pgh.pa.us
Discussion: https://postgr.es/m/20200621211855.6211-1-eggert@cs.ucla.edu
SQL:1999 had syntax
SUBSTRING(text FROM pattern FOR escapechar)
but this was replaced in SQL:2003 by the more clear
SUBSTRING(text SIMILAR pattern ESCAPE escapechar)
but this was never implemented in PostgreSQL. This patch adds that
new syntax as an alternative in the parser, and updates documentation
and tests to indicate that this is the preferred alternative now.
Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Reviewed-by: Vik Fearing <vik@postgresfriends.org>
Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr>
Discussion: https://www.postgresql.org/message-id/flat/a15db31c-d0f8-8ce0-9039-578a31758adb%402ndquadrant.com
Warnings start 10M transactions before xidStopLimit, which is 11M
transactions before wraparound. The sample WARNING output showed a
value greater than 11M, and its HINT message predated commit
25ec228ef760eb91c094cc3b6dea7257cc22ffb5. Hence, the sample was
impossible. Back-patch to 9.5 (all supported versions).
When we initially created this parameter, in commit ff8ca5fad, we left
the default as "allow any protocol version" on grounds of backwards
compatibility. However, that's inconsistent with the backend's default
since b1abfec82; protocol versions prior to 1.2 are not considered very
secure; and OpenSSL has had TLSv1.2 support since 2012, so the number
of PG servers that need a lesser minimum is probably quite small.
On top of those things, it emerges that some popular distros (including
Debian and RHEL) set MinProtocol=TLSv1.2 in openssl.cnf. Thus, far
from having "allow any protocol version" behavior in practice, what
we actually have as things stand is a platform-dependent lower limit.
So, change our minds and set the min version to TLSv1.2. Anybody
wanting to connect with a new libpq to a pre-2012 server can either
set ssl_min_protocol_version=TLSv1 or accept the fallback to non-SSL.
Back-patch to v13 where the aforementioned patches appeared.
Patch by me, reviewed by Daniel Gustafsson
Discussion: https://postgr.es/m/a9408304-4381-a5af-d259-e55d349ae4ce@2ndquadrant.com
In pg_replication_slot, change output from normal/reserved/lost to
reserved/extended/unreserved/ lost, which better expresses the possible
states particularly near the time where segments are no longer safe but
checkpoint has not run yet.
Under the new definition, reserved means the slot is consuming WAL
that's still under the normal WAL size constraints; extended means it's
consuming WAL that's being protected by wal_keep_segments or the slot
itself, whose size is below max_slot_wal_keep_size; unreserved means the
WAL is no longer safe, but checkpoint has not yet removed those files.
Such as slot is in imminent danger, but can still continue for a little
while and may catch up to the reserved WAL space.
Also, there were some bugs in the calculations used to report the
status; fixed those.
Backpatch to 13.
Reported-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20200616.120236.1809496990963386593.horikyota.ntt@gmail.com
Both INDEX_CLEANUP and TRUNCATE have been available since v12, and are
enabled by default except if respectively vacuum_index_cleanup and
vacuum_truncate are disabled for a given relation. This change adds
support for disabling these options from vacuumdb.
Author: Nathan Bossart
Reviewed-by: Michael Paquier, Masahiko Sawada
Discussion: https://postgr.es/m/6F7F17EF-B1F2-4681-8D03-BA96365717C0@amazon.com
Defining duplicates as "close by" to each other was unclear. Simplify
the definition.
Backpatch: 13-, where deduplication was introduced (by commit 0d861bbb)
911e7020770 added opclass options and adjusted documentation for each
particular affected opclass. However, documentation for extendability was
not adjusted. This commit adjusts documentation for interfaces of index AMs
and opclasses.
Discussion: https://postgr.es/m/CAH2-WzmQnW6%2Bz5F9AW%2BSz%2BzEcEvXofTwh_A9J3%3D_WA-FBP0wYg%40mail.gmail.com
Author: Alexander Korotkov
Reported-by: Peter Geoghegan
Reviewed-by: Peter Geoghegan
The IANA time zone folk have deprecated use of a "posixrules" file in
the tz database. While for now it's our choice whether to keep
supplying one in our own builds, installations built with
--with-system-tzdata will soon be needing to cope with that file not
being present, at least on some platforms.
This causes a problem for the horology test, which expected the
nonstandard POSIX zone spec "CST7CDT" to apply pre-2007 US daylight
savings rules. That does happen if the posixrules file supplies such
information, but otherwise the test produces undesired results.
To fix, add an explicit transition date rule that matches 2005 practice.
(We could alternatively have switched the test to use some real time
zone, but it seems useful to have coverage of this type of zone spec.)
While at it, update a documentation example that also relied on
"CST7CDT"; use a real-world zone name instead. Also, document why
the zone names EST5EDT, CST6CDT, MST7MDT, PST8PDT aren't subject to
similar failures when "posixrules" is missing.
Back-patch to all supported branches, since the hazard is the same
for all.
Discussion: https://postgr.es/m/1665379.1592581287@sss.pgh.pa.us
Mostly in response to Jürgen Purtz critique of previous definitions,
though I added many other changes.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Jürgen Purtz <juergen@purtz.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Erik Rijkers <er@xs4all.nl>
Discussion: https://postgr.es/m/c1e06008-2132-30f4-9b38-877e8683d418@purtz.de
We'd glossed over most of this complexity for years, but it's hard
to avoid writing it all down now, so that we can explain what happens
when there's no "posixrules" file in the IANA time zone database.
That was at best a tiny minority situation till now, but it's likely
to become quite common in the future, so we'd better explain it.
Nonetheless, we don't really encourage people to use POSIX zone specs;
picking a named zone is almost always what you really want, unless
perhaps you're stuck with an out-of-date zone database. Therefore,
let's shove all this detail into an appendix.
Patch by me; thanks to Robert Haas for help with some awkward wording.
Discussion: https://postgr.es/m/1390.1562258309@sss.pgh.pa.us
Don't use fread(), since that doesn't necessarily set errno. We could
use read() instead, but it's even better to use pg_pread(), which
allows us to avoid some extra calls to seek to the desired location in
the file.
Also, advertise a wait event while reading from a file, as we do for
most other places where we're reading data from files.
Patch by me, reviewed by Hamid Akhtar.
Discussion: http://postgr.es/m/CA+TgmobBw-3573vMosGj06r72ajHsYeKtksT_oTxH8XvTL7DxA@mail.gmail.com
Our documentation failed to point out that REPEATABLE READ is really
snapshot isolation, which might be important to some users. Point to
the standard reference paper for this complicated topic.
Likewise, add a reference to the VLDB paper about PostgreSQL SSI, for
technical information about our SSI implementation and how it compares
to S2PL.
While here, add a note about catalog access using a lower isolation
level, per recent user complaint.
Back-patch to all releases.
Reported-by: Kyle Kingsbury <aphyr@jepsen.io>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Peter Geoghegan <pg@bowt.ie>
Reviewed-by: Tatsuo Ishii <ishii@sraoss.co.jp>
Discussion: https://postgr.es/m/db7b729d-0226-d162-a126-8a8ab2dc4443%40jepsen.io
Discussion: https://postgr.es/m/16454-9408996bb1750faf%40postgresql.org
This patch removes the hardcoded check for superuser privileges when
executing replication origin functions. Instead, execution is revoked
from public, meaning that those functions can be executed by a superuser
and that access to them can be granted.
Author: Martín Marqués
Reviewed-by: Kyotaro Horiguchi, Michael Paquier, Masahiko Sawada
Discussion: https:/postgr.es/m/CAPdiE1xJMZOKQL3dgHMUrPqysZkgwzSMXETfKkHYnBAB7-0VRQ@mail.gmail.com
plpgsql has always executed the query given in a RETURN QUERY command
by opening it as a cursor and then fetching a few rows at a time,
which it turns around and dumps into the function's result tuplestore.
The point of this was to keep from blowing out memory with an oversized
SPITupleTable result (note that while a tuplestore can spill tuples
to disk, SPITupleTable cannot). However, it's rather inefficient, both
because of extra data copying and because of executor entry/exit
overhead. In recent versions, a new performance problem has emerged:
use of a cursor prevents use of a parallel plan for the executed query.
We can improve matters by skipping use of a cursor and having the
executor push result tuples directly into the function's result
tuplestore. However, a moderate amount of new infrastructure is needed
to make that idea work:
* We can use the existing tstoreReceiver.c DestReceiver code to funnel
executor output to the tuplestore, but it has to be extended to support
plpgsql's requirement for possibly applying a tuple conversion map.
* SPI needs to be extended to allow use of a caller-supplied
DestReceiver instead of its usual receiver that puts tuples into
a SPITupleTable. Two new API calls are needed to handle both the
RETURN QUERY and RETURN QUERY EXECUTE cases.
I also felt that I didn't want these new API calls to use the legacy
method of specifying query parameter values with "char" null flags
(the old ' '/'n' convention); rather they should accept ParamListInfo
objects containing the parameter type and value info. This required
a bit of additional new infrastructure since we didn't yet have any
parse analysis callback that would interpret $N parameter symbols
according to type data supplied in a ParamListInfo. There seems to be
no harm in letting makeParamList install that callback by default,
rather than leaving a new ParamListInfo's parserSetup hook as NULL.
(Indeed, as of HEAD, I couldn't find anyplace that was using the
parserSetup field at all; plpgsql was using parserSetupArg for its
own purposes, but parserSetup seemed to be write-only.)
We can actually get plpgsql out of the business of using legacy null
flags altogether, and using ParamListInfo instead of its ad-hoc
PreparedParamsData structure; but this requires inventing one more
SPI API call that can replace SPI_cursor_open_with_args. That seems
worth doing, though.
SPI_execute_with_args and SPI_cursor_open_with_args are now unused
anywhere in the core PG distribution. Perhaps someday we could
deprecate/remove them. But cleaning up the crufty bits of the SPI
API is a task for a different patch.
Per bug #16040 from Jeremy Smith. This is unfortunately too invasive to
consider back-patching. Patch by me; thanks to Hamid Akhtar for review.
Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
Eliminate enable_groupingsets_hash_disk, which was primarily useful
for testing grouping sets that use HashAgg and spill. Instead, hack
the table stats to convince the planner to choose hashed aggregation
for grouping sets that will spill to disk. Suggested by Melanie
Plageman.
Rename enable_hashagg_disk to hashagg_avoid_disk_plan, and invert the
meaning of on/off. The new name indicates more strongly that it only
affects the planner. Also, the word "avoid" is less definite, which
should avoid surprises when HashAgg still needs to use the
disk. Change suggested by Justin Pryzby, though I chose a different
GUC name.
Discussion: https://postgr.es/m/CAAKRu_aisiENMsPM2gC4oUY1hHG3yrCwY-fXUg22C6_MJUwQdA%40mail.gmail.com
Discussion: https://postgr.es/m/20200610021544.GA14879@telsasoft.com
Backpatch-through: 13
In PostgreSQL 10, we stopped using System V semaphores on Linux
systems. Update the example we give of an error message from a
misconfigured system to show what people are most likely to see these
days.
Back-patch to 10, where PREFERRED_SEMAPHORES=UNNAMED_POSIX arrived.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA%2BhUKGLmJUSwybaPQv39rB8ABpqJq84im2UjZvyUY4feYhpWMw%40mail.gmail.com