For an EXISTS subquery, the only thing that matters is whether it
returns zero or more than zero rows. Therefore, we remove certain SQL
features that won't affect that, among them the GROUP BY clauses.
After we drop the groupClause, we'd better remove the RTE_GROUP RTE
and clear the hasGroupRTE flag, as they depend on the groupClause.
Failing to do so could result in a bogus RTE_GROUP entry in the parent
query, leading to an assertion failure on the hasGroupRTE flag.
Reported-by: David Rowley
Author: Richard Guo
Discussion: https://postgr.es/m/CAApHDvp2_yht8uPLyWO-kVGWZhYvx5zjGfSrg4fBQ9fsC13V0g@mail.gmail.com
Similar to the pg_set_*_stats() functions, except with a variadic
signature that's designed to be more future-proof. Additionally, most
problems are reported as WARNINGs rather than ERRORs, allowing most
stats to be restored even if some cannot.
These functions are intended to be called from pg_dump to avoid the
need to run ANALYZE after an upgrade.
Author: Corey Huinker
Discussion: https://postgr.es/m/CADkLM=eErgzn7ECDpwFcptJKOk9SxZEk5Pot4d94eVTZsvj3gw@mail.gmail.com
Enable manipulation of attribute statistics. Only superficial
validation is performed, so it's possible to add nonsense, and it's up
to the planner (or other users of statistics) to behave reasonably in
that case.
Bump catalog version.
Author: Corey Huinker
Discussion: https://postgr.es/m/CADkLM=eErgzn7ECDpwFcptJKOk9SxZEk5Pot4d94eVTZsvj3gw@mail.gmail.com
... to fix bugs when the referenced table is partitioned.
The catalog representation we chose for foreign keys connecting
partitioned tables (in commit f56f8f8da6) is inconvenient, in the
sense that a standalone table has a different way to represent the
constraint when referencing a partitioned table, than when the same
table becomes a partition (and vice versa). Because of this, we need to
create additional catalog rows on detach (pg_constraint and pg_trigger),
and remove them on attach. We were doing some of those things, but not
all of them, leading to missing catalog rows in certain cases.
The worst problem seems to be that we are missing action triggers after
detaching a partition, which means that you could update/delete rows
from the referenced partitioned table that still had referencing rows on
that table, the server failing to throw the required errors.
!!!
Note that this means existing databases with FKs that reference
partitioned tables might have rows that break relational integrity, on
tables that were once partitions on the referencing side of the FK.
Another possible problem is that trying to reattach a table
that had been detached would fail indicating that internal triggers
cannot be found, which from the user's point of view is nonsensical.
In branches 15 and above, we fix this by creating a new helper function
addFkConstraint() which is in charge of creating a standalone
pg_constraint row, and repurposing addFkRecurseReferencing() and
addFkRecurseReferenced() so that they're only the recursive routine for
each side of the FK, and they call addFkConstraint() to create
pg_constraint at each partitioning level and add the necessary triggers.
These new routines can be used during partition creation, partition
attach and detach, and foreign key creation. This reduces redundant
code and simplifies the flow.
In branches 14 and 13, we have a much simpler fix that consists on
simply removing the constraint on detach. The reason is that those
branches are missing commit f4566345cf, which reworked the way this
works in a way that we didn't consider back-patchable at the time.
We opted to leave branch 12 alone, because it's different from branch 13
enough that the fix doesn't apply; and because it is going in EOL mode
very soon, patching it now might be worse since there's no way to undo
the damage if it goes wrong.
Existing databases might need to be repaired.
In the future we might want to rethink the catalog representation to
avoid this problem, but for now the code seems to do what's required to
make the constraints operate correctly.
Co-authored-by: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Co-authored-by: Tender Wang <tndrwang@gmail.com>
Co-authored-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Guillaume Lelarge <guillaume@lelarge.info>
Reported-by: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Reported-by: Thomas Baehler (SBB CFF FFS) <thomas.baehler2@sbb.ch>
Discussion: https://postgr.es/m/20230420144344.40744130@karst
Discussion: https://postgr.es/m/20230705233028.2f554f73@karst
Discussion: https://postgr.es/m/GVAP278MB02787E7134FD691861635A8BC9032@GVAP278MB0278.CHEP278.PROD.OUTLOOK.COM
Discussion: https://postgr.es/m/18541-628a61bc267cd2d3@postgresql.org
If the query is rewritten into a NOTIFY command by a DO INSTEAD
rule, we'd get an assertion failure, or in non-assert builds
issue a rather confusing error message. Improve that.
Also fix a longstanding grammar mistake in a nearby error message.
Per bug #18664 from Alexander Lakhin. Back-patch to all supported
branches.
Tender Wang and Tom Lane
Discussion: https://postgr.es/m/18664-ffd0ebc2386598df@postgresql.org
The decision in b6e1157e7 to ignore raw_expr when evaluating a
JsonValueExpr was incorrect. While its value is not ultimately
used (since formatted_expr's value is), failing to initialize it
can lead to problems, for instance, when the expression tree in
raw_expr contains Aggref nodes, which must be initialized to
ensure the parent Agg node works correctly.
Also, optimize eval_const_expressions_mutator()'s handling of
JsonValueExpr a bit. Currently, when formatted_expr cannot be folded
into a constant, we end up processing it twice -- once directly in
eval_const_expressions_mutator() and again recursively via
ece_generic_processing(). This recursive processing is required to
handle raw_expr. To avoid the redundant processing of formatted_expr,
we now process raw_expr directly in eval_const_expressions_mutator().
Finally, update the comment of JsonValueExpr to describe the roles of
raw_expr and formatted_expr more clearly.
Bug: #18657
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Diagnosed-by: Fabio R. Sluzala <fabio3rs@gmail.com>
Diagnosed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18657-1b90ccce2b16bdb8@postgresql.org
Backpatch-through: 16
adf97c156 made it so ExprStates could support hashing and changed Hash
Join to use that instead of manually extracting Datums from tuples and
hashing them one column at a time.
When hashing multiple columns or expressions, the code added in that
commit stored the intermediate hash value in the ExprState's resvalue
field. That was a mistake as steps may be injected into the ExprState
between each hashing step that look at or overwrite the stored
intermediate hash value. EEOP_PARAM_SET is an example of such a step.
Here we fix this by adding a new dedicated field for storing
intermediate hash values and adjust the code so that all apart from the
final hashing step store their result in the intermediate field.
In passing, rename a variable so that it's more aligned to the
surrounding code and also so a few lines stay within the 80 char margin.
Reported-by: Andres Freund
Reviewed-by: Alena Rybakina <a.rybakina@postgrespro.ru>
Discussion: https://postgr.es/m/CAApHDvqo9eenEFXND5zZ9JxO_k4eTA4jKMGxSyjdTrsmYvnmZw@mail.gmail.com
This commit adds missing checks for COPY FORCE_NOT_NULL and FORCE_NULL
when applied to all columns via "*". These options now correctly
require CSV mode and are disallowed in COPY TO, making their behavior
consistent with FORCE_QUOTE.
Some regression tests are added to verify the correct behavior for the
all-columns case, including FORCE_QUOTE, which was not tested.
Backpatch down to 17, where support for the all-column grammar with
FORCE_NOT_NULL and FORCE_NULL has been added.
Author: Joel Jacobson
Reviewed-by: Zhang Mingli
Discussion: https://postgr.es/m/65030d1d-5f90-4fa4-92eb-f5f50389858e@app.fastmail.com
Backpatch-through: 17
Some queries in copy2 are there to check various option combinations,
and used "stdin" or "stdout" incompatible with the COPY TO or FROM
clauses combined with them, which was confusing. This commit rewrites
these queries to use a compatible grammar.
The coverage of the tests is unchanged. Like the original commit
451d1164b9, backpatch down to 16 where these have been introduced. A
follow-up commit will rely on this area of the tests for a bug fix.
Author: Joel Jacobson
Reviewed-by: Zhang Mingli
Discussion: https://postgr.es/m/65030d1d-5f90-4fa4-92eb-f5f50389858e@app.fastmail.com
Backpatch-through: 16
find_computable_ec_member() had the wrong mental model of what
its primary caller prepare_sort_from_pathkeys() would do with
the selected EquivalenceClass member expression. We will not
compute the EC expression in a plan node atop the one returning
the passed-in targetlist; rather, the EC expression will be
computed as an additional column of that targetlist. So any
Var or quasi-Var used in the given tlist is also available to the
EC expression. In simple cases this makes no difference because
the given tlist is just a list of Vars or quasi-Vars --- but if
we are considering an appendrel member produced by flattening
a UNION ALL, the tlist may contain expressions, resulting in
failure to match and a "could not find pathkey item to sort"
error.
To fix, we can flatten both the tlist and the EC members with
pull_var_clause(), and then just check for subset-ness, so
that the code is actually shorter than before.
While this bug is quite old, the present patch only works back to
v13. We could possibly make it work in v12 by back-patching parts
of 375398244. On the whole though I don't like the risk/reward
ratio of that idea. v12's final release is next month, meaning
there would be no chance to correct matters if the patch causes a
regression. Since this failure has escaped notice for 14 years,
it's likely nobody will hit it in the field with v12.
Per bug #18652 from Alexander Lakhin.
Andrei Lepikhov and Tom Lane
Discussion: https://postgr.es/m/18652-deaa782ebcca85d1@postgresql.org
This function returns the name, size, and last modification time of
each regular file in pg_wal/summaries. This allows administrators
to grant privileges to view the contents of this directory without
granting privileges on pg_ls_dir(), which allows listing the
contents of many other directories. This commit also gives the
pg_monitor predefined role EXECUTE privileges on the new
pg_ls_summariesdir() function.
Bumps catversion.
Author: Yushi Ogiwara
Reviewed-by: Michael Paquier, Fujii Masao
Discussion: https://postgr.es/m/a0a3af15a9b9daa107739eb45aa9a9bc%40oss.nttdata.com
The name extracted from the record of the GUC tables is applied to more
internal places of guc.c. This change has the advantage to simplify
parse_and_validate_value(), where the "name" was only used in elog
messages, while it was required to match with the name from the GUC
record.
pg_parameter_aclcheck() now passes the name of the GUC from its record
in two places rather than the caller's argument. The value given to
this function goes through convert_GUC_name_for_parameter_acl() that
does a simple ASCII downcasing.
Few GUCs mix character casing in core; one test is added for one of
these code paths with "IntervalStyle".
Author: Peter Smith, Michael Paquier
Discussion: https://postgr.es/m/ZwNh4vkc2NHJHnND@paquier.xyz
In some cases, we may want to transfer a HAVING clause into WHERE in
hopes of eliminating tuples before aggregation instead of after.
Previously, we couldn't do this if there were any nonempty grouping
sets, because we didn't have a way to tell if the HAVING clause
referenced any columns that were nullable by the grouping sets, and
moving such a clause into WHERE could potentially change the results.
Now, with expressions marked nullable by grouping sets with the RT
index of the RTE_GROUP RTE, it is much easier to identify those
clauses that reference any nullable-by-grouping-sets columns: we just
need to check if the RT index of the RTE_GROUP RTE is present in the
clause. For other HAVING clauses, they can be safely pushed down.
Author: Richard Guo
Discussion: https://postgr.es/m/CAMbWs4-NpzPgtKU=hgnvyn+J-GanxQCjrUi7piNzZ=upiCV=2Q@mail.gmail.com
For a mergejoin, if the given outer path or inner path is not already
well enough ordered, we need to do an explicit sort. Currently, we
only consider explicit full sort and do not account for incremental
sort.
In this patch, for the outer path of a mergejoin, we choose to use
explicit incremental sort if it is enabled and there are presorted
keys. For the inner path, though, we cannot use incremental sort
because it does not support mark/restore at present.
The rationale is based on the assumption that incremental sort is
always faster than full sort when there are presorted keys, a premise
that has been applied in various parts of the code. In addition, the
current cost model tends to favor incremental sort as being cheaper
than full sort in the presence of presorted keys, making it reasonable
not to consider full sort in such cases.
It could be argued that what if a mergejoin with an incremental sort
as the outer path is selected as the inner path of another mergejoin.
However, this should not be a problem, because mergejoin itself does
not support mark/restore either, and we will add a Material node on
top of it anyway in this case (see final_cost_mergejoin).
There is one ensuing plan change in the regression tests, and we have
to modify that test case to ensure that it continues to test what it
is intended to.
No backpatch as this could result in plan changes.
Author: Richard Guo
Reviewed-by: David Rowley, Tomas Vondra
Discussion: https://postgr.es/m/CAMbWs49x425QrX7h=Ux05WEnt8GS757H-jOP3_xsX5t1FoUsZw@mail.gmail.com
Previously, when ON_ERROR was set to 'ignore', the COPY command
would skip all rows with data type conversion errors, with no way to
limit the number of skipped rows before failing.
This commit introduces the REJECT_LIMIT option, allowing users to
specify the maximum number of erroneous rows that can be skipped.
If more rows encounter data type conversion errors than allowed by
REJECT_LIMIT, the COPY command will fail with an error, even when
ON_ERROR = 'ignore'.
Author: Atsushi Torikoshi
Reviewed-by: Junwang Zhao, Kirill Reshke, jian he, Fujii Masao
Discussion: https://postgr.es/m/63f99327aa6b404cc951217fa3e61fe4@oss.nttdata.com
Commit 6aa44060a3 removed pg_authid's TOAST table because the only
varlena column is rolpassword, which cannot be de-TOASTed during
authentication because we haven't selected a database yet and
cannot read pg_class. Since that change, attempts to set password
hashes that require out-of-line storage will fail with a "row is
too big" error. This error message might be confusing to users.
This commit places a limit on the length of password hashes so that
attempts to set long password hashes will fail with a more
user-friendly error. The chosen limit of 512 bytes should be
sufficient to avoid "row is too big" errors independent of BLCKSZ,
but it should also be lenient enough for all reasonable use-cases
(or at least all the use-cases we could imagine).
Reviewed-by: Tom Lane, Jonathan Katz, Michael Paquier, Jacob Champion
Discussion: https://postgr.es/m/89e8649c-eb74-db25-7945-6d6b23992394%40gmail.com
When instantiating an existing partitioned index for a new child
partition, we use generateClonedIndexStmt to build a suitable
IndexStmt to pass to DefineIndex. However, when DefineIndex needs
to recurse to instantiate a newly created partitioned index on an
existing child partition, it was doing copyObject on the given
IndexStmt and then applying a bunch of ad-hoc fixups. This has
a number of problems, primarily that it implies fresh lookups of
referenced objects such as opclasses and collations. Since commit
2af07e2f7 caused DefineIndex to restrict search_path internally, those
lookups could fail or deliver different results than the original one.
We can avoid those problems and save a few dozen lines of code by
using generateClonedIndexStmt in this code path too.
Another thing this fixes is incorrect propagation of parent-index
comments to child indexes (because the copyObject approach copies
the idxcomment field while generateClonedIndexStmt doesn't). I had
noticed this in connection with commit c01eb619a, but not run the
problem to ground.
I'm tempted to back-patch this further than v17, but the only thing
it's known to fix in older branches is the comment issue, which is
pretty minor and doesn't seem worth the risk of introducing new
issues in stable branches. (If anyone does care about that,
clearing idxcomment in the copied IndexStmt would be a safer fix.)
Per bug #18637 from usamoi. Back-patch to v17 where the search_path
change came in.
Discussion: https://postgr.es/m/18637-f51e314546e3ba2a@postgresql.org
Formerly there were two internal functions in numeric.c to perform
numeric division, div_var() and div_var_fast(). div_var() performed
division exactly to a specified rscale using Knuth's long division
algorithm, while div_var_fast() used the algorithm from the "FM"
library, which approximates each quotient digit using floating-point
arithmetic, and computes a truncated quotient with DIV_GUARD_DIGITS
extra digits. div_var_fast() could be many times faster than
div_var(), but did not guarantee correct results in all cases, and was
therefore only suitable for use in transcendental functions, where
small errors are acceptable.
This commit merges div_var() and div_var_fast() together into a single
function with an extra "exact" boolean parameter, which can be set to
false if the caller is OK with an approximate result. The new function
uses the faster algorithm from the "FM" library, except that when
"exact" is true, it does not truncate the computation with
DIV_GUARD_DIGITS extra digits, but instead performs the full-precision
computation, subtracting off complete multiples of the divisor for
each quotient digit. However, it is able to retain most of the
performance benefits of div_var_fast(), by delaying the propagation of
carries, allowing the inner loop to be auto-vectorized.
Since this may still lead to an inaccurate result, when "exact" is
true, it then inspects the remainder and uses that to adjust the
quotient, if necessary, to make it correct. In practice, the quotient
rarely needs to be adjusted, and never by more than one in the final
digit, though it's difficult to prove that, so the code allows for
larger adjustments, just in case.
In addition, use base-NBASE^2 arithmetic and a 64-bit dividend array,
similar to mul_var(), so that the number of iterations of the outer
loop is roughly halved. Together with the faster algorithm, this makes
div_var() up to around 20 times as fast as the old Knuth algorithm
when "exact" is true, and up to 2 or 3 times as fast as the old
div_var_fast() function when "exact" is false.
Dean Rasheed, reviewed by Joel Jacobson.
Discussion: https://postgr.es/m/CAEZATCVHR10BPDJSANh0u2+Sg6atO3mD0G+CjKDNRMD-C8hKzQ@mail.gmail.com
If a MERGE command contains WHEN NOT MATCHED BY SOURCE actions, the
source relation appears on the outer side of the join. Thus, any Vars
referring to the source in the merge join condition, actions, and
RETURNING list should be marked as nullable by the join, since they
are used in the ModifyTable node above the join. Note that this only
applies to the copy of join condition used in the executor to
distinguish MATCHED from NOT MATCHED BY SOURCE cases. Vars in the
original join condition, inside the join node itself, should not be
marked.
Failure to correctly mark these Vars led to a "wrong varnullingrels"
error in the final stage of query planning, in some circumstances. We
happened to get away without this in all previous tests, since they
all involved a ModifyTable node directly on top of the join node, so
that the top plan targetlist coincided with the output of the join,
and the varnullingrels check was more lax. However, if another plan
node, such as a one-time filter Result node, gets inserted between the
ModifyTable node and the join node, then a stricter check is applied,
which fails.
Per bug #18634 from Alexander Lakhin. Thanks to Tom Lane and Richard
Guo for review and analysis.
Back-patch to v17, where WHEN NOT MATCHED BY SOURCE support was added
to MERGE.
Discussion: https://postgr.es/m/18634-db5299c937877f2b%40postgresql.org
If a MERGE command contains WHEN NOT MATCHED BY SOURCE actions, the
merge join condition is used by the executor to distinguish MATCHED
from NOT MATCHED BY SOURCE cases. However, this qual is executed using
the output from the join subplan node, which nulls the output from the
source relation in the not matched case, and so the result may be
incorrect if the join condition is "non-strict" -- for example,
something like "src.col IS NOT DISTINCT FROM tgt.col".
Fix this by enhancing the join recheck condition with an additional
"src IS NOT NULL" check, so that it does the right thing when
evaluated using the output from the join subplan.
Noted by Tom Lane while investigating bug #18634 from Alexander
Lakhin.
Back-patch to v17, where WHEN NOT MATCHED BY SOURCE support was added
to MERGE.
Discussion: https://postgr.es/m/18634-db5299c937877f2b%40postgresql.org
Previously, when the on_error option was set to ignore, the COPY command
would always log NOTICE messages for input rows discarded due to
data type incompatibility. Users had no way to suppress these messages.
This commit introduces a new log_verbosity setting, 'silent',
which prevents the COPY command from emitting NOTICE messages
when on_error = 'ignore' is used, even if rows are discarded.
This feature is particularly useful when processing malformed files
frequently, where a flood of NOTICE messages can be undesirable.
For example, when frequently loading malformed files via the COPY command
or querying foreign tables using file_fdw (with an upcoming patch to
add on_error support for file_fdw), users may prefer to suppress
these messages to reduce log noise and improve clarity.
Author: Atsushi Torikoshi
Reviewed-by: Masahiko Sawada, Fujii Masao
Discussion: https://postgr.es/m/ab59dad10490ea3734cf022b16c24cfd@oss.nttdata.com
The following commands were allowed on partitioned tables, with
different effects:
1) ALTER TABLE SET [UN]LOGGED did not issue an error, and did not update
pg_class.relpersistence.
2) CREATE UNLOGGED TABLE was working with pg_class.relpersistence marked
as initially defined, but partitions did not inherit the UNLOGGED
property, which was confusing.
This commit causes the commands mentioned above to fail for partitioned
tables, instead.
pg_dump is tweaked so as partitioned tables marked as UNLOGGED ignore
the option when dumped from older server versions. pgbench needs a
tweak for --unlogged and --partitions=N to ignore the UNLOGGED option on
the partitioned tables created, its partitions still being unlogged.
Author: Michael Paquier
Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/ZiiyGFTBNkqcMQi_@paquier.xyz
We have always documented that a copy EOF marker (\.) must appear
by itself on a line, and that is how psql interprets the rule.
However, the backend's actual COPY FROM logic only insists that
there not be data between the \. and the following newline.
Any data ahead of the \. is parsed as a final line of input.
It's hard to interpret this as anything but an ancient mistake
that we've faithfully carried forward. Continuing to allow it
is not cost-free, since it could mask client-side bugs that
unnecessarily backslash-escape periods (and thereby risk
accidentally creating an EOF marker). So, let's remove that
provision and throw error if the EOF marker isn't alone on its
line, matching what the documentation has said right along.
Adjust the relevant error messages to be clearer, too.
Discussion: https://postgr.es/m/ed659f37-a9dd-42a7-82b9-0da562cc4006@manitou-mail.org
As formulated, the assertion added in the executor by 24f5205948 to
check that a query ID is set had two problems:
- track_activities may be disabled while compute_query_id is enabled,
causing the query ID to not be reported to pg_stat_activity.
- debug_query_string may not be set in some context. The only path
where this would matter is visibly autovacuum, should parallel workers
be enabled there at some point. This is not the case currently.
There was no test showing the interactions between the query ID and
track_activities, so let's add one based on a scan of pg_stat_activity.
This assertion is still an experimentation at this stage, but let's see
if this shows more paths where query IDs are not properly set while they
should.
Discussion: https://postgr.es/m/Zvn5616oYXmpXyHI@paquier.xyz
Since backslash is (typically) not special in CSV data, we should
not be treating \. as special either. The server historically did
this to keep CSV and TEXT modes more alike and to support V2 protocol;
but V2 protocol is long dead, and the inconsistency with CSV standards
is annoying. Remove that behavior in CopyReadLineText, and make some
minor consequent code simplifications.
On the client side, we need to fix psql so that it does not check
for \. except when reading data from STDIN (that is, the script
source). We must do that regardless of TEXT/CSV mode or there is
no way to end the COPY short of script EOF. Also, be careful
not to send the \. to the server in that case.
This is a small compatibility break in that other applications
beside psql may need similar adjustment. Also, using an older
version of psql with a v18 server may result in misbehavior
during CSV-mode COPY IN.
Daniel Vérité, reviewed by vignesh C, Robert Haas, and myself
Discussion: https://postgr.es/m/ed659f37-a9dd-42a7-82b9-0da562cc4006@manitou-mail.org
This restriction seems to have come about due to some fuzzy thinking: in
commit 9139aa1942 we were adding a restriction against ADD constraint
ONLY on partitioned tables (which is sensible) and apparently we thought
the DROP case had to be symmetrical. However, it isn't, and the
comments about it are mistaken about the effect it would have. Remove
this limitation.
There have been no reports of users bothered by this limitation, so I'm
not backpatching it just yet. We can revisit this decision later, as needed.
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/202409261752.nbvlawkxsttf@alvherre.pgsql
Discussion: https://postgr.es/m/7682253a-6f79-6a92-00aa-267c4c412870@lab.ntt.co.jp
(about commit 9139aa1942, previously not registered)
All these code paths use their own entry point when starting parallel
workers, but failed to set a query ID, even if they set a text query.
Hence, this data would be missed in pg_stat_activity for the worker
processes. The main entry point for parallel query processing,
ParallelQueryMain(), is already doing that by saving its query ID in a
dummy PlannedStmt, but not the others. The code is changed so as the
query ID of these queries is set in their shared state, and reported
back once the parallel workers start.
Some tests are added to show how the failures can happen for btree and
BRIN with a parallel build enforced, which are able to trigger a failure
in an assertion added by 24f5205948 in the recovery TAP test
027_stream_regress.pl where pg_stat_statements is always loaded. In
this case, the executor path was taken because the index expression
needs to be flattened when building its IndexInfo.
Alexander Lakhin has noticed the problem in btree, and I have noticed
that the issue was more spread. This is arguably a bug, but nobody has
complained about that until now, so no backpatch is done out of caution.
If folks would like to see a backpatch, well, let me know.
Reported-by: Alexander Lakhin
Reviewed-by: Sami Imseih
Discussion: https://postgr.es/m/cf3547c1-498a-6a61-7b01-819f902a251f@gmail.com
Up to now, remove_rel_from_query() has done a pretty shoddy job
of updating our where-needed bitmaps (per-Var attr_needed and
per-PlaceHolderVar ph_needed relid sets). It removed direct mentions
of the to-be-removed baserel and outer join, which is the minimum
amount of effort needed to keep the data structures self-consistent.
But it didn't account for the fact that the removed join ON clause
probably mentioned Vars of other relations, and those Vars might now
not be needed as high up in the join tree as before. It's easy to
show cases where this results in failing to remove a lower outer join
that could also have been removed.
To fix, recalculate the where-needed bitmaps from scratch after
each successful join removal. This sounds expensive, but it seems
to add only negligible planner runtime. (We cheat a little bit
by preserving "relation 0" entries in the bitmaps, allowing us to
skip re-scanning the targetlist and HAVING qual.)
The submitted test case drew attention because we had successfully
optimized away the lower join prior to v16. I suspect that that's
somewhat accidental and there are related cases that were never
optimized before and now can be. I've not tried to come up with
one, though.
Perhaps we should back-patch this into v16 and v17 to repair the
performance regression. However, since it took a year for anyone
to notice the problem, it can't be affecting too many people. Let's
let the patch bake awhile in HEAD, and see if we get more complaints.
Per bug #18627 from Mikaël Gourlaouen. No back-patch for now.
Discussion: https://postgr.es/m/18627-44f950eb6a8416c2@postgresql.org
Creating, reindexing, and dropping an index concurrently could
entail accessing pg_index's TOAST table, which was recently added
in commit b52c4fc3c0. These code paths start and commit their own
transactions, but they do not always set an active snapshot. This
rightfully leads to assertion failures and ERRORs when trying to
access pg_index's TOAST table, such as the following:
ERROR: cannot fetch toast data without an active snapshot
To fix, push an active snapshot just before each section of code
that might require accessing pg_index's TOAST table, and pop it
shortly afterwards.
Reported-by: Alexander Lakhin
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/a97d7401-e7c9-f771-6a00-037379f0a8bb%40gmail.com
int_to_roman() only accepts plain "int" input, which is fine since
we're going to produce '###############' for any value above 3999
anyway. However, the numeric and int8 variants of to_char() would
throw an error if the given input exceeded the integer range, while
the float-input variants invoked undefined-per-C-standard behavior.
Fix things so that you uniformly get '###############' for out of
range input.
Also add test cases covering this code, plus the equally-untested
EEEE, V, and PL format codes.
Discussion: https://postgr.es/m/2956175.1725831136@sss.pgh.pa.us
Since autovacuum does not trigger an ANALYZE for partitioned tables,
users must perform these manually. However, performing a manual ANALYZE
on a partitioned table would always result in recursively analyzing each
partition and that could be undesirable as autovacuum takes care of that.
For partitioned tables that contain a large number of partitions, having
to analyze each partition could take an unreasonably long time, especially
so for tables with a large number of columns.
Here we allow the ONLY keyword to prefix the name of the table to allow
users to have ANALYZE skip processing partitions. This option can also
be used with VACUUM, but there is no work to do if VACUUM ONLY is used on
a partitioned table.
This commit also changes the behavior of VACUUM and ANALYZE for
inheritance parents. Previously inheritance child tables would not be
processed when operating on the parent. Now, by default we *do* operate
on the child tables. ONLY can be used to obtain the old behavior.
The release notes should note this as an incompatibility. The default
behavior has not changed for partitioned tables as these always
recursively processed the partitions.
Author: Michael Harris <harmic@gmail.com>
Discussion: https://postgr.es/m/CADofcAWATx_haD=QkSxHbnTsAe6+e0Aw8Eh4H8cXyogGvn_kOg@mail.gmail.com
Discussion: https://postgr.es/m/CADofcAXVbD0yGp_EaC9chmzsOoSai3jcfBCnyva3j0RRdRvMVA@mail.gmail.com
Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Melih Mutlu <m.melihmutlu@gmail.com>
Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Reviewed-by: jian he <jian.universality@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Attempting these commands for a non-partitioned table would result in a
failure when creating the relation in transformPartitionCmd(). This
gives the possibility to throw an error earlier with a much better error
message, thanks to d69a3f4d70.
The extra test cases are from me. Note that FINALIZE uses a different
subcommand and it had no coverage for its failure path with
non-partitioned tables.
Author: Álvaro Herrera, Michael Paquier
Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/202409190803.tnis52adt2n5@alvherre.pgsql
pg_authid's only varlena column is rolpassword, which unfortunately
cannot be de-TOASTed during authentication because we haven't
selected a database yet and cannot read pg_class. By removing
pg_authid's TOAST table, attempts to set password hashes that
require out-of-line storage will fail with a "row is too big"
error instead. We may want to provide a more user-friendly error
in the future, but for now let's just remove the useless TOAST
table.
Bumps catversion.
Reported-by: Alexander Lakhin
Reviewed-by: Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/89e8649c-eb74-db25-7945-6d6b23992394%40gmail.com
The implementation assumes that on multiple calls of these meta-commands
the last one wins. Multiple \g calls in-between mean multiple
executions.
There were no tests to check these properties, hence let's add
something.
Author: Jelte Fennema-Nio, Michael Paquier
Discussion: https://postgr.es/m/CAGECzQSTE7CoM=Gst56Xj8pOvjaPr09+7jjtWqTC40pGETyAuA@mail.gmail.com
This fixes a couple of issues with the psql meta-commands mentioned
above when called repeatedly:
- The statement name is reset for each call. If a command errors out,
its send_mode would still be set, causing an incorrect path to be taken
when processing a query. For \bind_named, this could trigger an
assertion failure as a statement name is always expected for this
meta-command. This issue has been introduced by d55322b0da.
- The memory allocated for bind parameters can be leaked. This is a bug
enlarged by d55322b0da that exists since 5b66de3433, as it is also
possible to leak memory with \bind in v16 and v17. This requires a fix
that will be done on the affected branches separately. This issue is
taken care of here for HEAD.
This patch tightens the cleanup of the state used for the extended
protocol meta-commands (bind parameters, send mode, statement name) by
doing it before running each meta-command on top of doing it once a
query has been processed, avoiding any leaks and the inconsistencies
when mixing calls, by refactoring the cleanup in a single routine used
in all the code paths where this step is required.
Reported-by: Alexander Lakhin
Author: Anthonin Bonnefoy
Discussion: https://postgr.es/m/2e5b89af-a351-ff0a-000c-037ac28314ab@gmail.com
This change allows pg_index rows to use out-of-line storage for the
"indexprs" and "indpred" columns, which enables use-cases such as
very large index expressions.
This system catalog was previously not given a TOAST table due to a
fear of circularity issues (see commit 96cdeae07f). Testing has
not revealed any such problems, and it seems unlikely that the
entries for system indexes could ever need out-of-line storage. In
any case, it is still early in the v18 development cycle, so
committing this now will hopefully increase the chances of finding
any unexpected problems prior to release.
Bumps catversion.
Reported-by: Jonathan Katz
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/b611015f-b423-458c-aa2d-be0e655cc1b4%40postgresql.org
This opens the possibility to define keys for more types of statistics
kinds in PgStat_HashKey, the first case being 8-byte query IDs for
statistics like pg_stat_statements.
This increases the size of PgStat_HashKey from 12 to 16 bytes, while
PgStatShared_HashEntry, entry stored in the dshash for pgstats, keeps
the same size due to alignment.
xl_xact_stats_item, that tracks the stats items to drop in commit WAL
records, is increased from 12 to 16 bytes. Note that individual chunks
in commit WAL records should be multiples of sizeof(int), hence 8-byte
object IDs are stored as two uint32, based on a suggestion from Heikki
Linnakangas.
While on it, the field of PgStat_HashKey is renamed from "objoid" to
"objid", as for some stats kinds this field does not refer to OIDs but
just IDs, like for replication slot stats.
This commit bumps the following format variables:
- PGSTAT_FILE_FORMAT_ID, as PgStat_HashKey is written to the stats file
for non-serialized stats kinds in the dshash table.
- XLOG_PAGE_MAGIC for the changes in xl_xact_stats_item.
- Catalog version, for the SQL function pg_stat_have_stats().
Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/ZsvTS9EW79Up8I62@paquier.xyz
Since we introduced unlogged sequences in v15, identity sequences
have defaulted to having the same persistence as their owning table.
However, it is possible to change that with ALTER SEQUENCE, and
pg_dump tries to preserve the logged-ness of sequences when it doesn't
match (as indeed it wouldn't for an unlogged table from before v15).
The fly in the ointment is that ALTER SEQUENCE SET [UN]LOGGED fails
in binary-upgrade mode, because it needs to assign a new relfilenode
which we cannot permit in that mode. Thus, trying to pg_upgrade a
database containing a mismatching identity sequence failed.
To fix, add syntax to ADD/ALTER COLUMN GENERATED AS IDENTITY to allow
the sequence's persistence to be set correctly at creation, and use
that instead of ALTER SEQUENCE SET [UN]LOGGED in pg_dump. (I tried to
make SET [UN]LOGGED work without any pg_dump modifications, but that
seems too fragile to be a desirable answer. This way should be
markedly faster anyhow.)
In passing, document the previously-undocumented SEQUENCE NAME option
that pg_dump also relies on for identity sequences; I see no value
in trying to pretend it doesn't exist.
Per bug #18618 from Anthony Hsu.
Back-patch to v15 where we invented this stuff.
Discussion: https://postgr.es/m/18618-d4eb26d669ed110a@postgresql.org
Add PERIOD clause to foreign key constraint definitions. This is
supported for range and multirange types. Temporal foreign keys check
for range containment instead of equality.
This feature matches the behavior of the SQL standard temporal foreign
keys, but it works on PostgreSQL's native ranges instead of SQL's
"periods", which don't exist in PostgreSQL (yet).
Reference actions ON {UPDATE,DELETE} {CASCADE,SET NULL,SET DEFAULT}
are not supported yet.
(previously committed as 34768ee361, reverted by 8aee330af55; this is
essentially unchanged from those)
Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
Add WITHOUT OVERLAPS clause to PRIMARY KEY and UNIQUE constraints.
These are backed by GiST indexes instead of B-tree indexes, since they
are essentially exclusion constraints with = for the scalar parts of
the key and && for the temporal part.
(previously committed as 46a0cd4cef, reverted by 46a0cd4cefb; the new
part is this:)
Because 'empty' && 'empty' is false, the temporal PK/UQ constraint
allowed duplicates, which is confusing to users and breaks internal
expectations. For instance, when GROUP BY checks functional
dependencies on the PK, it allows selecting other columns from the
table, but in the presence of duplicate keys you could get the value
from any of their rows. So we need to forbid empties.
This all means that at the moment we can only support ranges and
multiranges for temporal PK/UQs, unlike the original patch (above).
Documentation and tests for this are added. But this could
conceivably be extended by introducing some more general support for
the notion of "empty" for other types.
Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
This is support function 12 for the GiST AM and translates
"well-known" RT*StrategyNumber values into whatever strategy number is
used by the opclass (since no particular numbers are actually
required). We will use this to support temporal PRIMARY
KEY/UNIQUE/FOREIGN KEY/FOR PORTION OF functionality.
This commit adds two implementations, one for internal GiST opclasses
(just an identity function) and another for btree_gist opclasses. It
updates btree_gist from 1.7 to 1.8, adding the support function for
all its opclasses.
(previously committed as 6db4598fcb, reverted by 8aee330af55; this is
essentially unchanged from those)
Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
This commit is similar to 1eff8279d and expands the idea to Window
aggregate nodes so that users can know how much memory or disk the
tuplestore used.
This commit uses newly introduced tuplestore_get_stats() to inquire this
information and add some additional output in EXPLAIN ANALYZE to
display the information for the Window aggregate node.
Reviewed-by: David Rowley, Ashutosh Bapat, Maxim Orlov, Jian He
Discussion: https://postgr.es/m/20240706.202254.89740021795421286.ishii%40postgresql.org
Historically we've used timezone "PST8PDT", but the recent release
2024b of tzdb changes the definition of that zone in a way that
breaks many test cases concerned with dates before 1970. Although
we've not yet adopted 2024b into our own tree, this is already
problematic for people using --with-system-tzdata if their platform
has already adopted 2024b. To work with both older and newer
versions of tzdb, switch to using "America/Los_Angeles", accepting
the ensuing changes in regression test results.
Back-patch to all supported branches.
Per report and patch from Wolfgang Walther.
Discussion: https://postgr.es/m/0a997455-5aba-4cf2-a354-d26d8bcbfae6@technowledgy.de