postgres

mirror of https://github.com/postgres/postgres.git synced 2025-12-04 12:02:48 +03:00

Author	SHA1	Message	Date
Tom Lane	d9a81e8671	Suppress variable-set-but-not-used warning from clang 13. In the normal configuration where GEQO_DEBUG isn't defined, recent clang versions have started to complain that geqo_main.c accumulates the edge_failures count but never does anything with it. As a minimal back-patchable fix, insert a void cast to silence this warning. (I'd speculated about ripping out the GEQO_DEBUG logic altogether, but I don't think we'd wish to back-patch that.) Per recently-established project policy, this is a candidate for back-patching into out-of-support branches: it suppresses an annoying compiler warning but changes no behavior. Hence, back-patch all the way to 9.2. Discussion: https://postgr.es/m/CA+hUKGLTSZQwES8VNPmWO9AO0wSeLt36OCPDAZTccT1h7Q7kTQ@mail.gmail.com	2022-01-23 11:09:43 -05:00
Tomas Vondra	a3eb08b809	Correct type of front_pathkey to PathKey In sort_inner_and_outer we iterate a list of PathKey elements, but the variable is declared as (List *). This mistake is benign, because we only pass the pointer to lcons() and never dereference it. This exists since ~2004, but it's confusing. So fix and backpatch to all supported branches. Backpatch-through: 10 Discussion: https://postgr.es/m/bf3a6ea1-a7d8-7211-0669-189d5c169374%40enterprisedb.com	2022-01-23 03:54:13 +01:00
Andres Freund	f862cc09fa	fsync pg_logical/mappings in CheckPointLogicalRewriteHeap(). While individual logical rewrite files were synced to disk, the directory was not. On some filesystems that could lead to loosing directory entries after a crash. Reported-By: Tom Lane <tgl@sss.pgh.pa.us> Author: Nathan Bossart <bossartn@amazon.com> Discussion: https://postgr.es/m/867F2E29-2782-4869-970E-B984C6D35A8F@amazon.com Backpatch: 10-	2022-01-21 11:24:12 -08:00
Michael Paquier	919be95c6f	Fix one-off bug causing missing commit timestamps for subtransactions The logic in charge of writing commit timestamps (enabled with track_commit_timestamp) for subtransactions had a one-bug bug, where it would be possible that commit timestamps go missing for the last subtransaction committed. While on it, simplify a bit the iteration logic in the loop writing the commit timestamps, as per suggestions from Kyotaro Horiguchi and Tom Lane, so as some variable initializations are not part of the loop itself. Issue introduced in `73c986a`. Analyzed-by: Alex Kingsborough Author: Alex Kingsborough, Kyotaro Horiguchi Discussion: https://postgr.es/m/73A66172-4050-4F2A-B7F1-13508EDA2144@amazon.com Backpatch-through: 10	2022-01-21 14:55:04 +09:00
Tomas Vondra	9211c2e38f	Build inherited extended stats on partitioned tables Commit `859b3003de` disabled building of extended stats for inheritance trees, to prevent updating the same catalog row twice. While that resolved the issue, it also means there are no extended stats for declaratively partitioned tables, because there are no data in the non-leaf relations. That also means declaratively partitioned tables were not affected by the issue `859b3003de` addressed, which means this is a regression affecting queries that calculate estimates for the whole inheritance tree as a whole (which includes e.g. GROUP BY queries). But because partitioned tables are empty, we can invert the condition and build statistics only for the case with inheritance, without losing anything. And we can consider them when calculating estimates. It may be necessary to run ANALYZE on partitioned tables, to collect proper statistics. For declarative partitioning there should no prior statistics, and it might take time before autoanalyze is triggered. For tables partitioned by inheritance the statistics may include data from child relations (if built `859b3003de`), contradicting the current code. Report and patch by Justin Pryzby, minor fixes and cleanup by me. Backpatch all the way back to PostgreSQL 10, where extended statistics were introduced (same as `859b3003de`). Author: Justin Pryzby Reported-by: Justin Pryzby Backpatch-through: 10 Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com	2022-01-15 18:30:45 +01:00
Tomas Vondra	ff0e7c7e84	Ignore extended statistics for inheritance trees Since commit `859b3003de` we only build extended statistics for individual relations, ignoring the child relations. This resolved the issue with updating catalog tuple twice, but we still tried to use the statistics when calculating estimates for the whole inheritance tree. When the relations contain very distinct data, it may produce bogus estimates. This is roughly the same issue `427c6b5b9` addressed ~15 years ago, and we fix it the same way - by ignoring extended statistics when calculating estimates for the inheritance tree as a whole. We still consider extended statistics when calculating estimates for individual child relations, of course. This may result in plan changes due to different estimates, but if the old statistics were not describing the inheritance tree particularly well it's quite likely the new plans is actually better. Report and patch by Justin Pryzby, minor fixes and cleanup by me. Backpatch all the way back to PostgreSQL 10, where extended statistics were introduced (same as `859b3003de`). Author: Justin Pryzby Reported-by: Justin Pryzby Backpatch-through: 10 Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com	2022-01-15 03:05:06 +01:00
Tom Lane	3433a1fc76	Fix ruleutils.c's dumping of whole-row Vars in more contexts. Commit `7745bc352` intended to ensure that whole-row Vars would be printed with "::type" decoration in all contexts where plain "var.*" notation would result in star-expansion, notably in ROW() and VALUES() constructs. However, it missed the case of INSERT with a single-row VALUES, as reported by Timur Khanjanov. Nosing around ruleutils.c, I found a second oversight: the code for RowCompareExpr generates ROW() notation without benefit of an actual RowExpr, and naturally it wasn't in sync :-(. (The code for FieldStore also does this, but we don't expect that to generate strictly parsable SQL anyway, so I left it alone.) Back-patch to all supported branches. Discussion: https://postgr.es/m/efaba6f9-4190-56be-8ff2-7a1674f9194f@intrans.baku.az	2022-01-13 17:49:26 -05:00
Tom Lane	e5b044c84e	Prevent altering partitioned table's rowtype, if it's used elsewhere. We disallow altering a column datatype within a regular table, if the table's rowtype is used as a column type elsewhere, because we lack code to go around and rewrite the other tables. This restriction should apply to partitioned tables as well, but it was not checked because ATRewriteTables and ATPrepAlterColumnType were not on the same page about who should do it for which relkinds. Per bug #17351 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://postgr.es/m/17351-6db1870f3f4f612a@postgresql.org	2022-01-06 16:46:46 -05:00
Alvaro Herrera	4a8282425f	Fix silly mistake in Assert	2022-01-04 13:21:23 -03:00
Alvaro Herrera	026a93727c	Allow special SKIP LOCKED condition in Assert() Under concurrency, it is possible for two sessions to be merrily locking and releasing a tuple and marking it again as HEAP_XMAX_INVALID all the while a third session attempts to lock it, miserably fails at it, and then contemplates life, the universe and everything only to eventually fail an assertion that said bit is not set. Before SKIP LOCKED that was indeed a reasonable expectation, but alas! commit `df630b0dd5` falsified it. This bug is as old as time itself, and even older, if you think time begins with the oldest supported branch. Therefore, backpatch to all supported branches. Author: Simon Riggs <simon.riggs@enterprisedb.com> Discussion: https://postgr.es/m/CANbhV-FeEwMnN8yuMyss7if1ZKjOKfjcgqB26n8pqu1e=q0ebg@mail.gmail.com	2022-01-04 13:01:05 -03:00
Tom Lane	7d344f0041	Fix index-only scan plans, take 2. Commit `4ace45677` failed to fix the problem fully, because the same issue of attempting to fetch a non-returnable index column can occur when rechecking the indexqual after using a lossy index operator. Moreover, it broke EXPLAIN for such indexquals (which indicates a gap in our test cases :-(). Revert the code changes of `4ace45677` in favor of adding a new field to struct IndexOnlyScan, containing a version of the indexqual that can be executed against the index-returned tuple without using any non-returnable columns. (The restrictions imposed by check_index_only guarantee this is possible, although we may have to recompute indexed expressions.) Support construction of that during setrefs.c processing by marking IndexOnlyScan.indextlist entries as resjunk if they can't be returned, rather than removing them entirely. (We could alternatively require setrefs.c to look up the IndexOptInfo again, but abusing resjunk this way seems like a reasonably safe way to avoid needing to do that.) This solution isn't great from an API-stability standpoint: if there are any extensions out there that build IndexOnlyScan structs directly, they'll be broken in the next minor releases. However, only a very invasive extension would be likely to do such a thing. There's no change in the Path representation, so typical planner extensions shouldn't have a problem. As before, back-patch to all supported branches. Discussion: https://postgr.es/m/3179992.1641150853@sss.pgh.pa.us Discussion: https://postgr.es/m/17350-b5bdcf476e5badbb@postgresql.org	2022-01-03 15:42:27 -05:00
Tom Lane	70a31a0e34	Fix index-only scan plans when not all index columns can be returned. If an index has both returnable and non-returnable columns, and one of the non-returnable columns is an expression using a Var that is in a returnable column, then a query returning that expression could result in an index-only scan plan that attempts to read the non-returnable column, instead of recomputing the expression from the returnable column as intended. To fix, redefine the "indextlist" list of an IndexOnlyScan plan node as containing null Consts in place of any non-returnable columns. This solves the problem by preventing setrefs.c from falsely matching to such entries. The executor is happy since it only cares about the exposed types of the entries, and ruleutils.c doesn't care because a correct plan won't reference those entries. I considered some other ways to prevent setrefs.c from doing the wrong thing, but this way seems good since (a) it allows a very localized fix, (b) it makes the indextlist structure more compact in many cases, and (c) the indextlist is now a more faithful representation of what the index AM will actually produce, viz. nulls for any non-returnable columns. This is easier to hit since we introduced included columns, but it's possible to construct failing examples without that, as per the added regression test. Hence, back-patch to all supported branches. Per bug #17350 from Louis Jachiet. Discussion: https://postgr.es/m/17350-b5bdcf476e5badbb@postgresql.org	2022-01-01 16:12:03 -05:00
Tom Lane	1acf345869	Ensure casting to typmod -1 generates a RelabelType. Fix the code changed by commit `5c056b0c2` so that we always generate RelabelType, not something else, for a cast to unspecified typmod. Otherwise planner optimizations might not happen. It appears we missed this point because the previous experiments were done on type numeric: the parser undesirably generates a call on the numeric() length-coercion function, but then numeric_support() optimizes that down to a RelabelType, so that everything seems fine. It misbehaves for types that have a non-optimized length coercion function, such as bpchar. Per report from John Naylor. Back-patch to all supported branches, as the previous patch eventually was. Unfortunately, that no longer includes 9.6 ... we really shouldn't put this type of change into a nearly-EOL branch. Discussion: https://postgr.es/m/CAFBsxsEfbFHEkouc+FSj+3K1sHipLPbEC67L0SAe-9-da8QtYg@mail.gmail.com	2021-12-16 15:36:02 -05:00
Tom Lane	878f38b80e	On Windows, also call shutdown() while closing the client socket. Further experimentation shows that commit `6051857fc` is not sufficient when using (some versions of?) OpenSSL. The reason is obscure, but calling shutdown(socket, SD_SEND) improves matters. Per testing by Andrew Dunstan and Alexander Lakhin. Back-patch as before. Discussion: https://postgr.es/m/af5e0bf3-6a61-bb97-6cba-061ddf22ff6b@dunslane.net	2021-12-07 13:34:32 -05:00
Tom Lane	00cd81723c	On Windows, close the client socket explicitly during backend shutdown. It turns out that this is necessary to keep Winsock from dropping any not-yet-sent data, such as an error message explaining the reason for process termination. It's pretty weird that the implicit close done by the kernel acts differently from an explicit close, but it's hard to argue with experimental results. Independently submitted by Alexander Lakhin and Lars Kanis (comments by me, though). Back-patch to all supported branches. Discussion: https://postgr.es/m/90b34057-4176-7bb0-0dbb-9822a5f6425b@greiz-reinsdorf.de Discussion: https://postgr.es/m/16678-253e48d34dc0c376@postgresql.org	2021-12-02 17:15:14 -05:00
Tom Lane	fec187dc3c	Avoid leaking memory during large-scale REASSIGN OWNED BY operations. The various ALTER OWNER routines tend to leak memory in CurrentMemoryContext. That's not a problem when they're only called once per command; but in this usage where we might be touching many objects, it can amount to a serious memory leak. Fix that by running each call in a short-lived context. (DROP OWNED BY likely has a similar issue, except that you'll probably run out of lock table space before noticing. REASSIGN is worth fixing since for most non-table object types, it won't take any lock.) Back-patch to all supported branches. Unfortunately, in the back branches this helps to only a limited extent, since the sinval message queue bloats quite a lot in this usage before commit `3aafc030a`, consuming memory more or less comparable to what's actually leaked. Still, it's clearly a leak with a simple fix, so we might as well fix it. Justin Pryzby, per report from Guillaume Lelarge Discussion: https://postgr.es/m/CAECtzeW2DAoioEGBRjR=CzHP6TdL=yosGku8qZxfX9hhtrBB0Q@mail.gmail.com	2021-12-01 13:44:47 -05:00
Alvaro Herrera	72cf39d51a	Fix determination of broken LSN in OVERWRITTEN_CONTRECORD In commit `ff9f111bce` I mixed up inconsistent definitions of the LSN of the first record in a page, when the previous record ends exactly at the page boundary. The correct LSN is adjusted to skip the WAL page header; I failed to use that when setting XLogReaderState->overwrittenRecPtr, so at WAL replay time VerifyOverwriteContrecord would refuse to let replay continue past that record. Backpatch to 10. 9.6 also contains this bug, but it's no longer being maintained. Discussion: https://postgr.es/m/45597.1637694259@sss.pgh.pa.us	2021-11-26 11:14:27 -03:00
Michael Paquier	817c469c2a	Block ALTER TABLE .. DROP NOT NULL on columns in replica identity index Replica identities that depend directly on an index rely on a set of properties, one of them being that all the columns defined in this index have to be marked as NOT NULL. There was a hole in the logic with ALTER TABLE DROP NOT NULL, where it was possible to remove the NOT NULL property of a column part of an index used as replica identity, so block it to avoid problems with logical decoding down the road. The same check was already done columns part of a primary key, so the fix is straight-forward. Author: Haiying Tang, Hou Zhijie Reviewed-by: Dilip Kumar, Michael Paquier Discussion: https://postgr.es/m/OS0PR01MB6113338C102BEE8B2FFC5BD9FB619@OS0PR01MB6113.jpnprd01.prod.outlook.com Backpatch-through: 10	2021-11-25 15:05:37 +09:00
Amit Kapila	2c0443c595	Invalidate relcache when changing REPLICA IDENTITY index. When changing REPLICA IDENTITY INDEX to another one, the target table's relcache was not being invalidated. This leads to skipping update/delete operations during apply on the subscriber side as the columns required to search corresponding rows won't get logged. Author: Tang Haiying, Hou Zhijie Reviewed-by: Euler Taveira, Amit Kapila Backpatch-through: 10 Discussion: https://postgr.es/m/OS0PR01MB61133CA11630DAE45BC6AD95FB939@OS0PR01MB6113.jpnprd01.prod.outlook.com	2021-11-16 09:44:00 +05:30
Noah Misch	2f60fd647d	Report any XLogReadRecord() error in XlogReadTwoPhaseData(). Buildfarm members kittiwake and tadarida have witnessed errors at this site. The site discarded key facts. Back-patch to v10 (all supported versions). Reviewed by Michael Paquier and Tom Lane. Discussion: https://postgr.es/m/20211107013157.GB790288@rfd.leadboat.com	2021-11-11 17:11:19 -08:00
Tom Lane	fcfb40dcc1	Doc: improve protocol spec for logical replication Type messages. protocol.sgml documented the layout for Type messages, but completely dropped the ball otherwise, failing to explain what they are, when they are sent, or what they're good for. While at it, do a little copy-editing on the description of Relation messages. In passing, adjust the comment for apply_handle_type() to make it clearer that we choose not to do anything when receiving a Type message, not that we think it has no use whatsoever. Per question from Stefen Hillman. Discussion: https://postgr.es/m/CAPgW8pMknK5pup6=T4a_UG=Cz80Rgp=KONqJmTdHfaZb0RvnFg@mail.gmail.com	2021-11-10 13:12:58 -05:00
Tom Lane	9ae0f11129	Reject extraneous data after SSL or GSS encryption handshake. The server collects up to a bufferload of data whenever it reads data from the client socket. When SSL or GSS encryption is requested during startup, any additional data received with the initial request message remained in the buffer, and would be treated as already-decrypted data once the encryption handshake completed. Thus, a man-in-the-middle with the ability to inject data into the TCP connection could stuff some cleartext data into the start of a supposedly encryption-protected database session. This could be abused to send faked SQL commands to the server, although that would only work if the server did not demand any authentication data. (However, a server relying on SSL certificate authentication might well not do so.) To fix, throw a protocol-violation error if the internal buffer is not empty after the encryption handshake. Our thanks to Jacob Champion for reporting this problem. Security: CVE-2021-23214	2021-11-08 11:01:43 -05:00
Alvaro Herrera	c3bda112eb	Fix typo Introduced in `1d97d3d086`. Co-authored-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/83641f59-d566-b33e-ef21-a272a98675aa@gmail.com	2021-11-08 09:17:24 -03:00
Peter Eisentraut	992d0c3a9c	Translation updates Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 3f8ccab66ae01c89727b0284ac600ae6648c1adf	2021-11-08 10:09:21 +01:00
Alexander Korotkov	774d005739	Reset lastOverflowedXid on standby when needed Currently, lastOverflowedXid is never reset. It's just adjusted on new transactions known to be overflowed. But if there are no overflowed transactions for a long time, snapshots could be mistakenly marked as suboverflowed due to wraparound. This commit fixes this issue by resetting lastOverflowedXid when needed altogether with KnownAssignedXids. Backpatch to all supported versions. Reported-by: Stan Hu Discussion: https://postgr.es/m/CAMBWrQ%3DFp5UAsU_nATY7EMY7NHczG4-DTDU%3DmCvBQZAQ6wa2xQ%40mail.gmail.com Author: Kyotaro Horiguchi, Alexander Korotkov Reviewed-by: Stan Hu, Simon Riggs, Nikolay Samokhvalov, Andrey Borodin, Dmitry Dolgov	2021-11-06 18:34:26 +03:00
Alvaro Herrera	58b600f64b	Avoid crash in rare case of concurrent DROP When a role being dropped contains is referenced by catalog objects that are concurrently also being dropped, a crash can result while trying to construct the string that describes the objects. Suppress that by ignoring objects whose descriptions are returned as NULL. The majority of relevant codesites were already cautious about this already; we had just missed a couple. This is an old bug, so backpatch all the way back. Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/17126-21887f04508cb5c8@postgresql.org	2021-11-05 12:29:34 -03:00
Heikki Linnakangas	7b55bb892a	Fix snapshot reference leak if lo_export fails. If lo_export() fails to open the target file or to write to it, it leaks the created LargeObjectDesc and its snapshot in the top-transaction context and resource owner. That's pretty harmless, it's a small leak after all, but it gives the user a "Snapshot reference leak" warning. Fix by using a short-lived memory context and no resource owner for transient LargeObjectDescs that are opened and closed within one function call. The leak is easiest to reproduce with lo_export() on a directory that doesn't exist, but in principle the other lo_* functions could also fail. Backpatch to all supported versions. Reported-by: Andrew B Reviewed-by: Alvaro Herrera Discussion: https://www.postgresql.org/message-id/32bf767a-2d65-71c4-f170-122f416bab7e@iki.fi	2021-11-03 11:09:08 +02:00
Alvaro Herrera	656312c2ac	Handle XLOG_OVERWRITE_CONTRECORD in DecodeXLogOp Failing to do so results in inability of logical decoding to process the WAL stream. Handle it by doing nothing. Backpatch all the way back. Reported-by: Petr Jelínek <petr.jelinek@enterprisedb.com>	2021-11-01 13:07:23 -03:00
Noah Misch	560124a37c	Fix CREATE INDEX CONCURRENTLY for the newest prepared transactions. The purpose of commit `8a54e12a38` was to fix this, and it sufficed when the PREPARE TRANSACTION completed before the CIC looked for lock conflicts. Otherwise, things still broke. As before, in a cluster having used CIC while having enabled prepared transactions, queries that use the resulting index can silently fail to find rows. It may be necessary to reindex to recover from past occurrences; REINDEX CONCURRENTLY suffices. Fix this for future index builds by making CIC wait for arbitrarily-recent prepared transactions and for ordinary transactions that may yet PREPARE TRANSACTION. As part of that, have PREPARE TRANSACTION transfer locks to its dummy PGPROC before it calls ProcArrayClearTransaction(). Back-patch to 9.6 (all supported versions). Andrey Borodin, reviewed (in earlier versions) by Andres Freund. Discussion: https://postgr.es/m/01824242-AA92-4FE9-9BA7-AEBAFFEA3D0C@yandex-team.ru	2021-10-23 18:36:43 -07:00
Noah Misch	db86746fd1	Avoid race in RelationBuildDesc() affecting CREATE INDEX CONCURRENTLY. CIC and REINDEX CONCURRENTLY assume backends see their catalog changes no later than each backend's next transaction start. That failed to hold when a backend absorbed a relevant invalidation in the middle of running RelationBuildDesc() on the CIC index. Queries that use the resulting index can silently fail to find rows. Fix this for future index builds by making RelationBuildDesc() loop until it finishes without accepting a relevant invalidation. It may be necessary to reindex to recover from past occurrences; REINDEX CONCURRENTLY suffices. Back-patch to 9.6 (all supported versions). Noah Misch and Andrey Borodin, reviewed (in earlier versions) by Andres Freund. Discussion: https://postgr.es/m/20210730022548.GA1940096@gust.leadboat.com	2021-10-23 18:36:43 -07:00
Amit Kapila	13e52d7c55	Back-patch "Add parent table name in an error in reorderbuffer.c." This was originally done in commit `5e77625b26` for 15 only, as a troubleshooting aid but multiple people showed interest in back-patching this. Author: Jeremy Schneider Reviewed-by: Amit Kapila Backpatch-through: 9.6 Discussion: https://postgr.es/m/808ed65b-994c-915a-361c-577f088b837f@amazon.com	2021-10-21 10:12:59 +05:30
Tom Lane	9681c8fd5f	Remove bogus assertion in transformExpressionList(). I think when I added this assertion (in commit `8f889b108`), I was only thinking of the use of transformExpressionList at top level of INSERT and VALUES. But it's also called by transformRowExpr(), which can certainly occur in an UPDATE targetlist, so it's inappropriate to suppose that p_multiassign_exprs must be empty. Besides, since the input is not expected to contain ResTargets, there's no reason it should contain MultiAssignRefs either. Hence this code need not be concerned about the state of p_multiassign_exprs, and we should just drop the assertion. Per bug #17236 from ocean_li_996. It's been wrong for years, so back-patch to all supported branches. Discussion: https://postgr.es/m/17236-3210de9bcba1d7ca@postgresql.org	2021-10-19 11:35:15 -04:00
Alvaro Herrera	d36bdc4e9d	Invalidate partitions of table being attached/detached Failing to do that, any direct inserts/updates of those partitions would fail to enforce the correct constraint, that is, one that considers the new partition constraint of their parent table. Backpatch to 10. Reported by: Hou Zhijie <houzj.fnst@fujitsu.com> Author: Amit Langote <amitlangote09@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Nitin Jadhav <nitinjadhavpostgres@gmail.com> Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com> Discussion: https://postgr.es/m/OS3PR01MB5718DA1C4609A25186D1FBF194089%40OS3PR01MB5718.jpnprd01.prod.outlook.com	2021-10-18 19:08:25 -03:00
Michael Paquier	d1a6a08dfa	Reset properly snapshot export state during transaction abort During a replication slot creation, an ERROR generated in the same transaction as the one creating a to-be-exported snapshot would have left the backend in an inconsistent state, as the associated static export snapshot state was not being reset on transaction abort, but only on the follow-up command received by the WAL sender that created this snapshot on replication slot creation. This would trigger inconsistency failures if this session tried to export again a snapshot, like during the creation of a replication slot. Note that a snapshot export cannot happen in a transaction block, so there is no need to worry resetting this state for subtransaction aborts. Also, this inconsistent state would very unlikely show up to users. For example, one case where this could happen is an out-of-memory error when building the initial snapshot to-be-exported. Dilip found this problem while poking at a different patch, that caused an error in this code path for reasons unrelated to HEAD. Author: Dilip Kumar Reviewed-by: Michael Paquier, Zhihong Yu Discussion: https://postgr.es/m/CAFiTN-s0zA1Kj0ozGHwkYkHwa5U0zUE94RSc_g81WrpcETB5=w@mail.gmail.com Backpatch-through: 9.6	2021-10-18 11:57:02 +09:00
Jeff Davis	9364f64a2a	Check criticalSharedRelcachesBuilt in GetSharedSecurityLabel(). An extension may want to call GetSecurityLabel() on a shared object before the shared relcaches are fully initialized. For instance, a ClientAuthentication_hook might want to retrieve the security label on a role. Discussion: https://postgr.es/m/ecb7af0b26e3be1d96d291c8453a86f1f82d9061.camel@j-davis.com Backpatch-through: 9.6	2021-10-14 12:25:48 -07:00
Dean Rasheed	4853baaaca	Fix corner-case loss of precision in numeric_power(). This fixes a loss of precision that occurs when the first input is very close to 1, so that its logarithm is very small. Formerly, during the initial low-precision calculation to estimate the result weight, the logarithm was computed to a local rscale that was capped to NUMERIC_MAX_DISPLAY_SCALE (1000). However, the base may be as close as 1e-16383 to 1, hence its logarithm may be as small as 1e-16383, and so the local rscale needs to be allowed to exceed 16383, otherwise all precision is lost, leading to a poor choice of rscale for the full-precision calculation. Fix this by removing the cap on the local rscale during the initial low-precision calculation, as we already do in the full-precision calculation. This doesn't change the fact that the initial calculation is a low-precision approximation, computing the logarithm to around 8 significant digits, which is very fast, especially when the base is very close to 1. Patch by me, reviewed by Alvaro Herrera. Discussion: https://postgr.es/m/CAEZATCV-Ceu%2BHpRMf416yUe4KKFv%3DtdgXQAe5-7S9tD%3D5E-T1g%40mail.gmail.com	2021-10-06 13:23:13 +01:00
Michael Paquier	8a6a1fe07e	Fix snapshot builds during promotion of hot standby node with 2PC Some specific logic is done at the end of recovery when involving 2PC transactions: 1) Call RecoverPreparedTransactions(), to recover the state of 2PC transactions into memory (re-acquire locks, etc.). 2) ShutdownRecoveryTransactionEnvironment(), to move back to normal operations, mainly cleaning up recovery locks and KnownAssignedXids (including any 2PC transaction tracked previously). 3) Switch XLogCtl->SharedRecoveryState to RECOVERY_STATE_DONE, which is the tipping point for any process calling RecoveryInProgress() to check if the cluster is still in recovery or not. Any snapshot taken between steps 2) and 3) would be empty, causing any transaction relying on a snapshot at this point to potentially corrupt data as there could still be some 2PC transactions to track, with RecentXmin moving backwards on successive calls to GetSnapshotData() in the same transaction. As SharedRecoveryState is the point to take into account to know if it is safe to discard KnownAssignedXids, this commit moves step 2) after step 3), so as we can never finish with empty snapshots. This exists since the introduction of hot standby, so backpatch all the way down. The window with incorrect snapshots is extremely small, but I have seen it when running 023_pitr_prepared_xact.pl, as did buildfarm member fairywren. Thomas Munro also found it independently. Special thanks to Andres Freund for taking the time to analyze this issue. Reported-by: Thomas Munro, Michael Paquier Analyzed-by: Andres Freund Discussion: https://postgr.es/m/20210422203603.fdnh3fu2mmfp2iov@alap3.anarazel.de Backpatch-through: 9.6	2021-10-04 14:06:03 +09:00
Tom Lane	f951ea3a2a	Avoid believing incomplete MCV-only stats in get_variable_range(). get_variable_range() would incautiously believe that statistics containing only an MCV list are sufficient to derive a range estimate. That's okay for an enum-like column that contains only MCVs, but otherwise the estimate could be pretty bad. Make it report that the range is indeterminate unless the MCVs plus nullfrac account for the whole table. I don't think this needs a dedicated test case, since a quick code coverage check verifies that the existing regression tests traverse all the alternatives. There is room to doubt that a future-proof test case could be built anyway, given that the submitted example accidentally doesn't fail before v11. Per bug #17207 from Simon Perepelitsa. Back-patch to v10. In principle this has been broken all along, but I'm hesitant to make such changes in 9.6, since if anyone is unhappy with 9.6.24's behavior there will be no second chance to fix it. Discussion: https://postgr.es/m/17207-5265aefa79e333b4@postgresql.org	2021-10-01 14:59:35 -04:00
Alvaro Herrera	d9fe2cc7dd	Fix WAL replay in presence of an incomplete record Physical replication always ships WAL segment files to replicas once they are complete. This is a problem if one WAL record is split across a segment boundary and the primary server crashes before writing down the segment with the next portion of the WAL record: WAL writing after crash recovery would happily resume at the point where the broken record started, overwriting that record ... but any standby or backup may have already received a copy of that segment, and they are not rewinding. This causes standbys to stop following the primary after the latter crashes: LOG: invalid contrecord length 7262 at A8/D9FFFBC8 because the standby is still trying to read the continuation record (contrecord) for the original long WAL record, but it is not there and it will never be. A workaround is to stop the replica, delete the WAL file, and restart it -- at which point a fresh copy is brought over from the primary. But that's pretty labor intensive, and I bet many users would just give up and re-clone the standby instead. A fix for this problem was already attempted in commit `515e3d84a0`, but it only addressed the case for the scenario of WAL archiving, so streaming replication would still be a problem (as well as other things such as taking a filesystem-level backup while the server is down after having crashed), and it had performance scalability problems too; so it had to be reverted. This commit fixes the problem using an approach suggested by Andres Freund, whereby the initial portion(s) of the split-up WAL record are kept, and a special type of WAL record is written where the contrecord was lost, so that WAL replay in the replica knows to skip the broken parts. With this approach, we can continue to stream/archive segment files as soon as they are complete, and replay of the broken records will proceed across the crash point without a hitch. Because a new type of WAL record is added, users should be careful to upgrade standbys first, primaries later. Otherwise they risk the standby being unable to start if the primary happens to write such a record. A new TAP test that exercises this is added, but the portability of it is yet to be seen. This has been wrong since the introduction of physical replication, so backpatch all the way back. In stable branches, keep the new XLogReaderState members at the end of the struct, to avoid an ABI break. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Nathan Bossart <bossartn@amazon.com> Discussion: https://postgr.es/m/202108232252.dh7uxf6oxwcy@alvherre.pgsql	2021-09-29 11:21:51 -03:00
Tomas Vondra	d77e085afd	Release memory allocated by dependency_degree Calculating degree of a functional dependency may allocate a lot of memory - we have released mot of the explicitly allocated memory, but e.g. detoasted varlena values were left behind. That may be an issue, because we consider a lot of dependencies (all combinations), and the detoasting may happen for each one again. Fixed by calling dependency_degree() in a dedicated context, and resetting it after each call. We only need the calculated dependency degree, so we don't need to copy anything. Backpatch to PostgreSQL 10, where extended statistics were introduced. Backpatch-through: 10 Discussion: https://www.postgresql.org/message-id/20210915200928.GP831%40telsasoft.com	2021-09-23 18:55:22 +02:00
Tomas Vondra	3aac99068c	Free memory after building each statistics object Until now, all extended statistics on a given relation were built in the same memory context, without resetting. Some of the memory was released explicitly, but not all of it - for example memory allocated while detoasting values is hard to free. This is how it worked since extended statistics were introduced in PostgreSQL 10, but adding support for extended stats on expressions made the issue somewhat worse as it increases the number of statistics to build. Fixed by adding a memory context which gets reset after building each statistics object (all the statistics kinds included in it). Resetting it after building each statistics kind would be even better, but it would require more invasive changes and copying of results, making it harder to backpatch. Backpatch to PostgreSQL 10, where extended statistics were introduced. Author: Justin Pryzby Reported-by: Justin Pryzby Reviewed-by: Tomas Vondra Backpatch-through: 10 Discussion: https://www.postgresql.org/message-id/20210915200928.GP831%40telsasoft.com	2021-09-23 18:54:30 +02:00
Tom Lane	923b7efc25	Don't elide casting to typmod -1. Casting a value that's already of a type with a specific typmod to an unspecified typmod doesn't do anything so far as run-time behavior is concerned. However, it really ought to change the exposed type of the expression to match. Up to now, coerce_type_typmod hasn't bothered with that, which creates gotchas in contexts such as recursive unions. If for example one side of the union is numeric(18,3), but it needs to be plain numeric to match the other side, there's no direct way to express that. This is easy enough to fix, by inserting a RelabelType to update the exposed type of the expression. However, it's a bit nervous-making to change this behavior, because it's stood for a really long time. But no complaints have emerged about 14beta3, so go ahead and back-patch. Back-patch of `5c056b0c2` into previous supported branches. Discussion: https://postgr.es/m/CABNQVagu3bZGqiTjb31a8D5Od3fUMs7Oh3gmZMQZVHZ=uWWWfQ@mail.gmail.com Discussion: https://postgr.es/m/1488389.1631984807@sss.pgh.pa.us	2021-09-20 11:48:52 -04:00
Fujii Masao	639d731acc	Fix variable shadowing in procarray.c. ProcArrayGroupClearXid function has a parameter named "proc", but the same name was used for its local variables. This commit fixes this variable shadowing, to improve code readability. Back-patch to all supported versions, to make future back-patching easy though this patch is classified as refactoring only. Reported-by: Ranier Vilela Author: Ranier Vilela, Aleksander Alekseev https://postgr.es/m/CAEudQAqyoTZC670xWi6w-Oe2_Bk1bfu2JzXz6xRfiOUzm7xbyQ@mail.gmail.com	2021-09-16 13:08:12 +09:00
Tom Lane	daac97eb0b	Make pg_regexec() robust against out-of-range search_start. If search_start is greater than the length of the string, we should just return REG_NOMATCH immediately. (Note that the equality case should not be rejected, since the pattern might be able to match zero characters.) This guards various internal assumptions that the min of a range of string positions is not more than the max. Violation of those assumptions could allow an attempt to fetch string[search_start-1], possibly causing a crash. Jaime Casanova pointed out that this situation is reachable with the new regexp_xxx functions that accept a user-specified start position. I don't believe it's reachable via any in-core call site in v14 and below. However, extensions could possibly call pg_regexec with an out-of-range search_start, so let's back-patch the fix anyway. Discussion: https://postgr.es/m/20210911180357.GA6870@ahch-to	2021-09-11 15:20:04 -04:00
Tom Lane	ca1dd62340	Check for relation length overrun soon enough. We don't allow relations to exceed 2^32-1 blocks, because block numbers are 32 bits and the last possible block number is reserved to mean InvalidBlockNumber. There is a check for this in mdextend, but that's really way too late, because the smgr API requires us to create a buffer for the block-to-be-added, and we do not want to have any buffer with blocknum InvalidBlockNumber. (Such a case can trigger assertions in bufmgr.c, plus I think it might confuse ReadBuffer's logic for data-past-EOF later on.) So put the check into ReadBuffer. Per report from Christoph Berg. It's been like this forever, so back-patch to all supported branches. Discussion: https://postgr.es/m/YTn1iTkUYBZfcODk@msg.credativ.de	2021-09-09 11:45:48 -04:00
Fujii Masao	f77489046d	Fix issue with WAL archiving in standby. Previously, walreceiver always closed the currently-opened WAL segment and created its archive notification file, after it finished writing the current segment up and received any WAL data that should be written into the next segment. If walreceiver exited just before any WAL data in the next segment arrived at standby, it did not create the archive notification file of the current segment even though that's known completed. This behavior could cause WAL archiving of the segment to be delayed until subsequent restartpoints or checkpoints created its notification file. To fix the issue, this commit changes walreceiver so that it creates an archive notification file of a current WAL segment immediately if that's known completed before receiving next WAL data. Back-patch to all supported branches. Reported-by: Kyotaro Horiguchi Author: Fujii Masao Reviewed-by: Kyotaro Horiguchi Discussion: https://postgr.es/m/20200630.165503.1465894182551545886.horikyota.ntt@gmail.com	2021-09-09 23:59:40 +09:00
Tom Lane	9de082399c	Fix rewriter to set hasModifyingCTE correctly on rewritten queries. If we copy data-modifying CTEs from the original query to a replacement query (from a DO INSTEAD rule), we must set hasModifyingCTE properly in the replacement query. Failure to do this can cause various unpleasantness, such as unsafe usage of parallel plans. The code also neglected to propagate hasRecursive, though that's only cosmetic at the moment. A difficulty arises if the rule action is an INSERT...SELECT. We attach the original query's RTEs and CTEs to the sub-SELECT Query, but data-modifying CTEs are only allowed to appear in the topmost Query. For the moment, throw an error in such cases. It would probably be possible to avoid this error by attaching the CTEs to the top INSERT Query instead; but that would require a bunch of new code to adjust ctelevelsup references. Given the narrowness of the use-case, and the need to back-patch this fix, it does not seem worth the trouble for now. We can revisit this if we get field complaints. Per report from Greg Nancarrow. Back-patch to all supported branches. (The test case added here does not fail before v10, but there are plenty of places checking top-level hasModifyingCTE in 9.6, so I have no doubt that this code change is necessary there too.) Greg Nancarrow and Tom Lane Discussion: https://postgr.es/m/CAJcOf-f68DT=26YAMz_i0+Au3TcLO5oiHY5=fL6Sfuits6r+_w@mail.gmail.com Discussion: https://postgr.es/m/CAJcOf-fAdj=nDKMsRhQzndm-O13NY4dL6xGcEvdX5Xvbbi0V7g@mail.gmail.com	2021-09-08 12:05:43 -04:00
Amit Kapila	28cde380c1	Invalidate relcache for publications defined for all tables. Updates/Deletes on a relation were allowed even without replica identity after we define the publication for all tables. This would later lead to an error on subscribers. The reason was that for such publications we were not invalidating the relcache and the publication information for relations was not getting rebuilt. Similarly, we were not invalidating the relcache after dropping of such publications which will prohibit Updates/Deletes without replica identity even without any publication. Author: Vignesh C and Hou Zhijie Reviewed-by: Hou Zhijie, Kyotaro Horiguchi, Amit Kapila Backpatch-through: 10, where it was introduced Discussion: https://postgr.es/m/CALDaNm0pF6zeWqCA8TCe2sDuwFAy8fCqba=nHampCKag-qLixg@mail.gmail.com	2021-09-08 11:23:01 +05:30
Tom Lane	b28c862a6c	Fix bogus timetz_zone() results for DYNTZ abbreviations. timetz_zone() delivered completely wrong answers if the zone was specified by a dynamic TZ abbreviation, because it failed to account for the difference between the POSIX conventions for field values in struct pg_tm and the conventions used in PG-specific datetime code. As a stopgap fix, just adjust the tm_year and tm_mon fields to match PG conventions. This is fixed in a different way in HEAD (`388e71af8`) but I don't want to back-patch the change of reference point. Discussion: https://postgr.es/m/CAJ7c6TOMG8zSNEZtCn5SPe+cCk3Lfxb71ZaQwT2F4T7PJ_t=KA@mail.gmail.com	2021-09-06 11:29:52 -04:00
Tom Lane	70354dd560	Further portability tweaks for float4/float8 hash functions. Attempting to make hashfloat4() look as much as possible like hashfloat8(), I'd figured I could replace NaNs with get_float4_nan() before widening to float8. However, results from protosciurus and topminnow show that on some platforms that produces a different bit-pattern from get_float8_nan(), breaking the intent of `ce773f230`. Rearrange so that we use the result of get_float8_nan() for all NaN cases. As before, back-patch.	2021-09-04 16:29:08 -04:00

1 2 3 4 5 ...

18357 Commits