postgres

mirror of https://github.com/postgres/postgres.git synced 2025-11-24 00:23:06 +03:00

Author	SHA1	Message	Date
Michael Paquier	b199eb89c6	Fix some typos Author: Yongtao Huang Discussion: https://postgr.es/m/CAOe1Go1F99o5JsphtXdDC5bxm7AzetU8q3AxLh4AAVGKu1AzEQ@mail.gmail.com	2024-01-22 13:55:25 +09:00
Peter Eisentraut	6db4598fcb	Add stratnum GiST support function This is support function 12 for the GiST AM and translates "well-known" RT*StrategyNumber values into whatever strategy number is used by the opclass (since no particular numbers are actually required). We will use this to support temporal PRIMARY KEY/UNIQUE/FOREIGN KEY/FOR PORTION OF functionality. This commit adds two implementations, one for internal GiST opclasses (just an identity function) and another for btree_gist opclasses. It updates btree_gist from 1.7 to 1.8, adding the support function for all its opclasses. Author: Paul A. Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: jian he <jian.universality@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com	2024-01-19 15:42:13 +01:00
Robert Haas	e313a61137	Remove LVPagePruneState. Commit `cb970240f1` moved some code from lazy_scan_heap() to lazy_scan_prune(), and now some things that used to need to be passed back and forth are completely local to lazy_scan_prune(). Hence, this struct is mostly obsolete. The only thing that still needs to be passed back to the caller is has_lpdead_items, and that's also passed back by lazy_scan_noprune(), so do it the same way in both cases. Melanie Plageman, reviewed and slightly revised by me. Discussion: http://postgr.es/m/CAAKRu_aM=OL85AOr-80wBsCr=vLVzhnaavqkVPRkFBtD0zsuLQ@mail.gmail.com	2024-01-18 15:17:09 -05:00
Robert Haas	cb970240f1	Move VM update code from lazy_scan_heap() to lazy_scan_prune(). Most of the output parameters of lazy_scan_prune() were being used to update the VM in lazy_scan_heap(). Moving that code into lazy_scan_prune() simplifies lazy_scan_heap() and requires less communication between the two. This change permits some further code simplification, but that is left for a separate commit. Melanie Plageman, reviewed by me. Discussion: http://postgr.es/m/CAAKRu_aM=OL85AOr-80wBsCr=vLVzhnaavqkVPRkFBtD0zsuLQ@mail.gmail.com	2024-01-18 14:44:57 -05:00
Robert Haas	c120550edb	Optimize vacuuming of relations with no indexes. If there are no indexes on a relation, items can be marked LP_UNUSED instead of LP_DEAD when pruning. This significantly reduces WAL volume, since we no longer need to emit one WAL record for pruning and a second to change the LP_DEAD line pointers thus created to LP_UNUSED. Melanie Plageman, reviewed by Andres Freund, Peter Geoghegan, and me Discussion: https://postgr.es/m/CAAKRu_bgvb_k0gKOXWzNKWHt560R0smrGe3E8zewKPs8fiMKkw%40mail.gmail.com	2024-01-18 10:03:42 -05:00
Michael Paquier	8013850c85	Add try_index_open(), conditional variant of index_open() try_index_open() is able to open an index if its relkind fits, except that it would return NULL instead of generated an error if the relation does not exist. This new routine will be used by an upcoming patch to make REINDEX on partitioned relations more robust when an index in a partition tree is dropped. Extracted from a larger patch by the same author. Author: Fei Changhong Discussion: https://postgr.es/m/tencent_6A52106095ACDE55333E3AD33F304C0C3909@qq.com Backpatch-through: 14	2024-01-18 15:04:24 +09:00
Robert Haas	45d395cd75	Be more consistent about whether to update the FSM while vacuuming. Previously, when lazy_scan_noprune() was called and returned true, we would update the FSM immediately if the relation had no indexes or if the page contained no dead items. On the other hand, when lazy_scan_prune() was called, we would update the FSM if either of those things was true or if index vacuuming was disabled. Eliminate that behavioral difference by considering vacrel->do_index_vacuuming in both cases. Also, make lazy_scan_heap() responsible for deciding whether to update the FSM, instead of doing it inside lazy_scan_noprune(). This is more consistent with the lazy_scan_prune() case. lazy_scan_noprune() still needs an output parameter for whether there are LP_DEAD items on the page, but the real decision-making now happens in the caller. Patch by me, reviewed by Peter Geoghegan and Melanie Plageman. Discussion: http://postgr.es/m/CA+TgmoaOzvN1TcHd9iej=PR3fY40En1USxzOnXSR2CxCLaRM0g@mail.gmail.com	2024-01-16 14:16:57 -05:00
Peter Eisentraut	4f622503d6	Make attstattarget nullable This changes the pg_attribute field attstattarget into a nullable field in the variable-length part of the row. If no value is set by the user for attstattarget, it is now null instead of previously -1. This saves space in pg_attribute and tuple descriptors for most practical scenarios. (ATTRIBUTE_FIXED_PART_SIZE is reduced from 108 to 104.) Also, null is the semantically more correct value. The ANALYZE code internally continues to represent the default statistics target by -1, so that that code can avoid having to deal with null values. But that is now contained to the ANALYZE code. Only the DDL code deals with attstattarget possibly null. For system columns, the field is now always null. The ANALYZE code skips system columns anyway. To set a column's statistics target to the default value, the new command form ALTER TABLE ... SET STATISTICS DEFAULT can be used. (SET STATISTICS -1 still works.) Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://www.postgresql.org/message-id/flat/4da8d211-d54d-44b9-9847-f2a9f1184c76@eisentraut.org	2024-01-13 18:14:53 +01:00
Robert Haas	e2d5b3b9b6	Remove hastup from LVPagePruneState. Instead, just have lazy_scan_prune() and lazy_scan_noprune() update LVRelState->nonempty_pages directly. This makes the two functions more similar and also removes makes lazy_scan_noprune need one fewer output parameters. Melanie Plageman, reviewed by Andres Freund, Michael Paquier, and me Discussion: http://postgr.es/m/CAAKRu_btji_wQdg=ok-5E4v_bGVxKYnnFFe7RA6Frc1EcOwtSg@mail.gmail.com	2024-01-11 13:30:12 -05:00
Bruce Momjian	29275b1d17	Update copyright for 2024 Reported-by: Michael Paquier Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz Backpatch-through: 12	2024-01-03 20:49:05 -05:00
Robert Haas	e62e73f3a2	gist: fix typo "split(t)ed" -> "split" Dagfinn Ilmari Mannsåker, reviewed by Shubham Khanna. Discussion: http://postgr.es/m/87le9fmi01.fsf@wibble.ilmari.org	2024-01-02 12:24:28 -05:00
Robert Haas	0d9937d118	Fix typos in comments and in one isolation test. Dagfinn Ilmari Mannsåker, reviewed by Shubham Khanna. Some subtractions by me. Discussion: http://postgr.es/m/87le9fmi01.fsf@wibble.ilmari.org	2024-01-02 12:05:41 -05:00
Tomas Vondra	cb44a8345e	Fix parallel BRIN builds with synchronized scans The brinbuildCallbackParallel callback used by parallel BRIN builds did not consider that the parallel table scans may be synchronized, starting from an arbitrary block and then wrap around. If this happened and the scan actually did wrap around, tuples from the beginning of the table were added to the last range produced by the same worker. The index would be missing range at the beginning of the table, while the last range would be too wide. This would not produce incorrect query results, but it'd be less efficient. Fixed by checking for both past and future ranges in the callback. The worker may produce multiple summaries for the same page range, but the leader will merge them as if the summaries came from different workers. Discussion: https://postgr.es/m/c2ee7d69-ce17-43f2-d1a0-9811edbda6e6%40enterprisedb.com	2023-12-30 23:17:01 +01:00
Tomas Vondra	6c63bcbf3c	Minor cleanup of the BRIN parallel build code Commit `b437571714` added support for parallel builds for BRIN indexes, using code similar to BTREE parallel builds, and also a new tuplesort variant. This commit simplifies the new code in two ways: * The "spool" grouping tuplesort and the heap/index is not necessary. The heap/index are available as separate arguments, causing confusion. So remove the spool, and use the tuplesort directly. * The new tuplesort variant does not need the heap/index, as it sorts simply by the range block number, without accessing the tuple data. So simplify that too. Initial report and patch by Ranier Vilela, further cleanup by me. Author: Ranier Vilela Discussion: https://postgr.es/m/CAEudQAqD7f2i4iyEaAz-5o-bf6zXVX-AkNUBm-YjUXEemaEh6A%40mail.gmail.com	2023-12-30 23:15:04 +01:00
Tom Lane	98c6231d19	Fix incorrect data type choices in some read and write calls. Recently-introduced code in reconstruct.c was using "unsigned" to store the result of read(), pg_pread(), or write(). This is completely bogus: it breaks subsequent tests for the result being negative, as we're being reminded of by a chorus of buildfarm warnings. Switch to "int" as was doubtless intended. (There are several other uses of "unsigned" in this file that also look poorly chosen to me, but for now I'm just trying to clean up the buildfarm.) A larger problem is that "int" is not necessarily wide enough to hold the result: per POSIX, all these functions return ssize_t. In places where the requested read or write length clearly fits in int, that's academic. It may be academic anyway as long as we constrain individual data files to 1GB, since even a readv or writev-like operation would then not be responsible for transferring more than 1GB. Nonetheless it seems like trouble waiting to happen, so I made a pass over readv and writev calls and fixed the result variables where that seemed appropriate. We might want to think about changing some of the fd.c functions to return ssize_t too, for future-proofing; but I didn't tackle that here. Discussion: https://postgr.es/m/1672202.1703441340@sss.pgh.pa.us	2023-12-27 11:02:53 -05:00
Alexander Korotkov	7e6fb5da41	Improvements and fixes for `e0b1ee17dc` `e0b1ee17dc` introduced optimization for matching B-tree scan keys required for the directional scan. However, it incorrectly assumed that all keys required for opposite direction scan are satisfied by _bt_first(). It has been illustrated that with multiple scan keys over the same column, a lesser one (according to the scan direction) could win leaving the other one unsatisfied. Instead of relying on _bt_first() this commit introduces code that memorizes whether there was at least one match on the page. If that's true we know that keys required for opposite-direction scan are satisfied as soon as corresponding values are not NULLs. Also, this commit simplifies the description for the optimization of keys required for the current direction scan. Now the flag used for this is named continuescanPrechecked and means exactly that *continuescan flag is known to be true for the last item on the page. Reported-by: Peter Geoghegan Discussion: https://postgr.es/m/CAH2-Wzn0LeLcb1PdBnK0xisz8NpHkxRrMr3NWJ%2BKOK-WZ%2BQtTQ%40mail.gmail.com Reviewed-by: Pavel Borisov	2023-12-27 14:35:08 +02:00
Alexander Korotkov	06b10f80ba	Remove BTScanOpaqueData.firstPage It's not necessary to keep the firstPage flag as a field of BTScanOpaqueData. This commit makes it an argument of the _bt_readpage() function. We can easily distinguish first-time and repeated calls (within the scan) of this function. Reported-by: Peter Geoghegan Discussion: https://postgr.es/m/CAH2-Wzk4SOsw%2BtHuTFiz8U9Jqj-R77rYPkhWKODCBb1mdHACXA%40mail.gmail.com Reviewed-by: Pavel Borisov	2023-12-27 14:21:49 +02:00
Tom Lane	903737c5bf	Avoid trying to fetch metapage of an SPGist partitioned index. This is necessary when spgcanreturn() is invoked on a partitioned index, and the failure might be reachable in other scenarios as well. The rest of what spgGetCache() does is perfectly sensible for a partitioned index, so we should allow it to go through. I think the main takeaway from this is that we lack sufficient test coverage for non-btree partitioned indexes. Therefore, I added simple test cases for brin and gin as well as spgist (hash and gist AMs were covered already in indexing.sql). Per bug #18256 from Alexander Lakhin. Although the known test case only fails since v16 (`3c569049b`), I've got no faith at all that there aren't other ways to reach this problem; so back-patch to all supported branches. Discussion: https://postgr.es/m/18256-0b0e1b6e4a620f1b@postgresql.org	2023-12-21 12:43:36 -05:00
Masahiko Sawada	bf6260b39d	Show isCatalogRel in several rmgr descriptions. Commit `6af179395` added isCatalogRel field to some WAL record types, but this field was not shown in the rmgr descriptions. This commit changes the several rmgr descriptions to display the isCatalogRel field. Author: Bertrand Drouvot Reviewed-by: Michael Paquier, Masahiko Sawada Discussion: https://postgr.es/m/957dc8f9-2a02-4640-9c01-9dcbf97c4187%40gmail.com	2023-12-21 10:09:38 +09:00
Robert Haas	dc21234005	Add support for incremental backup. To take an incremental backup, you use the new replication command UPLOAD_MANIFEST to upload the manifest for the prior backup. This prior backup could either be a full backup or another incremental backup. You then use BASE_BACKUP with the INCREMENTAL option to take the backup. pg_basebackup now has an --incremental=PATH_TO_MANIFEST option to trigger this behavior. An incremental backup is like a regular full backup except that some relation files are replaced with files with names like INCREMENTAL.${ORIGINAL_NAME}, and the backup_label file contains additional lines identifying it as an incremental backup. The new pg_combinebackup tool can be used to reconstruct a data directory from a full backup and a series of incremental backups. Patch by me. Reviewed by Matthias van de Meent, Dilip Kumar, Jakub Wartak, Peter Eisentraut, and Álvaro Herrera. Thanks especially to Jakub for incredibly helpful and extensive testing. Discussion: http://postgr.es/m/CA+TgmoYOYZfMCyOXFyC-P+-mdrZqm5pP2N7S-r0z3_402h9rsA@mail.gmail.com	2023-12-20 09:49:12 -05:00
Robert Haas	174c480508	Add a new WAL summarizer process. When active, this process writes WAL summary files to $PGDATA/pg_wal/summaries. Each summary file contains information for a certain range of LSNs on a certain TLI. For each relation, it stores a "limit block" which is 0 if a relation is created or destroyed within a certain range of WAL records, or otherwise the shortest length to which the relation was truncated during that range of WAL records, or otherwise InvalidBlockNumber. In addition, it stores a list of blocks which have been modified during that range of WAL records, but excluding blocks which were removed by truncation after they were modified and never subsequently modified again. In other words, it tells us which blocks need to copied in case of an incremental backup covering that range of WAL records. But this doesn't yet add the capability to actually perform an incremental backup; the next patch will do that. A new parameter summarize_wal enables or disables this new background process. The background process also automatically deletes summary files that are older than wal_summarize_keep_time, if that parameter has a non-zero value and the summarizer is configured to run. Patch by me, with some design help from Dilip Kumar and Andres Freund. Reviewed by Matthias van de Meent, Dilip Kumar, Jakub Wartak, Peter Eisentraut, and Álvaro Herrera. Discussion: http://postgr.es/m/CA+TgmoYOYZfMCyOXFyC-P+-mdrZqm5pP2N7S-r0z3_402h9rsA@mail.gmail.com	2023-12-20 08:42:28 -05:00
Jeff Davis	766571be16	Additional write barrier in AdvanceXLInsertBuffer(). First, mark the xlblocks member with InvalidXLogRecPtr, then issue a write barrier, then initialize it. That ensures that the xlblocks member doesn't appear valid while the contents are being initialized. In preparation for reading WAL buffer contents without a lock. Author: Bharath Rupireddy Discussion: https://postgr.es/m/CALj2ACVfFMfqD5oLzZSQQZWfXiJqd-NdX0_317veP6FuB31QWA@mail.gmail.com Reviewed-by: Andres Freund	2023-12-19 17:35:54 -08:00
Jeff Davis	c3a8e2a7cb	Use 64-bit atomics for xlblocks array elements. In preparation for reading the contents of WAL buffers without a lock. Also, avoids the previously-needed comment in GetXLogBuffer() explaining why it's safe from torn reads. Author: Bharath Rupireddy Discussion: https://postgr.es/m/CALj2ACVfFMfqD5oLzZSQQZWfXiJqd-NdX0_317veP6FuB31QWA@mail.gmail.com Reviewed-by: Andres Freund	2023-12-19 17:35:42 -08:00
Michael Paquier	8a7cbfce13	Prevent tuples to be marked as dead in subtransactions on standbys Dead tuples are ignored and are not marked as dead during recovery, as it can lead to MVCC issues on a standby because its xmin may not match with the primary. This information is tracked by a field called "xactStartedInRecovery" in the transaction state data, switched on when starting a transaction in recovery. Unfortunately, this information was not correctly tracked when starting a subtransaction, because the transaction state used for the subtransaction did not update "xactStartedInRecovery" based on the state of its parent. This would cause index scans done in subtransactions to return inconsistent data, depending on how the xmin of the primary and/or the standby evolved. This is broken since the introduction of hot standby in `efc16ea520`, so backpatch all the way down. Author: Fei Changhong Reviewed-by: Kyotaro Horiguchi Discussion: https://postgr.es/m/tencent_C4D907A5093C071A029712E73B43C6512706@qq.com Backpatch-through: 12	2023-12-12 17:05:18 +01:00
Michael Paquier	c7a3e6b46d	Remove trace_recovery_messages This GUC was intended as a debugging help in the 9.0 area when hot standby and streaming replication were being developped, able to offer more information at LOG level rather than DEBUGn. There are more tools available these days that are able to offer rather equivalent information, like pg_waldump introduced in 9.3. It is not obvious how this facility is useful these days, so let's remove it. Author: Bharath Rupireddy Discussion: https://postgr.es/m/ZXEXEAUVFrvpquSd@paquier.xyz	2023-12-11 11:49:02 +01:00
Peter Geoghegan	aa210e0c12	Fix nbtree backward scan race condition comments. Remove comments that supposed that holding a pin was a useful interlock for _bt_walk_left(). There are times when _bt_walk_left() doesn't hold either a lock or a pin on any page, so clearly this can't be true. _bt_walk_left() is even prepared to deal with concurrent deletion of both the original page and any pages to its left. Oversight in commit `2ed5b87f96`.	2023-12-08 15:37:53 -08:00
Peter Geoghegan	c9c0589fda	Optimize nbtree backward scan boundary cases. Teach _bt_binsrch (and related helper routines like _bt_search and _bt_compare) about the initial positioning requirements of backward scans. Routines like _bt_binsrch already know all about "nextkey" searches, so it seems natural to teach them about "goback"/backward searches, too. These concepts are closely related, and are much easier to understand when discussed together. Now that certain implementation details are hidden from _bt_first, it's straightforward to add a new optimization: backward scans using the < strategy now avoid extra leaf page accesses in certain "boundary cases". Consider the following example, which uses the tenk1 table (and its tenk1_hundred index) from the standard regression tests: SELECT * FROM tenk1 WHERE hundred < 12 ORDER BY hundred DESC LIMIT 1; Before this commit, nbtree would scan two leaf pages, even though it was only really necessary to scan one leaf page. We'll now descend straight to the leaf page containing a (12, -inf) high key instead. The scan will locate matching non-pivot tuples with "hundred" values starting from the value 11. The scan won't waste a page access on the right sibling leaf page, which cannot possibly contain any matching tuples. You can think of the optimization added by this commit as disabling an optimization (the _bt_compare "!pivotsearch" behavior that was added to Postgres 12 in commit `dd299df8`) for a small subset of cases where it was always counterproductive. Equivalently, you can think of the new optimization as extending the "pivotsearch" behavior that page deletion by VACUUM has long required (since the aforementioned Postgres 12 commit went in) to other, similar cases. Obviously, this isn't strictly necessary for these new cases (unlike VACUUM, _bt_first is prepared to move the scan to the left once on the leaf level), but the underlying principle is the same. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/CAH2-Wz=XPzM8HzaLPq278Vms420mVSHfgs9wi5tjFKHcapZCEw@mail.gmail.com	2023-12-08 11:05:17 -08:00
Tomas Vondra	b437571714	Allow parallel CREATE INDEX for BRIN indexes Allow using multiple worker processes to build BRIN index, which until now was supported only for BTREE indexes. For large tables this often results in significant speedup when the build is CPU-bound. The work is split in a simple way - each worker builds BRIN summaries on a subset of the table, determined by the regular parallel scan used to read the data, and feeds them into a shared tuplesort which sorts them by blkno (start of the range). The leader then reads this sorted stream of ranges, merges duplicates (which may happen if the parallel scan does not align with BRIN pages_per_range), and adds the resulting ranges into the index. The number of duplicate results produced by workers (requiring merging in the leader process) should be fairly small, thanks to how parallel scans assign chunks to workers. The likelihood of duplicate results may increase for higher pages_per_range values, but then there are fewer page ranges in total. In any case, we expect the merging to be much cheaper than summarization, so this should be a win. Most of the parallelism infrastructure is a simplified copy of the code used by BTREE indexes, omitting the parts irrelevant for BRIN indexes (e.g. uniqueness checks). This also introduces a new index AM flag amcanbuildparallel, determining whether to attempt to start parallel workers for the index build. Original patch by me, with reviews and substantial reworks by Matthias van de Meent, certainly enough to make him a co-author. Author: Tomas Vondra, Matthias van de Meent Reviewed-by: Matthias van de Meent Discussion: https://postgr.es/m/c2ee7d69-ce17-43f2-d1a0-9811edbda6e6%40enterprisedb.com	2023-12-08 18:15:26 +01:00
Tomas Vondra	dae761a87e	Add empty BRIN ranges during CREATE INDEX When building BRIN indexes, the brinbuildCallback only advances to the next page range when seeing a tuple that doesn't belong to the current one. This means that the index may end up missing ranges at the end of the table, if those pages do not contain any indexable tuples. We tend not to have completely empty pages at the end of a relation, but this also applies to partial indexes, where the tuples may simply not match the index predicate. This results in inefficient scans using the affected BRIN index - without the summaries, the page ranges have to be read and processed, which consumes I/O and possibly also CPU time. The existing code already added empty ranges for earlier parts of the table, this commit makes sure we add them for the ranges at the end of the table too. Patch by Matthias van de Meent, with review/improvements by me. Author: Matthias van de Meent Reviewed-by: Tomas Vondra Discussion: https://postgr.es/m/CAEze2WiMsPZg%3DxkvSF_jt4%3D69k6K7gz5B8V2wY3gCGZ%2B1BzCbQ%40mail.gmail.com	2023-12-08 17:14:32 +01:00
Heikki Linnakangas	b31ba5310b	Rename ShmemVariableCache to TransamVariables The old name was misleading: It's not a cache, the values kept in the struct are the authoritative source. Reviewed-by: Tristan Partin, Richard Guo Discussion: https://www.postgresql.org/message-id/6537d63d-4bb5-46f8-9b5d-73a8ba4720ab@iki.fi	2023-12-08 09:47:15 +02:00
Heikki Linnakangas	15916ffb04	Initialize ShmemVariableCache like other shmem areas For sake of consistency. Reviewed-by: Tristan Partin, Richard Guo Discussion: https://www.postgresql.org/message-id/6537d63d-4bb5-46f8-9b5d-73a8ba4720ab@iki.fi	2023-12-08 09:46:59 +02:00
Thomas Munro	cd7f19da34	Fix potential pointer overflow in xlogreader.c. While checking if a record could fit in the circular WAL decoding buffer, the coding from commit `3f1ce973` used arithmetic that could overflow. 64 bit systems were unaffected for various technical reasons, which probably explains the lack of problem reports. Likewise for 32 bit systems running known 32 bit kernels. The systems at risk of problems appear to be 32 bit processes running on 64 bit kernels, with unlucky placement in memory. Per complaint from GCC -fsanitize=undefined -m32, while testing variations of 039_end_of_wal.pl. Back-patch to 15. Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKGKH0oRPOX7DhiQ_b51sM8HqcPp2J3WA-Oen%3DdXog%2BAGGQ%40mail.gmail.com	2023-12-08 16:09:03 +13:00
Michael Paquier	7636725b92	Fix compilation on Windows with WAL_DEBUG This has been broken since `b060dbe000` that has reworked the callback mechanism of XLogReader, most likely unnoticed because any form of development involving WAL happens on platforms where this compiles fine. Author: Bharath Rupireddy Discussion: https://postgr.es/m/CALj2ACVF14WKQMFwcJ=3okVDhiXpuK5f7YdT+BdYXbbypMHqWA@mail.gmail.com Backpatch-through: 13	2023-12-06 14:10:39 +09:00
Amit Kapila	f66fcc5cd6	Fix an uninitialized access in hash_xlog_squeeze_page(). Commit `861f86beea` changed hash_xlog_squeeze_page() to start reading the write buffer conditionally but forgot to initialize it leading to an uninitialized access. Reported-by: Alexander Lakhin Author: Hayato Kuroda Reviewed-by: Alexander Lakhin, Amit Kapila Discussion: http://postgr.es/m/62ed1a9f-746a-8e86-904b-51b9b806a1d9@gmail.com	2023-12-01 10:22:13 +05:30
Alexander Korotkov	ae2ccf66a2	Fix typo in `5a1dfde833` Reported-by: Alexander Lakhin Discussion: https://postgr.es/m/55d8800f-4a80-5256-1e84-246fbe79acd0@gmail.com	2023-11-30 13:46:23 +02:00
Alexander Korotkov	b589f211e0	Fix warning due non-standard inline declaration in `4ed8f0913b` Reported-by: Alexander Lakhin, Tom Lane Author: Pavel Borisov Discussion: https://postgr.es/m/55d8800f-4a80-5256-1e84-246fbe79acd0@gmail.com	2023-11-30 11:34:45 +02:00
Michael Paquier	8d9978a717	Apply quotes more consistently to GUC names in logs Quotes are applied to GUCs in a very inconsistent way across the code base, with a mix of double quotes or no quotes used. This commit removes double quotes around all the GUC names that are obviously referred to as parameters with non-English words (use of underscore, mixed case, etc). This is the result of a discussion with Álvaro Herrera, Nathan Bossart, Laurenz Albe, Peter Eisentraut, Tom Lane and Daniel Gustafsson. Author: Peter Smith Discussion: https://postgr.es/m/CAHut+Pv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w@mail.gmail.com	2023-11-30 14:11:45 +09:00
Alexander Korotkov	5a1dfde833	Make use FullTransactionId in 2PC filenames Switch from using TransactionId to FullTransactionId in naming of 2PC files. Transaction state file in the pg_twophase directory now have extra 8 bytes in the name to address an epoch of a given xid. Author: Maxim Orlov, Aleksander Alekseev, Alexander Korotkov, Teodor Sigaev Author: Nikita Glukhov, Pavel Borisov, Yura Sokolov Reviewed-by: Jacob Champion, Heikki Linnakangas, Alexander Korotkov Reviewed-by: Japin Li, Pavel Borisov, Tom Lane, Peter Eisentraut, Andres Freund Reviewed-by: Andrey Borodin, Dilip Kumar, Aleksander Alekseev Discussion: https://postgr.es/m/CACG%3DezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe%3DpyyjVWA%40mail.gmail.com Discussion: https://postgr.es/m/CAJ7c6TPDOYBYrnCAeyndkBktO0WG2xSdYduTF0nxq%2BvfkmTF5Q%40mail.gmail.com	2023-11-29 01:43:36 +02:00
Alexander Korotkov	4ed8f0913b	Index SLRUs by 64-bit integers rather than by 32-bit integers We've had repeated bugs in the area of handling SLRU wraparound in the past, some of which have caused data loss. Switching to an indexing system for SLRUs that does not wrap around should allow us to get rid of a whole bunch of problems and improve the overall reliability of the system. This particular patch however only changes the indexing and doesn't address the wraparound per se. This is going to be done in the following patches. Author: Maxim Orlov, Aleksander Alekseev, Alexander Korotkov, Teodor Sigaev Author: Nikita Glukhov, Pavel Borisov, Yura Sokolov Reviewed-by: Jacob Champion, Heikki Linnakangas, Alexander Korotkov Reviewed-by: Japin Li, Pavel Borisov, Tom Lane, Peter Eisentraut, Andres Freund Reviewed-by: Andrey Borodin, Dilip Kumar, Aleksander Alekseev Discussion: https://postgr.es/m/CACG%3DezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe%3DpyyjVWA%40mail.gmail.com Discussion: https://postgr.es/m/CAJ7c6TPDOYBYrnCAeyndkBktO0WG2xSdYduTF0nxq%2BvfkmTF5Q%40mail.gmail.com	2023-11-29 01:40:56 +02:00
Heikki Linnakangas	60f227316c	Fix assertions with RI triggers in heap_update and heap_delete. If the tuple being updated is not visible to the crosscheck snapshot, we return TM_Updated but the assertions would not hold in that case. Move them to before the cross-check. Fixes bug #17893. Backpatch to all supported versions. Author: Alexander Lakhin Backpatch-through: 12 Discussion: https://www.postgresql.org/message-id/17893-35847009eec517b5%40postgresql.org	2023-11-28 12:00:14 +02:00
Tomas Vondra	a82ee7ef3a	Check if ii_AmCache is NULL in aminsertcleanup Fix a bug introduced by `c1ec02be1d`. It may happen that the executor opens indexes on the result relation, but no rows end up being inserted. Then the index_insert_cleanup still gets executed, but passes down NULL to the AM callback. The AM callback may not expect this, as is the case of brininsertcleanup, leading to a crash. Fixed by only calling the cleanup callback if (ii_AmCache != NULL). This way the AM can simply assume to only see a valid cache. Reported-by: Richard Guo Discussion: https://postgr.es/m/CAMbWs4-w9qC-o9hQox9UHvdVZAYTp8OrPQOKtwbvzWaRejTT=Q@mail.gmail.com	2023-11-27 16:53:06 +01:00
Heikki Linnakangas	1f395354d8	Reduce rate of walwriter wakeups due to async commits. XLogSetAsyncXactLSN(), called at asynchronous commit, would wake up walwriter every time the LSN advances, but walwriter doesn't actually do anything unless it has at least 'wal_writer_flush_after' full blocks of WAL to write. Repeatedly waking up walwriter to do nothing is a waste of CPU cycles in both walwriter and the backends doing the wakeups. To fix, apply the same logic in XLogSetAsyncXactLSN() to decide whether to wake up walwriter, as walwriter uses to determine if it has any work to do. In the passing, rename misleadingly named 'flushbytes' local variable to 'flushblocks'. Author: Andres Freund, Heikki Linnakangas Discussion: https://www.postgresql.org/message-id/20231024230929.vsc342baqs7kmbte@awork3.anarazel.de	2023-11-27 17:42:39 +02:00
Tomas Vondra	b2caf7c0e1	Fix brin.c indentation issues introduced by `c1ec02be1d` Per buildfarm member koel.	2023-11-26 21:35:32 +01:00
Tomas Vondra	c1ec02be1d	Reuse BrinDesc and BrinRevmap in brininsert The brininsert code used to initialize (and destroy) BrinDesc and BrinRevmap for each tuple, which is not free. This patch initializes these structures only once, and reuses them for all inserts in the same command. The data is passed through indexInfo->ii_AmCache. This also introduces an optional AM callback "aminsertcleanup" that allows performing custom cleanup in case simply pfree-ing ii_AmCache is not sufficient (which is the case when the cache contains TupleDesc, Buffers, and so on). Author: Soumyadeep Chakraborty Reviewed-by: Alvaro Herrera, Matthias van de Meent, Tomas Vondra Discussion: https://postgr.es/m/CAE-ML%2B9r2%3DaO1wwji1sBN9gvPz2xRAtFUGfnffpd0ZqyuzjamA%40mail.gmail.com	2023-11-25 20:27:28 +01:00
Bruce Momjian	8d981341a5	C comment: clarify that WAL files can be _recycled_ or removed Reported-by: Michael Paquier Discussion: https://postgr.es/m/CAB7nPqSDdF0heotQU3gsepgqx+9c+6KjLd3R6aNYH7KKfDd2ig@mail.gmail.com Author: Michael Paquier Backpatch-through: master	2023-11-25 10:48:18 -05:00
Bruce Momjian	344afc7769	modify segno. for pg_walfile_name() and pg_walfile_name_offset() Previously these functions returned the previous segment number if the LSN was on a segment boundary. We now always return the current segment number for an LSN. Docs updated to reflect this change. Regression tests added, author Andres Freund. Also mentioned in thread https://postgr.es/m/flat/20220204225057.GA1535307%40nathanxps13#d964275c9540d8395e138efc0a75f7e8 BACKWARD INCOMPATIBILITY Reported-by: Kyotaro Horiguchi Discussion: https://postgr.es/m/20190726.172120.101752680.horikyota.ntt@gmail.com Co-authored-by: Kyotaro Horiguchi Backpatch-through: master	2023-11-24 19:44:09 -05:00
Andres Freund	b2e237afdd	Release lock on heap buffer before vacuuming FSM When there are no indexes on a table, we vacuum each heap block after pruning it and then update the freespace map. Periodically, we also vacuum the freespace map. This was done while unnecessarily holding a lock on the heap page. Release the lock before calling FreeSpaceMapVacuumRange() and, while we're at it, ensure the range includes the heap block we just vacuumed. There are no known deadlocks or other similar issues, therefore don't backpatch. It's certainly not good to do all this work under a lock, but it's not frequently reached, making it not worth the risk of backpatching. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAAKRu_YiL%3D44GvGnt1dpYouDSSoV7wzxVoXs8m3p311rp-TVQQ%40mail.gmail.com	2023-11-17 12:46:55 -08:00
Nathan Bossart	6a72c42fd5	Retire MemoryContextResetAndDeleteChildren() macro. As of commit `eaa5808e8e`, MemoryContextResetAndDeleteChildren() is just a backwards compatibility macro for MemoryContextReset(). Now that some time has passed, this macro seems more likely to create confusion. This commit removes the macro and replaces all remaining uses with calls to MemoryContextReset(). Any third-party code that use this macro will need to be adjusted to call MemoryContextReset() instead. Since the two have behaved the same way since v9.5, such adjustments won't produce any behavior changes for all currently-supported versions of PostgreSQL. Reviewed-by: Amul Sul, Tom Lane, Alvaro Herrera, Dagfinn Ilmari Mannsåker Discussion: https://postgr.es/m/20231113185950.GA1668018%40nathanxps13	2023-11-15 13:42:30 -06:00
Heikki Linnakangas	c21e6e2fd4	Clear CurrentResourceOwner earlier in CommitTransaction. Alexander reported a crash with repeated create + drop database, after the ResourceOwner rewrite (commit `b8bff07daa`). That was fixed by the previous commit, but it nevertheless seems like a good idea clear CurrentResourceOwner earlier, because you're not supposed to use it for anything after we start releasing it. Reviewed-by: Alexander Lakhin Discussion: https://www.postgresql.org/message-id/11b70743-c5f3-3910-8e5b-dd6c115ff829%40gmail.com	2023-11-15 11:03:49 +01:00
Tom Lane	5c62ecf6ec	Don't release index root page pin in ginFindParents(). It's clearly stated in the comments that ginFindParents() must keep the pin on the index's root page that's associated with the topmost GinBtreeStack item. However, the code path for the case that the desired downlink has been pushed down to the next index level ignored this proviso, and would release the pin anyway if we were still examining the root level. That led to an assertion failure or "buffer NNNN is not owned by resource owner" error later, when we try to release the pin again at the end of the insertion. This is quite hard to reproduce, since it can only happen if an index root page split occurs concurrently with our own insertion. Thanks to Jeff Janes for finding a test case that triggers it often enough to allow investigation. This has been there since the beginning of GIN, so back-patch to all supported branches. Discussion: https://postgr.es/m/CAMkU=1yCAKtv86dMrD__Ja-7KzjE=uMeKX8y__cx5W-OEWy2ow@mail.gmail.com	2023-11-13 11:44:35 -05:00

1 2 3 4 5 ...

4794 Commits