postgres

mirror of https://github.com/postgres/postgres.git synced 2025-12-21 05:21:08 +03:00

Author	SHA1	Message	Date
Michael Paquier	b2abcfa33a	Fix comment in pg_get_shmem_allocations_numa() The comment fixed in this commit described the function as dealing with database blocks, but in reality it processes shared memory allocations. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aH4DDhdiG9Gi0rG7@ip-10-97-1-34.eu-west-3.compute.internal Backpatch-through: 18	2025-10-21 16:12:34 +09:00
Álvaro Herrera	47da174524	Don't include execnodes.h in replication/conflict.h ... which silently propagates a lot of headers into many places via pgstat.h, as evidenced by the variety of headers that this patch needs to add to seemingly random places. Add a minimum of typedefs to conflict.h to be able to remove execnodes.h, and fix the fallout. Backpatch to 18, where conflict.h first appeared. Discussion: https://postgr.es/m/202509191927.uj2ijwmho7nv@alvherre.pgsql	2025-09-25 14:52:19 +02:00
Michael Paquier	dd74e599b8	Fix shared memory calculation size of PgAioCtl The shared memory size was calculated based on an offset of io_handles, which is itself a pointer included in the structure. We tend to overestimate the shared memory size overall, so this was unlikely an issue in practice, but let's be correct and use the full size of the structure in the calculation, so as the pointer for io_handles is included. Oversight in `da7226993f`. Author: Madhukar Prasad <madhukarprasad@google.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/CAKi+wrbC2dTzh_vKJoAZXV5wqTbhY0n4wRNpCjJ=e36aoo0kFw@mail.gmail.com Backpatch-through: 18	2025-09-17 09:33:35 +09:00
Jeff Davis	a7024398b8	meson: build checksums with extra optimization flags. Use -funroll-loops and -ftree-vectorize when building checksum.c to match what autoconf does. Missed backport of `9af672bcb2`, noticed by Nathan Bossart. Reported-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/a81f2f7ef34afc24a89c613671ea017e3651329c.camel@j-davis.com Reviewed-by: Andres Freund <andres@anarazel.de> Backpatch-through: 16	2025-09-09 16:04:04 -07:00
Andres Freund	ce161b194e	aio: Stop using enum bitfields due to bad code generation During an investigation into rather odd aio related errors on macos, observed by Alexander and Konstantin, we started to wonder if bitfield access is related to the error. At the moment it looks like it is related, we cannot reproduce the failures when replacing the bitfields. In addition, the problem can only be reproduced with some compiler [versions] and not everyone has been able to reproduce the issue. The observed problem is that, very rarely, PgAioHandle->{state,target} are in an inconsistent state, after having been checked to be in a valid state not long before, triggering an assertion failure. Unfortunately, this could be caused by wrong compiler code generation or somehow of missing memory barriers - we don't really know. In theory there should not be any concurrent write access to the handle in the state the bug is triggered, as the handle was idle and is just being initialized. Separately from the bug, we observed that at least gcc and clang generate rather terrible code for the bitfield access. Even if it's not clear if the observed assertion failure is actually caused by the bitfield somehow, the bad code generation alone is sufficient reason to stop using bitfields. Therefore, replace the enum bitfields with uint8s and instead cast in each switch statement. Reported-by: Alexander Lakhin <exclusion@gmail.com> Reported-by: Konstantin Knizhnik <knizhnik@garret.ru> Discussion: https://postgr.es/m/1500090.1745443021@sss.pgh.pa.us Backpatch-through: 18	2025-08-27 19:12:50 -04:00
Peter Eisentraut	952b5a4a95	Message style improvements Mostly adding some quoting.	2025-08-26 22:48:00 +02:00
Peter Eisentraut	b942360f82	Use consistent type for pgaio_io_get_id() result The result of pgaio_io_get_id() was being assigned to a mix of int and uint32 variables. This fixes it to use int consistently, which seems the most correct. Also change the queue empty special value in method_worker.c to -1 from UINT32_MAX. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/70c784b3-f60b-4652-b8a6-75e5f051243e%40eisentraut.org	2025-08-21 19:46:16 +02:00
Thomas Munro	9110d81641	Fix rare bug in read_stream.c's split IO handling. The internal queue of buffers could become corrupted in a rare edge case that failed to invalidate an entry, causing a stale buffer to be "forwarded" to StartReadBuffers(). This is a simple fix for the immediate problem. A small API change might be able to remove this and related fragility entirely, but that will have to wait a bit. Defect in commit `ed0b87ca`. Bug: 19006 Backpatch-through: 18 Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/19006-80fcaaf69000377e%40postgresql.org	2025-08-09 13:06:46 +12:00
Thomas Munro	4cd9d5fc15	Remove obsolete comment. Remove a comment about potential for AIO in StartReadBuffersImpl(), because that change happened.	2025-08-09 01:46:19 +12:00
Heikki Linnakangas	fce7da1e73	Handle cancel requests with PID 0 gracefully If the client sent a query cancel request with backend PID 0, it tripped an assertion. With assertions disabled, you got this in the log instead: LOG: invalid cancel request with PID 0 LOG: wrong key in cancel request for process 0 Query cancellations don't even require authentication, so we better tolerate bogus requests. Fix by turning the assertion into a regular runtime check. Spotted while testing libpq behavior with a modified server that didn't send BackendKeyData to the client. Backpatch-through: 18	2025-07-30 00:40:15 +03:00
Michael Paquier	f7dfccf960	Fix assertion failure with latch wait in single-user mode LatchWaitSetPostmasterDeathPos, the latch event position for the postmaster death event, is initialized under IsUnderPostmaster. WaitLatch() considered it as a valid wait target in single-user mode (!IsUnderPostmaster), which was incorrect. One code path found to fail with an assertion failure is a database drop in single-user mode while waiting in WaitForProcSignalBarrier() after the drop. Oversight in commit `84e5b2f07a`. Author: Patrick Stählin <me@packi.ch> Co-authored-by: Ronan Dunklau <ronan.dunklau@aiven.io> Discussion: https://postgr.es/m/18996-3a2744c8140488de@postgresql.org Backpatch-through: 18	2025-07-25 16:17:31 +09:00
Andres Freund	7b98c55368	aio: Fix assertion, clarify README The assertion wouldn't have triggered for a long while yet, but this won't accidentally fail to detect the issue if/when it occurs. Author: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/CAEze2Wj-43JV4YufW23gm=Uwr7Lkj+p0yKctKHxNm1rwFC+_DQ@mail.gmail.com Backpatch-through: 18	2025-07-22 08:32:14 -04:00
Michael Paquier	4fcbe06aa8	Fix inconsistent LWLock tranche names for MultiXact* The terms used in wait_event_names.txt and lwlock.c were inconsistent for MultiXactOffsetSLRU and MultiXactMemberSLRU, which could cause joins between pg_wait_events and pg_stat_activity to fail. lwlock.c is adjusted in this commit to what the historical name of the event has always been, and what is documented. Oversight in `53c2a97a92`. `08b9b9e043` has fixed a similar inconsistency some time ago. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/aHdxN0D0hKXzHFQG@ip-10-97-1-34.eu-west-3.compute.internal Backpatch-through: 17	2025-07-17 09:32:49 +09:00
Thomas Munro	7d11f36e71	aio: Fix configuration reload in IO workers. method_worker.c installed SignalHandlerForConfigReload, but it failed to actually process reload requests. That hasn't yet produced any concrete problem reports in terms of GUC changes it should have cared about in v18, but it was inconsistent. It did cause problems for a couple of patches in development that need IO workers to react to ALTER SYSTEM + pg_reload_conf(). Fix extracted from one of those patches. Back-patch to 18. Reported-by: Dmitry Dolgov <9erthalion6@gmail.com> Discussion: https://postgr.es/m/sh5uqe4a4aqo5zkkpfy5fobe2rg2zzouctdjz7kou4t74c66ql%40yzpkxb7pgoxf	2025-07-12 16:34:06 +12:00
Thomas Munro	b2afb06763	aio: Regularize IO worker internal naming. Adopt PgAioXXX convention for pgaio module type names. Rename a function that didn't use a pgaio_worker_ submodule prefix. Rename the internal submit function's arguments to match the indirectly relevant function pointer declaration and nearby examples. Rename the array of handle IDs in PgAioSubmissionQueue to sqes, a term of art seen in the systems it emulates, also clarifying that they're not IO handle pointers as the old name might imply. No change in behavior, just type, variable and function name cleanup. Back-patch to 18. Discussion: https://postgr.es/m/CA%2BhUKG%2BwbaZZ9Nwc_bTopm4f-7vDmCwLk80uKDHj9mq%2BUp0E%2Bg%40mail.gmail.com	2025-07-12 14:45:34 +12:00
Thomas Munro	20b8b5dab9	Fix stale idle flag when IO workers exit. Otherwise we could choose a worker that has exited and crash while trying to wake it up. Back-patch to 18. Reported-by: Tomas Vondra <tomas@vondra.me> Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/t5aqjhkj6xdkido535pds7fk5z4finoxra4zypefjqnlieevbg%40357aaf6u525j	2025-07-12 13:14:22 +12:00
Andres Freund	9a5334c0b4	aio: Combine io_uring memory mappings, if supported By default io_uring creates a shared memory mapping for each io_uring instance, leading to a large number of memory mappings. Unfortunately a large number of memory mappings slows things down, backend exit is particularly affected. To address that, newer kernels (6.5) support using user-provided memory for the memory. By putting the relevant memory into shared memory we don't need any additional mappings. On a system with a new enough kernel and liburing, there is no discernible overhead when doing a pgbench -S -C anymore. Reported-by: MARK CALLAGHAN <mdcallag@gmail.com> Reviewed-by: "Burd, Greg" <greg@burd.me> Reviewed-by: Jim Nasby <jnasby@upgrade.com> Discussion: https://postgr.es/m/CAFbpF8OA44_UG+RYJcWH9WjF7E3GA6gka3gvH6nsrSnEe9H0NA@mail.gmail.com Backpatch-through: 18	2025-07-07 21:04:03 -04:00
Tom Lane	45c5276628	Make safeguard against incorrect flags for fsync more portable. The existing code assumed that O_RDONLY is defined as 0, but this is not required by POSIX and is not true on GNU Hurd. We can avoid the assumption by relying on O_ACCMODE to mask the fcntl() result. (Hopefully, all supported platforms define that.) Author: Michael Banck <mbanck@gmx.net> Co-authored-by: Samuel Thibault Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/6862e8d1.050a0220.194b8d.76fa@mx.google.com Discussion: https://postgr.es/m/68480868.5d0a0220.1e214d.68a6@mx.google.com Backpatch-through: 13	2025-07-01 12:08:20 -04:00
Tomas Vondra	14e52227e5	Silence valgrind about pg_numa_touch_mem_if_required When querying NUMA status of pages in shared memory, we need to touch the memory first to get valid results. This may trigger valgrind reports, because some of the memory (e.g. unpinned buffers) may be marked as noaccess. Solved by adding a valgrind suppresion. An alternative would be to adjust the access/noaccess status before touching the memory, but that seems far too invasive. It would require all those places to have detailed knowledge of what the shared memory stores. The pg_numa_touch_mem_if_required() macro is replaced with a function. Macros are invisible to suppressions, so it'd have to suppress reports for the caller - e.g. pg_get_shmem_allocations_numa(). So we'd suppress reports for the whole function, and that seems to heavy-handed. It might easily hide other valid issues. Reviewed-by: Christoph Berg <myon@debian.org> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aEtDozLmtZddARdB@msg.df7cb.de Backpatch-through: 18	2025-07-01 12:33:29 +02:00
Andres Freund	e9a3615a52	aio: Add missing memory barrier when waiting for IO handle Previously there was no memory barrier enforcing correct memory ordering when waiting for a free IO handle. However, in the much more common case of waiting for IO to complete, memory barriers already were present. On strongly ordered architectures like x86 this had no negative consequences, but on some armv8 hardware (observed on Apple hardware), it was possible for the update, in the IO worker, to PgAioHandle->state to become visible before ->distilled_result becoming visible, leading to rather confusing assertion failures. The failures were rare enough that the bug sometimes took days to reproduce when running 027_stream_regress in a loop. Once finally debugged, it was easy enough to come up with a much quicker repro: Trigger a lot of very fast IO by limiting io_combine_limit to 1 and ensure that we always have to wait for a free handle by setting io_max_concurrency to 1. Triggering lots of concurrent seqscans in that setup triggers the issue within seconds. One reason this was hard to debug was that the assertion failure most commonly happened in WaitReadBuffers(), rather than in the AIO subsystem itself. The assertions added in this commit make problems like this easier to understand. Also add a comment to the IO worker explaining that we rely on the lwlock acquisition for correct memory ordering. I think it'd be good to add a tap test that stress tests buffer IO, but that's material for a separate patch. Thanks a lot to Alexander and Konstantin for all the debugging help. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Reported-by: Alexander Lakhin <exclusion@gmail.com> Investigated-by: Andres Freund <andres@anarazel.de> Investigated-by: Alexander Lakhin <exclusion@gmail.com> Investigated-by: Konstantin Knizhnik <knizhnik@garret.ru> Discussion: https://postgr.es/m/2dkz7azclpeiqcmouamdixyn5xhlzy4rvikxrbovyzvi6rnv5c@pz7o7osv2ahf	2025-06-16 12:36:01 -04:00
Michael Paquier	2c76c6ac47	Replace %llu by PRIu64 in AIO io_uring code This is a continuation of `15a79c7311`, cleaning up the AIO io_uring code that has been committed after that while still using %llu. The code changed here is new in v18, so cleaning things now means less conflicts if this area of the code changes on backpatch once the 18 stable branch is created. Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/aEZcGCnYFq642q8k@paquier.xyz	2025-06-13 08:59:47 +09:00
Michael Paquier	b87163e5f3	Fix copy-pasto with process count calculation in method_io_uring.c This commit replaces the formula used for "TotalProcs" with a call to pgaio_uring_procs() in pgaio_uring_shmem_init() for the shared memory initialization, which is exactly the same, removing a duplication. pgaio_uring_procs() is used for shared memory sizing and a sanity check, and it has some documentation explaining some reasoning behind the formula. Author: Japin Li <japinli@hotmail.com> Discussion: https://postgr.es/m/ME0P300MB044521067A1EDDA9EDEC3793B66DA@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM	2025-06-05 09:39:24 +09:00
Peter Eisentraut	58fbfde152	Fix incorrect format placeholders	2025-06-03 21:38:04 +02:00
Fujii Masao	73bdcfab35	Rename log_lock_failure GUC to log_lock_failures for consistency. This commit renames the GUC log_lock_failure to log_lock_failures to align with the existing similar setting log_lock_waits, which uses the plural form. This improves naming consistency across related GUCs. Suggested-by: Peter Eisentraut <peter@eisentraut.org> Author: Fujii Masao <masao.fujii@gmail.com Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/7a8198b6-d5b8-4910-b41e-8d3efcbb015d@eisentraut.org	2025-06-03 10:02:55 +09:00
Peter Eisentraut	fc32be3c94	Fix incorrect format placeholders Fixes for return type of dclist_count().	2025-06-02 10:12:58 +02:00
Fujii Masao	961553daf5	Make XactLockTableWait() and ConditionalXactLockTableWait() interruptable more. Previously, XactLockTableWait() and ConditionalXactLockTableWait() could enter a non-interruptible loop when they successfully acquired a lock on a transaction but the transaction still appeared to be running. Since this loop continued until the transaction completed, it could result in long, uninterruptible waits. Although this scenario is generally unlikely since XactLockTableWait() and ConditionalXactLockTableWait() can basically acquire a transaction lock only when the transaction is not running, it can occur in a hot standby. In such cases, the transaction may still appear active due to the KnownAssignedXids list, even while no lock on the transaction exists. For example, this situation can happen when creating a logical replication slot on a standby. The cause of the non-interruptible loop was the absence of CHECK_FOR_INTERRUPTS() within it. This commit adds CHECK_FOR_INTERRUPTS() to the loop in both functions, ensuring they can be interrupted safely. Back-patch to all supported branches. Author: Kevin K Biju <kevinkbiju@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAM45KeELdjhS-rGuvN=ZLJ_asvZACucZ9LZWVzH7bGcD12DDwg@mail.gmail.com Backpatch-through: 13	2025-05-31 00:08:40 +09:00
Daniel Gustafsson	fb844b9f06	Revert function to get memory context stats for processes Due to concerns raised about the approach, and memory leaks found in sensitive contexts the functionality is reverted. This reverts commits `45e7e8ca9`, `f8c115a6c`, `d2a1ed172`, `55ef7abf8` and `042a66291` for v18 with an intent to revisit this patch for v19. Discussion: https://postgr.es/m/594293.1747708165@sss.pgh.pa.us	2025-05-23 15:44:54 +02:00
Michael Paquier	3d0c3a418f	Adjust operation names of pg_aios to match the documentation pg_aios used the terms "read" and "write" for vectored I/O read and write operations, respectively. The documentation refers to them as "readv" and "writev", and the code uses internally the terms PGAIO_OP_READV and PGAIO_OP_WRITEV for them, as of "vectored". This commit adjusts these operation names to match with the code and the documentation. Oversight in `8e293e689b`. Author: Atsushi Torikoshi <torikoshia@oss.nttdata.com> Discussion: https://postgr.es/m/6df1e949d1d759ad2767c18e5845963e@oss.nttdata.com	2025-05-21 15:58:03 +09:00
Andres Freund	acad909321	aio: Fix possible state confusions due to interrupt processing elog()/ereport() process interrupts, iff the log message is < ERROR and the log message will be emitted. aio's debug messages are emitted via ereport(), but in some places the code is not ready for interrupts to be processed. Fix the issue using a few different methods: 1) handle interrupts arriving concurrently - in some places it's easy to detect that by fetching the handle's generation a bit earlier 2) Check if interrupts made the work needing to be done obsolete 3) Disallow interrupts, as there's no sane way to make interrupt processing safe To prevent some similar issues from being re-introduced, assert that interrupts are held in pgaio_io_update_state(). This commit also fixes the contents of a debug message I added in `039bfc457e`. Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/mvpm7ga3dfgz7bvum22hmuz26cariylmcppb3irayftc7bwk3r@l7gb6gr7azhc	2025-05-19 21:07:06 -04:00
Noah Misch	4a4ee0c2c1	Remove GLOBALTABLESPACE_OID assert for locked buffers. Commit `f4ece891fc` added the assertion in an attempt to catch some defects even after VACUUM FULL or REINDEX. However, IsCatalogTextUniqueIndexOid(tag.relNumber) always returns false after a relfilenode change, provoking unintended assertion failures. Reported-by: Adam Guo <adamguo@amazon.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Bug: #18912 Discussion: https://postgr.es/m/18912-a41c9bd0e0ad19b1@postgresql.org	2025-05-10 07:36:27 -07:00
Michael Paquier	c259ba881c	aio: Use runtime arguments with injections points in tests This cleans up the code related to the testing infrastructure of AIO that used injection points, switching the test code to use the new facility for injection points added by `371f2db8b0` rather than tweaks to pass and reset arguments to the callbacks run. This removes all the dependencies to USE_INJECTION_POINTS in the AIO code. pgaio_io_call_inj(), pgaio_inj_io_get() and pgaio_inj_cur_handle are now gone. Reviewed-by: Greg Burd <greg@burd.me> Discussion: https://postgr.es/m/Z_y9TtnXubvYAApS@paquier.xyz	2025-05-10 12:36:57 +09:00
Michael Paquier	371f2db8b0	Add support for runtime arguments in injection points The macros INJECTION_POINT() and INJECTION_POINT_CACHED() are extended with an optional argument that can be passed down to the callback attached when an injection point is run, giving to callbacks the possibility to manipulate a stack state given by the caller. The existing callbacks in modules injection_points and test_aio have their declarations adjusted based on that. `da7226993f` (core AIO infrastructure) and `93bc3d75d8` (test_aio) and been relying on a set of workarounds where a static variable called pgaio_inj_cur_handle is used as runtime argument in the injection point callbacks used by the AIO tests, in combination with a TRY/CATCH block to reset the argument value. The infrastructure introduced in this commit will be reused for the AIO tests, simplifying them. Reviewed-by: Greg Burd <greg@burd.me> Discussion: https://postgr.es/m/Z_y9TtnXubvYAApS@paquier.xyz	2025-05-10 06:56:26 +09:00
Heikki Linnakangas	b28c59a6cd	Use 'void ' for arbitrary buffers, 'uint8 ' for byte arrays A 'void ' argument suggests that the caller might pass an arbitrary struct, which is appropriate for functions like libc's read/write, or pq_sendbytes(). 'uint8 ' is more appropriate for byte arrays that have no structure, like the cancellation keys or SCRAM tokens. Some places used 'char ', but 'uint8 ' is better because 'char *' is commonly used for null-terminated strings. Change code around SCRAM, MD5 authentication, and cancellation key handling to follow these conventions. Discussion: https://www.postgresql.org/message-id/61be9e31-7b7d-49d5-bc11-721800d89d64@eisentraut.org	2025-05-08 22:01:25 +03:00
Michael Paquier	c4c236ab5c	Fix some comments related to IO workers IO workers are treated as auxiliary processes. The comments fixed in this commit stated that there could be only one auxiliary process of each BackendType at the same time. This is not true for IO workers, as up to MAX_IO_WORKERS of them can co-exist at the same time. Author: Cédric Villemain <Cedric.Villemain@data-bene.io> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/e4a3ac45-abce-4b58-a043-b4a31cd11113@Data-Bene.io	2025-05-07 14:55:57 +09:00
Tom Lane	94b84a6072	Don't use double-quotes in #include's of system headers, redux. This cleans up some loose ends left by commit `e8ca9ed1d`. I hadn't looked closely enough at these places before, but now I have. The use of double-quoted #includes for Perl headers in plperl_system.h seems to be simply a mistake introduced in `6c944bf3c` and faithfully copied forward since then. (I had thought possibly it was required by some weird Windows build setup, but there's no evidence of that in our history.) The occurrences in SectionMemoryManager.h and SectionMemoryManager.cpp evidently stem from those files' origin as LLVM code. It's understandable that LLVM would treat their own files as needing double-quoted #includes; but they're still system headers to us. I also applied the same check to *.c files, and found a few other random incorrect usages in both directions. Our ECPG headers and test files routinely use angle brackets to refer to ECPG headers. I left those usages alone, since it seems reasonable for an ECPG user to regard those headers as system headers.	2025-04-27 13:23:19 -04:00
David Rowley	936457419d	Eliminate divide in new fast-path locking code `c4d5cb71d2` adjusted the fast-path locking code to allow some configuration of the number of fast-path locking slots via the max_locks_per_transaction GUC. In that commit the FAST_PATH_REL_GROUP() macro used integer division to determine the fast-path locking group slot to use for the lock. The divisor in this case is always a power-of-two value. Here we swap out the divide by a bitwise-AND, which is a significantly faster operation to perform. In passing, adjust the code that's setting FastPathLockGroupsPerBackend so that it's more clear that the value being set is a power-of-two. Also, adjust some comments in the area which contained some magic numbers. It seems better to justify the 1024 upper limit in the location where the #define is made instead of where it is used. Author: David Rowley <drowleyml@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CAApHDvodr3bcnpxcs7+k-3cFwYR0tP-BYhyd2PpDhe-bCx9i=g@mail.gmail.com	2025-04-27 11:53:40 +12:00
Andres Freund	039bfc457e	aio: Improve debug logging around waiting for IOs Trying to investigate a bug report by Alexander Lakhin made it apparent that the debug logging around waiting for IO completion is insufficient. Fix that. Discussion: https://postgr.es/m/h4in2db37vepagmi2oz5vvqymjasc5gyb4lpqkunj4eusu274i@37jpd3c2spd3	2025-04-25 13:31:25 -04:00
Andres Freund	500b61769f	Fix bug allowing io_combine_limit > io_max_combine_combine limit `10f6646847` intended to limit the value of io_combine_limit to the minimum of io_combine_limit and io_max_combine_limit. To avoid issues with interdependent GUCs, it introduced io_combine_limit_guc and set io_combine_limit in assign hooks. That plan was thwarted by guc_tables.c accidentally still referencing io_combine_limit, instead of io_combine_limit_guc. That lead to the GUC machinery overriding the work done in the assign hooks, potentially leaving io_combine_limit with a too high value. The consequence of this bug was that when running with io_combine_limit > io_combine_limit_guc the AIO machinery would not have reserved large enough iovec and IO data arrays, with one IO's arrays overlapping with another IO's, leading to total confusion. To make such a problem easier to detect in the future, add assertions to pgaio_io_set_handle_data_* checking the length is smaller than io_max_combine_limit (not just PG_IOV_MAX). It'd be nice to have a few tests for this, but it's not entirely obvious how to do so portably. As remarked upon by Tom, the GUC assignment hooks really shouldn't set the underlying variable, that's the job of the GUC machinery. Change that as well. Discussion: https://postgr.es/m/c5jyqnuwrpigd35qe7xdypxsisdjrdba5iw63mhcse4mzjogxo@qdjpv22z763f	2025-04-25 13:31:24 -04:00
Andres Freund	0d9114b704	aio: Fix crash potential for pg_aios views due to late state update pgaio_io_reclaim() reset the fields in PgAioHandle before updating the state to IDLE or incrementing the generation. For most things that's OK, but for pg_get_aios() it is not - if it copied the PgAioHandle while fields were being reset, we wouldn't detect that and could call pgaio_io_get_target_description() with ioh->target == PGAIO_TID_INVALID, leading to a crash. Fix this issue by incrementing the generation and state earlier, before resetting. Also add an assertion to pgaio_io_get_target_description() for the target to be valid - that'd have made this case a bit easier to debug. While at it, add/update a few related assertions. Author: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/062daca9-dfad-4750-9da8-b13388301ad9@gmail.com	2025-04-25 13:31:13 -04:00
David Rowley	78eda9e264	Fix a few more duplicate words in comments Similar to `84fd3bc14` but these ones were found using a regex that can span multiple lines. Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAApHDvrMcr8XD107H3NV=WHgyBcu=sx5+7=WArr-n_cWUqdFXQ@mail.gmail.com	2025-04-21 13:50:50 +12:00
Michael Paquier	88e947136b	Fix typos and grammar in the code The large majority of these have been introduced by recent commits done in the v18 development cycle. Author: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/9a7763ab-5252-429d-a943-b28941e0e28b@gmail.com	2025-04-19 19:17:42 +09:00
Michael Paquier	114f7fa81c	Rename injection points used in AIO tests The format of the injection point names used by the AIO code does not match the existing naming convention used everywhere else in the code, so let's be consistent. These points are used in test_aio. Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Discussion: https://postgr.es/m/Z_yTB80bdu1sYDqJ@paquier.xyz	2025-04-19 18:53:35 +09:00
Noah Misch	f4ece891fc	Assert lack of hazardous buffer locks before possible catalog read. Commit `0bada39c83` fixed a bug of this kind, which existed in all branches for six days before detection. While the probability of reaching the trouble was low, the disruption was extreme. No new backends could start, and service restoration needed an immediate shutdown. Hence, add this to catch the next bug like it. The new check in RelationIdGetRelation() suffices to make autovacuum detect the bug in commit `243e9b40f1` that led to commit `0bada39`. This also checks in a number of similar places. It replaces each Assert(IsTransactionState()) that pertained to a conditional catalog read. No back-patch for now, but a back-patch of commit `243e9b4` should back-patch this, too. A back-patch could omit the src/test/regress changes, since back branches won't gain new index columns. Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/20250410191830.0e.nmisch@google.com Discussion: https://postgr.es/m/10ec0bc3-5933-1189-6bb8-5dec4114558e@gmail.com	2025-04-17 05:00:30 -07:00
Peter Geoghegan	a6cab6a78e	Harmonize function parameter names for Postgres 18. Make sure that function declarations use names that exactly match the corresponding names from function definitions in a few places. These inconsistencies were all introduced during Postgres 18 development. This commit was written with help from clang-tidy, by mechanically applying the same rules as similar clean-up commits (the earliest such commit was commit `035ce1fe`).	2025-04-12 12:07:36 -04:00
Daniel Gustafsson	847bbb21f8	Fix recently introduced typos This fixes typos in docs and comments introduced during the v18 development cycle, to keep them from ending up in backbranches. Author: Jacob Brazeal <jacob.brazeal@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CA+COZaCgGua25f2hSrjrDLJcJJAHkwoKgTTqUy-wyL1=64JNjw@mail.gmail.com	2025-04-11 22:17:12 +02:00
Tomas Vondra	3887d0cfeb	Cleanup of pg_numa.c This moves/renames some of the functions defined in pg_numa.c: * pg_numa_get_pagesize() is renamed to pg_get_shmem_pagesize(), and moved to src/backend/storage/ipc/shmem.c. The new name better reflects that the page size is not related to NUMA, and it's specifically about the page size used for the main shared memory segment. * move pg_numa_available() to src/backend/storage/ipc/shmem.c, i.e. into the backend (which more appropriate for functions callable from SQL). While at it, improve the comment to explain what page size it returns. * remove unnecessary includes from src/port/pg_numa.c, adding unnecessary dependencies (src/port should be suitable for frontent). These were either leftovers or unnecessary thanks to the other changes in this commit. This eliminates unnecessary dependencies on backend symbols, which we don't want in src/port. Reported-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> https://postgr.es/m/CALdSSPi5fj0a7UG7Fmw2cUD1uWuckU_e8dJ+6x-bJEokcSXzqA@mail.gmail.com	2025-04-09 21:50:17 +02:00
Thomas Munro	f78ca6f3eb	Introduce file_copy_method setting. It can be set to either COPY (the default) or CLONE if the system supports it. CLONE causes callers of copydir(), currently CREATE DATABASE ... STRATEGY=FILE_COPY and ALTER DATABASE ... SET TABLESPACE = ..., to use copy_file_range (Linux, FreeBSD) or copyfile (macOS) to copy files instead of a read-write loop over the contents. CLONE gives the kernel the opportunity to share block ranges on copy-on-write file systems and push copying down to storage on others, depending on configuration. On some systems CLONE can be used to clone large databases quickly with CREATE DATABASE ... TEMPLATE=source STRATEGY=FILE_COPY. Other operating systems could be supported; patches welcome. Co-authored-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Ranier Vilela <ranier.vf@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKGLM%2Bt%2BSwBU-cHeMUXJCOgBxSHLGZutV5zCwY4qrCcE02w%40mail.gmail.com	2025-04-08 21:35:38 +12:00
Daniel Gustafsson	042a66291b	Add function to get memory context stats for processes This adds a function for retrieving memory context statistics and information from backends as well as auxiliary processes. The intended usecase is cluster debugging when under memory pressure or unanticipated memory usage characteristics. When calling the function it sends a signal to the specified process to submit statistics regarding its memory contexts into dynamic shared memory. Each memory context is returned in detail, followed by a cumulative total in case the number of contexts exceed the max allocated amount of shared memory. Each process is limited to use at most 1Mb memory for this. A summary can also be explicitly requested by the user, this will return the TopMemoryContext and a cumulative total of all lower contexts. In order to not block on busy processes the caller specifies the number of seconds during which to retry before timing out. In the case where no statistics are published within the set timeout, the last known statistics are returned, or NULL if no previously published statistics exist. This allows dash- board type queries to continually publish even if the target process is temporarily congested. Context records contain a timestamp to indicate when they were submitted. Author: Rahila Syed <rahilasyed90@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Tomas Vondra <tomas@vondra.me> Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Discussion: https://postgr.es/m/CAH2L28v8mc9HDt8QoSJ8TRmKau_8FM_HKS41NeO9-6ZAkuZKXw@mail.gmail.com	2025-04-08 11:06:56 +02:00
Andres Freund	15f0cb26b5	Increase BAS_BULKREAD based on effective_io_concurrency Before, BAS_BULKREAD was always of size 256kB. With the default io_combine_limit of 16, that only allowed 1-2 IOs to be in flight - insufficient even on very low latency storage. We don't just want to increase the size to a much larger hardcoded value, as very large rings (10s of MBs of of buffers), appear to have negative performance effects when reading in data that the OS has cached (but not when actually needing to do IO). To address this, increase the size of BAS_BULKREAD to allow for io_combine_limit * effective_io_concurrency buffers getting read in. To prevent the ring being much larger than useful, limit the increased size with GetPinLimit(). The formula outlined above keeps the ring size to sizes for which we have not observed performance regressions, unless very large effective_io_concurrency values are used together with large shared_buffers setting. Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/lqwghabtu2ak4wknzycufqjm5ijnxhb4k73vzphlt2a3wsemcd@gtftg44kdim6 Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah@brqs62irg4dt	2025-04-08 02:41:03 -04:00
Andres Freund	dcf7e1697b	Add pg_buffercache_evict_{relation,all} functions In addition to the added functions, the pg_buffercache_evict() function now shows whether the buffer was flushed. pg_buffercache_evict_relation(): Evicts all shared buffers in a relation at once. pg_buffercache_evict_all(): Evicts all shared buffers at once. Both functions provide mechanism to evict multiple shared buffers at once. They are designed to address the inefficiency of repeatedly calling pg_buffercache_evict() for each individual buffer, which can be time-consuming when dealing with large shared buffer pools. (e.g., ~477ms vs. ~2576ms for 16GB of fully populated shared buffers). These functions are intended for developer testing and debugging purposes and are available to superusers only. Minimal tests for the new functions are included. Also, there was no test for pg_buffercache_evict(), test for this added too. No new extension version is needed, as it was already increased this release by `ba2a3c2302`. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Aidar Imamov <a.imamov@postgrespro.ru> Reviewed-by: Joseph Koshakow <koshy44@gmail.com> Discussion: https://postgr.es/m/CAN55FZ0h_YoSqqutxV6DES1RW8ig6wcA8CR9rJk358YRMxZFmw%40mail.gmail.com	2025-04-08 02:19:32 -04:00

1 2 3 4 5 ...

2898 Commits