postgres

mirror of https://github.com/postgres/postgres.git synced 2025-06-27 23:21:58 +03:00

Author	SHA1	Message	Date
Peter Eisentraut	c53f3b9cc8	Improve file_copy_method entry in postgresql.conf.sample Improve the wording of the comment a bit, fix whitespace. Also move the entry so that the section order is consistent with config.sgml.	2025-05-26 14:52:00 +02:00
Peter Eisentraut	70a13c528b	Move oauth_validator_libraries in postgresql.conf.sample Move oauth_validator_libraries in postgresql.conf.sample to be grouped with the other CONN_AUTH_AUTH settings, rather than making up a new ad-hoc category. This matches the internal categorization and also how it is listed in the documentation.	2025-05-23 09:03:09 +02:00
Thomas Munro	f78ca6f3eb	Introduce file_copy_method setting. It can be set to either COPY (the default) or CLONE if the system supports it. CLONE causes callers of copydir(), currently CREATE DATABASE ... STRATEGY=FILE_COPY and ALTER DATABASE ... SET TABLESPACE = ..., to use copy_file_range (Linux, FreeBSD) or copyfile (macOS) to copy files instead of a read-write loop over the contents. CLONE gives the kernel the opportunity to share block ranges on copy-on-write file systems and push copying down to storage on others, depending on configuration. On some systems CLONE can be used to clone large databases quickly with CREATE DATABASE ... TEMPLATE=source STRATEGY=FILE_COPY. Other operating systems could be supported; patches welcome. Co-authored-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Ranier Vilela <ranier.vf@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKGLM%2Bt%2BSwBU-cHeMUXJCOgBxSHLGZutV5zCwY4qrCcE02w%40mail.gmail.com	2025-04-08 21:35:38 +12:00
Tom Lane	3516ea768c	Add local-address escape "%L" to log_line_prefix. This escape shows the numeric server IP address that the client has connected to. Unix-socket connections will show "[local]". Non-client processes (e.g. background processes) will show "[none]". We expect that this option will be of interest to only a fairly small number of users. Therefore the implementation is optimized for the case where it's not used (that is, we don't do the string conversion until we have to), and we've not added the field to csvlog or jsonlog formats. Author: Greg Sabino Mullane <htamfids@gmail.com> Reviewed-by: Cary Huang <cary.huang@highgo.ca> Reviewed-by: David Steele <david@pgmasters.net> Reviewed-by: Jim Jones <jim.jones@uni-muenster.de> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAKAnmmK-U+UicE-qbNU23K--Q5XTLdM6bj+gbkZBZkjyjrd3Ow@mail.gmail.com	2025-04-07 11:06:05 -04:00
Alexander Korotkov	717d0e8dd9	Remove GUC_NOT_IN_SAMPLE from enable_self_join_elimination `fc069a3a63` implements Self-Join Elimination (SJE) and provides a new GUC variable: enable_self_join_elimination. This new GUC variable was marked as GUC_NOT_IN_SAMPLE. However, enable_self_join_elimination is documented and is not different from any other enable_* GUCs. Thus, remove GUC_NOT_IN_SAMPLE from it and add it to the postgresql.conf.sample. Discussion: https://postgr.es/m/CAPpHfdsqMTEsmxk3aQwt6xPz%2BKpUELO%3D6fzmER9ZRGrbs4uMfA%40mail.gmail.com Author: Tender Wang <tndrwang@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>	2025-04-07 16:28:54 +03:00
Alexander Korotkov	3a1a7c5a70	Revert "Put enable_self_join_elimination into postgresql.conf.sample" This reverts commit `c2d329260c`. Reported-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/D292EB44-806E-439A-82A4-491A1BA59E7A%40yesql.se	2025-04-06 14:30:20 +03:00
Alexander Korotkov	c2d329260c	Put enable_self_join_elimination into postgresql.conf.sample `fc069a3a63` implements Self-Join Elimination (SJE) and provides a new GUC variable: enable_self_join_elimination. This commit adds enable_self_join_elimination to the postgresql.conf.sample, as it was forgotten in the original commit. Discussion: https://postgr.es/m/CAHewXN%3D%2Bghd6O6im46q7j2u6c3H6vkXtXmF%3D_v4CfGSnjje8PA%40mail.gmail.com Author: Tender Wang <tndrwang@gmail.com>	2025-04-06 13:24:16 +03:00
Andres Freund	2a5e709e72	Enable IO concurrency on all systems Previously effective_io_concurrency and maintenance_io_concurrency could not be set above 0 on machines without fadvise support. AIO enables IO concurrency without such support, via io_method=worker. Currently only subsystems using the read stream API will take advantage of this. Other users of maintenance_io_concurrency (like recovery prefetching) which leverage OS advice directly will not benefit from this change. In those cases, maintenance_io_concurrency will have no effect on I/O behavior. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/CAAKRu_atGgZePo=_g6T3cNtfMf0QxpvoUh5OUqa_cnPdhLd=gw@mail.gmail.com	2025-03-30 19:16:47 -04:00
Álvaro Herrera	9fbd53dea5	Remove the query_id_squash_values GUC Commit `62d712ecfd` introduced the capability to calculate the same queryId for queries with different lengths of constants in a list for an IN clause. This behavior was originally enabled with a GUC query_id_squash_values. After a discussion about the value of such a GUC, it was decided to back out of the use of a GUC and make the squashing behavior the only available option. Author: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/Z-LZyygkkNyA8-kR@msg.df7cb.de Discussion: https://postgr.es/m/CA+q6zcVTK-3C-8NWV1oY2NZrvtnMCDqnyYYyk1T7WMUG65MeOQ@mail.gmail.com	2025-03-27 13:33:37 +01:00
Andres Freund	c325a7633f	aio: Add io_method=io_uring Performing AIO using io_uring can be considerably faster than io_method=worker, particularly when lots of small IOs are issued, as a) the context-switch overhead for worker based AIO becomes more significant b) the number of IO workers can become limiting io_uring, however, is linux specific and requires an additional compile-time dependency (liburing). This implementation is fairly simple and there are substantial optimization opportunities. The description of the existing AIO_IO_COMPLETION wait event is updated to make the difference between it and the new AIO_IO_URING_EXECUTION clearer. Reviewed-by: Noah Misch <noah@leadboat.com> Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt Discussion: https://postgr.es/m/20210223100344.llw5an2aklengrmn@alap3.anarazel.de Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf@gcnactj4z56m	2025-03-26 19:49:13 -04:00
Masahiko Sawada	04ff636cbc	Add GUC option to control maximum active replication origins. This commit introduces a new GUC option max_active_replication_origins to control the maximum number of active replication origins. Previously, this was controlled by 'max_replication_slots'. Having a separate GUC option provides better flexibility for setting up subscribers, as they may not require replication slots (for cascading replication) but always require replication origins. Author: Euler Taveira <euler@eulerto.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: vignesh C <vignesh21@gmail.com> Discussion: https://postgr.es/m/b81db436-8262-4575-b7c4-bc0c1551000b@app.fastmail.com	2025-03-21 12:20:15 -07:00
Nathan Bossart	0164a0f9ee	Add vacuum_truncate configuration parameter. This new parameter works just like the storage parameter of the same name: if set to true (which is the default), autovacuum and VACUUM attempt to truncate any empty pages at the end of the table. It is primarily intended to help users avoid locking issues on hot standbys. The setting can be overridden with the storage parameter or VACUUM's TRUNCATE option. Since there's presently no way to determine whether a Boolean storage parameter is explicitly set or has just picked up the default value, this commit also introduces an isset_offset member to relopt_parse_elt. Suggested-by: Will Storey <will@summercat.com> Author: Nathan Bossart <nathandbossart@gmail.com> Co-authored-by: Gurjeet Singh <gurjeet@singh.im> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Robert Treat <rob@xzilla.net> Discussion: https://postgr.es/m/Z2DE4lDX4tHqNGZt%40dev.null	2025-03-20 10:16:50 -05:00
Peter Eisentraut	4f7f7b0375	extension_control_path The new GUC extension_control_path specifies a path to look for extension control files. The default value is $system, which looks in the compiled-in location, as before. The path search uses the same code and works in the same way as dynamic_library_path. Some use cases of this are: (1) testing extensions during package builds, (2) installing extensions outside security-restricted containers like Python.app (on macOS), (3) adding extensions to PostgreSQL running in a Kubernetes environment using operators such as CloudNativePG without having to rebuild the base image for each new extension. There is also a tweak in Makefile.global so that it is possible to install extensions using PGXS into an different directory than the default, using 'make install prefix=/else/where'. This previously only worked when specifying the subdirectories, like 'make install datadir=/else/where/share pkglibdir=/else/where/lib', for purely implementation reasons. (Of course, without the path feature, installing elsewhere was rarely useful.) Author: Peter Eisentraut <peter@eisentraut.org> Co-authored-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: David E. Wheeler <david@justatheory.com> Reviewed-by: Gabriele Bartolini <gabriele.bartolini@enterprisedb.com> Reviewed-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com> Reviewed-by: Niccolò Fei <niccolo.fei@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/E7C7BFFB-8857-48D4-A71F-88B359FADCFD@justatheory.com	2025-03-19 07:03:20 +01:00
Thomas Munro	06fb5612c9	Increase io_combine_limit range to 1MB. The default of 128kB is unchanged, but the upper limit is changed from 32 blocks to 128 blocks, unless the operating system's IOV_MAX is too low. Some other RDBMSes seem to cap their multi-block buffer pool I/O around this number, and it seems useful to allow experimentation. The concrete change is to our definition of PG_IOV_MAX, which provides the maximum for io_combine_limit and io_max_combine_limit. It also affects a couple of other places that work with arrays of struct iovec or smaller objects on the stack, so we still don't want to use the system IOV_MAX directly without a clamp: it is not under our control and likely to be 1024. 128 seems acceptable for our current usage. For Windows, we can't use real scatter/gather yet, so we continue to define our own IOV_MAX value of 16 and emulate preadv()/pwritev() with loops. Someone would need to research the trade-offs of raising that number. NB if trying to see this working: you might temporarily need to hack BAS_BULKREAD to be bigger, since otherwise the obvious way of "a very big SELECT" is limited by that for now. Suggested-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CA%2BhUKG%2B2T9p-%2BzM6Eeou-RAJjTML6eit1qn26f9twznX59qtCA%40mail.gmail.com	2025-03-19 15:40:35 +13:00
Thomas Munro	10f6646847	Introduce io_max_combine_limit. The existing io_combine_limit can be changed by users. The new io_max_combine_limit is fixed at server startup time, and functions as a silent clamp on the user setting. That in itself is probably quite useful, but the primary motivation is: aio_init.c allocates shared memory for all asynchronous IOs including some per-block data, and we didn't want to waste memory you'd never used by assuming they could be up to PG_IOV_MAX. This commit already halves the size of 'AioHandleIov' and 'AioHandleData'. A follow-up commit can now expand PG_IOV_MAX without affecting that. Since our GUC system doesn't support dependencies or cross-checks between GUCs, the user-settable one now assigns a "raw" value to io_combine_limit_guc, and the lower of io_combine_limit_guc and io_max_combine_limit is maintained in io_combine_limit. Reviewed-by: Andres Freund <andres@anarazel.de> (earlier version) Discussion: https://postgr.es/m/CA%2BhUKG%2B2T9p-%2BzM6Eeou-RAJjTML6eit1qn26f9twznX59qtCA%40mail.gmail.com	2025-03-19 15:23:54 +13:00
Álvaro Herrera	62d712ecfd	Introduce squashing of constant lists in query jumbling pg_stat_statements produces multiple entries for queries like SELECT something FROM table WHERE col IN (1, 2, 3, ...) depending on the number of parameters, because every element of ArrayExpr is individually jumbled. Most of the time that's undesirable, especially if the list becomes too large. Fix this by introducing a new GUC query_id_squash_values which modifies the node jumbling code to only consider the first and last element of a list of constants, rather than each list element individually. This affects both the query_id generated by query jumbling, as well as pg_stat_statements query normalization so that it suppresses printing of the individual elements of such a list. The default value is off, meaning the previous behavior is maintained. Author: Dmitry Dolgov <9erthalion6@gmail.com> Reviewed-by: Sergey Dudoladov (mysterious, off-list) Reviewed-by: David Geier <geidav.pg@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Sutou Kouhei <kou@clear-code.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Marcos Pegoraro <marcos@f10.com.br> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Tested-by: Yasuo Honda <yasuo.honda@gmail.com> Tested-by: Sergei Kornilov <sk@zsrv.org> Tested-by: Maciek Sakrejda <m.sakrejda@gmail.com> Tested-by: Chengxi Sun <sunchengxi@highgo.com> Tested-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Discussion: https://postgr.es/m/CA+q6zcWtUbT_Sxj0V6HY6EZ89uv5wuG5aefpe_9n0Jr3VwntFg@mail.gmail.com	2025-03-18 18:56:11 +01:00
Andres Freund	247ce06b88	aio: Add io_method=worker The previous commit introduced the infrastructure to start io_workers. This commit actually makes the workers execute IOs. IO workers consume IOs from a shared memory submission queue, run traditional synchronous system calls, and perform the shared completion handling immediately. Client code submits most requests by pushing IOs into the submission queue, and waits (if necessary) using condition variables. Some IOs cannot be performed in another process due to lack of infrastructure for reopening the file, and must processed synchronously by the client code when submitted. For now the default io_method is changed to "worker". We should re-evaluate that around beta1, we might want to be careful and set the default to "sync" for 18. Reviewed-by: Noah Misch <noah@leadboat.com> Co-authored-by: Thomas Munro <thomas.munro@gmail.com> Co-authored-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt Discussion: https://postgr.es/m/20210223100344.llw5an2aklengrmn@alap3.anarazel.de Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf@gcnactj4z56m	2025-03-18 11:54:01 -04:00
Andres Freund	55b454d0e1	aio: Infrastructure for io_method=worker This commit contains the basic, system-wide, infrastructure for io_method=worker. It does not yet actually execute IO, this commit just provides the infrastructure for running IO workers, kept separate for easier review. The number of IO workers can be adjusted with a PGC_SIGHUP GUC. Eventually we'd like to make the number of workers dynamically scale up/down based on the current "IO load". To allow the number of IO workers to be increased without a restart, we need to reserve PGPROC entries for the workers unconditionally. This has been judged to be worth the cost. If it turns out to be problematic, we can introduce a PGC_POSTMASTER GUC to control the maximum number. As io workers might be needed during shutdown, e.g. for AIO during the shutdown checkpoint, a new PMState phase is added. IO workers are shut down after the shutdown checkpoint has been performed and walsender/archiver have shut down, but before the checkpointer itself shuts down. See also `87a6690cc6`. Updates PGSTAT_FILE_FORMAT_ID due to the addition of a new BackendType. Reviewed-by: Noah Misch <noah@leadboat.com> Co-authored-by: Thomas Munro <thomas.munro@gmail.com> Co-authored-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt Discussion: https://postgr.es/m/20210223100344.llw5an2aklengrmn@alap3.anarazel.de Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf@gcnactj4z56m	2025-03-18 11:54:01 -04:00
Daniel Gustafsson	daa02c6bd9	Add X25519 to the default set of curves Since many clients default to the X25519 curve in the TLS handshake, the fact that the server by defualt doesn't support it cause an extra roundtrip for each TLS connection. By adding multiple curves, which is supported since `3d1ef3a15c`, we can reduce the risk of extra roundtrips. Author: Daniel Gustafsson <daniel@yesql.se> Co-authored-by: Jacob Champion <jacob.champion@enterprisedb.com> Reported-by: Andres Freund <andres@anarazel.de> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://postgr.es/m/20240616234612.6cslu7nqexquvwj7@awork3.anarazel.de	2025-03-18 15:26:27 +01:00
Melanie Plageman	cc6be07ebd	Increase default maintenance_io_concurrency to 16 Since its introduction in `fc34b0d9de`, the default maintenance_io_concurrency has been larger than the default effective_io_concurrency. maintenance_io_concurrency primarily controlled prefetching done on behalf of the whole system, for operations like recovery. Therefore it makes sense for it to have a value equal to or greater than effective_io_concurrency, which controls I/O concurrency for reading a relation in a bitmap heap scan. `ff79b5b2ab` increased effective_io_concurrency to 16, so we'll increase maintenance_io_concurrency as well. For now, though, we'll keep the defaults of effective_io_concurrency and maintenance_io_concurrency equal to one another (16). On fast, high IOPs systems, significantly higher values of maintenance_io_concurrency are observably beneficial [1]. However, such values would flood low IOPs systems and increase overall system I/O latency. It is worth mentioning that since `9256822608` and `c3e775e608`, maintenance_io_concurrency also controls the I/O concurrency of each vacuum worker. Since many autovacuum workers may be simultaneously issuing I/Os, we want to keep maintenance_io_concurrency appropriately conservative. [1] https://postgr.es/m/c5d52837-6256-0556-ac8c-d6d3d558820a%40enterprisedb.com Suggested-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Discussion: https://postgr.es/m/CAKZiRmxdHQaU%2B2Zpe6d%3Dx%3D0vigJ1sfWwwVYLJAf%3Dud_wQ_VcUw%40mail.gmail.com	2025-03-18 09:08:10 -04:00
Andres Freund	02844012b3	aio: Basic subsystem initialization This commit just does the minimal wiring up of the AIO subsystem, added in the next commit, to the rest of the system. The next commit contains more details about motivation and architecture. This commit is kept separate to make it easier to review, separating the changes across the tree, from the implementation of the new subsystem. We discussed squashing this commit with the main commit before merging AIO, but there has been a mild preference for keeping it separate. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt	2025-03-17 18:51:33 -04:00
Fujii Masao	6d376c3b0d	Add GUC option to log lock acquisition failures. This commit introduces a new GUC, log_lock_failure, which controls whether a detailed log message is produced when a lock acquisition fails. Currently, it only supports logging lock failures caused by SELECT ... NOWAIT. The log message includes information about all processes holding or waiting for the lock that couldn't be acquired, helping users analyze and diagnose the causes of lock failures. Currently, this option does not log failures from SELECT ... SKIP LOCKED, as that could generate excessive log messages if many locks are skipped, causing unnecessary noise. This mechanism can be extended in the future to support for logging lock failures from other commands, such as LOCK TABLE ... NOWAIT. Author: Yuki Seino <seinoyu@oss.nttdata.com> Co-authored-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://postgr.es/m/411280a186cc26ef7034e0f2dfe54131@oss.nttdata.com	2025-03-14 23:14:12 +09:00
Melanie Plageman	ff79b5b2ab	Increase default effective_io_concurrency to 16 The default effective_io_concurrency has been 1 since it was introduced in `b7b8f0b609`. Referencing the associated discussion [1], it seems 1 was chosen as a conservative value that seemed unlikely to cause regressions. Experimentation on high latency cloud storage as well as fast, local nvme storage (see Discussion link) shows that even slightly higher values improve query timings substantially. 1 actually performs worse than 0 [2]. With effective_io_concurrency 1, we are not prefetching enough to avoid I/O stalls, but we are issuing extra syscalls. The new default is 16, which should be more appropriate for common hardware while still avoiding flooding low IOPs devices with I/O requests. [1] https://www.postgresql.org/message-id/flat/FDDBA24E-FF4D-4654-BA75-692B3BA71B97%40enterprisedb.com [2] https://www.postgresql.org/message-id/CAAKRu_Zv08Cic%3DqdCfzrQabpEXGrd9Z9UOW5svEVkCM6%3DFXA9g%40mail.gmail.com Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAAKRu_Z%2BJa-mwXebOoOERMMUMvJeRhzTjad4dSThxG0JLXESxw%40mail.gmail.com	2025-03-12 15:57:44 -04:00
Melanie Plageman	18cd15e706	Add connection establishment duration logging Add log_connections option 'setup_durations' which logs durations of several key parts of connection establishment and backend setup. For an incoming connection, starting from when the postmaster gets a socket from accept() and ending when the forked child backend is first ready for query, there are multiple steps that could each take longer than expected due to external factors. This logging provides visibility into authentication and fork duration as well as the end-to-end connection establishment and backend initialization time. To make this portable, the timings captured in the postmaster (socket creation time, fork initiation time) are passed through the BackendStartupData. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Guillaume Lelarge <guillaume.lelarge@dalibo.com> Discussion: https://postgr.es/m/flat/CAAKRu_b_smAHK0ZjrnL5GRxnAVWujEXQWpLXYzGbmpcZd3nLYw%40mail.gmail.com	2025-03-12 11:35:27 -04:00
Melanie Plageman	9219093cab	Modularize log_connections output Convert the boolean log_connections GUC into a list GUC comprised of the connection aspects to log. This gives users more control over the volume and kind of connection logging. The current log_connections options are 'receipt', 'authentication', and 'authorization'. The empty string disables all connection logging. 'all' enables all available connection logging. For backwards compatibility, the most common values for the log_connections boolean are still supported (on, off, 1, 0, true, false, yes, no). Note that previously supported substrings of on, off, true, false, yes, and no are no longer supported. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/flat/CAAKRu_b_smAHK0ZjrnL5GRxnAVWujEXQWpLXYzGbmpcZd3nLYw%40mail.gmail.com	2025-03-12 11:35:21 -04:00
Melanie Plageman	06eae9e621	Trigger more frequent autovacuums with relallfrozen Calculate the insert threshold for triggering an autovacuum of a relation based on the number of unfrozen pages. By only considering the unfrozen portion of the table when calculating how many tuples to add to the insert threshold, we can trigger more frequent vacuums of insert-heavy tables. This increases the chances of vacuuming those pages when they still reside in shared buffers This also increases the number of autovacuums triggered by tuples inserted and not by wraparound risk. We prefer to freeze these pages during insert-triggered autovacuums, as anti-wraparound vacuums are not automatically canceled by conflicting lock requests. We calculate the unfrozen percentage of the table using the recently added (`99f8f3fbbc`) relallfrozen column of pg_class. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com> Reviewed-by: Robert Treat <rob@xzilla.net> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Discussion: https://postgr.es/m/flat/CAAKRu_aj-P7YyBz_cPNwztz6ohP%2BvWis%3Diz3YcomkB3NpYA--w%40mail.gmail.com	2025-03-03 14:42:00 -05:00
Michael Paquier	6c349d83b6	Re-add GUC track_wal_io_timing This commit is a rework of `2421e9a51d`, about which Andres Freund has raised some concerns as it is valuable to have both track_io_timing and track_wal_io_timing in some cases, as the WAL write and fsync paths can be a major bottleneck for some workloads. Hence, it can be relevant to not calculate the WAL timings in environments where pg_test_timing performs poorly while capturing some IO data under track_io_timing for the non-WAL IO paths. The opposite can be also true: it should be possible to disable the non-WAL timings and enable the WAL timings (the previous GUC setups allowed this possibility). track_wal_io_timing is added back in this commit, controlling if WAL timings should be calculated in pg_stat_io for the read, fsync and write paths, as done previously with pg_stat_wal. pg_stat_wal previously tracked only the sync and write parts (now removed), read stats is new data tracked in pg_stat_io, all three are aggregated if track_wal_io_timing is enabled. The read part matters during recovery or if a XLogReader is used. Extra note: more control over if the types of timings calculated in pg_stat_io could be done with a GUC that lists pairs of (IOObject,IOOp). Reported-by: Andres Freund <andres@anarazel.de> Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/3opf2wh2oljco6ldyqf7ukabw3jijnnhno6fjb4mlu6civ5h24@fcwmhsgmlmzu	2025-02-26 09:49:59 +09:00
Michael Paquier	2421e9a51d	Remove read/sync fields from pg_stat_wal and GUC track_wal_io_timing The four following attributes are removed from pg_stat_wal: * wal_write * wal_sync * wal_write_time * wal_sync_time `a051e71e28` has added an equivalent of this information in pg_stat_io with more granularity as this now spreads across the backend types, IO context and IO objects. So, keeping the same information in pg_stat_wal has little benefits. Another benefit of this commit is the removal of PendingWalStats, simplifying an upcoming patch to add per-backend WAL statistics, which already support IO statistics and which have access to the write/sync stats data of WAL. The GUC track_wal_io_timing, that was used to enable or disable the aggregation of the write and sync timings for WAL, is also removed. pgstat_prepare_io_time() is simplified. Bump catalog version. Bump PGSTAT_FILE_FORMAT_ID, due to the update of PgStat_WalStats. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/Z7RkQ0EfYaqqjgz/@ip-10-97-1-34.eu-west-3.compute.internal	2025-02-24 09:51:56 +09:00
Daniel Gustafsson	b3f0be788a	Add support for OAUTHBEARER SASL mechanism This commit implements OAUTHBEARER, RFC 7628, and OAuth 2.0 Device Authorization Grants, RFC 8628. In order to use this there is a new pg_hba auth method called oauth. When speaking to a OAuth- enabled server, it looks a bit like this: $ psql 'host=example.org oauth_issuer=... oauth_client_id=...' Visit https://oauth.example.org/login and enter the code: FPQ2-M4BG Device authorization is currently the only supported flow so the OAuth issuer must support that in order for users to authenticate. Third-party clients may however extend this and provide their own flows. The built-in device authorization flow is currently not supported on Windows. In order for validation to happen server side a new framework for plugging in OAuth validation modules is added. As validation is implementation specific, with no default specified in the standard, PostgreSQL does not ship with one built-in. Each pg_hba entry can specify a specific validator or be left blank for the validator installed as default. This adds a requirement on libcurl for the client side support, which is optional to build, but the server side has no additional build requirements. In order to run the tests, Python is required as this adds a https server written in Python. Tests are gated behind PG_TEST_EXTRA as they open ports. This patch has been a multi-year project with many contributors involved with reviews and in-depth discussions: Michael Paquier, Heikki Linnakangas, Zhihong Yu, Mahendrakar Srinivasarao, Andrey Chudnovsky and Stephen Frost to name a few. While Jacob Champion is the main author there have been some levels of hacking by others. Daniel Gustafsson contributed the validation module and various bits and pieces; Thomas Munro wrote the client side support for kqueue. Author: Jacob Champion <jacob.champion@enterprisedb.com> Co-authored-by: Daniel Gustafsson <daniel@yesql.se> Co-authored-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Antonin Houska <ah@cybertec.at> Reviewed-by: Kashif Zeeshan <kashi.zeeshan@gmail.com> Discussion: https://postgr.es/m/d1b467a78e0e36ed85a09adf979d04cf124a9d4b.camel@vmware.com	2025-02-20 16:25:17 +01:00
Amit Kapila	ac0e33136a	Invalidate inactive replication slots. This commit introduces idle_replication_slot_timeout GUC that allows inactive slots to be invalidated at the time of checkpoint. Because checkpoints happen checkpoint_timeout intervals, there can be some lag between when the idle_replication_slot_timeout was exceeded and when the slot invalidation is triggered at the next checkpoint. To avoid such lags, users can force a checkpoint to promptly invalidate inactive slots. Note that the idle timeout invalidation mechanism is not applicable for slots that do not reserve WAL or for slots on the standby server that are synced from the primary server (i.e., standby slots having 'synced' field 'true'). Synced slots are always considered to be inactive because they don't perform logical decoding to produce changes. The slots can become inactive for a long period if a subscriber is down due to a system error or inaccessible because of network issues. If such a situation persists, it might be more practical to recreate the subscriber rather than attempt to recover the node and wait for it to catch up which could be time-consuming. Then, external tools could create replication slots (e.g., for migrations or upgrades) that may fail to remove them if an error occurs, leaving behind unused slots that take up space and resources. Manually cleaning them up can be tedious and error-prone, and without intervention, these lingering slots can cause unnecessary WAL retention and system bloat. As the duration of idle_replication_slot_timeout is in minutes, any test using that would be time-consuming. We are planning to commit a follow up patch for tests by using the injection point framework. Author: Nisha Moond <nisha.moond412@gmail.com> Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Vignesh C <vignesh21@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com Discussion: https://postgr.es/m/OS0PR01MB5716C131A7D80DAE8CB9E88794FC2@OS0PR01MB5716.jpnprd01.prod.outlook.com	2025-02-19 09:29:50 +05:30
Nathan Bossart	bb8dff9995	Add cost-based vacuum delay time to progress views. This commit adds the amount of time spent sleeping due to cost-based delay to the pg_stat_progress_vacuum and pg_stat_progress_analyze system views. A new configuration parameter named track_cost_delay_timing, which is off by default, controls whether this information is gathered. For vacuum, the reported value includes the sleep time of any associated parallel workers. However, parallel workers only report their sleep time once per second to avoid overloading the leader process. Bumps catversion. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Co-authored-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Masahiro Ikeda <ikedamsh@oss.nttdata.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Sergei Kornilov <sk@zsrv.org> Discussion: https://postgr.es/m/ZmaXmWDL829fzAVX%40ip-10-97-1-34.eu-west-3.compute.internal	2025-02-11 16:38:14 -06:00
Melanie Plageman	052026c9b9	Eagerly scan all-visible pages to amortize aggressive vacuum Aggressive vacuums must scan every unfrozen tuple in order to advance the relfrozenxid/relminmxid. Because data is often vacuumed before it is old enough to require freezing, relations may build up a large backlog of pages that are set all-visible but not all-frozen in the visibility map. When an aggressive vacuum is triggered, all of these pages must be scanned. These pages have often been evicted from shared buffers and even from the kernel buffer cache. Thus, aggressive vacuums often incur large amounts of extra I/O at the expense of foreground workloads. To amortize the cost of aggressive vacuums, eagerly scan some all-visible but not all-frozen pages during normal vacuums. All-visible pages that are eagerly scanned and set all-frozen in the visibility map are counted as successful eager freezes and those not frozen are counted as failed eager freezes. If too many eager scans fail in a row, eager scanning is temporarily suspended until a later portion of the relation. The number of failures tolerated is configurable globally and per table. To effectively amortize aggressive vacuums, we cap the number of successes as well. Capping eager freeze successes also limits the amount of potentially wasted work if these pages are modified again before the next aggressive vacuum. Once we reach the maximum number of blocks successfully eager frozen, eager scanning is disabled for the remainder of the vacuum of the relation. Original design idea from Robert Haas, with enhancements from Andres Freund, Tomas Vondra, and me Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Robert Treat <rob@xzilla.net> Reviewed-by: Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/flat/CAAKRu_ZF_KCzZuOrPrOqjGVe8iRVWEAJSpzMgRQs%3D5-v84cXUg%40mail.gmail.com	2025-02-11 13:53:48 -05:00
Andres Freund	4dd09a1d41	config: Rename "Asynchronous Behavior" to "I/O" "I/O" seems more descriptive than "Asynchronous Behavior", given that some of the GUCs in the section don't relate to anything asynchronous. Most other abbreviations in the config sections are un-abbreviated, but "Input/Output" seems less likely to be helpful than just IO or I/O. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/x3tlw2jk5gm3r3mv47hwrshffyw7halpczkfbk3peksxds7bvc@lguk43z3bsyq	2025-02-11 12:53:40 -05:00
Andres Freund	740766d37c	config: Split "Worker Processes" out of "Asynchronous Behavior" Having all the worker related GUCs in the same section as IO controlling GUCs doesn't really make sense. Create a separate section for them. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/x3tlw2jk5gm3r3mv47hwrshffyw7halpczkfbk3peksxds7bvc@lguk43z3bsyq	2025-02-11 12:53:40 -05:00
Nathan Bossart	306dc520b9	Introduce autovacuum_vacuum_max_threshold. One way autovacuum chooses tables to vacuum is by comparing the number of updated or deleted tuples with a value calculated using autovacuum_vacuum_threshold and autovacuum_vacuum_scale_factor. The threshold specifies the base value for comparison, and the scale factor specifies the fraction of the table size to add to it. This strategy ensures that smaller tables are vacuumed after fewer updates/deletes than larger tables, which is reasonable in many cases but can result in infrequent vacuums on very large tables. This is undesirable for a couple of reasons, such as very large tables incurring a huge amount of bloat between vacuums. This new parameter provides a way to set a limit on the value calculated with autovacuum_vacuum_threshold and autovacuum_vacuum_scale_factor so that very large tables are vacuumed more frequently. By default, it is set to 100,000,000 tuples, but it can be disabled by setting it to -1. It can also be adjusted for individual tables by changing storage parameters. Author: Nathan Bossart <nathandbossart@gmail.com> Co-authored-by: Frédéric Yhuel <frederic.yhuel@dalibo.com> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Michael Banck <mbanck@gmx.net> Reviewed-by: Joe Conway <mail@joeconway.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Reviewed-by: Vinícius Abrahão <vinnix.bsd@gmail.com> Reviewed-by: Robert Treat <rob@xzilla.net> Reviewed-by: Alena Rybakina <a.rybakina@postgrespro.ru> Discussion: https://postgr.es/m/956435f8-3b2f-47a6-8756-8c54ded61802%40dalibo.com	2025-02-05 15:48:18 -06:00
Melanie Plageman	af2317652d	Reorder vacuum GUCs in postgresql.conf.sample to match docs `ca9c6a5680` consolidated most of vacuum-related GUCs' documentation into a new subsection. It neglected, however, to reorganize postgresql.conf.sample to match the new order. Do this now. Reported-by: Álvaro Herrera Discussion: https://postgr.es/m/202501110902.5banlseavz7c%40alvherre.pgsql	2025-01-13 15:21:04 -05:00
Nathan Bossart	c758119e5b	Allow changing autovacuum_max_workers without restarting. This commit introduces a new parameter named autovacuum_worker_slots that controls how many autovacuum worker slots to reserve during server startup. Modifying this new parameter's value does require a server restart, but it should typically be set to the upper bound of what you might realistically need to set autovacuum_max_workers. With that new parameter in place, autovacuum_max_workers can now be changed with a SIGHUP (e.g., pg_ctl reload). If autovacuum_max_workers is set higher than autovacuum_worker_slots, a WARNING is emitted, and the server will only start up to autovacuum_worker_slots workers at a given time. If autovacuum_max_workers is set to a value less than the number of currently-running autovacuum workers, the existing workers will continue running, but no new workers will be started until the number of running autovacuum workers drops below autovacuum_max_workers. Reviewed-by: Sami Imseih, Justin Pryzby, Robert Haas, Andres Freund, Yogesh Sharma Discussion: https://postgr.es/m/20240410212344.GA1824549%40nathanxps13	2025-01-06 15:01:22 -06:00
Nathan Bossart	db6a4a985b	Deprecate MD5 passwords. MD5 has been considered to be unsuitable for use as a cryptographic hash algorithm for some time. Furthermore, MD5 password hashes in PostgreSQL are vulnerable to pass-the-hash attacks, i.e., knowing the username and hashed password is sufficient to authenticate. The SCRAM-SHA-256 method added in v10 is not subject to these problems and is considered to be superior to MD5. This commit marks MD5 password support in PostgreSQL as deprecated and to be removed in a future release. The documentation now contains several deprecation notices, and CREATE ROLE and ALTER ROLE now emit deprecation warnings when setting MD5 passwords. The warnings can be disabled by setting the md5_password_warnings parameter to "off". Reviewed-by: Greg Sabino Mullane, Jim Nasby Discussion: https://postgr.es/m/ZwbfpJJol7lDWajL%40nathan	2024-12-02 13:30:07 -06:00
Richard Guo	a8ccf4e93a	Reordering DISTINCT keys to match input path's pathkeys The ordering of DISTINCT items is semantically insignificant, so we can reorder them as needed. In fact, in the parser, we absorb the sorting semantics of the sortClause as much as possible into the distinctClause, ensuring that one clause is a prefix of the other. This can help avoid a possible need to re-sort. In this commit, we attempt to adjust the DISTINCT keys to match the input path's pathkeys. This can likewise help avoid re-sorting, or allow us to use incremental-sort to save efforts. For DISTINCT ON expressions, the parser already ensures that they match the initial ORDER BY expressions. When reordering the DISTINCT keys, we must ensure that the resulting pathkey list matches the initial distinctClause pathkeys. This introduces a new GUC, enable_distinct_reordering, which allows the optimization to be disabled if needed. Author: Richard Guo Reviewed-by: Andrei Lepikhov Discussion: https://postgr.es/m/CAMbWs48dR26cCcX0f=8bja2JKQPcU64136kHk=xekHT9xschiQ@mail.gmail.com	2024-11-26 09:25:18 +09:00
Daniel Gustafsson	45188c2ea2	Support configuring TLSv1.3 cipher suites The ssl_ciphers GUC can only set cipher suites for TLSv1.2, and lower, connections. For TLSv1.3 connections a different OpenSSL API must be used. This adds a new GUC, ssl_tls13_ciphers, which can be used to configure a colon separated list of cipher suites to support when performing a TLSv1.3 handshake. Original patch by Erica Zhang with additional hacking by me. Author: Erica Zhang <ericazhangy2021@qq.com> Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://postgr.es/m/tencent_063F89FA72CCF2E48A0DF5338841988E9809@qq.com	2024-10-24 15:20:32 +02:00
Daniel Gustafsson	3d1ef3a15c	Support configuring multiple ECDH curves The ssl_ecdh_curve GUC only accepts a single value, but the TLS handshake can list multiple curves in the groups extension (the extension has been renamed to contain more than elliptic curves). This changes the GUC to accept a colon-separated list of curves. This commit also renames the GUC to ssl_groups to match the new nomenclature for the TLS extension. Original patch by Erica Zhang with additional hacking by me. Author: Erica Zhang <ericazhangy2021@qq.com> Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://postgr.es/m/tencent_063F89FA72CCF2E48A0DF5338841988E9809@qq.com	2024-10-24 15:20:28 +02:00
John Naylor	bbf668d66f	Lower minimum maintenance_work_mem to 64kB Since the introduction of TID store, vacuum uses far less memory in the common case than in versions 16 and earlier. Invoking multiple rounds of index vacuuming in turn requires a much larger table. It'd be a good idea anyway to cover this case in regression testing, and a lower limit is less painful for slow buildfarm animals. The reason to do it now is to re-enable coverage of the bugfix in commit `83c39a1f7f`. For consistency, give autovacuum_work_mem the same treatment. Suggested by Andres Freund Tested by Melanie Plageman Backpatch to v17, where TID store was introduced Discussion: https://postgr.es/m/20240516205458.ohvlzis5b5tvejru@awork3.anarazel.de Discussion: https://postgr.es/m/20240722164745.fvaoh6g6zprisqgp%40awork3.anarazel.de	2024-08-10 14:52:56 +07:00
Amit Kapila	2357c9223b	Rename standby_slot_names to synchronized_standby_slots. The standby_slot_names GUC allows the specification of physical standby slots that must be synchronized before the logical walsenders associated with logical failover slots. However, for this purpose, the GUC name is too generic. Author: Hou Zhijie Reviewed-by: Bertrand Drouvot, Masahiko Sawada Backpatch-through: 17 Discussion: https://postgr.es/m/ZnWeUgdHong93fQN@momjian.us	2024-07-01 11:36:56 +05:30
Peter Eisentraut	b6dbbaa302	Move new SLRU buffers GUCs to a better place in postgresql.conf.sample They were under "File Locations", which doesn't make sense. Move them to Resource Usage / Memory, which matches their categorization in the source code and in the documentation.	2024-06-05 20:24:41 +02:00
Bruce Momjian	4712177a6c	postgresql.conf: align variable comments, mostly new ones Backpatch-through: master	2024-05-06 21:16:06 -04:00
Alexander Korotkov	ff9f72c68f	revert: Transform OR clauses to ANY expression This commit reverts `72bd38cc99` due to implementation and design issues. Reported-by: Tom Lane Discussion: https://postgr.es/m/3604469.1712628736%40sss.pgh.pa.us	2024-04-10 02:28:09 +03:00
Alexander Korotkov	2af75e1174	Fix the value of or_to_any_transform_limit in postgresql.conf.sample Reported-by: Justin Pryzby Discussion: https://postgr.es/m/ZhM8jH8gsKm5Q-9p%40pryzbyj2023	2024-04-08 08:51:07 +03:00
Alexander Korotkov	72bd38cc99	Transform OR clauses to ANY expression Replace (expr op C1) OR (expr op C2) ... with expr op ANY(ARRAY[C1, C2, ...]) on the preliminary stage of optimization when we are still working with the expression tree. Here Cn is a n-th constant expression, 'expr' is non-constant expression, 'op' is an operator which returns boolean result and has a commuter (for the case of reverse order of constant and non-constant parts of the expression, like 'Cn op expr'). Sometimes it can lead to not optimal plan. This is why there is a or_to_any_transform_limit GUC. It specifies a threshold value of length of arguments in an OR expression that triggers the OR-to-ANY transformation. Generally, more groupable OR arguments mean that transformation will be more likely to win than to lose. Discussion: https://postgr.es/m/567ED6CA.2040504%40sigaev.ru Author: Alena Rybakina <lena.ribackina@yandex.ru> Author: Andrey Lepikhov <a.lepikhov@postgrespro.ru> Reviewed-by: Peter Geoghegan <pg@bowt.ie> Reviewed-by: Ranier Vilela <ranier.vf@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Jian He <jian.universality@gmail.com>	2024-04-08 01:27:52 +03:00
Thomas Munro	98f320eb2e	Increase default vacuum_buffer_usage_limit to 2MB. The BAS_VACUUM ring size has been 256kB since commit `d526575f` introduced the mechanism 17 years ago. Commit `1cbbee03` recently made it configurable but retained the traditional default. The correct default size has been debated for years, but 256kB is certainly very small. VACUUM soon needs to write back data it dirtied only 32 blocks ago, which usually requires flushing the WAL. New experiments in prefetching pages for VACUUM exacerbated the problem by crashing into dirty data even sooner. Let's make the default 2MB. That's 1.6% of the default toy buffer pool size, and 0.2% of 1GB, which would be a considered a small shared_buffers setting for a real system these days. Users are still free to set the GUC to a different value. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20240403221257.md4gfki3z75cdyf6%40awork3.anarazel.de Discussion: https://postgr.es/m/CA%2BhUKGLY4Q4ZY4f1rvnFtv6%2BPkjNf8MejdPkcju3Qii9DYqqcQ%40mail.gmail.com	2024-04-06 23:12:03 +13:00
Thomas Munro	210622c60e	Provide vectored variant of ReadBuffer(). Break ReadBuffer() up into two steps. StartReadBuffers() and WaitReadBuffers() give us two main advantages: 1. Multiple consecutive blocks can be read with one system call. 2. Advice (hints of future reads) can optionally be issued to the kernel ahead of time. The traditional ReadBuffer() function is now implemented in terms of those functions, to avoid duplication. A new GUC io_combine_limit is defined, and the functions for limiting per-backend pin counts are made into public APIs. Those are provided for use by callers of StartReadBuffers(), when deciding how many buffers to read at once. The following commit will add a higher level mechanism for doing that automatically with a practical interface. With some more infrastructure in later work, StartReadBuffers() could be extended to start real asynchronous I/O instead of just issuing advice and leaving WaitReadBuffers() to do the work synchronously. Author: Thomas Munro <thomas.munro@gmail.com> Author: Andres Freund <andres@anarazel.de> (some optimization tweaks) Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Tested-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Discussion: https://postgr.es/m/CA+hUKGJkOiOCa+mag4BF+zHo7qo=o9CFheB8=g6uT5TUm2gkvA@mail.gmail.com	2024-04-03 00:23:20 +13:00

1 2 3 4 5 ...

647 Commits