postgres

mirror of https://github.com/postgres/postgres.git synced 2026-01-27 21:43:08 +03:00

Author	SHA1	Message	Date
Michael Paquier	f9a09aa295	Add wal_fpi_bytes to pg_stat_wal and pg_stat_get_backend_wal() This new counter, called "wal_fpi_bytes", tracks the total amount in bytes of full page images (FPIs) generated in WAL. This data becomes available globally via pg_stat_wal, and for backend statistics via pg_stat_get_backend_wal(). Previously, this information could only be retrieved with pg_waldump or pg_walinspect, which may not be available depending on the environment, and are expensive to execute. It offers hints about how much FPIs impact the WAL generated, which could be a large percentage for some workloads, as well as the effects of wal_compression or page holes. Bump catalog version. Bump PGSTAT_FILE_FORMAT_ID, due to the addition of wal_fpi_bytes in PgStat_WalCounters. Author: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAOzEurQtZEAfg6P0kU3Wa-f9BWQOi0RzJEMPN56wNTOmJLmfaQ@mail.gmail.com	2025-10-28 16:21:51 +09:00
Amit Kapila	f0b3573c3a	Introduce "REFRESH SEQUENCES" for subscriptions. This patch adds support for a new SQL command: ALTER SUBSCRIPTION ... REFRESH SEQUENCES This command updates the sequence entries present in the pg_subscription_rel catalog table with the INIT state to trigger resynchronization. In addition to the new command, the following subscription commands have been enhanced to automatically refresh sequence mappings: ALTER SUBSCRIPTION ... REFRESH PUBLICATION ALTER SUBSCRIPTION ... ADD PUBLICATION ALTER SUBSCRIPTION ... DROP PUBLICATION ALTER SUBSCRIPTION ... SET PUBLICATION These commands will perform the following actions: Add newly published sequences that are not yet part of the subscription. Remove sequences that are no longer included in the publication. This ensures that sequence replication remains aligned with the current state of the publication on the publisher side. Note that the actual synchronization of sequence data/values will be handled in a subsequent patch that introduces a dedicated sequence sync worker. Author: Vignesh C <vignesh21@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Hou Zhijie <houzj.fnst@fujitsu.com> Discussion: https://postgr.es/m/CAA4eK1LC+KJiAkSrpE_NwvNdidw9F2os7GERUeSxSKv71gXysQ@mail.gmail.com	2025-10-23 08:30:27 +00:00
Michael Paquier	2519fa8362	Bump catalog version for new function error_on_null() Oversight in `2b75c38b70`. No comments. Discussion: https://postgr.es/m/aPgu7kwiT4iGo6Ya@paquier.xyz	2025-10-22 10:08:47 +09:00
Michael Paquier	2b75c38b70	Add error_on_null(), checking if the input is the null value This polymorphic function produces an error if the input value is detected as being the null value; otherwise it returns the input value unchanged. This function can for example become handy in SQL function bodies, to enforce that exactly one row was returned. Author: Joel Jacobson <joel@compiler.org> Reviewed-by: Vik Fearing <vik@postgresfriends.org> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/ece8c6d1-2ab1-45d5-ba12-8dec96fc8886@app.fastmail.com Discussion: https://postgr.es/m/de94808d-ed58-4536-9e28-e79b09a534c7@app.fastmail.com	2025-10-22 09:55:17 +09:00
Amit Kapila	41c674d2e3	Refactor logical worker synchronization code into a separate file. To support the upcoming addition of a sequence synchronization worker, this patch extracts common synchronization logic shared by table sync workers and the new sequence sync worker into a dedicated file. This modularization improves code reuse, maintainability, and clarity in the logical workers framework. Author: vignesh C <vignesh21@gmail.com> Author: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAA4eK1LC+KJiAkSrpE_NwvNdidw9F2os7GERUeSxSKv71gXysQ@mail.gmail.com	2025-10-16 05:10:50 +00:00
Amit Kapila	96b3784973	Add "ALL SEQUENCES" support to publications. This patch adds support for the ALL SEQUENCES clause in publications, enabling synchronization/replication of all sequences that is useful for upgrades. Publications can now include all sequences via FOR ALL SEQUENCES. psql enhancements: \d shows publications for a given sequence. \dRp indicates if a publication includes all sequences. ALL SEQUENCES can be combined with ALL TABLES, but not with other options like TABLE or TABLES IN SCHEMA. We can extend support for more granular clauses in future. The view pg_publication_sequences provides information about the mapping between publications and sequences. This patch enables publishing of sequences; subscriber-side support will be added in upcoming patches. Author: vignesh C <vignesh21@gmail.com> Author: Tomas Vondra <tomas@vondra.me> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAA4eK1LC+KJiAkSrpE_NwvNdidw9F2os7GERUeSxSKv71gXysQ@mail.gmail.com	2025-10-09 03:48:54 +00:00
Masahiko Sawada	d3b6183dd9	Add mem_exceeded_count column to pg_stat_replication_slots. This commit introduces a new column mem_exceeded_count to the pg_stat_replication_slots view. This counter tracks how often the memory used by logical decoding exceeds the logical_decoding_work_mem limit. The new statistic helps users determine whether exceeding the logical_decoding_work_mem limit is a rare occurrences or a frequent issue, information that wasn't available through existing statistics. Bumps catversion. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/978D21E8-9D3B-40EA-A4B1-F87BABE7868C@yesql.se	2025-10-08 10:05:04 -07:00
Richard Guo	185e304263	Allow negative aggtransspace to indicate unbounded state size This patch reuses the existing aggtransspace in pg_aggregate to signal that an aggregate's transition state can grow unboundedly. If aggtransspace is set to a negative value, it now indicates that the transition state may consume unpredictable or large amounts of memory, such as in aggregates like array_agg or string_agg that accumulate input rows. This information can be used by the planner to avoid applying memory-sensitive optimizations (e.g., eager aggregation) when there is a risk of excessive memory usage during partial aggregation. Bump catalog version. Per idea from Robert Haas, though applied differently than originally suggested. Discussion: https://postgr.es/m/CA+TgmoYbkvYwLa+1vOP7RDY7kO2=A7rppoPusoRXe44VDOGBPg@mail.gmail.com	2025-10-08 17:01:48 +09:00
Michael Paquier	b71bae41a0	Add stats_reset to pg_stat_user_functions It is possible to call pg_stat_reset_single_function_counters() for a single function, but the reset time was missing the system view showing its statistics. Like all the fields of pg_stat_user_functions, the GUC track_functions needs to be enabled to show the statistics about function executions. Bump catalog version. Bump PGSTAT_FILE_FORMAT_ID, as a result of the new field added to PgStat_StatFuncEntry. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aONjnsaJSx-nEdfU@paquier.xyz	2025-10-08 12:43:40 +09:00
Amit Kapila	b93172ca59	Expose sequence page LSN via pg_get_sequence_data. This patch enhances the pg_get_sequence_data function to include the page-level LSN (Log Sequence Number) of the sequence. This additional metadata will be used by upcoming patches to support synchronization of sequences during logical replication. By exposing the LSN, we enable more accurate tracking of sequence changes, which is essential for maintaining consistency across replicated nodes. Author: vignesh C <vignesh21@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://www.postgresql.org/message-id/CAA4eK1LC+KJiAkSrpE_NwvNdidw9F2os7GERUeSxSKv71gXysQ@mail.gmail.com	2025-10-06 08:30:16 +00:00
Michael Paquier	a5b543258a	Add stats_reset to pg_stat_all_{tables,indexes} and related views It is possible to call pg_stat_reset_single_table_counters() on a relation (index or table) but the reset time was missing from the system views showing their statistics. This commit adds the reset time as an attribute of pg_stat_all_tables, pg_stat_all_indexes, and other relations related to them. Bump catalog version. Bump PGSTAT_FILE_FORMAT_ID, as a result of the new field added to PgStat_StatTabEntry. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aN8l182jKxEq1h9f@paquier.xyz	2025-10-06 15:31:21 +09:00
Tom Lane	ef38a4d975	Add GROUP BY ALL. GROUP BY ALL is a form of GROUP BY that adds any TargetExpr that does not contain an aggregate or window function into the groupClause of the query, making it exactly equivalent to specifying those same expressions in an explicit GROUP BY list. This feature is useful for certain kinds of data exploration. It's already present in some other DBMSes, and the SQL committee recently accepted it into the standard, so we can be reasonably confident in the syntax being stable. We do have to invent part of the semantics, as the standard doesn't allow for expressions in GROUP BY, so they haven't specified what to do with window functions. We assume that those should be treated like aggregates, i.e., left out of the constructed GROUP BY list. In passing, wordsmith some existing documentation about GROUP BY, and update some neglected synopsis entries in select_into.sgml. Author: David Christensen <david@pgguru.net> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAHM0NXjz0kDwtzoe-fnHAqPB1qA8_VJN0XAmCgUZ+iPnvP5LbA@mail.gmail.com	2025-09-29 16:55:17 -04:00
Tom Lane	261f89a976	Track the maximum possible frequency of non-MCE array elements. The lossy-counting algorithm that ANALYZE uses to identify most-common array elements has a notion of cutoff frequency: elements with frequency greater than that are guaranteed to be collected, elements with smaller frequencies are not. In cases where we find fewer MCEs than the stats target would permit us to store, the cutoff frequency provides valuable additional information, to wit that there are no non-MCEs with frequency greater than that. What the selectivity estimation functions actually use the "minfreq" entry for is as a ceiling on the possible frequency of non-MCEs, so using the cutoff rather than the lowest stored MCE frequency provides a tighter bound and more accurate estimates. Therefore, instead of redundantly storing the minimum observed MCE frequency, store the cutoff frequency when there are fewer tracked values than we want. (When there are more, then of course we cannot assert that no non-stored elements are above the cutoff frequency, since we're throwing away some that are; so we still use the minimum stored frequency in that case.) Notably, this works even when none of the values are common enough to be called MCEs. In such cases we previously stored nothing in the STATISTIC_KIND_MCELEM pg_statistic slot, which resulted in the selectivity functions falling back to default estimates. So in that case we want to construct a STATISTIC_KIND_MCELEM entry that contains no "values" but does have "numbers", to wit the three extra numbers that the MCELEM entry type defines. A small obstacle is that update_attstats() has traditionally stored a null, not an empty array, when passed zero "values" for a slot. That gives rise to an MCELEM entry that get_attstatsslot() will spit up on. The least risky solution seems to be to adjust update_attstats() so that it will emit a non-null (but possibly empty) array when the passed stavalues array pointer isn't NULL, rather than conditioning that on numvalues > 0. In other existing cases I don't believe that that changes anything. For consistency, handle the stanumbers array the same way. In passing, improve the comments in routines that use STATISTIC_KIND_MCELEM data. Particularly, explain why we use minfreq / 2 not minfreq as the estimate for non-MCE values. Thanks to Matt Long for the suggestion that we could apply this idea even when there are more than zero MCEs. Reported-by: Mark Frost <FROSTMAR@uk.ibm.com> Reported-by: Matt Long <matt@mattlong.org> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/PH3PPF1C905D6E6F24A5C1A1A1D8345B593E16FA@PH3PPF1C905D6E6.namprd15.prod.outlook.com	2025-09-20 14:48:16 -04:00
Amit Kapila	5b148706c5	Add optional pid parameter to pg_replication_origin_session_setup(). Commit `216a784829` introduced parallel apply workers, allowing multiple processes to share a replication origin. To support this, replorigin_session_setup() was extended to accept a pid argument identifying the process using the origin. This commit exposes that capability through the SQL interface function pg_replication_origin_session_setup() by adding an optional pid parameter. This enables multiple processes to coordinate replication using the same origin when using SQL-level replication functions. This change allows the non-builtin logical replication solutions to implement parallel apply for large transactions. Additionally, an existing internal error was made user-facing, as it can now be triggered via the exposed SQL API. Author: Doruk Yilmaz <doruk@mixrank.com> Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Discussion: https://postgr.es/m/CAMPB6wfe4zLjJL8jiZV5kjjpwBM2=rTRme0UCL7Ra4L8MTVdOg@mail.gmail.com Discussion: https://postgr.es/m/CAE2gYzyTSNvHY1+iWUwykaLETSuAZsCWyryokjP6rG46ZvRgQA@mail.gmail.com	2025-09-19 05:38:40 +00:00
Tom Lane	83a5641945	Provide more-specific error details/hints for function lookup failures. Up to now we've contented ourselves with a one-size-fits-all error hint when we fail to find any match to a function or procedure call. That was mostly okay in the beginning, but it was never great, and since the introduction of named arguments it's really not adequate. We at least ought to distinguish "function name doesn't exist" from "function name exists, but not with those argument names". And the rules for named-argument matching are arcane enough that some more detail seems warranted if we match the argument names but the call still doesn't work. This patch creates a framework for dealing with these problems: FuncnameGetCandidates and related code will now pass back a bitmask of flags showing how far the match succeeded. This allows a considerable amount of granularity in the reports. The set-bits-in-a-bitmask approach means that when there are multiple candidate functions, the report will reflect the match(es) that got the furthest, which seems correct. Also, we can avoid mentioning "maybe add casts" unless failure to match argument types is actually the issue. Extend the same return-a-bitmask approach to OpernameGetCandidates. The issues around argument names don't apply to operator syntax, but it still seems worth distinguishing between "there is no operator of that name" and "we couldn't match the argument types". While at it, adjust these messages and related ones to more strictly separate "detail" from "hint", following our message style guidelines' distinction between those. Reported-by: Dominique Devienne <ddevienne@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/1756041.1754616558@sss.pgh.pa.us	2025-09-16 12:17:02 -04:00
Dean Rasheed	faf071b553	Add date and timestamp variants of random(min, max). This adds 3 new variants of the random() function: random(min date, max date) returns date random(min timestamp, max timestamp) returns timestamp random(min timestamptz, max timestamptz) returns timestamptz Each returns a random value x in the range min <= x <= max. Author: Damien Clochard <damien@dalibo.info> Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Vik Fearing <vik@postgresfriends.org> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/f524d8cab5914613d9e624d9ce177d3d@dalibo.info	2025-09-09 10:39:30 +01:00
Amit Kapila	a850be2fe6	Add max_retention_duration option to subscriptions. This commit introduces a new subscription parameter, max_retention_duration, aimed at mitigating excessive accumulation of dead tuples when retain_dead_tuples is enabled and the apply worker lags behind the publisher. When the time spent advancing a non-removable transaction ID exceeds the max_retention_duration threshold, the apply worker will stop retaining conflict detection information. In such cases, the conflict slot's xmin will be set to InvalidTransactionId, provided that all apply workers associated with the subscription (with retain_dead_tuples enabled) confirm the retention duration has been exceeded. To ensure retention status persists across server restarts, a new column subretentionactive has been added to the pg_subscription catalog. This prevents unnecessary reactivation of retention logic after a restart. The conflict detection slot will not be automatically re-initialized unless a new subscription is created with retain_dead_tuples = true, or the user manually re-enables retain_dead_tuples. A future patch will introduce support for automatic slot re-initialization once at least one apply worker confirms that the retention duration is within the configured max_retention_duration. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/OS0PR01MB5716BE80DAEB0EE2A6A5D1F5949D2@OS0PR01MB5716.jpnprd01.prod.outlook.com	2025-09-02 03:20:18 +00:00
Álvaro Herrera	325fc0ab14	Avoid including commands/dbcommands.h in so many places This has been done historically because of get_database_name (which since commit `cb98e6fb8f` belongs in lsyscache.c/h, so let's move it there) and get_database_oid (which is in the right place, but whose declaration should appear in pg_database.h rather than dbcommands.h). Clean this up. Also, xlogreader.h and stringinfo.h are no longer needed by dbcommands.h since commit `f1fd515b39`, so remove them. Author: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/202508191031.5ipojyuaswzt@alvherre.pgsql	2025-08-28 12:39:04 +02:00
Peter Eisentraut	16d434d53d	Add src/include/catalog/README This just includes a link to the bki documentation, to help people get started. Before commit `372728b0d4`, there was a README at src/backend/catalog/README, but then this was moved to the SGML documentation. So this effectively puts back a link to what was moved. But src/include/catalog/ is probably a better location, because that's where all the interesting files are. Co-authored-by: Florents Tselai <florents.tselai@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CA+v5N400GJFJ9RyXAX7hFKbtF7vVQGvWdFWEfcSQmvVhi9xfrA@mail.gmail.com	2025-08-19 08:41:42 +02:00
Tom Lane	ee54046601	Grab the low-hanging fruit from forcing USE_FLOAT8_BYVAL to true. Remove conditionally-compiled code for the other case. Replace uses of FLOAT8PASSBYVAL with constant "true", mainly because it was quite confusing in cases where the type we were dealing with wasn't float8. I left the associated pg_control and Pg_magic_struct fields in place. Perhaps we should get rid of them, but it would save little, so it doesn't seem worth thinking hard about the compatibility implications. I just labeled them "vestigial" in places where that seemed helpful. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/1749799.1752797397@sss.pgh.pa.us	2025-08-13 17:18:22 -04:00
Tom Lane	2a600a93c7	Make type Datum be 8 bytes wide everywhere. This patch makes sizeof(Datum) be 8 on all platforms including 32-bit ones. The objective is to allow USE_FLOAT8_BYVAL to be true everywhere, and in consequence to remove a lot of code that is specific to pass-by-reference handling of float8, int8, etc. The code for abbreviated sort keys can be simplified similarly. In this way we can reduce the maintenance effort involved in supporting 32-bit platforms, without going so far as to actually desupport them. Since Datum is strictly an in-memory concept, this has no impact on on-disk storage, though an initdb or pg_upgrade will be needed to fix affected catalog entries. We have required platforms to support [u]int64 for ages, so this breaks no supported platform. We can expect that this change will make 32-bit builds a bit slower and more memory-hungry, although being able to use pass-by-value handling of 8-byte types may buy back some of that. But we stopped optimizing for 32-bit cases a long time ago, and this seems like just another step on that path. This initial patch simply forces the correct type definition and USE_FLOAT8_BYVAL setting, and cleans up a couple of minor compiler complaints that ensued. This is sufficient for testing purposes. In the wake of a bunch of Datum-conversion cleanups by Peter Eisentraut, this now compiles cleanly with gcc on a 32-bit platform. (I'd only tested the previous version with clang, which it turns out is less picky than gcc about width-changing coercions.) There is a good deal of now-dead code that I'll remove in separate follow-up patches. A catversion bump is required because this affects initial catalog contents (on 32-bit machines) in two ways: pg_type.typbyval changes for some built-in types, and Const nodes in stored views/rules will now have 8 bytes not 4 for pass-by-value types. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/1749799.1752797397@sss.pgh.pa.us	2025-08-13 17:18:22 -04:00
Masahiko Sawada	deb674454c	Add backup_type column to pg_stat_progress_basebackup. This commit introduces a new column backup_type that indicates the type of backup being performed: either 'full' or 'incremental'. Bump catalog version. Author: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Yugo Nagata <nagata@sraoss.co.jp> Discussion: https://postgr.es/m/CAOzEurQuzbHwTj1ehk1a+eeQDidJPyrE5s6mYumkjwjZnurhkQ@mail.gmail.com	2025-08-05 10:50:45 -07:00
Amit Kapila	fd5a1a0c3e	Detect and report update_deleted conflicts. This enhancement builds upon the infrastructure introduced in commit `228c370868`, which enables the preservation of deleted tuples and their origin information on the subscriber. This capability is crucial for handling concurrent transactions replicated from remote nodes. The update introduces support for detecting update_deleted conflicts during the application of update operations on the subscriber. When an update operation fails to locate the target row-typically because it has been concurrently deleted-we perform an additional table scan. This scan uses the SnapshotAny mechanism and we do this additional scan only when the retain_dead_tuples option is enabled for the relevant subscription. The goal of this scan is to locate the most recently deleted tuple-matching the old column values from the remote update-that has not yet been removed by VACUUM and is still visible according to our slot (i.e., its deletion is not older than conflict-detection-slot's xmin). If such a tuple is found, the system reports an update_deleted conflict, including the origin and transaction details responsible for the deletion. This provides a groundwork for more robust and accurate conflict resolution process, preventing unexpected behavior by correctly identifying cases where a remote update clashes with a deletion from another origin. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/OS0PR01MB5716BE80DAEB0EE2A6A5D1F5949D2@OS0PR01MB5716.jpnprd01.prod.outlook.com	2025-08-04 04:02:47 +00:00
Amit Kapila	2ab2d6f970	Fix a deadlock during ALTER SUBSCRIPTION ... DROP PUBLICATION. A deadlock can occur when the DDL command and the apply worker acquire catalog locks in different orders while dropping replication origins. The issue is rare in PG16 and higher branches because, in most cases, the tablesync worker performs the origin drop in those branches, and its locking sequence does not conflict with DDL operations. This patch ensures consistent lock acquisition to prevent such deadlocks. As per buildfarm. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Ajin Cherian <itsajin@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 14, where it was introduced Discussion: https://postgr.es/m/bab95e12-6cc5-4ebb-80a8-3e41956aa297@gmail.com	2025-08-01 07:58:48 +00:00
Amit Kapila	228c370868	Preserve conflict-relevant data during logical replication. Logical replication requires reliable conflict detection to maintain data consistency across nodes. To achieve this, we must prevent premature removal of tuples deleted by other origins and their associated commit_ts data by VACUUM, which could otherwise lead to incorrect conflict reporting and resolution. This patch introduces a mechanism to retain deleted tuples on the subscriber during the application of concurrent transactions from remote nodes. Retaining these tuples allows us to correctly ignore concurrent updates to the same tuple. Without this, an UPDATE might be misinterpreted as an INSERT during resolutions due to the absence of the original tuple. Additionally, we ensure that origin metadata is not prematurely removed by vacuum freeze, which is essential for detecting update_origin_differs and delete_origin_differs conflicts. To support this, a new replication slot named pg_conflict_detection is created and maintained by the launcher on the subscriber. Each apply worker tracks its own non-removable transaction ID, which the launcher aggregates to determine the appropriate xmin for the slot, thereby retaining necessary tuples. Conflict information retention (deleted tuples and commit_ts) can be enabled per subscription via the retain_conflict_info option. This is disabled by default to avoid unnecessary overhead for configurations that do not require conflict resolution or logging. During upgrades, if any subscription on the old cluster has retain_conflict_info enabled, a conflict detection slot will be created to protect relevant tuples from deletion when the new cluster starts. This is a foundational work to correctly detect update_deleted conflict which will be done in a follow-up patch. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/OS0PR01MB5716BE80DAEB0EE2A6A5D1F5949D2@OS0PR01MB5716.jpnprd01.prod.outlook.com	2025-07-23 02:56:00 +00:00
Nathan Bossart	167ed8082f	Introduce pg_dsm_registry_allocations view. This commit adds a new system view that provides information about entries in the dynamic shared memory (DSM) registry. Specifically, it returns the name, type, and size of each entry. Note that since we cannot discover the size of dynamic shared memory areas (DSAs) and hash tables backed by DSAs (dshashes) without first attaching to them, the size column is left as NULL for those. Bumps catversion. Author: Florents Tselai <florents.tselai@gmail.com> Reviewed-by: Sungwoo Chang <swchangdev@gmail.com> Discussion: https://postgr.es/m/4D445D3E-81C5-4135-95BB-D414204A0AB4%40gmail.com	2025-07-09 09:17:56 -05:00
Amit Kapila	b5cd0ecd4d	Fix typo in pg_publication.h. Author: shveta malik <shveta.malik@gmail.com> Discussion: https://postgr.es/m/CAJpy0uAyFN9o7vU_ZkZFv5-6ysXDNKNx_fC0gwLLKg=8==E3ow@mail.gmail.com	2025-07-01 15:17:03 +05:30
Nathan Bossart	bd09f024a1	Add new OID alias type regdatabase. This provides a convenient way to look up a database's OID. For example, the query SELECT * FROM pg_shdepend WHERE dbid = (SELECT oid FROM pg_database WHERE datname = current_database()); can now be simplified to SELECT * FROM pg_shdepend WHERE dbid = current_database()::regdatabase; Like the regrole type, regdatabase has cluster-wide scope, so we disallow regdatabase constants from appearing in stored expressions. Bumps catversion. Author: Ian Lawrence Barwick <barwick@gmail.com> Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com> Reviewed-by: Jian He <jian.universality@gmail.com> Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/aBpjJhyHpM2LYcG0%40nathan	2025-06-30 15:38:54 -05:00
Joe Conway	9c5b9a280c	Do pre-release housekeeping on catalog data. Run renumber_oids.pl to move high-numbered OIDs down, as per pre-beta tasks specified by RELEASE_CHANGES. For reference, the command was ./renumber_oids.pl --first-mapped-oid 8000 --target-oid 6300 This should have been done prior to beta1, but it was forgotten. This will ensure we get the correct numbering for beta2 onward.	2025-06-29 21:43:39 -04:00
Joe Conway	0ebd242555	Run pgperltidy This is required before the creation of a new branch. pgindent is clean, as well as is reformat-dat-files. perltidy version is v20230309, as documented in pgindent's README.	2025-06-29 21:14:21 -04:00
Peter Eisentraut	0cd69b3d7e	Restrict virtual columns to use built-in functions and types Just like selecting from a view is exploitable (CVE-2024-7348), selecting from a table with virtual generated columns is exploitable. Users who are concerned about this can avoid selecting from views, but telling them to avoid selecting from tables is less practical. To address this, this changes it so that generation expressions for virtual generated columns are restricted to using built-in functions and types, and the columns are restricted to having a built-in type. We assume that built-in functions and types cannot be exploited for this purpose. In the future, this could be expanded by some new mechanism to declare other functions and types as safe or trusted for this purpose, but that is to be designed. (An alternative approach might have been to expand the restrict_nonsystem_relation_kind GUC to handle this, like the fix for CVE-2024-7348. But that is kind of an ugly approach. That fix had to fit in the constraints of fixing an ancient vulnerability in all branches. Since virtual generated columns are new, we're free from the constraints of the past, and we can and should use cleaner options.) Reported-by: Feike Steenbergen <feikesteenbergen@gmail.com> Reviewed-by: jian he <jian.universality@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAK_s-G2Q7de8Q0qOYUR%3D_CTB5FzzVBm5iZjOp%2BmeVWpMpmfO0w%40mail.gmail.com	2025-06-25 09:56:49 +02:00
Álvaro Herrera	0f65f3eec4	Fix squashing algorithm for query texts The algorithm to squash lists of constants added by commit `62d712ecfd` was a bit too simplistic; we wanted to avoid adding unnecessary complexity, but cases like direct function calls of typecasting functions (and others) were missed, and bogus SQL syntax was being shown in pg_stat_statements normalized query text field. To fix normalization for those cases, we need the parser to transmit information about were each list of constant values starts and ends, so add that to a couple of nodes. Also add a few more test cases to make sure we're doing the right thing. The patch initially submitted by Sami added a new private struct in gram.y to carry the start/end information for A_Expr, but I (Álvaro) decided that a better fix was to remove the parser indirection via the in_expr production, and instead create separate components in the a_expr rule. I'm surprised that this works and doesn't require more changes, but I assume (without checking) that the grammar used to be more complex and got simplified at some point. Bump catversion. Author: Sami Imseih <samimseih@gmail.com> Author: Dmitry Dolgov <9erthalion6@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAA5RZ0tRXoPG2y6bMgBCWNDt0Tn=unRerbzYM=oW0syi1=C1OA@mail.gmail.com	2025-06-12 14:21:21 +02:00
Peter Eisentraut	32edf732e8	Rename gist stratnum support function Commit `7406ab623f` added a gist support function that we internally refer to by the symbol GIST_STRATNUM_PROC. This translated from "well-known" strategy numbers to opfamily-specific strategy numbers. However, we later (commit `630f9a43ce`) changed this to fit into index-AM-level compare type mapping, so this function actually now maps from compare type to opfamily-specific strategy numbers. So this name is no longer fitting. Moreover, the index AM level also supports the opposite, a function to map from strategy number to compare type. This is currently not supported in gist, but one might wonder what this function is supposed to be called when it is added. This patch changes the naming of the gist-level functionality to be more in line with the index-AM-level functionality. This makes sense because these are essentially the same thing on different levels. This also changes the names of the externally visible functions that are provided for use as such a support function. Reviewed-by: Paul A Jungwirth <pj@illuminatedcomputing.com> Discussion: https://www.postgresql.org/message-id/37ebb1d9-9036-485f-a215-e55435689917%40eisentraut.org	2025-06-02 08:41:27 +02:00
Daniel Gustafsson	fb844b9f06	Revert function to get memory context stats for processes Due to concerns raised about the approach, and memory leaks found in sensitive contexts the functionality is reverted. This reverts commits `45e7e8ca9`, `f8c115a6c`, `d2a1ed172`, `55ef7abf8` and `042a66291` for v18 with an intent to revisit this patch for v19. Discussion: https://postgr.es/m/594293.1747708165@sss.pgh.pa.us	2025-05-23 15:44:54 +02:00
Nathan Bossart	16bf24e0e4	Remove pg_replication_origin's TOAST table. A few places that access this catalog don't set up an active snapshot before potentially accessing its TOAST table. However, roname (the replication origin name) is the only varlena column, so this is only a problem if the name requires out-of-line storage. This commit removes its TOAST table to avoid needing to set up a snapshot. It also places a limit on replication origin names so that attempts to set long names will fail with a more user-friendly error. Those chosen limit of 512 bytes should be sufficient to avoid "row is too big" errors independent of BLCKSZ, but it should also be lenient enough for all reasonable use-cases. Bumps catversion. Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan	2025-05-07 14:47:36 -05:00
Noah Misch	f4ece891fc	Assert lack of hazardous buffer locks before possible catalog read. Commit `0bada39c83` fixed a bug of this kind, which existed in all branches for six days before detection. While the probability of reaching the trouble was low, the disruption was extreme. No new backends could start, and service restoration needed an immediate shutdown. Hence, add this to catch the next bug like it. The new check in RelationIdGetRelation() suffices to make autovacuum detect the bug in commit `243e9b40f1` that led to commit `0bada39`. This also checks in a number of similar places. It replaces each Assert(IsTransactionState()) that pertained to a conditional catalog read. No back-patch for now, but a back-patch of commit `243e9b4` should back-patch this, too. A back-patch could omit the src/test/regress changes, since back branches won't gain new index columns. Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/20250410191830.0e.nmisch@google.com Discussion: https://postgr.es/m/10ec0bc3-5933-1189-6bb8-5dec4114558e@gmail.com	2025-04-17 05:00:30 -07:00
Daniel Gustafsson	ef366b7d7e	Perform missed catversion bump Commit `c57971034e` renamed an argument for a function but missed to bump the catversion to reflect this. Reported-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAApHDvqOega=dPtu3h2C5fJWJEuaGCMDib_sVfhKQqgUNJVmFA@mail.gmail.com	2025-04-09 09:29:12 +02:00
Daniel Gustafsson	c57971034e	Rename argument in pg_get_process_memory_contexts(). During development the third argument to pg_get_process_memory_contexts was a retry count, but it was changed to a timeout instead. The param name was accidentally left in pg_proc.dat though. Fix by renaming to the correct parameter name. Author: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/3eb40b3e-45c7-426a-b7f8-81f7d05a9b53@oss.nttdata.com	2025-04-08 23:09:13 +02:00
Daniel Gustafsson	042a66291b	Add function to get memory context stats for processes This adds a function for retrieving memory context statistics and information from backends as well as auxiliary processes. The intended usecase is cluster debugging when under memory pressure or unanticipated memory usage characteristics. When calling the function it sends a signal to the specified process to submit statistics regarding its memory contexts into dynamic shared memory. Each memory context is returned in detail, followed by a cumulative total in case the number of contexts exceed the max allocated amount of shared memory. Each process is limited to use at most 1Mb memory for this. A summary can also be explicitly requested by the user, this will return the TopMemoryContext and a cumulative total of all lower contexts. In order to not block on busy processes the caller specifies the number of seconds during which to retry before timing out. In the case where no statistics are published within the set timeout, the last known statistics are returned, or NULL if no previously published statistics exist. This allows dash- board type queries to continually publish even if the target process is temporarily congested. Context records contain a timestamp to indicate when they were submitted. Author: Rahila Syed <rahilasyed90@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Tomas Vondra <tomas@vondra.me> Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Discussion: https://postgr.es/m/CAH2L28v8mc9HDt8QoSJ8TRmKau_8FM_HKS41NeO9-6ZAkuZKXw@mail.gmail.com	2025-04-08 11:06:56 +02:00
Tomas Vondra	8cc139bec3	Introduce pg_shmem_allocations_numa view Introduce new pg_shmem_alloctions_numa view with information about how shared memory is distributed across NUMA nodes. For each shared memory segment, the view returns one row for each NUMA node backing it, with the total amount of memory allocated from that node. The view may be relatively expensive, especially when executed for the first time in a backend, as it has to touch all memory pages to get reliable information about the NUMA node. This may also force allocation of the shared memory. Unlike pg_shmem_allocations, the view does not show anonymous shared memory allocations. It also does not show memory allocated using the dynamic shared memory infrastructure. Author: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CAKZiRmxh6KWo0aqRqvmcoaX2jUxZYb4kGp3N%3Dq1w%2BDiH-696Xw%40mail.gmail.com	2025-04-07 23:08:17 +02:00
Tomas Vondra	65c298f61f	Add support for basic NUMA awareness Add basic NUMA awareness routines, using a minimal src/port/pg_numa.c portability wrapper and an optional build dependency, enabled by --with-libnuma configure option. For now this is Linux-only, other platforms may be supported later. A built-in SQL function pg_numa_available() allows checking NUMA support, i.e. that the server was built/linked with the NUMA library. The main function introduced is pg_numa_query_pages(), which allows determining the NUMA node for individual memory pages. Internally the function uses move_pages(2) syscall, as it allows batching, and is more efficient than get_mempolicy(2). Author: Jakub Wartak <jakub.wartak@enterprisedb.com> Co-authored-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CAKZiRmxh6KWo0aqRqvmcoaX2jUxZYb4kGp3N%3Dq1w%2BDiH-696Xw%40mail.gmail.com	2025-04-07 23:08:17 +02:00
Tom Lane	b73e6d71a8	Fix erroneous construction of functions' dependencies on transforms. The list of transform objects that a function should use is specified in CREATE FUNCTION's TRANSFORM clause, and then represented indirectly in pg_proc.protrftypes. However, ProcedureCreate completely ignored that for purposes of constructing pg_depend entries, and instead made the function depend on any transforms that exist for its parameter or return data types. This is bad in both directions: the function could be made dependent on a transform it does not actually use, or it could try to use a transform that's since been dropped. (The latter scenario would require use of a transform that's not for any of the parameter or return types, but that seems legit for cases where the function performs SQL operations internally.) To fix, pass in the list of transform objects that CreateFunction identified, and build pg_depend entries from that not from the parameter/return types. This results in changes in the expected test outputs in contrib/bool_plperl, which I guess are due to different ordering of pg_depend entries -- that test case is surely not exercising either of the problem scenarios. This fix is not back-patchable as-is: changing the signature of ProcedureCreate seems too risky in stable branches. We could do something like making ProcedureCreate a wrapper around ProcedureCreateExt or so. However, I'm more inclined to do nothing in the back branches. We had no field complaints up to now, so the hazards don't seem to be a big issue in practice. And we couldn't do anything about existing pg_depend entries, so a back-patched fix would result in a mishmash of dependencies created according to different rules. That cure could be worse than the disease, perhaps. I bumped catversion just to lay down a marker that the expected contents of pg_depend are a bit different than before. Reported-by: Chapman Flack <jcflack@acm.org> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3112950.1743984111@sss.pgh.pa.us	2025-04-07 13:31:37 -04:00
Álvaro Herrera	a379061a22	Allow NOT NULL constraints to be added as NOT VALID This allows them to be added without scanning the table, and validating them afterwards without holding access exclusive lock on the table after any violating rows have been deleted or fixed. Doing ALTER TABLE ... SET NOT NULL for a column that has an invalid not-null constraint validates that constraint. ALTER TABLE .. VALIDATE CONSTRAINT is also supported. There are various checks on whether an invalid constraint is allowed in a child table when the parent table has a valid constraint; this should match what we do for enforced/not enforced constraints. pg_attribute.attnotnull is now only an indicator for whether a not-null constraint exists for the column; whether it's valid or invalid must be queried in pg_constraint. Applications can continue to query pg_attribute.attnotnull as before, but now it's possible that NULL rows are present in the column even when that's set to true. For backend internal purposes, we cache the nullability status in CompactAttribute->attnullability that each tuple descriptor carries (replacing CompactAttribute.attnotnull, which was a mirror of Form_pg_attribute.attnotnull). During the initial tuple descriptor creation, based on the pg_attribute scan, we set this to UNRESTRICTED if pg_attribute.attnotnull is false, or to UNKNOWN if it's true; then we update the latter to VALID or INVALID depending on the pg_constraint scan. This flag is also copied when tupledescs are copied. Comparing tuple descs for equality must also compare the CompactAttribute.attnullability flag and return false in case of a mismatch. pg_dump deals with these constraints by storing the OIDs of invalid not-null constraints in a separate array, and running a query to obtain their properties. The regular table creation SQL omits them entirely. They are then dealt with in the same way as "separate" CHECK constraints, and dumped after the data has been loaded. Because no additional pg_dump infrastructure was required, we don't bump its version number. I decided not to bump catversion either, because the old catalog state works perfectly in the new world. (Trying to run with new catalog state and the old server version would likely run into issues, however.) System catalogs do not support invalid not-null constraints (because commit `14e87ffa5c` didn't allow them to have pg_constraint rows anyway.) Author: Rushabh Lathia <rushabh.lathia@gmail.com> Author: Jian He <jian.universality@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Tested-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://postgr.es/m/CAGPqQf0KitkNack4F5CFkFi-9Dqvp29Ro=EpcWt=4_hs-Rt+bQ@mail.gmail.com	2025-04-07 19:19:50 +02:00
Peter Geoghegan	92fe23d93a	Add nbtree skip scan optimization. Teach nbtree multi-column index scans to opportunistically skip over irrelevant sections of the index given a query with no "=" conditions on one or more prefix index columns. When nbtree is passed input scan keys derived from a predicate "WHERE b = 5", new nbtree preprocessing steps output "WHERE a = ANY(<every possible 'a' value>) AND b = 5" scan keys. That is, preprocessing generates a "skip array" (and an output scan key) for the omitted prefix column "a", which makes it safe to mark the scan key on "b" as required to continue the scan. The scan is therefore able to repeatedly reposition itself by applying both the "a" and "b" keys. A skip array has "elements" that are generated procedurally and on demand, but otherwise works just like a regular ScalarArrayOp array. Preprocessing can freely add a skip array before or after any input ScalarArrayOp arrays. Index scans with a skip array decide when and where to reposition the scan using the same approach as any other scan with array keys. This design builds on the design for array advancement and primitive scan scheduling added to Postgres 17 by commit `5bf748b8`. Testing has shown that skip scans of an index with a low cardinality skipped prefix column can be multiple orders of magnitude faster than an equivalent full index scan (or sequential scan). In general, the cardinality of the scan's skipped column(s) limits the number of leaf pages that can be skipped over. The core B-Tree operator classes on most discrete types generate their array elements with the help of their own custom skip support routine. This infrastructure gives nbtree a way to generate the next required array element by incrementing (or decrementing) the current array value. It can reduce the number of index descents in cases where the next possible indexable value frequently turns out to be the next value stored in the index. Opclasses that lack a skip support routine fall back on having nbtree "increment" (or "decrement") a skip array's current element by setting the NEXT (or PRIOR) scan key flag, without directly changing the scan key's sk_argument. These sentinel values behave just like any other value from an array -- though they can never locate equal index tuples (they can only locate the next group of index tuples containing the next set of non-sentinel values that the scan's arrays need to advance to). A skip array's range is constrained by "contradictory" inequality keys. For example, a skip array on "x" will only generate the values 1 and 2 given a qual such as "WHERE x BETWEEN 1 AND 2 AND y = 66". Such a skip array qual usually has near-identical performance characteristics to a comparable SAOP qual "WHERE x = ANY('{1, 2}') AND y = 66". However, improved performance isn't guaranteed. Much depends on physical index characteristics. B-Tree preprocessing is optimistic about skipping working out: it applies static, generic rules when determining where to generate skip arrays, which assumes that the runtime overhead of maintaining skip arrays will pay for itself -- or lead to only a modest performance loss. As things stand, these assumptions are much too optimistic: skip array maintenance will lead to unacceptable regressions with unsympathetic queries (queries whose scan can't skip over many irrelevant leaf pages). An upcoming commit will address the problems in this area by enhancing _bt_readpage's approach to saving cycles on scan key evaluation, making it work in a way that directly considers the needs of = array keys (particularly = skip array keys). Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Masahiro Ikeda <masahiro.ikeda@nttdata.com> Reviewed-By: Heikki Linnakangas <heikki.linnakangas@iki.fi> Reviewed-By: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-By: Tomas Vondra <tomas@vondra.me> Reviewed-By: Aleksander Alekseev <aleksander@timescale.com> Reviewed-By: Alena Rybakina <a.rybakina@postgrespro.ru> Discussion: https://postgr.es/m/CAH2-Wzmn1YsLzOGgjAQZdn1STSG_y8qP__vggTaPAYXJP+G4bw@mail.gmail.com	2025-04-04 12:27:04 -04:00
Fujii Masao	0d6c477664	Extend ALTER DEFAULT PRIVILEGES to define default privileges for large objects. Previously, ALTER DEFAULT PRIVILEGES did not support large objects. This meant that to grant privileges to users other than the owner, permissions had to be manually assigned each time a large object was created, which was inconvenient. This commit extends ALTER DEFAULT PRIVILEGES to allow defining default access privileges for large objects. With this change, specified privileges will automatically apply to newly created large objects, making privilege management more efficient. As a side effect, this commit introduces the new keyword OBJECTS since it's used in the syntax of ALTER DEFAULT PRIVILEGES. Original patch by Haruka Takatsuka, with some fixes and tests by Yugo Nagata, and rebased by Laurenz Albe. Author: Takatsuka Haruka <harukat@sraoss.co.jp> Co-authored-by: Yugo Nagata <nagata@sraoss.co.jp> Co-authored-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Masao Fujii <masao.fujii@gmail.com> Discussion: https://postgr.es/m/20240424115242.236b499b2bed5b7a27f7a418@sraoss.co.jp	2025-04-04 19:02:17 +09:00
Amit Kapila	4868c96bc8	Fix slot synchronization for two_phase enabled slots. The issue is that the transactions prepared before two-phase decoding is enabled can fail to replicate to the subscriber after being committed on a promoted standby following a failover. This is because the two_phase_at field of a slot, which tracks the LSN from which two-phase decoding starts, is not synchronized to standby servers. Without two_phase_at, the logical decoding might incorrectly identify prepared transaction as already replicated to the subscriber after promotion of standby server, causing them to be skipped. To address the issue on HEAD, the two_phase_at field of the slot is exposed by the pg_replication_slots view and allows the slot synchronization to copy this value to the corresponding synced slot on the standby server. This bug is likely to occur if the user toggles the two_phase option to true after initial slot creation. Given that altering the two_phase option of a replication slot is not allowed in PostgreSQL 17, this bug is less likely to occur. We can't change the view/function definition in backbranch so we can't push the same fix but we are brainstorming an appropriate solution for PG17. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/TYAPR01MB5724CC7C288535BBCEEE65DA94A72@TYAPR01MB5724.jpnprd01.prod.outlook.com	2025-04-03 12:26:54 +05:30
Heikki Linnakangas	e9e7b66044	Add GiST and btree sortsupport routines for range types For GiST, having a sortsupport function allows building the index using the "sorted build" method, which is much faster. For b-tree, the sortsupport routine doesn't give any new functionality, but speeds up sorting a tiny bit. The difference is not very significant, about 2% in cursory testing on my laptop, because the range type comparison function has quite a lot of overhead from detoasting. In any case, since we have the function for GiST anyway, we might as well register it for the btree opfamily too. Author: Bernd Helmle <mailings@oopsware.de> Discussion: https://www.postgresql.org/message-id/64d324ce2a6d535d3f0f3baeeea7b25beff82ce4.camel@oopsware.de	2025-04-02 19:51:28 +03:00
Tom Lane	6c12ae09f5	Introduce a SQL-callable function array_sort(anyarray). Create a function that will sort the elements of an array according to the element type's sort order. If the array has more than one dimension, the sub-arrays of the first dimension are sorted per normal array-comparison rules, leaving their contents alone. In support of this, add pg_type.typarray to the set of fields cached by the typcache. Author: Junwang Zhao <zhjwpku@gmail.com> Co-authored-by: Jian He <jian.universality@gmail.com> Reviewed-by: Aleksander Alekseev <aleksander@timescale.com> Discussion: https://postgr.es/m/CAEG8a3J41a4dpw_-F94fF-JPRXYxw-GfsgoGotKcjs9LVfEEvw@mail.gmail.com	2025-04-01 18:03:55 -04:00
Andres Freund	60f566b4f2	aio: Add pg_aios view The new view lists all IO handles that are currently in use and is mainly useful for PG developers, but may also be useful when tuning PG. Bumps catversion. Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt	2025-04-01 13:30:33 -04:00
Tom Lane	9324c8c580	Introduce PG_MODULE_MAGIC_EXT macro. This macro allows dynamically loaded shared libraries (modules) to provide a wired-in module name and version, and possibly other compile-time-constant fields in future. This information can be retrieved with the new pg_get_loaded_modules() function. This feature is expected to be particularly useful for modules that do not have any exposed SQL functionality and thus are not associated with a SQL-level extension object. But even for modules that do belong to extensions, being able to verify the actual code version can be useful. Author: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Yurii Rashkovskii <yrashk@omnigres.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/dd4d1b59-d0fe-49d5-b28f-1e463b68fa32@gmail.com	2025-03-26 11:06:12 -04:00

1 2 3 4 5 ...

2911 Commits