postgres

mirror of https://github.com/postgres/postgres.git synced 2025-11-21 00:42:43 +03:00

Author	SHA1	Message	Date
Peter Eisentraut	83ea6c5402	Virtual generated columns This adds a new variant of generated columns that are computed on read (like a view, unlike the existing stored generated columns, which are computed on write, like a materialized view). The syntax for the column definition is ... GENERATED ALWAYS AS (...) VIRTUAL and VIRTUAL is also optional. VIRTUAL is the default rather than STORED to match various other SQL products. (The SQL standard makes no specification about this, but it also doesn't know about VIRTUAL or STORED.) (Also, virtual views are the default, rather than materialized views.) Virtual generated columns are stored in tuples as null values. (A very early version of this patch had the ambition to not store them at all. But so much stuff breaks or gets confused if you have tuples where a column in the middle is completely missing. This is a compromise, and it still saves space over being forced to use stored generated columns. If we ever find a way to improve this, a bit of pg_upgrade cleverness could allow for upgrades to a newer scheme.) The capabilities and restrictions of virtual generated columns are mostly the same as for stored generated columns. In some cases, this patch keeps virtual generated columns more restricted than they might technically need to be, to keep the two kinds consistent. Some of that could maybe be relaxed later after separate careful considerations. Some functionality that is currently not supported, but could possibly be added as incremental features, some easier than others: - index on or using a virtual column - hence also no unique constraints on virtual columns - extended statistics on virtual columns - foreign-key constraints on virtual columns - not-null constraints on virtual columns (check constraints are supported) - ALTER TABLE / DROP EXPRESSION - virtual column cannot have domain type - virtual columns are not supported in logical replication The tests in generated_virtual.sql have been copied over from generated_stored.sql with the keyword replaced. This way we can make sure the behavior is mostly aligned, and the differences can be visible. Some tests for currently not supported features are currently commented out. Reviewed-by: Jian He <jian.universality@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Tested-by: Shlok Kyal <shlok.kyal.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/a368248e-69e4-40be-9c07-6c3b5880b0a6@eisentraut.org	2025-02-07 09:46:59 +01:00
Amit Langote	cbc127917e	Track unpruned relids to avoid processing pruned relations This commit introduces changes to track unpruned relations explicitly, making it possible for top-level plan nodes, such as ModifyTable and LockRows, to avoid processing partitions pruned during initial pruning. Scan-level nodes, such as Append and MergeAppend, already avoid the unnecessary processing by accessing partition pruning results directly via part_prune_index. In contrast, top-level nodes cannot access pruning results directly and need to determine which partitions remain unpruned. To address this, this commit introduces a new bitmapset field, es_unpruned_relids, which the executor uses to track the set of unpruned relations. This field is referenced during plan initialization to skip initializing certain nodes for pruned partitions. It is initialized with PlannedStmt.unprunableRelids, a new field that the planner populates with RT indexes of relations that cannot be pruned during runtime pruning. These include relations not subject to partition pruning and those required for execution regardless of pruning. PlannedStmt.unprunableRelids is computed during set_plan_refs() by removing the RT indexes of runtime-prunable relations, identified from PartitionPruneInfos, from the full set of relation RT indexes. ExecDoInitialPruning() then updates es_unpruned_relids by adding partitions that survive initial pruning. To support this, PartitionedRelPruneInfo and PartitionedRelPruningData now include a leafpart_rti_map[] array that maps partition indexes to their corresponding RT indexes. The former is used in set_plan_refs() when constructing unprunableRelids, while the latter is used in ExecDoInitialPruning() to convert partition indexes returned by get_matching_partitions() into RT indexes, which are then added to es_unpruned_relids. These changes make it possible for ModifyTable and LockRows nodes to process only relations that remain unpruned after initial pruning. ExecInitModifyTable() trims lists, such as resultRelations, withCheckOptionLists, returningLists, and updateColnosLists, to consider only unpruned partitions. It also creates ResultRelInfo structs only for these partitions. Similarly, child RowMarks for pruned relations are skipped. By avoiding unnecessary initialization of structures for pruned partitions, these changes improve the performance of updates and deletes on partitioned tables during initial runtime pruning. Due to ExecInitModifyTable() changes as described above, EXPLAIN on a plan for UPDATE and DELETE that uses runtime initial pruning no longer lists partitions pruned during initial pruning. Reviewed-by: Robert Haas <robertmhaas@gmail.com> (earlier versions) Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.com	2025-02-07 17:15:09 +09:00
John Naylor	128897b101	Fix grammatical typos around possessive "its" Some places spelled it "it's", which is short for "it is". In passing, fix a couple other nearby grammatical errors. Author: Jacob Brazeal <jacob.brazeal@gmail.com> Discussion: https://postgr.es/m/CA+COZaAO8g1KJCV0T48=CkJMjAnnfTGLWOATz+2aCh40c2Nm+g@mail.gmail.com	2025-01-29 14:39:14 +07:00
Amit Kapila	e65dbc9927	Change publication's publish_generated_columns option type to enum. The current boolean publish_generated_columns option only supports a binary choice, which is insufficient for future enhancements where generated columns can be of different types (e.g., stored or virtual). The supported values for the publish_generated_columns option are 'none' and 'stored'. Author: Vignesh C <vignesh21@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/d718d219-dd47-4a33-bb97-56e8fc4da994@eisentraut.org Discussion: https://postgr.es/m/B80D17B2-2C8E-4C7D-87F2-E5B4BE3C069E@gmail.com	2025-01-23 15:28:37 +05:30
Bruce Momjian	50e6eb731d	Update copyright for 2025 Backpatch-through: 13	2025-01-01 11:21:55 -05:00
Michael Paquier	c9b3d4909b	Fix memory leak in pgoutput with relation attribute map pgoutput caches the attribute map of a relation, that is free()'d only when validating a RelationSyncEntry. However, this code path is not taken when calling any of the SQL functions able to do some logical decoding, like pg_logical_slot_{get,peek}_changes(), leaking some memory into CacheMemoryContext on repeated calls. To address this, a relation's attribute map is allocated in PGOutputData's cachectx, free()'d at the end of the execution of these SQL functions when logical decoding ends. This is available down to 15. v13 and v14 have a similar leak, which will be dealt with later. Reported-by: Masahiko Sawada Author: Vignesh C Reviewed-by: Hou Zhijie Discussion: https://postgr.es/m/CAD21AoDkAhQVSukOfH3_reuF-j4EU0-HxMqU3dU+bSTxsqT14Q@mail.gmail.com Discussion: https://postgr.es/m/CALDaNm1hewNAsZ_e6FF52a=9drmkRJxtEPrzCB6-9mkJyeBBqA@mail.gmail.com Backpatch-through: 15	2024-12-30 13:33:09 +09:00
David Rowley	5983a4cffc	Introduce CompactAttribute array in TupleDesc, take 2 The new compact_attrs array stores a few select fields from FormData_pg_attribute in a more compact way, using only 16 bytes per column instead of the 104 bytes that FormData_pg_attribute uses. Using CompactAttribute allows performance-critical operations such as tuple deformation to be performed without looking at the FormData_pg_attribute element in TupleDesc which means fewer cacheline accesses. For some workloads, tuple deformation can be the most CPU intensive part of processing the query. Some testing with 16 columns on a table where the first column is variable length showed around a 10% increase in transactions per second for an OLAP type query performing aggregation on the 16th column. However, in certain cases, the increases were much higher, up to ~25% on one AMD Zen4 machine. This also makes pg_attribute.attcacheoff redundant. A follow-on commit will remove it, thus shrinking the FormData_pg_attribute struct by 4 bytes. Author: David Rowley Reviewed-by: Andres Freund, Victor Yegorov Discussion: https://postgr.es/m/CAApHDvrBztXP3yx=NKNmo3xwFAFhEdyPnvrDg3=M0RhDs+4vYw@mail.gmail.com	2024-12-20 22:31:26 +13:00
Michael Paquier	f0c569d715	Fix memory leak in pgoutput with publication list cache The pgoutput module caches publication names in a list and frees it upon invalidation. However, the code forgot to free the actual publication names within the list elements, as publication names are pstrdup()'d in GetPublication(). This would cause memory to leak in CacheMemoryContext, bloating it over time as this context is not cleaned. This is a problem for WAL senders running for a long time, as an accumulation of invalidation requests would bloat its cache memory usage. A second case, where this leak is easier to see, involves a backend calling SQL functions like pg_logical_slot_{get,peek}_changes() which create a new decoding context with each execution. More publications create more bloat. To address this, this commit adds a new memory context within the logical decoding context and resets it each time the publication names cache is invalidated, based on a suggestion from Amit Kapila. This ensures that the lifespan of the publication names aligns with that of the logical decoding context. This solution changes PGOutputData, which is fine for HEAD but it could cause an ABI breakage in stable branches as the structure size would change, so these are left out for now. Analyzed-by: Michael Paquier, Jeff Davis Author: Zhijie Hou Reviewed-by: Michael Paquier, Masahiko Sawada, Euler Taveira Discussion: https://postgr.es/m/Z0khf9EVMVLOc_YY@paquier.xyz	2024-12-09 16:41:46 +09:00
Michael Paquier	ea792bfd93	Fix memory leak in pgoutput for the WAL sender RelationSyncCache, the hash table in charge of tracking the relation schemas sent through pgoutput, was forgetting to free the TupleDesc associated to the two slots used to store the new and old tuples, causing some memory to be leaked each time a relation is invalidated when the slots of an existing relation entry are cleaned up. This is rather hard to notice as the bloat is pretty minimal, but a long-running WAL sender would be in trouble over time depending on the workload. sysbench has proved to be pretty good at showing the problem, coupled with some memory monitoring of the WAL sender. Issue introduced in `52e4f0cd47`, that has added row filters for tables logically replicated. Author: Boyu Yang Reviewed-by: Michael Paquier, Hou Zhijie Discussion: https://postgr.es/m/DM3PR84MB3442E14B340E553313B5C816E3252@DM3PR84MB3442.NAMPRD84.PROD.OUTLOOK.COM Backpatch-through: 15	2024-11-21 15:14:02 +09:00
Amit Kapila	7054186c4e	Replicate generated columns when 'publish_generated_columns' is set. This patch builds on the work done in commit `745217a051` by enabling the replication of generated columns alongside regular column changes through a new publication parameter: publish_generated_columns. Example usage: CREATE PUBLICATION pub1 FOR TABLE tab_gencol WITH (publish_generated_columns = true); The column list takes precedence. If the generated columns are specified in the column list, they will be replicated even if 'publish_generated_columns' is set to false. Conversely, if generated columns are not included in the column list (assuming the user specifies a column list), they will not be replicated even if 'publish_generated_columns' is true. Author: Vignesh C, Shubham Khanna Reviewed-by: Peter Smith, Amit Kapila, Hayato Kuroda, Shlok Kyal, Ajin Cherian, Hou Zhijie, Masahiko Sawada Discussion: https://postgr.es/m/B80D17B2-2C8E-4C7D-87F2-E5B4BE3C069E@gmail.com	2024-11-07 08:58:49 +05:30
Amit Kapila	745217a051	Replicate generated columns when specified in the column list. This commit allows logical replication to publish and replicate generated columns when explicitly listed in the column list. We also ensured that the generated columns were copied during the initial tablesync when they were published. We will allow to replicate generated columns even when they are not specified in the column list (via a new publication option) in a separate commit. The motivation of this work is to allow replication for cases where the client doesn't have generated columns. For example, the case where one is trying to replicate data from Postgres to the non-Postgres database. Author: Shubham Khanna, Vignesh C, Hou Zhijie Reviewed-by: Peter Smith, Hayato Kuroda, Shlok Kyal, Amit Kapila Discussion: https://postgr.es/m/B80D17B2-2C8E-4C7D-87F2-E5B4BE3C069E@gmail.com	2024-10-30 12:36:26 +05:30
Peter Eisentraut	edee0c621d	Message style improvements	2024-08-29 14:43:34 +02:00
Amit Kapila	3e53492aa7	Drop the temporary tuple slots allocated by pgoutput. In pgoutput, when converting the child table's tuple format to match the parent table's, we temporarily create a new slot to store the converted tuple. However, we missed to drop such temporary slots, leading to resource leakage. Reported-by: Bowen Shi Author: Hou Zhijie Reviewed-by: Amit Kapila Backpatch-through: 15 Discussion: https://postgr.es/m/CAM_vCudv8dc3sjWiPkXx5F2b27UV7_YRKRbtSCcE-pv=cVACGA@mail.gmail.com	2024-06-27 11:35:00 +05:30
Peter Eisentraut	dbbca2cf29	Remove unused #include's from backend .c files as determined by include-what-you-use (IWYU) While IWYU also suggests to add a bunch of #include's (which is its main purpose), this patch does not do that. In some cases, a more specific #include replaces another less specific one. Some manual adjustments of the automatic result: - IWYU currently doesn't know about includes that provide global variable declarations (like -Wmissing-variable-declarations), so those includes are being kept manually. - All includes for port(ability) headers are being kept for now, to play it safe. - No changes of catalog/pg_foo.h to catalog/pg_foo_d.h, to keep the patch from exploding in size. Note that this patch touches just *.c files, so nothing declared in header files changes in hidden ways. As a small example, in src/backend/access/transam/rmgr.c, some IWYU pragma annotations are added to handle a special case there. Discussion: https://www.postgresql.org/message-id/flat/af837490-6b2f-46df-ba05-37ea6a6653fc%40eisentraut.org	2024-03-04 12:02:20 +01:00
Masahiko Sawada	08e6344fd6	Remove ReorderBufferTupleBuf structure. Since commit `a4ccc1cef`, the 'node' and 'alloc_tuple_size' fields of the ReorderBufferTupleBuf structure are no longer used. This leaves only the 'tuple' field in the structure. Since keeping a single-field structure makes little sense, the ReorderBufferTupleBuf is removed entirely. The code is refactored accordingly. No back-patching since these are ABI changes in an exposed structure and functions, and there would be some risk of breaking extensions. Author: Aleksander Alekseev Reviewed-by: Amit Kapila, Masahiko Sawada, Reid Thompson Discussion: https://postgr.es/m/CAD21AoCvnuxiXXfRecp7g9+CeC35POQfhuQeJFr7_9u_Q5jc_Q@mail.gmail.com	2024-01-29 10:37:16 +09:00
Nathan Bossart	14dd0f27d7	Add macros for looping through a List without a ListCell. Many foreach loops only use the ListCell pointer to retrieve the content of the cell, like so: ListCell *lc; foreach(lc, mylist) { int myint = lfirst_int(lc); ... } This commit adds a few convenience macros that automatically declare the loop variable and retrieve the current cell's contents. This allows us to rewrite the previous loop like this: foreach_int(myint, mylist) { ... } This commit also adjusts a few existing loops in order to add coverage for the new/adjusted macros. There is presently no plan to bulk update all foreach loops, as that could introduce a significant amount of back-patching pain. Instead, these macros are primarily intended for use in new code. Author: Jelte Fennema-Nio Reviewed-by: David Rowley, Alvaro Herrera, Vignesh C, Tom Lane Discussion: https://postgr.es/m/CAGECzQSwXKnxGwW1_Q5JE%2B8Ja20kyAbhBHO04vVrQsLcDciwXA%40mail.gmail.com	2024-01-04 16:09:34 -06:00
Bruce Momjian	29275b1d17	Update copyright for 2024 Reported-by: Michael Paquier Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz Backpatch-through: 12	2024-01-03 20:49:05 -05:00
Amit Kapila	c8bc807cf8	pgoutput: Raise an error for missing protocol version parameter. Currently, we give a misleading error if the user omits to pass the required parameter 'proto_version' in SQL API pg_logical_slot_get_changes() or during START_REPLICATION protocol message. The error raised is as follows which indicates that the wrong version is passed. ERROR: client sent proto_version=0 but server only supports protocol 1 or higher Author: Emre Hasegeli Reviewed-by: Peter Smith, Amit Kapila Discussion: https://postgr.es/m/CAE2gYzwdwtUbs-tPSV-QBwgTubiyGD2ZGsSnAVsDfAGGLDrGOA@mail.gmail.com	2023-12-19 09:53:33 +05:30
Amit Kapila	360392fa2a	Avoid unconditionally filling in missing values with NULL in pgoutput. `52e4f0cd4` introduced a bug in pgoutput in which missing values in tuples were incorrectly filled in with NULL. The problem was the use of CreateTupleDescCopy where CreateTupleDescCopyConstr was required, as the former drops the constraints in the tuple description (specifically, the default value constraint) on the floor. The bug could result in incorrectness when a table replicated via `REPLICA IDENTITY FULL` underwent a schema change that added a column with a default value. The problem is that in such cases updates fill NULL values in old tuples for missing columns for default values. Then on the subscriber, we failed to find a matching tuple and missed updating the required row. Author: Nikhil Benesch Reviewed-by: Hou Zhijie, Amit Kapila Backpatch-through: 15 Discussion: http://postgr.es/m/CAPWqQZTEpZQamYsGMn6ZDRvVywwpVPiKH6OY4KSgA+NmeqFNzA@mail.gmail.com	2023-11-27 08:49:55 +05:30
Peter Eisentraut	721856ff24	Remove distprep A PostgreSQL release tarball contains a number of prebuilt files, in particular files produced by bison, flex, perl, and well as html and man documentation. We have done this consistent with established practice at the time to not require these tools for building from a tarball. Some of these tools were hard to get, or get the right version of, from time to time, and shipping the prebuilt output was a convenience to users. Now this has at least two problems: One, we have to make the build system(s) work in two modes: Building from a git checkout and building from a tarball. This is pretty complicated, but it works so far for autoconf/make. It does not currently work for meson; you can currently only build with meson from a git checkout. Making meson builds work from a tarball seems very difficult or impossible. One particular problem is that since meson requires a separate build directory, we cannot make the build update files like gram.h in the source tree. So if you were to build from a tarball and update gram.y, you will have a gram.h in the source tree and one in the build tree, but the way things work is that the compiler will always use the one in the source tree. So you cannot, for example, make any gram.y changes when building from a tarball. This seems impossible to fix in a non-horrible way. Second, there is increased interest nowadays in precisely tracking the origin of software. We can reasonably track contributions into the git tree, and users can reasonably track the path from a tarball to packages and downloads and installs. But what happens between the git tree and the tarball is obscure and in some cases non-reproducible. The solution for both of these issues is to get rid of the step that adds prebuilt files to the tarball. The tarball now only contains what is in the git tree (). Getting the additional build dependencies is no longer a problem nowadays, and the complications to keep these dual build modes working are significant. And of course we want to get the meson build system working universally. This commit removes the make distprep target altogether. The make dist target continues to do its job, it just doesn't call distprep anymore. () - The tarball also contains the INSTALL file that is built at make dist time, but not by distprep. This is unchanged for now. The make maintainer-clean target, whose job it is to remove the prebuilt files in addition to what make distclean does, is now just an alias to make distprep. (In practice, it is probably obsolete given that git clean is available.) The following programs are now hard build requirements in configure (they were already required by meson.build): - bison - flex - perl Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/e07408d9-e5f2-d9fd-5672-f53354e9305e@eisentraut.org	2023-11-06 15:18:04 +01:00
Peter Eisentraut	611806cd72	Add trailing commas to enum definitions Since C99, there can be a trailing comma after the last value in an enum definition. A lot of new code has been introducing this style on the fly. Some new patches are now taking an inconsistent approach to this. Some add the last comma on the fly if they add a new last value, some are trying to preserve the existing style in each place, some are even dropping the last comma if there was one. We could nudge this all in a consistent direction if we just add the trailing commas everywhere once. I omitted a few places where there was a fixed "last" value that will always stay last. I also skipped the header files of libpq and ecpg, in case people want to use those with older compilers. There were also a small number of cases where the enum type wasn't used anywhere (but the enum values were), which ended up confusing pgindent a bit, so I left those alone. Discussion: https://www.postgresql.org/message-id/flat/386f8c45-c8ac-4681-8add-e3b0852c1620%40eisentraut.org	2023-10-26 09:20:54 +02:00
Michael Paquier	9210afd3bc	Move tracking of in_streaming to PGOutputData "in_streaming" is a flag used to track if an instance of pgoutput is streaming changes. When pgoutput is started, the flag was always reset, switched it back and forth in the stream start/stop callbacks. Before this commit, it was a global variable, which is confusing as it is actually attached to a state of PGOutputData. Per my analysis, using a global variable did not lead to an active bug like in `54ccfd6586`, but it makes the code more consistent. Note that we cannot backpatch this change anyway as it requires the addition of a new field to PGOutputData, exposed in pgoutput.h. Author: Hou Zhijie Reviewed-by: Amit Kapila, Michael Paquier, Peter Smith Discussion: https://postgr.es/m/OS0PR01MB571690EF24F51F51EFFCBB0E94FAA@OS0PR01MB5716.jpnprd01.prod.outlook.com	2023-09-28 09:33:51 +09:00
Amit Kapila	54ccfd6586	Fix the misuse of origin filter across multiple pg_logical_slot_get_changes() calls. The pgoutput module uses a global variable (publish_no_origin) to cache the action for the origin filter, but we didn't reset the flag when shutting down the output plugin, so subsequent retries may access the previous publish_no_origin value. We fix this by storing the flag in the output plugin's private data. Additionally, the patch removes the currently unused origin string from the structure. For the back branch, to avoid changing the exposed structure, we eliminated the global variable and instead directly used the origin string for change filtering. Author: Hou Zhijie Reviewed-by: Amit Kapila, Michael Paquier Backpatch-through: 16 Discussion: http://postgr.es/m/OS0PR01MB571690EF24F51F51EFFCBB0E94FAA@OS0PR01MB5716.jpnprd01.prod.outlook.com	2023-09-27 14:32:51 +05:30
Michael Paquier	c868cbfef7	Fix typos in pgoutput.c RelationSyncCache was mentioned in two comments under a different name. Issue noticed while reviewing a different patch touching the same area. Introduced by `665d1fad99`. Discussion: https://postgr.es/m/ZQk1Ca_eFDTmBiZy@paquier.xyz	2023-09-20 10:02:12 +09:00
Tom Lane	0245f8db36	Pre-beta mechanical code beautification. Run pgindent, pgperltidy, and reformat-dat-files. This set of diffs is a bit larger than typical. We've updated to pg_bsd_indent 2.1.2, which properly indents variable declarations that have multi-line initialization expressions (the continuation lines are now indented one tab stop). We've also updated to perltidy version 20230309 and changed some of its settings, which reduces its desire to add whitespace to lines to make assignments etc. line up. Going forward, that should make for fewer random-seeming changes to existing code. Discussion: https://postgr.es/m/20230428092545.qfb3y5wcu4cm75ur@alvherre.pgsql	2023-05-19 17:24:48 -04:00
David Rowley	eef231e816	Fix some typos and some incorrectly duplicated words Author: Justin Pryzby Reviewed-by: David Rowley Discussion: https://postgr.es/m/ZD3D1QxoccnN8A1V@telsasoft.com	2023-04-18 14:03:49 +12:00
Amit Kapila	da324d6cd4	Refactor pgoutput_change(). Instead of mostly-duplicate code for different operation (insert/update/delete) types, write a common code to compute old/new tuples, and check the row filter. Author: Hou Zhijie Reviewed-by: Peter Smith, Amit Kapila Discussion: https://postgr.es/m/OS0PR01MB5716194A47FFA8D91133687D94BF9@OS0PR01MB5716.jpnprd01.prod.outlook.com	2023-03-30 11:10:38 +05:30
Amit Kapila	e709596b25	Add macros for ReorderBufferTXN toptxn. Currently, there are quite a few places in reorderbuffer.c that tries to access top-transaction for a subtransaction. This makes the code to access top-transaction consistent and easier to follow. Author: Peter Smith Reviewed-by: Vignesh C, Sawada Masahiko Discussion: https://postgr.es/m/CAHut+PuCznOyTqBQwjRUu-ibG-=KHyCv-0FTcWQtZUdR88umfg@mail.gmail.com	2023-03-17 08:29:41 +05:30
Tom Lane	b803b7d132	Fill EState.es_rteperminfos more systematically. While testing a fix for bug #17823, I discovered that EvalPlanQualStart failed to copy es_rteperminfos from the parent EState, resulting in failure if anything in EPQ execution wanted to consult that information. This led me to conclude that commit `a61b1f748` had been too haphazard about where to fill es_rteperminfos, and that we need to be sure that that happens exactly where es_range_table gets filled. So I changed the signature of ExecInitRangeTable to help ensure that this new requirement doesn't get missed. (Indeed, pgoutput.c was also failing to fill it. Maybe we don't ever need it there, but I wouldn't bet on that.) No test case yet; one will arrive with the fix for #17823. But that needs to be back-patched, while this fix is HEAD-only. Discussion: https://postgr.es/m/17823-b64909cf7d63de84@postgresql.org	2023-03-06 13:10:57 -05:00
Tom Lane	05172f1f37	Don't repeatedly register cache callbacks in pgoutput plugin. Multiple cycles of starting up and shutting down the plugin within a single session would eventually lead to "out of relcache_callback_list slots", because pgoutput_startup blindly re-registered its cache callbacks each time. Fix it to register them only once, as all other users of cache callbacks already take care to do. This has been broken all along, so back-patch to all supported branches. Shi Yu Discussion: https://postgr.es/m/OSZPR01MB631004A78D743D68921FFAD3FDA79@OSZPR01MB6310.jpnprd01.prod.outlook.com	2023-02-23 15:40:42 -05:00
Andres Freund	30b789eafe	Remove uses of AssertVariableIsOfType() obsoleted by `f2b73c8` Author: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/20230208172705.GA451849@nathanxps13	2023-02-08 21:06:46 -08:00
Amit Kapila	8c58624df4	Fix the logical replication timeout during large DDLs. The DDLs like Refresh Materialized views that generate lots of temporary data due to rewrite rules may not be processed by output plugins (for example pgoutput). So, we won't send keep-alive messages for a long time while processing such commands and that can lead the subscriber side to timeout. We have previously fixed a similar case for large transactions in commit `f95d53eded` where the output plugin filters all or most of the changes but missed to handle the DDLs. We decided not to backpatch this as this adds a new callback in the existing exposed structure and moreover, users can increase the wal_sender_timeout and wal_receiver_timeout to avoid this problem. Author: Wang wei, Hou Zhijie Reviewed-by: Peter Smith, Ashutosh Bapat, Shi yu, Amit Kapila Discussion: https://postgr.es/m/OS3PR01MB6275478E5D29E4A563302D3D9E2B9@OS3PR01MB6275.jpnprd01.prod.outlook.com Discussion: https://postgr.es/m/CAA5-nLARN7-3SLU_QUxfy510pmrYK6JJb=bk3hcgemAM_pAv+w@mail.gmail.com	2023-02-08 07:58:25 +05:30
Peter Eisentraut	54a177a948	Remove useless casts to (void ) in hash_search() calls Some of these appear to be leftovers from when hash_search() took a char argument (changed in `5999e78fc4`). Since after this there is some more horizontal space available, do some light reformatting where suitable. Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/fd9adf5d-b1aa-e82f-e4c7-263c30145807%40enterprisedb.com	2023-02-06 09:41:01 +01:00
Amit Kapila	b7ae039536	Ignore dropped and generated columns from the column list. We don't allow different column lists for the same table in the different publications of the single subscription. A publication with a column list except for dropped and generated columns should be considered the same as a publication with no column list (which implicitly includes all columns as part of the columns list). However, as we were not excluding the dropped and generated columns from the column list combining such publications leads to an error "cannot use different column lists for table ...". We decided not to backpatch this fix as there is a risk of users seeing this as a behavior change and also we didn't see any field report of this case. Author: Shi yu Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/OSZPR01MB631091CCBC56F195B1B9ACB0FDFE9@OSZPR01MB6310.jpnprd01.prod.outlook.com	2023-01-13 14:49:23 +05:30
Amit Kapila	216a784829	Perform apply of large transactions by parallel workers. Currently, for large transactions, the publisher sends the data in multiple streams (changes divided into chunks depending upon logical_decoding_work_mem), and then on the subscriber-side, the apply worker writes the changes into temporary files and once it receives the commit, it reads from those files and applies the entire transaction. To improve the performance of such transactions, we can instead allow them to be applied via parallel workers. In this approach, we assign a new parallel apply worker (if available) as soon as the xact's first stream is received and the leader apply worker will send changes to this new worker via shared memory. The parallel apply worker will directly apply the change instead of writing it to temporary files. However, if the leader apply worker times out while attempting to send a message to the parallel apply worker, it will switch to "partial serialize" mode - in this mode, the leader serializes all remaining changes to a file and notifies the parallel apply workers to read and apply them at the end of the transaction. We use a non-blocking way to send the messages from the leader apply worker to the parallel apply to avoid deadlocks. We keep this parallel apply assigned till the transaction commit is received and also wait for the worker to finish at commit. This preserves commit ordering and avoid writing to and reading from files in most cases. We still need to spill if there is no worker available. This patch also extends the SUBSCRIPTION 'streaming' parameter so that the user can control whether to apply the streaming transaction in a parallel apply worker or spill the change to disk. The user can set the streaming parameter to 'on/off', or 'parallel'. The parameter value 'parallel' means the streaming will be applied via a parallel apply worker, if available. The parameter value 'on' means the streaming transaction will be spilled to disk. The default value is 'off' (same as current behaviour). In addition, the patch extends the logical replication STREAM_ABORT message so that abort_lsn and abort_time can also be sent which can be used to update the replication origin in parallel apply worker when the streaming transaction is aborted. Because this message extension is needed to support parallel streaming, parallel streaming is not supported for publications on servers < PG16. Author: Hou Zhijie, Wang wei, Amit Kapila with design inputs from Sawada Masahiko Reviewed-by: Sawada Masahiko, Peter Smith, Dilip Kumar, Shi yu, Kuroda Hayato, Shveta Mallik Discussion: https://postgr.es/m/CAA4eK1+wyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw@mail.gmail.com	2023-01-09 07:52:45 +05:30
Tom Lane	cd4b2334db	Invalidate pgoutput's replication-decisions cache upon schema rename. A schema rename should cause reporting the new qualified names of tables to logical replication subscribers, but that wasn't happening. Flush the RelationSyncCache to make it happen. (If you ask me, the new test case shows that the behavior in this area is still pretty dubious, but apparently it's operating as designed.) Vignesh C Discussion: https://postgr.es/m/CALDaNm32vLRv5KdrDFeVC-CU+4Wg1daA55hMqOxDGJBzvd76-w@mail.gmail.com	2023-01-06 11:11:51 -05:00
Bruce Momjian	c8e1ba736b	Update copyright for 2023 Backpatch-through: 11	2023-01-02 15:00:37 -05:00
Andrew Dunstan	8284cf5f74	Add copyright notices to meson files Discussion: https://postgr.es/m/222b43a5-2fb3-2c1b-9cd0-375d376c8246@dunslane.net	2022-12-20 07:54:39 -05:00
Amit Kapila	40b1491357	Fix incorrect output from pgoutput when using column lists. For Updates and Deletes, we were not honoring the columns list for old tuple values while sending tuple data via pgoutput. This results in pgoutput emitting more columns than expected. This is not a problem for built-in logical replication as we simply ignore additional columns based on the relation information sent previously which didn't have those columns. However, some other users of pgoutput plugin may expect the columns as per the column list. Also, sending extra columns unnecessarily consumes network bandwidth defeating the purpose of the column list feature. Reported-by: Gunnar Morling Author: Hou Zhijie Reviewed-by: Amit Kapila Backpatch-through: 15 Discussion: https://postgr.es/m/CADGJaX9kiRZ-OH0EpWF5Fkyh1ZZYofoNRCrhapBfdk02tj5EKg@mail.gmail.com	2022-12-02 10:52:58 +05:30
Alvaro Herrera	ad86d159b6	Add 'missing_ok' argument to build_attrmap_by_name When it's given as true, return a 0 in the position of the missing column rather than raising an error. This is currently unused, but it allows us to reimplement column permission checking in a subsequent commit. It seems worth breaking into a separate commit because it affects unrelated code. Author: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/CA+HiwqFfiai=qBxPDTjaio_ZcaqUKh+FC=prESrB8ogZgFNNNQ@mail.gmail.com	2022-11-29 09:39:36 +01:00
Andres Freund	902ab2fcef	meson: Add windows resource files The generated resource files aren't exactly the same ones as the old buildsystems generate. Previously "InternalName" and "OriginalFileName" were mostly wrong / not set (despite being required), but that was hard to fix in at least the make build. Additionally, the meson build falls back to a "auto-generated" description when not set, and doesn't set it in a few cases - unlikely that anybody looks at these descriptions in detail. Author: Andres Freund <andres@anarazel.de> Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>	2022-10-05 09:56:05 -07:00
Amit Kapila	13a185f54b	Allow publications with schema and table of the same schema. We previously thought that allowing such cases can confuse users when they specify DROP TABLES IN SCHEMA but that doesn't seem to be the case based on discussion. This helps to uplift the restriction during ALTER TABLE ... SET SCHEMA which used to ensure that we couldn't end up with a publication having both a schema and the same schema's table. To allow this, we need to forbid having any schema on a publication if column lists on a table are specified (and vice versa). This is because otherwise we still need a restriction during ALTER TABLE ... SET SCHEMA to forbid cases where it could lead to a publication having both a schema and the same schema's table with column list. Based on suggestions by Peter Eisentraut. Author: Hou Zhijie and Vignesh C Reviewed-By: Peter Smith, Amit Kapila Backpatch-through: 15, where it was introduced Discussion: https://postgr.es/m/2729c9e2-9aac-8cda-f2f4-34f2bcc18f4e@enterprisedb.com	2022-09-23 08:21:26 +05:30
Alvaro Herrera	790bf615dd	Remove ALL keyword from TABLES IN SCHEMA for publication This may be a bit too subtle, but removing that word from there makes this clause no longer a perfect parallel of the GRANT variant "ALL TABLES IN SCHEMA": indeed, for publications what we record is the schema itself, not the tables therein, which means that any tables added to the schema in the future are also published. This is completely different to what GRANT does, which is affect only the tables that exist when the command is executed. There isn't resounding support for this change, but there are a few positive votes and no opposition. Because the time to 15 RC1 is very short, let's get this out now. Backpatch to 15. Discussion: https://postgr.es/m/2729c9e2-9aac-8cda-f2f4-34f2bcc18f4e	2022-09-22 19:02:25 +02:00
Andres Freund	e6927270cd	meson: Add initial version of meson based build system Autoconf is showing its age, fewer and fewer contributors know how to wrangle it. Recursive make has a lot of hard to resolve dependency issues and slow incremental rebuilds. Our home-grown MSVC build system is hard to maintain for developers not using Windows and runs tests serially. While these and other issues could individually be addressed with incremental improvements, together they seem best addressed by moving to a more modern build system. After evaluating different build system choices, we chose to use meson, to a good degree based on the adoption by other open source projects. We decided that it's more realistic to commit a relatively early version of the new build system and mature it in tree. This commit adds an initial version of a meson based build system. It supports building postgres on at least AIX, FreeBSD, Linux, macOS, NetBSD, OpenBSD, Solaris and Windows (however only gcc is supported on aix, solaris). For Windows/MSVC postgres can now be built with ninja (faster, particularly for incremental builds) and msbuild (supporting the visual studio GUI, but building slower). Several aspects (e.g. Windows rc file generation, PGXS compatibility, LLVM bitcode generation, documentation adjustments) are done in subsequent commits requiring further review. Other aspects (e.g. not installing test-only extensions) are not yet addressed. When building on Windows with msbuild, builds are slower when using a visual studio version older than 2019, because those versions do not support MultiToolTask, required by meson for intra-target parallelism. The plan is to remove the MSVC specific build system in src/tools/msvc soon after reaching feature parity. However, we're not planning to remove the autoconf/make build system in the near future. Likely we're going to keep at least the parts required for PGXS to keep working around until all supported versions build with meson. Some initial help for postgres developers is at https://wiki.postgresql.org/wiki/Meson With contributions from Thomas Munro, John Naylor, Stone Tickle and others. Author: Andres Freund <andres@anarazel.de> Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Author: Peter Eisentraut <peter@eisentraut.org> Reviewed-By: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Discussion: https://postgr.es/m/20211012083721.hvixq4pnh2pixr3j@alap3.anarazel.de	2022-09-21 22:37:17 -07:00
Amit Kapila	6971a839cc	Improve some error messages. It is not our usual style to use "we" in the error messages. Author: Kyotaro Horiguchi Reviewed-By: Amit Kapila Discussion: https://postgr.es/m/20220914.111507.13049297635620898.horikyota.ntt@gmail.com	2022-09-21 09:43:59 +05:30
Peter Geoghegan	bfcf1b3480	Harmonize parameter names in storage and AM code. Make sure that function declarations use names that exactly match the corresponding names from function definitions in storage, catalog, access method, executor, and logical replication code, as well as in miscellaneous utility/library code. Like other recent commits that cleaned up function parameter names, this commit was written with help from clang-tidy. Later commits will do the same for other parts of the codebase. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAH2-WznJt9CMM9KJTMjJh_zbL5hD9oX44qdJ4aqZtjFi-zA3Tg@mail.gmail.com	2022-09-19 19:18:36 -07:00
Tom Lane	efd0c16bec	Avoid using list_length() to test for empty list. The standard way to check for list emptiness is to compare the List pointer to NIL; our list code goes out of its way to ensure that that is the only representation of an empty list. (An acceptable alternative is a plain boolean test for non-null pointer, but explicit mention of NIL is usually preferable.) Various places didn't get that memo and expressed the condition with list_length(), which might not be so bad except that there were such a variety of ways to check it exactly: equal to zero, less than or equal to zero, less than one, yadda yadda. In the name of code readability, let's standardize all those spellings as "list == NIL" or "list != NIL". (There's probably some microscopic efficiency gain too, though few of these look to be at all performance-critical.) A very small number of cases were left as-is because they seemed more consistent with other adjacent list_length tests that way. Peter Smith, with bikeshedding from a number of us Discussion: https://postgr.es/m/CAHut+PtQYe+ENX5KrONMfugf0q6NHg4hR5dAhqEXEc2eefFeig@mail.gmail.com	2022-08-17 11:12:35 -04:00
Amit Kapila	366283961a	Allow users to skip logical replication of data having origin. This patch adds a new SUBSCRIPTION parameter "origin". It specifies whether the subscription will request the publisher to only send changes that don't have an origin or send changes regardless of origin. Setting it to "none" means that the subscription will request the publisher to only send changes that have no origin associated. Setting it to "any" means that the publisher sends changes regardless of their origin. The default is "any". Usage: CREATE SUBSCRIPTION sub1 CONNECTION 'dbname=postgres port=9999' PUBLICATION pub1 WITH (origin = none); This can be used to avoid loops (infinite replication of the same data) among replication nodes. This feature allows filtering only the replication data originating from WAL but for initial sync (initial copy of table data) we don't have such a facility as we can only distinguish the data based on origin from WAL. As a follow-up patch, we are planning to forbid the initial sync if the origin is specified as none and we notice that the publication tables were also replicated from other publishers to avoid duplicate data or loops. We forbid to allow creating origin with names 'none' and 'any' to avoid confusion with the same name options. Author: Vignesh C, Amit Kapila Reviewed-By: Peter Smith, Amit Kapila, Dilip Kumar, Shi yu, Ashutosh Bapat, Hayato Kuroda Discussion: https://postgr.es/m/CALDaNm0gwjY_4HFxvvty01BOT01q_fJLKQ3pWP9=9orqubhjcQ@mail.gmail.com	2022-07-21 08:47:38 +05:30
Andres Freund	fd4bad1655	Remove now superfluous declarations of dlsym()ed symbols. The prior commit declared them centrally. Author: Andres Freund <andres@anarazel.de> Reviewed-By: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/20211101020311.av6hphdl6xbjbuif@alap3.anarazel.de	2022-07-17 17:29:32 -07:00
Alvaro Herrera	f10a025cfe	Implement List support for TransactionId Use it for RelationSyncEntry->streamed_txns, which is currently using an integer list. The API support is not complete, not because it is hard to write but because it's unclear that it's worth the code space, there being so little use of XID lists. Discussion: https://postgr.es/m/202205130830.g5ntonhztspb@alvherre.pgsql Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>	2022-07-04 14:52:12 +02:00

1 2 3

119 Commits