postgres

mirror of https://github.com/postgres/postgres.git synced 2025-11-10 17:42:29 +03:00

Author	SHA1	Message	Date
Peter Eisentraut	7e9a4c5c3d	Use consistent style for checking return from system calls Use if (something() != 0) error ... instead of just if (something) error ... The latter is not incorrect, but it's a bit confusing and not the common style. Discussion: https://www.postgresql.org/message-id/flat/5de61b6b-8be9-7771-0048-860328efe027%402ndquadrant.com	2019-07-07 15:28:49 +02:00
Peter Eisentraut	6a1cd8b923	Unwind some workarounds for lack of portable int64 format specifier Because there is no portable int64/uint64 format specifier and we can't stick macros like INT64_FORMAT into the middle of a translatable string, we have been using various workarounds that put the number to be printed into a string buffer first. Now that we always use our own sprintf(), we can rely on %lld and %llu to work, so we can use those. This patch undoes this workaround in a few places where it was egregiously verbose. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/CAH2-Wz%3DWbNxc5ob5NJ9yqo2RMJ0q4HXDS30GVCobeCvC9A1L9A%40mail.gmail.com	2019-07-04 17:01:43 +02:00
Michael Paquier	c74d49d41c	Fix many typos and inconsistencies Author: Alexander Lakhin Discussion: https://postgr.es/m/af27d1b3-a128-9d62-46e0-88f424397f44@gmail.com	2019-07-01 10:00:23 +09:00
Tom Lane	54100f5c60	Add an enforcement mechanism for global object names in regression tests. In commit `18555b132` we tentatively established a rule that regression tests should use names containing "regression" for databases, and names starting with "regress_" for all other globally-visible object names, so as to circumscribe the side-effects that "make installcheck" could have on an existing installation. This commit adds a simple enforcement mechanism for that rule: if the code is compiled with ENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS defined, it will emit a warning (not an error) whenever a database, role, tablespace, subscription, or replication origin name is created that doesn't obey the rule. Running one or more buildfarm members with that symbol defined should be enough to catch new violations, at least in the regular regression tests. Most TAP tests wouldn't notice such warnings, but that's actually fine because TAP tests don't execute against an existing server anyway. Since it's already the case that running src/test/modules/ tests in installcheck mode is deprecated, we can use that as a home for tests that seem unsafe to run against an existing server, such as tests that might have side-effects on existing roles. Document that (though this commit doesn't in itself make it any less safe than before). Update regress.sgml to define these restrictions more clearly, and to clean up assorted lack-of-up-to-date-ness in its descriptions of the available regression tests. Discussion: https://postgr.es/m/16638.1468620817@sss.pgh.pa.us	2019-06-29 11:34:00 -04:00
Tom Lane	a1e61badf9	Disallow user-created replication origins named "pg_xxx". Since we generate such names internally, it seems like a good idea to have a policy of disallowing them for user use, as we do for many other object types. Otherwise attempts to use them will randomly fail due to collisions with internally-generated names. Discussion: https://postgr.es/m/3606.1561747369@sss.pgh.pa.us	2019-06-29 10:30:08 -04:00
Michael Paquier	3412030205	Fix more typos and inconsistencies in the tree Author: Alexander Lakhin Discussion: https://postgr.es/m/0a5419ea-1452-a4e6-72ff-545b1a5a8076@gmail.com	2019-06-17 16:13:16 +09:00
Thomas Munro	7988cb446d	Fix typos. Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/CA%2BhUKGJFWXmtYo6Frd77RR8YXCHz7hJ2mRy5aHV%3D7fJOqDnBHA%40mail.gmail.com	2019-05-24 12:00:59 +12:00
Tom Lane	8255c7a5ee	Phase 2 pgindent run for v12. Switch to 2.1 version of pg_bsd_indent. This formats multiline function declarations "correctly", that is with additional lines of parameter declarations indented to match where the first line's left parenthesis is. Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com	2019-05-22 13:04:48 -04:00
Tom Lane	be76af171c	Initial pgindent run for v12. This is still using the 2.0 version of pg_bsd_indent. I thought it would be good to commit this separately, so as to document the differences between 2.0 and 2.1 behavior. Discussion: https://postgr.es/m/16296.1558103386@sss.pgh.pa.us	2019-05-22 12:55:34 -04:00
Alvaro Herrera	75445c1515	More message style fixes Discussion: https://postgr.es/m/20190515183005.GA26486@alvherre.pgsql	2019-05-16 19:14:31 -04:00
Michael Paquier	7e19929ea2	Fix duplicated words in comments Author: Stephen Amell Discussion: https://postgr.es/m/539fa271-21b3-777e-a468-d96cffe9c768@gmail.com	2019-05-14 09:37:35 +09:00
Tom Lane	32ebb35128	Fix logical replication's ideas about which type OIDs are built-in. Only hand-assigned type OIDs should be presumed to match across different PG servers; those assigned during genbki.pl or during initdb are likely to change due to addition or removal of unrelated objects. This means that the cutoff should be FirstGenbkiObjectId (in HEAD) or FirstBootstrapObjectId (before that), not FirstNormalObjectId. Compare postgres_fdw's is_builtin() test. It's likely that this error has no observable consequence in a normally-functioning system, since ATM the only affected type OIDs are system catalog rowtypes and information_schema types, which would not typically be interesting for logical replication. But you could probably break it if you tried hard, so back-patch. Discussion: https://postgr.es/m/15150.1557257111@sss.pgh.pa.us	2019-05-13 17:23:00 -04:00
Tom Lane	a1a789eb5a	In walreceiver, don't try to do ereport() in a signal handler. This is quite unsafe, even for the case of ereport(FATAL) where we won't return control to the interrupted code, and despite this code's use of a flag to restrict the areas where we'd try to do it. It's possible for example that we interrupt malloc or free while that's holding a lock that's meant to protect against cross-thread interference. Then, any attempt to do malloc or free within ereport() will result in a deadlock, preventing the walreceiver process from exiting in response to SIGTERM. We hypothesize that this explains some hard-to-reproduce failures seen in the buildfarm. Hence, get rid of the immediate-exit code in WalRcvShutdownHandler, as well as the logic associated with WalRcvImmediateInterruptOK. Instead, we need to take care that potentially-blocking operations in the walreceiver's data transmission logic (libpqwalreceiver.c) will respond reasonably promptly to the process's latch becoming set and then call ProcessWalRcvInterrupts. Much of the needed code for that was already present in libpqwalreceiver.c. I refactored things a bit so that all the uses of PQgetResult use latch-aware waiting, but didn't need to do much more. These changes should be enough to ensure that libpqwalreceiver.c will respond promptly to SIGTERM whenever it's waiting to receive data. In principle, it could block for a long time while waiting to send data too, and this patch does nothing to guard against that. I think that that hazard is mostly theoretical though: such blocking should occur only if we fill the kernel's data transmission buffers, and we don't generally send enough data to make that happen without waiting for input. If we find out that the hazard isn't just theoretical, we could fix it by using PQsetnonblocking, but that would require more ticklish changes than I care to make now. This is a bug fix, but it seems like too big a change to push into the back branches without much more testing than there's time for right now. Perhaps we'll back-patch once we have more confidence in the change. Patch by me; thanks to Thomas Munro for review. Discussion: https://postgr.es/m/20190416070119.GK2673@paquier.xyz	2019-04-29 12:26:07 -04:00
Michael Paquier	9010156445	Fix thinko introduced by `82a5649` in slot.c When saving a replication slot, failing to close the temporary path used to save the slot information is considered as a failure and reported as such. However the code forgot to leave immediately as other failure paths do. Noticed while looking up at this area of the code for another patch.	2019-04-17 10:01:22 +09:00
Michael Paquier	47ac2033d4	Simplify some ERROR paths clearing wait events and transient files Transient files and wait events get normally cleaned up when seeing an exception (be it in the context of a transaction for a backend or another process like the checkpointer), hence there is little point in complicating error code paths to do this work. This shaves a bit of code, and removes some extra handling with errno which needed to be preserved during the cleanup steps done. Reported-by: Masahiko Sawada Author: Michael Paquier Reviewed-by: Tom Lane, Masahiko Sawada Discussion: https://postgr.es/m/CAD21AoDhHYVq5KkXfkaHhmjA-zJYj-e4teiRAJefvXuKJz1tKQ@mail.gmail.com	2019-04-17 09:51:45 +09:00
Michael Paquier	5ed4b123b6	Remove duplicate assignment when initializing logical decoder context The private data in the WAL reader is already getting set when allocating it. Author: Antonin Houska Reviewed-by: Tom Lane Discussion: https://postgr.es/m/30563.1555329094@localhost	2019-04-16 15:08:38 +09:00
Michael Paquier	c34677fdaa	Fix SHOW ALL command for non-superusers with replication connection Since Postgres 10, SHOW commands can be triggered with replication connections in a WAL sender context, however it missed that a transaction context is needed for syscache lookups. This commit makes sure that the syscache lookups can happen correctly by setting a transaction context when running SHOW commands in a WAL sender. Superuser-only parameters can be displayed using SHOW commands not only to superusers, but also to members of system role pg_read_all_settings, which requires a syscache lookup to check if the connected role is a member of this system role or not, or the instance crashes. Superusers do not need to check the syscache so it worked correctly in this case. New tests are added to cover this issue. Reported-by: Alexander Kukushkin Author: Michael Paquier Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/15734-2daa8761eeed8e20@postgresql.org Backpatch-through: 10	2019-04-15 12:34:32 +09:00
Magnus Hagander	77bd49adba	Show shared object statistics in pg_stat_database This adds a row to the pg_stat_database view with datoid 0 and datname NULL for those objects that are not in a database. This was added particularly for checksums, but we were already tracking more satistics for these objects, just not returning it. Also add a checksum_last_failure column that holds the timestamptz of the last checksum failure that occurred in a database (or in a non-dataabase file), if any. Author: Julien Rouhaud <rjuju123@gmail.com>	2019-04-12 14:04:50 +02:00
Alvaro Herrera	4dba0f6dc4	Fix typo	2019-04-09 13:00:12 -04:00
Alvaro Herrera	9f06d79ef8	Add facility to copy replication slots This allows the user to create duplicates of existing replication slots, either logical or physical, and even changing properties such as whether they are temporary or the output plugin used. There are multiple uses for this, such as initializing multiple replicas using the slot for one base backup; when doing investigation of logical replication issues; and to select a different output plugins. Author: Masahiko Sawada Reviewed-by: Michael Paquier, Andres Freund, Petr Jelinek Discussion: https://postgr.es/m/CAD21AoAm7XX8y_tOPP6j4Nzzch12FvA1wPqiO690RCk+uYVstg@mail.gmail.com	2019-04-05 18:05:18 -03:00
Peter Eisentraut	fc22b6623b	Generated columns This is an SQL-standard feature that allows creating columns that are computed from expressions rather than assigned, similar to a view or materialized view but on a column basis. This implements one kind of generated column: stored (computed on write). Another kind, virtual (computed on read), is planned for the future, and some room is left for it. Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/b151f851-4019-bdb1-699e-ebab07d2f40a@2ndquadrant.com	2019-03-30 08:15:57 +01:00
Thomas Munro	2fc7af5e96	Add basic infrastructure for 64 bit transaction IDs. Instead of inferring epoch progress from xids and checkpoints, introduce a 64 bit FullTransactionId type and use it to track xid generation. This fixes an unlikely bug where the epoch is reported incorrectly if the range of active xids wraps around more than once between checkpoints. The only user-visible effect of this commit is to correct the epoch used by txid_current() and txid_status(), also visible with pg_controldata, in those rare circumstances. It also creates some basic infrastructure so that later patches can use 64 bit transaction IDs in more places. The new type is a struct that we pass by value, as a form of strong typedef. This prevents the sort of accidental confusion between TransactionId and FullTransactionId that would be possible if we were to use a plain old uint64. Author: Thomas Munro Reported-by: Amit Kapila Reviewed-by: Andres Freund, Tom Lane, Heikki Linnakangas Discussion: https://postgr.es/m/CAA4eK1%2BMv%2Bmb0HFfWM9Srtc6MVe160WFurXV68iAFMcagRZ0dQ%40mail.gmail.com	2019-03-28 18:12:20 +13:00
Peter Eisentraut	69039fda83	Add walreceiver API to get remote server version Add a separate walreceiver API function walrcv_server_version() to get the version of the remote server, instead of doing it as part of walrcv_identify_system(). This allows the server version to be available even for uses that don't call IDENTIFY_SYSTEM, and it seems cleaner anyway. This is for an upcoming patch, not currently used. Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/20190115071359.GF1433@paquier.xyz	2019-03-15 10:16:26 +01:00
Michael Paquier	6dd263cfaa	Rename pg_verify_checksums to pg_checksums The current tool name is too restrictive and focuses only on verifying checksums. As more options to control checksums for an offline cluster are planned to be added, switch to a more generic name. Documentation as well as all past references to the tool are updated. Author: Michael Paquier Reviewed-by: Michael Banck, Fabien Coelho, Seigei Kornilov Discussion: https://postgr.es/m/20181221201616.GD4974@nighthawk.caipicrew.dd-dns.de	2019-03-13 10:43:20 +09:00
Andres Freund	c2fe139c20	tableam: Add and use scan APIs. Too allow table accesses to be not directly dependent on heap, several new abstractions are needed. Specifically: 1) Heap scans need to be generalized into table scans. Do this by introducing TableScanDesc, which will be the "base class" for individual AMs. This contains the AM independent fields from HeapScanDesc. The previous heap_{beginscan,rescan,endscan} et al. have been replaced with a table_ version. There's no direct replacement for heap_getnext(), as that returned a HeapTuple, which is undesirable for a other AMs. Instead there's table_scan_getnextslot(). But note that heap_getnext() lives on, it's still used widely to access catalog tables. This is achieved by new scan_begin, scan_end, scan_rescan, scan_getnextslot callbacks. 2) The portion of parallel scans that's shared between backends need to be able to do so without the user doing per-AM work. To achieve that new parallelscan_{estimate, initialize, reinitialize} callbacks are introduced, which operate on a new ParallelTableScanDesc, which again can be subclassed by AMs. As it is likely that several AMs are going to be block oriented, block oriented callbacks that can be shared between such AMs are provided and used by heap. table_block_parallelscan_{estimate, intiialize, reinitialize} as callbacks, and table_block_parallelscan_{nextpage, init} for use in AMs. These operate on a ParallelBlockTableScanDesc. 3) Index scans need to be able to access tables to return a tuple, and there needs to be state across individual accesses to the heap to store state like buffers. That's now handled by introducing a sort-of-scan IndexFetchTable, which again is intended to be subclassed by individual AMs (for heap IndexFetchHeap). The relevant callbacks for an AM are index_fetch_{end, begin, reset} to create the necessary state, and index_fetch_tuple to retrieve an indexed tuple. Note that index_fetch_tuple implementations need to be smarter than just blindly fetching the tuples for AMs that have optimizations similar to heap's HOT - the currently alive tuple in the update chain needs to be fetched if appropriate. Similar to table_scan_getnextslot(), it's undesirable to continue to return HeapTuples. Thus index_fetch_heap (might want to rename that later) now accepts a slot as an argument. Core code doesn't have a lot of call sites performing index scans without going through the systable_* API (in contrast to loads of heap_getnext calls and working directly with HeapTuples). Index scans now store the result of a search in IndexScanDesc->xs_heaptid, rather than xs_ctup->t_self. As the target is not generally a HeapTuple anymore that seems cleaner. To be able to sensible adapt code to use the above, two further callbacks have been introduced: a) slot_callbacks returns a TupleTableSlotOps* suitable for creating slots capable of holding a tuple of the AMs type. table_slot_callbacks() and table_slot_create() are based upon that, but have additional logic to deal with views, foreign tables, etc. While this change could have been done separately, nearly all the call sites that needed to be adapted for the rest of this commit also would have been needed to be adapted for table_slot_callbacks(), making separation not worthwhile. b) tuple_satisfies_snapshot checks whether the tuple in a slot is currently visible according to a snapshot. That's required as a few places now don't have a buffer + HeapTuple around, but a slot (which in heap's case internally has that information). Additionally a few infrastructure changes were needed: I) SysScanDesc, as used by systable_{beginscan, getnext} et al. now internally uses a slot to keep track of tuples. While systable_getnext() still returns HeapTuples, and will so for the foreseeable future, the index API (see 1) above) now only deals with slots. The remainder, and largest part, of this commit is then adjusting all scans in postgres to use the new APIs. Author: Andres Freund, Haribabu Kommi, Alvaro Herrera Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql	2019-03-11 12:46:41 -07:00
Magnus Hagander	6b9e875f72	Track block level checksum failures in pg_stat_database This adds a column that counts how many checksum failures have occurred on files belonging to a specific database. Both checksum failures during normal backend processing and those created when a base backup detects a checksum failure are counted. Author: Magnus Hagander Reviewed by: Julien Rouhaud	2019-03-09 10:47:30 -08:00
Michael Paquier	82a5649fb9	Tighten use of OpenTransientFile and CloseTransientFile This fixes two sets of issues related to the use of transient files in the backend: 1) OpenTransientFile() has been used in some code paths with read-write flags while read-only is sufficient, so switch those calls to be read-only where necessary. These have been reported by Joe Conway. 2) When opening transient files, it is up to the caller to close the file descriptors opened. In error code paths, CloseTransientFile() gets called to clean up things before issuing an error. However in normal exit paths, a lot of callers of CloseTransientFile() never actually reported errors, which could leave a file descriptor open without knowing about it. This is an issue I complained about a couple of times, but never had the courage to write and submit a patch, so here we go. Note that one frontend code path is impacted by this commit so as an error is issued when fetching control file data, making backend and frontend to be treated consistently. Reported-by: Joe Conway, Michael Paquier Author: Michael Paquier Reviewed-by: Álvaro Herrera, Georgios Kokolatos, Joe Conway Discussion: https://postgr.es/m/20190301023338.GD1348@paquier.xyz Discussion: https://postgr.es/m/c49b69ec-e2f7-ff33-4f17-0eaa4f2cef27@joeconway.com	2019-03-09 08:50:55 +09:00
Andres Freund	70b9bda65f	Use a virtual rather than a heap slot in two places where that suffices. Author: Andres Freund Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de	2019-03-01 17:26:43 -08:00
Peter Eisentraut	6ae578a91e	Set fallback_application_name for a walreceiver to cluster_name By default, the fallback_application_name for a physical walreceiver is "walreceiver". This means that multiple standbys cannot be distinguished easily on a primary, for example in pg_stat_activity or synchronous_standby_names. If cluster_name is set, use that for fallback_application_name in the walreceiver. (If it's not set, it remains "walreceiver".) If someone set cluster_name to identify their instance, we might as well use that by default to identify the node remotely as well. It's still possible to specify another application_name in primary_conninfo explicitly. Reviewed-by: Euler Taveira <euler@timbira.com.br> Discussion: https://www.postgresql.org/message-id/flat/1257eaee-4874-e791-e83a-46720c72cac7@2ndquadrant.com	2019-02-27 10:59:25 +01:00
Andres Freund	ff11e7f4b9	Use slots in trigger infrastructure, except for the actual invocation. In preparation for abstracting table storage, convert trigger.c to track tuples in slots. Which also happens to make code calling triggers simpler. As the calling interface for triggers themselves is not changed in this patch, HeapTuples still are extracted from the slot at that time. But that's handled solely inside trigger.c, not visible to callers. It's quite likely that we'll want to revise the external trigger interface, but that's a separate large project. As part of this work the slots used for old/new/return tuples are moved from EState into ResultRelInfo, as different updated tables might need different slots. The slots are now also now created on-demand, which is good both from an efficiency POV, but also makes the modifying code simpler. Author: Andres Freund, Amit Khandekar and Ashutosh Bapat Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de	2019-02-26 20:31:38 -08:00
Michael Paquier	56fadbedbd	Mark correctly initial slot snapshots with MVCC type when built When building an initial slot snapshot, snapshots are marked with historic MVCC snapshots as type with the marker field being set in SnapBuildBuildSnapshot() but not overriden in SnapBuildInitialSnapshot(). Existing callers of SnapBuildBuildSnapshot() do not care about the type of snapshot used, but extensions calling it actually may, as reported. While on it, mark correctly the snapshot type when importing one. This is cosmetic as the field is enforced to 0. Author: Antonin Houska Reviewed-by: Álvaro Herrera, Michael Paquier Discussion: https://postgr.es/m/23215.1527665193@localhost Backpatch-through: 9.4	2019-02-20 12:31:07 +09:00
Peter Eisentraut	37d9916020	More unconstify use Replace casts whose only purpose is to cast away const with the unconstify() macro. Discussion: https://www.postgresql.org/message-id/flat/53a28052-f9f3-1808-fed9-460fd43035ab%402ndquadrant.com	2019-02-13 11:50:16 +01:00
Alvaro Herrera	8c67d29fd5	Relax overly strict assertion Ever since its birth, ReorderBufferBuildTupleCidHash() has contained an assertion that a catalog tuple cannot change Cmax after acquiring one. But that's wrong: if a subtransaction executes DDL that affects that catalog tuple, and later aborts and another DDL affects the same tuple, it will change Cmax. Relax the assertion to merely verify that the Cmax remains valid and monotonically increasing, instead. Add a test that tickles the relevant code. Diagnosed by, and initial patch submitted by: Arseny Sher Co-authored-by: Arseny Sher Discussion: https://postgr.es/m/874l9p8hyw.fsf@ars-thinkpad	2019-02-12 18:42:37 -03:00
Tom Lane	232a8e233f	Fix erroneous error reports in snapbuild.c. It's pretty unhelpful to report the wrong file name in a complaint about syscall failure, but SnapBuildSerialize managed to do that twice in a span of 50 lines. Also fix half a dozen missing or poorly-chosen errcode assignments; that's mostly cosmetic, but still wrong. Noted while studying recent failures on buildfarm member nightjar. I'm not sure whether those reports are actually giving the wrong filename, because there are two places here with identically spelled error messages. The other one is specifically coded not to report ENOENT, but if it's this one, how could we be getting ENOENT from open() with O_CREAT? Need to sit back and await results. However, these ereports are clearly broken from birth, so back-patch.	2019-02-12 01:12:52 -05:00
Michael Paquier	ea92368cd1	Move max_wal_senders out of max_connections for connection slot handling Since its introduction, max_wal_senders is counted as part of max_connections when it comes to define how many connection slots can be used for replication connections with a WAL sender context. This can lead to confusion for some users, as it could be possible to block a base backup or replication from happening because other backend sessions are already taken for other purposes by an application, and superuser-only connection slots are not a correct solution to handle that case. This commit makes max_wal_senders independent of max_connections for its handling of PGPROC entries in ProcGlobal, meaning that connection slots for WAL senders are handled using their own free queue, like autovacuum workers and bgworkers. One compatibility issue that this change creates is that a standby now requires to have a value of max_wal_senders at least equal to its primary. So, if a standby created enforces the value of max_wal_senders to be lower than that, then this could break failovers. Normally this should not be an issue though, as any settings of a standby are inherited from its primary as postgresql.conf gets normally copied as part of a base backup, so parameters would be consistent. Author: Alexander Kukushkin Reviewed-by: Kyotaro Horiguchi, Petr Jelínek, Masahiko Sawada, Oleksii Kliukin Discussion: https://postgr.es/m/CAFh8B=nBzHQeYAu0b8fjK-AF1X4+_p6GRtwG+cCgs6Vci2uRuQ@mail.gmail.com	2019-02-12 10:07:56 +09:00
Tom Lane	f09346a9c6	Refactor planner's header files. Create a new header optimizer/optimizer.h, which exposes just the planner functions that can be used "at arm's length", without need to access Paths or the other planner-internal data structures defined in nodes/relation.h. This is intended to provide the whole planner API seen by most of the rest of the system; although FDWs still need to use additional stuff, and more thought is also needed about just what selfuncs.c should rely on. The main point of doing this now is to limit the amount of new #include baggage that will be needed by "planner support functions", which I expect to introduce later, and which will be in relevant datatype modules rather than anywhere near the planner. This commit just moves relevant declarations into optimizer.h from other header files (a couple of which go away because everything got moved), and adjusts #include lists to match. There's further cleanup that could be done if we want to decide that some stuff being exposed by optimizer.h doesn't belong in the planner at all, but I'll leave that for another day. Discussion: https://postgr.es/m/11460.1548706639@sss.pgh.pa.us	2019-01-29 15:48:51 -05:00
Andres Freund	c91560defc	Move remaining code from tqual.[ch] to heapam.h / heapam_visibility.c. Given these routines are heap specific, and that there will be more generic visibility support in via table AM, it makes sense to move the prototypes to heapam.h (routines like HeapTupleSatisfiesVacuum will not be exposed in a generic fashion, because they are too storage specific). Similarly, the code in tqual.c is specific to heap, so moving it into access/heap/ makes sense. Author: Andres Freund Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de	2019-01-21 17:07:10 -08:00
Andres Freund	b7eda3e0e3	Move generic snapshot related code from tqual.h to snapmgr.h. The code in tqual.c is largely heap specific. Due to the upcoming pluggable storage work, it therefore makes sense to move it into access/heap/ (as the file's header notes, the tqual name isn't very good). But the various statically allocated snapshot and snapshot initialization functions are now (see previous commit) generic and do not depend on functions declared in tqual.h anymore. Therefore move. Also move XidInMVCCSnapshot as that's useful for future AMs, and already used outside of tqual.c. Author: Andres Freund Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de	2019-01-21 17:06:41 -08:00
Andres Freund	63746189b2	Change snapshot type to be determined by enum rather than callback. This is in preparation for allowing the same snapshot be used for different table AMs. With the current callback based approach we would need one callback for each supported AM, which clearly would not be extensible. Thus add a new Snapshot->snapshot_type field, and move the dispatch into HeapTupleSatisfiesVisibility() (which is now a function). Later work will then dispatch calls to HeapTupleSatisfiesVisibility() and other AMs visibility functions depending on the type of the table. The central SnapshotType enum also seems like a good location to centralize documentation about the intended behaviour of various types of snapshots. As tqual.h isn't included by bufmgr.h any more (as HeapTupleSatisfies* isn't referenced by TestForOldSnapshot() anymore) a few files now need to include it directly. Author: Andres Freund, loosely based on earlier work by Haribabu Kommi Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql	2019-01-21 17:03:15 -08:00
Andres Freund	e7cc78ad43	Remove superfluous tqual.h includes. Most of these had been obsoleted by `568d4138c` / the SnapshotNow removal. This is is preparation for moving most of tqual.[ch] into either snapmgr.h or heapam.h, which in turn is in preparation for pluggable table AMs. Author: Andres Freund Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de	2019-01-21 12:15:02 -08:00
Andres Freund	e0c4ec0728	Replace uses of heap_open et al with the corresponding table_* function. Author: Andres Freund Discussion: https://postgr.es/m/20190111000539.xbv7s6w7ilcvm7dp@alap3.anarazel.de	2019-01-21 10:51:37 -08:00
Andres Freund	111944c5ee	Replace heapam.h includes with {table, relation}.h where applicable. A lot of files only included heapam.h for relation_open, heap_open etc - replace the heapam.h include in those files with the narrower header. Author: Andres Freund Discussion: https://postgr.es/m/20190111000539.xbv7s6w7ilcvm7dp@alap3.anarazel.de	2019-01-21 10:51:37 -08:00
Andres Freund	4c850ecec6	Don't include heapam.h from others headers. heapam.h previously was included in a number of widely used headers (e.g. execnodes.h, indirectly in executor.h, ...). That's problematic on its own, as heapam.h contains a lot of low-level details that don't need to be exposed that widely, but becomes more problematic with the upcoming introduction of pluggable table storage - it seems inappropriate for heapam.h to be included that widely afterwards. heapam.h was largely only included in other headers to get the HeapScanDesc typedef (which was defined in heapam.h, even though HeapScanDescData is defined in relscan.h). The better solution here seems to be to just use the underlying struct (forward declared where necessary). Similar for BulkInsertState. Another problem was that LockTupleMode was used in executor.h - parts of the file tried to cope without heapam.h, but due to the fact that it indirectly included it, several subsequent violations of that goal were not not noticed. We could just reuse the approach of declaring parameters as int, but it seems nicer to move LockTupleMode to lockoptions.h - that's not a perfect location, but also doesn't seem bad. As a number of files relied on implicitly included heapam.h, a significant number of files grew an explicit include. It's quite probably that a few external projects will need to do the same. Author: Andres Freund Reviewed-By: Alvaro Herrera Discussion: https://postgr.es/m/20190114000701.y4ttcb74jpskkcfb@alap3.anarazel.de	2019-01-14 16:24:41 -08:00
Bruce Momjian	97c39498e5	Update copyright for 2019 Backpatch-through: certain files through 9.4	2019-01-02 12:44:25 -05:00
Michael Paquier	7fee252f6f	Add timestamp of last received message from standby to pg_stat_replication The timestamp generated by the standby at message transmission has been included in the protocol since its introduction for both the status update message and hot standby feedback message, but it has never appeared in pg_stat_replication. Seeing this timestamp does not matter much with a cluster which has a lot of activity, but on a mostly-idle cluster, this makes monitoring able to react faster than the configured timeouts. Author: MyungKyu LIM Reviewed-by: Michael Paquier, Masahiko Sawada Discussion: https://postgr.es/m/1657809367.407321.1533027417725.JavaMail.jboss@ep2ml404	2018-12-09 16:35:06 +09:00
Michael Paquier	5c99513975	Fix various checksum check problems for pg_verify_checksums and base backups Three issues are fixed in this patch: - Base backups forgot to ignore files specific to EXEC_BACKEND, leading to spurious warnings when checksums are enabled, per analysis from me. - pg_verify_checksums forgot about files specific to EXEC_BACKEND, leading to failures of the tool on any such build, particularly Windows. This error was originally found by newly-introduced TAP tests in various buildfarm members using EXEC_BACKEND. - pg_verify_checksums forgot to count for temporary files and temporary paths, which could be valid relation files, without checksums, per report from Andres Freund. More tests are added to cover this case. A new test case which emulates corruption for a file in a different tablespace is added, coming from from Michael Banck, while I have coded the main code and refactored the test code. Author: Michael Banck, Michael Paquier Reviewed-by: Stephen Frost, David Steele Discussion: https://postgr.es/m/20181021134206.GA14282@paquier.xyz	2018-11-30 10:34:45 +09:00
Michael Paquier	4c703369af	Fix handling of synchronous replication for stopping WAL senders This fixes an oversight from `c6c3334` which forgot that if a subset of WAL senders are stopping and in a sync state, other WAL senders could still be waiting for a WAL position to be synced while committing a transaction. However the subset of stopping senders would not release waiters, potentially breaking synchronous replication guarantees. This commit makes sure that even WAL senders stopping are able to release waiters and are tracked properly. On 9.4, this can also trigger an assertion failure when setting for example max_wal_senders to 1 where a WAL sender is not able to find itself as in synchronous state when the instance stops. Reported-by: Paul Guo Author: Paul Guo, Michael Paquier Discussion: https://postgr.es/m/CAEET0ZEv8VFqT3C-cQm6byOB4r4VYWcef1J21dOX-gcVhCSpmA@mail.gmail.com Backpatch-through: 9.4	2018-11-29 09:12:19 +09:00
Tomas Vondra	f69c959df0	Do not decode TOAST data for table rewrites During table rewrites (VACUUM FULL and CLUSTER), the main heap is logged using XLOG / FPI records, and thus (correctly) ignored in decoding. But the associated TOAST table is WAL-logged as plain INSERT records, and so was logically decoded and passed to reorder buffer. That has severe consequences with TOAST tables of non-trivial size. Firstly, reorder buffer has to keep all those changes, possibly spilling them to a file, incurring I/O costs and disk space. Secondly, ReoderBufferCommit() was stashing all those TOAST chunks into a hash table, which got discarded only after processing the row from the main heap. But as the main heap is not decoded for rewrites, this never happened, so all the TOAST data accumulated in memory, resulting either in excessive memory consumption or OOM. The fix is simple, as commit `e9edc1ba` already introduced infrastructure (namely HEAP_INSERT_NO_LOGICAL flag) to skip logical decoding of TOAST tables, but it only applied it to system tables. So simply use it for all TOAST data in raw_heap_insert(). That would however solve only the memory consumption issue - the TOAST changes would still be decoded and added to the reorder buffer, and spilled to disk (although without TOAST tuple data, so much smaller). But we can solve that by tweaking DecodeInsert() to just ignore such INSERT records altogether, using XLH_INSERT_CONTAINS_NEW_TUPLE flag, instead of skipping them later in ReorderBufferCommit(). Review: Masahiko Sawada Discussion: https://www.postgresql.org/message-id/flat/1a17c643-e9af-3dba-486b-fbe31bc1823a%402ndquadrant.com Backpatch: 9.4-, where logical decoding was introduced	2018-11-28 01:43:08 +01:00
Thomas Munro	cfdf4dc4fc	Add WL_EXIT_ON_PM_DEATH pseudo-event. Users of the WaitEventSet and WaitLatch() APIs can now choose between asking for WL_POSTMASTER_DEATH and then handling it explicitly, or asking for WL_EXIT_ON_PM_DEATH to trigger immediate exit on postmaster death. This reduces code duplication, since almost all callers want the latter. Repair all code that was previously ignoring postmaster death completely, or requesting the event but ignoring it, or requesting the event but then doing an unconditional PostmasterIsAlive() call every time through its event loop (which is an expensive syscall on platforms for which we don't have USE_POSTMASTER_DEATH_SIGNAL support). Assert that callers of WaitLatchXXX() under the postmaster remember to ask for either WL_POSTMASTER_DEATH or WL_EXIT_ON_PM_DEATH, to prevent future bugs. The only process that doesn't handle postmaster death is syslogger. It waits until all backends holding the write end of the syslog pipe (including the postmaster) have closed it by exiting, to be sure to capture any parting messages. By using the WaitEventSet API directly it avoids the new assertion, and as a by-product it may be slightly more efficient on platforms that have epoll(). Author: Thomas Munro Reviewed-by: Kyotaro Horiguchi, Heikki Linnakangas, Tom Lane Discussion: https://postgr.es/m/CAEepm%3D1TCviRykkUb69ppWLr_V697rzd1j3eZsRMmbXvETfqbQ%40mail.gmail.com, https://postgr.es/m/CAEepm=2LqHzizbe7muD7-2yHUbTOoF7Q+qkSD5Q41kuhttRTwA@mail.gmail.com	2018-11-23 20:46:34 +13:00
Andres Freund	578b229718	Remove WITH OIDS support, change oid catalog column visibility. Previously tables declared WITH OIDS, including a significant fraction of the catalog tables, stored the oid column not as a normal column, but as part of the tuple header. This special column was not shown by default, which was somewhat odd, as it's often (consider e.g. pg_class.oid) one of the more important parts of a row. Neither pg_dump nor COPY included the contents of the oid column by default. The fact that the oid column was not an ordinary column necessitated a significant amount of special case code to support oid columns. That already was painful for the existing, but upcoming work aiming to make table storage pluggable, would have required expanding and duplicating that "specialness" significantly. WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0). Remove it. Removing includes: - CREATE TABLE and ALTER TABLE syntax for declaring the table to be WITH OIDS has been removed (WITH (oids[ = true]) will error out) - pg_dump does not support dumping tables declared WITH OIDS and will issue a warning when dumping one (and ignore the oid column). - restoring an pg_dump archive with pg_restore will warn when restoring a table with oid contents (and ignore the oid column) - COPY will refuse to load binary dump that includes oids. - pg_upgrade will error out when encountering tables declared WITH OIDS, they have to be altered to remove the oid column first. - Functionality to access the oid of the last inserted row (like plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed. The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false) for CREATE TABLE) is still supported. While that requires a bit of support code, it seems unnecessary to break applications / dumps that do not use oids, and are explicit about not using them. The biggest user of WITH OID columns was postgres' catalog. This commit changes all 'magic' oid columns to be columns that are normally declared and stored. To reduce unnecessary query breakage all the newly added columns are still named 'oid', even if a table's column naming scheme would indicate 'reloid' or such. This obviously requires adapting a lot code, mostly replacing oid access via HeapTupleGetOid() with access to the underlying Form_pg_->oid column. The bootstrap process now assigns oids for all oid columns in genbki.pl that do not have an explicit value (starting at the largest oid previously used), only oids assigned later by oids will be above FirstBootstrapObjectId. As the oid column now is a normal column the special bootstrap syntax for oids has been removed. Oids are not automatically assigned during insertion anymore, all backend code explicitly assigns oids with GetNewOidWithIndex(). For the rare case that insertions into the catalog via SQL are called for the new pg_nextoid() function can be used (which only works on catalog tables). The fact that oid columns on system tables are now normal columns means that they will be included in the set of columns expanded by (i.e. SELECT * FROM pg_class will now include the table's oid, previously it did not). It'd not technically be hard to hide oid column by default, but that'd mean confusing behavior would either have to be carried forward forever, or it'd cause breakage down the line. While it's not unlikely that further adjustments are needed, the scope/invasiveness of the patch makes it worthwhile to get merge this now. It's painful to maintain externally, too complicated to commit after the code code freeze, and a dependency of a number of other patches. Catversion bump, for obvious reasons. Author: Andres Freund, with contributions by John Naylor Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de	2018-11-20 16:00:17 -08:00

... 11 12 13 14 15 ...

1390 Commits