postgres

mirror of https://github.com/postgres/postgres.git synced 2025-08-25 20:23:07 +03:00

Author	SHA1	Message	Date
Tom Lane	d0a02bdb8c	Update configure's probe for libldap to work with OpenLDAP 2.5. The separate libldap_r is gone and libldap itself is now always thread-safe. Unfortunately there seems no easy way to tell by inspection whether libldap is thread-safe, so we have to take it on faith that libldap is thread-safe if there's no libldap_r. That should be okay, as it appears that libldap_r was a standard part of the installation going back at least 20 years. Report and patch by Adrian Ho. Back-patch to all supported branches, since people might try to build any of them with a newer OpenLDAP. Discussion: https://postgr.es/m/17083-a19190d9591946a7@postgresql.org	2021-07-09 12:38:55 -04:00
Michael Paquier	9fd85570d1	Refactor SASL code with a generic interface for its mechanisms The code of SCRAM and SASL have been tightly linked together since SCRAM exists in the core code, making hard to apprehend the addition of new SASL mechanisms, but these are by design different facilities, with SCRAM being an option for SASL. This refactors the code related to both so as the backend and the frontend use a set of callbacks for SASL mechanisms, documenting while on it what is expected by anybody adding a new SASL mechanism. The separation between both layers is neat, using two sets of callbacks for the frontend and the backend to mark the frontier between both facilities. The shape of the callbacks is now directly inspired from the routines used by SCRAM, so the code change is straight-forward, and the SASL code is moved into its own set of files. These will likely change depending on how and if new SASL mechanisms get added in the future. Author: Jacob Champion Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/3d2a6f5d50e741117d6baf83eb67ebf1a8a35a11.camel@vmware.com	2021-07-07 10:55:15 +09:00
Amit Kapila	8aafb02616	Refactor function parse_subscription_options. Instead of using multiple parameters in parse_subscription_options function signature, use the struct SubOpts that encapsulate all the subscription options and their values. It will be useful for future work where we need to add other options in the subscription. Also, use bitmaps to pass the supported and retrieve the specified options much like the way it is done in the commit `a3dc926009`. Author: Bharath Rupireddy Reviewed-By: Peter Smith, Amit Kapila, Alvaro Herrera Discussion: https://postgr.es/m/CALj2ACXtoQczfNsDQWobypVvHbX2DtgEHn8DawS0eGFwuo72kw@mail.gmail.com	2021-07-06 07:46:50 +05:30
Michael Paquier	4035cd5d4e	Add support for LZ4 with compression of full-page writes in WAL The logic is implemented so as there can be a choice in the compression used when building a WAL record, and an extra per-record bit is used to track down if a block is compressed with PGLZ, LZ4 or nothing. wal_compression, the existing parameter, is changed to an enum with support for the following backward-compatible values: - "off", the default, to not use compression. - "pglz" or "on", to compress FPWs with PGLZ. - "lz4", the new mode, to compress FPWs with LZ4. Benchmarking has showed that LZ4 outclasses easily PGLZ. ZSTD would be also an interesting choice, but going just with LZ4 for now makes the patch minimalistic as toast compression is already able to use LZ4, so there is no need to worry about any build-related needs for this implementation. Author: Andrey Borodin, Justin Pryzby Reviewed-by: Dilip Kumar, Michael Paquier Discussion: https://postgr.es/m/3037310D-ECB7-4BF1-AF20-01C10BB33A33@yandex-team.ru	2021-06-29 11:17:55 +09:00
Tom Lane	14b2ffaf7f	Doc: further updates for RELEASE_CHANGES process notes. Mention expectations for email notifications of appropriate lists when a branch is made or retired. (I've been doing that informally for years, but it's better to have it written down.)	2021-06-28 18:05:46 -04:00
Peter Geoghegan	bc49ab3c27	Improve pgindent release workflow. Update RELEASE_CHANGES to direct the reader towards completing the steps outlined in the pgindent README, both as a pre-beta task and as a task to be performed when starting a new development cycle. This makes it less likely that somebody will miss updating the .git-blame-ignore-revs file when running pgindent against the tree as a routine release change task. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAH2-Wz=2PjF4As8dWECArsXxLKganYmQ-s0UeGqHHbHhqDKqeA@mail.gmail.com	2021-06-28 13:08:46 -07:00
Andrew Dunstan	596b5af1d3	Stamp HEAD as 15devel. Let the hacking begin ...	2021-06-28 11:31:16 -04:00
Andrew Dunstan	e1c1c30f63	Pre branch pgindent / pgperltidy run Along the way make a slight adjustment to src/include/utils/queryjumble.h to avoid an unused typedef.	2021-06-28 11:05:54 -04:00
Michael Paquier	d5a2c413fc	Remove non-existing variable reference in MSVC's Solution.pm The version string is grabbed from PACKAGE_VERSION in pg_config.h in the MSVC build since `8f4fb4c6`, but an error message referenced a variable that existed before that. This had no consequences except if one messes up enough with the version number of the build. Author: Anton Voloshin Discussion: https://postgr.es/m/af79ee1b-9962-b299-98e1-f90a289e19e6@postgrespro.ru Backpatch-through: 13	2021-06-26 13:52:48 +09:00
Peter Geoghegan	8e638845ff	Add list of ignorable pgindent commits for git-blame. Add a .git-blame-ignore-revs file with a list of pgindent, pgperlyidy, and reformat-dat-files commit hashes. Postgres hackers that configure git to use the ignore file will get git-blame output that avoids attributing line changes to the ignored indent commits. This makes git-blame output much easier to work with in practice. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAH2-Wz=cVh3GHTP6SdLU-Gnmt2zRdF8vZkcrFdSzXQ=WhbWm9Q@mail.gmail.com	2021-06-22 09:06:32 -07:00
Andrew Dunstan	d69fcb9cae	fix syntax error	2021-05-28 09:35:11 -04:00
Andrew Dunstan	fb424ae85f	Report configured port in MSVC built pg_config This is a long standing omission, discovered when trying to write code that relied on it. Backpatch to all live branches.	2021-05-28 09:30:16 -04:00
Michael Paquier	0251106634	Fix MSVC scripts when building with GSSAPI/Kerberos The deliverables of upstream Kerberos on Windows are installed with paths that do not match our MSVC scripts. First, the include folder was named "inc/" in our scripts, but the upstream MSIs use "include/". Second, the build would fail with 64-bit environments as the libraries are named differently. This commit adjusts the MSVC scripts to be compatible with the latest installations of upstream, and I have checked that the compilation was able to work with the 32-bit and 64-bit installations. Special thanks to Kondo Yuta for the help in investigating the situation in hamerkop, which had an incorrect configuration for the GSS compilation. Reported-by: Brian Ye Discussion: https://postgr.es/m/162128202219.27274.12616756784952017465@wrigleys.postgresql.org Backpatch-through: 9.6	2021-05-27 20:11:00 +09:00
Tom Lane	413c1ef98e	Avoid creating testtablespace directories where not wanted. Recently we refactored things so that pg_regress makes the "testtablespace" subdirectory used by the core regression tests, instead of doing that in the makefiles. That had the undesirable side effect of making such a subdirectory in every directory that has "input" or "output" test files. Since these subdirectories remain empty, git doesn't complain about them, but nonetheless they're clutter. To fix, invent an explicit --make-testtablespace-dir switch, so that pg_regress only makes the subdirectory when explicitly told to. Discussion: https://postgr.es/m/2854388.1621284789@sss.pgh.pa.us	2021-05-19 14:04:01 -04:00
Tom Lane	def5b065ff	Initial pgindent and pgperltidy run for v14. Also "make reformat-dat-files". The only change worthy of note is that pgindent messed up the formatting of launcher.c's struct LogicalRepWorkerId, which led me to notice that that struct wasn't used at all anymore, so I just took it out.	2021-05-12 13:14:10 -04:00
Tom Lane	0b85fa93e4	Fix vcregress.pl's ancient misspelling of --max-connections. I copied the existing spelling of "--max_connections", but that's just wrong :-(. Evidently setting $ENV{MAX_CONNECTIONS} has never actually worked in this script. Given the lack of complaints, it's probably not worth back-patching a fix. Per buildfarm. Discussion: https://postgr.es/m/899209.1620759506@sss.pgh.pa.us	2021-05-11 19:17:07 -04:00
Tom Lane	1df3555acc	Get rid of the separate serial_schedule list of tests. Having to maintain two lists of regression test scripts has proven annoyingly error-prone. We can achieve the effect of the serial_schedule by running the parallel_schedule with "--max_connections=1"; so do that and remove serial_schedule. This causes cosmetic differences in the progress output, but it doesn't seem worth restructuring pg_regress to avoid that. Discussion: https://postgr.es/m/899209.1620759506@sss.pgh.pa.us	2021-05-11 17:52:13 -04:00
Michael Paquier	9ca40dcd4d	Add support for LZ4 build in MSVC scripts Since its introduction in `bbe0a81`, compression of table data supports LZ4, but nothing had been done within the MSVC scripts to allow users to build the code with this library. This commit closes the gap by extending the MSVC scripts to be able to build optionally with LZ4. Getting libraries that can be used for compilation and execution is possible as LZ4 can be compiled down to MSVC 2010 using its source tarball. MinGW may require extra efforts to be able to work, and I have been able to test this only with MSVC, still this is better than nothing to give users a way to test the feature on Windows. Author: Dilip Kumar Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/YJPdNeF68XpwDDki@paquier.xyz	2021-05-11 10:43:05 +09:00
Thomas Munro	c2dc19342e	Revert recovery prefetching feature. This set of commits has some bugs with known fixes, but at this late stage in the release cycle it seems best to revert and resubmit next time, along with some new automated test coverage for this whole area. Commits reverted: `dc88460c`: Doc: Review for "Optionally prefetch referenced data in recovery." `1d257577`: Optionally prefetch referenced data in recovery. `f003d9f8`: Add circular WAL decoding buffer. `323cbe7c`: Remove read_page callback from XLogReader. Remove the new GUC group WAL_RECOVERY recently added by `a55a9847`, as the corresponding section of config.sgml is now reverted. Discussion: https://postgr.es/m/CAOuzzgrn7iKnFRsB4MHp3UisEQAGgZMbk_ViTN4HV4-Ksq8zCg%40mail.gmail.com	2021-05-10 16:06:09 +12:00
Andrew Dunstan	8b82de0164	Remove extraneous newlines added by perl copyright patch	2021-05-07 11:37:37 -04:00
Andrew Dunstan	8fa6e6919c	Add a copyright notice to perl files lacking one.	2021-05-07 10:56:14 -04:00
Thomas Munro	ec48314708	Revert per-index collation version tracking feature. Design problems were discovered in the handling of composite types and record types that would cause some relevant versions not to be recorded. Misgivings were also expressed about the use of the pg_depend catalog for this purpose. We're out of time for this release so we'll revert and try again. Commits reverted: `1bf946bd`: Doc: Document known problem with Windows collation versions. `cf002008`: Remove no-longer-relevant test case. `ef387bed`: Fix bogus collation-version-recording logic. `0fb0a050`: Hide internal error for pg_collation_actual_version(<bad OID>). `ff942057`: Suppress "warning: variable 'collcollate' set but not used". `d50e3b1f`: Fix assertion in collation version lookup. `f24b1569`: Rethink extraction of collation dependencies. `257836a7`: Track collation versions for indexes. `cd6f479e`: Add pg_depend.refobjversion. `7d1297df`: Remove pg_collation.collversion. Discussion: https://postgr.es/m/CA%2BhUKGLhj5t1fcjqAu8iD9B3ixJtsTNqyCCD4V0aTO9kAKAjjA%40mail.gmail.com	2021-05-07 21:10:11 +12:00
Tom Lane	e8ce68b0b9	Doc: update RELEASE_CHANGES checklist. Update checklist to reflect current practice: * The platform-specific FAQ files are long gone. * We've never routinely updated the libbind code we borrowed, either, and there seems no reason to start now. * Explain current practice of running pgindent twice per cycle. Discussion: https://postgr.es/m/4038398.1620238684@sss.pgh.pa.us	2021-05-05 23:11:28 -04:00
Amit Kapila	3fa17d3771	Use HTAB for replication slot statistics. Previously, we used to use the array of size max_replication_slots to store stats for replication slots. But that had two problems in the cases where a message for dropping a slot gets lost: 1) the stats for the new slot are not recorded if the array is full and 2) writing beyond the end of the array if the user reduces the max_replication_slots. This commit uses HTAB for replication slot statistics, resolving both problems. Now, pgstat_vacuum_stat() search for all the dead replication slots in stats hashtable and tell the collector to remove them. To avoid showing the stats for the already-dropped slots, pg_stat_replication_slots view searches slot stats by the slot name taken from pg_replication_slots. Also, we send a message for creating a slot at slot creation, initializing the stats. This reduces the possibility that the stats are accumulated into the old slot stats when a message for dropping a slot gets lost. Reported-by: Andres Freund Author: Sawada Masahiko, test case by Vignesh C Reviewed-by: Amit Kapila, Vignesh C, Dilip Kumar Discussion: https://postgr.es/m/20210319185247.ldebgpdaxsowiflw@alap3.anarazel.de	2021-04-27 09:09:11 +05:30
Fujii Masao	8ff1c94649	Allow TRUNCATE command to truncate foreign tables. This commit introduces new foreign data wrapper API for TRUNCATE. It extends TRUNCATE command so that it accepts foreign tables as the targets to truncate and invokes that API. Also it extends postgres_fdw so that it can issue TRUNCATE command to foreign servers, by adding new routine for that TRUNCATE API. The information about options specified in TRUNCATE command, e.g., ONLY, CACADE, etc is passed to FDW via API. The list of foreign tables to truncate is also passed to FDW. FDW truncates the foreign data sources that the passed foreign tables specify, based on those information. For example, postgres_fdw constructs TRUNCATE command using them and issues it to the foreign server. For performance, TRUNCATE command invokes the FDW routine for TRUNCATE once per foreign server that foreign tables to truncate belong to. Author: Kazutaka Onishi, Kohei KaiGai, slightly modified by Fujii Masao Reviewed-by: Bharath Rupireddy, Michael Paquier, Zhihong Yu, Alvaro Herrera, Stephen Frost, Ashutosh Bapat, Amit Langote, Daniel Gustafsson, Ibrar Ahmed, Fujii Masao Discussion: https://postgr.es/m/CAOP8fzb_gkReLput7OvOK+8NHgw-RKqNv59vem7=524krQTcWA@mail.gmail.com Discussion: https://postgr.es/m/CAJuF6cMWDDqU-vn_knZgma+2GMaout68YUgn1uyDnexRhqqM5Q@mail.gmail.com	2021-04-08 20:56:08 +09:00
Thomas Munro	1d257577e0	Optionally prefetch referenced data in recovery. Introduce a new GUC recovery_prefetch, disabled by default. When enabled, look ahead in the WAL and try to initiate asynchronous reading of referenced data blocks that are not yet cached in our buffer pool. For now, this is done with posix_fadvise(), which has several caveats. Better mechanisms will follow in later work on the I/O subsystem. The GUC maintenance_io_concurrency is used to limit the number of concurrent I/Os we allow ourselves to initiate, based on pessimistic heuristics used to infer that I/Os have begun and completed. The GUC wal_decode_buffer_size is used to limit the maximum distance we are prepared to read ahead in the WAL to find uncached blocks. Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> (parts) Reviewed-by: Andres Freund <andres@anarazel.de> (parts) Reviewed-by: Tomas Vondra <tomas.vondra@2ndquadrant.com> (parts) Tested-by: Tomas Vondra <tomas.vondra@2ndquadrant.com> Tested-by: Jakub Wartak <Jakub.Wartak@tomtom.com> Tested-by: Dmitry Dolgov <9erthalion6@gmail.com> Tested-by: Sait Talha Nisanci <Sait.Nisanci@microsoft.com> Discussion: https://postgr.es/m/CA%2BhUKGJ4VJN8ttxScUFM8dOKX0BrBiboo5uz1cq%3DAovOddfHpA%40mail.gmail.com	2021-04-08 23:20:42 +12:00
Robert Haas	ec7ffb8096	amcheck: fix multiple problems with TOAST pointer validation First, don't perform database access while holding a buffer lock. When checking a heap, we can validate that TOAST pointers are sane by performing a scan on the TOAST index and looking up the chunks that correspond to each value ID that appears in a TOAST poiner in the main table. But, to do that while holding a buffer lock at least risks causing other backends to wait uninterruptibly, and probably can cause undetected and uninterruptible deadlocks. So, instead, make a list of checks to perform while holding the lock, and then perform the checks after releasing it. Second, adjust things so that we don't try to follow TOAST pointers for tuples that are already eligible to be pruned. The TOAST tuples become eligible for pruning at the same time that the main tuple does, so trying to check them may lead to spurious reports of corruption, as observed in the buildfarm. The necessary infrastructure to decide whether or not the tuple being checked is prunable was added by commit `3b6c1259f9`, but it wasn't actually used for its intended purpose prior to this patch. Mark Dilger, adjusted by me to avoid a memory leak. Discussion: http://postgr.es/m/AC5479E4-6321-473D-AC92-5EC36299FBC2@enterprisedb.com	2021-04-07 13:39:12 -04:00
Peter Geoghegan	8523492d4e	Remove tupgone special case from vacuumlazy.c. Retry the call to heap_prune_page() in rare cases where there is disagreement between the heap_prune_page() call and the call to HeapTupleSatisfiesVacuum() that immediately follows. Disagreement is possible when a concurrently-aborted transaction makes a tuple DEAD during the tiny window between each step. This was the only case where a tuple considered DEAD by VACUUM still had storage following pruning. VACUUM's definition of dead tuples is now uniformly simple and unambiguous: dead tuples from each page are always LP_DEAD line pointers that were encountered just after we performed pruning (and just before we considered freezing remaining items with tuple storage). Eliminating the tupgone=true special case enables INDEX_CLEANUP=off style skipping of index vacuuming that takes place based on flexible, dynamic criteria. The INDEX_CLEANUP=off case had to know about skipping indexes up-front before now, due to a subtle interaction with the special case (see commit `dd695979`) -- this was a special case unto itself. Now there are no special cases. And so now it won't matter when or how we decide to skip index vacuuming: it won't affect how pruning behaves, and it won't be affected by any of the implementation details of pruning or freezing. Also remove XLOG_HEAP2_CLEANUP_INFO records. These are no longer necessary because we now rely entirely on heap pruning taking care of recovery conflicts. There is no longer any need to generate recovery conflicts for DEAD tuples that pruning just missed. This also means that heap vacuuming now uses exactly the same strategy for recovery conflicts as index vacuuming always has: REDO routines never need to process a latestRemovedXid from the WAL record, since earlier REDO of the WAL record from pruning is sufficient in all cases. The generic XLOG_HEAP2_CLEAN record type is now split into two new record types to reflect this new division (these are called XLOG_HEAP2_PRUNE and XLOG_HEAP2_VACUUM). Also stop acquiring a super-exclusive lock for heap pages when they're vacuumed during VACUUM's second heap pass. A regular exclusive lock is enough. This is correct because heap page vacuuming is now strictly a matter of setting the LP_DEAD line pointers to LP_UNUSED. No other backend can have a pointer to a tuple located in a pinned buffer that can be invalidated by a concurrent heap page vacuum operation. Heap vacuuming can now be thought of as conceptually similar to index vacuuming and conceptually dissimilar to heap pruning. Heap pruning now has sole responsibility for anything involving the logical contents of the database (e.g., managing transaction status information, recovery conflicts, considering what to do with HOT chains). Index vacuuming and heap vacuuming are now only concerned with recycling garbage items from physical data structures that back the logical database. Bump XLOG_PAGE_MAGIC due to pruning and heap page vacuum WAL record changes. Credit for the idea of retrying pruning a page to avoid the tupgone case goes to Andres Freund. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Andres Freund <andres@anarazel.de> Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/CAH2-WznneCXTzuFmcwx_EyRQgfsfJAAsu+CsqRFmFXCAar=nJw@mail.gmail.com	2021-04-06 08:49:22 -07:00
Michael Paquier	e6bdfd9700	Refactor HMAC implementations Similarly to the cryptohash implementations, this refactors the existing HMAC code into a single set of APIs that can be plugged with any crypto libraries PostgreSQL is built with (only OpenSSL currently). If there is no such libraries, a fallback implementation is available. Those new APIs are designed similarly to the existing cryptohash layer, so there is no real new design here, with the same logic around buffer bound checks and memory handling. HMAC has a dependency on cryptohashes, so all the cryptohash types supported by cryptohash{_openssl}.c can be used with HMAC. This refactoring is an advantage mainly for SCRAM, that included its own implementation of HMAC with SHA256 without relying on the existing crypto libraries even if PostgreSQL was built with their support. This code has been tested on Windows and Linux, with and without OpenSSL, across all the versions supported on HEAD from 1.1.1 down to 1.0.1. I have also checked that the implementations are working fine using some sample results, a custom extension of my own, and doing cross-checks across different major versions with SCRAM with the client and the backend. Author: Michael Paquier Reviewed-by: Bruce Momjian Discussion: https://postgr.es/m/X9m0nkEJEzIPXjeZ@paquier.xyz	2021-04-03 17:30:49 +09:00
Amit Kapila	26acb54a13	Revert "Enable parallel SELECT for "INSERT INTO ... SELECT ..."." To allow inserts in parallel-mode this feature has to ensure that all the constraints, triggers, etc. are parallel-safe for the partition hierarchy which is costly and we need to find a better way to do that. Additionally, we could have used existing cached information in some cases like indexes, domains, etc. to determine the parallel-safety. List of commits reverted, in reverse chronological order: `ed62d3737c` Doc: Update description for parallel insert reloption. `c8f78b6161` Add a new GUC and a reloption to enable inserts in parallel-mode. `c5be48f092` Improve FK trigger parallel-safety check added by `05c8482f7f`. `e2cda3c20a` Fix use of relcache TriggerDesc field introduced by commit `05c8482f7f`. `e4e87a32cc` Fix valgrind issue in commit `05c8482f7f`. `05c8482f7f` Enable parallel SELECT for "INSERT INTO ... SELECT ...". Discussion: https://postgr.es/m/E1lMiB9-0001c3-SY@gemulon.postgresql.org	2021-03-24 11:29:15 +05:30
Tomas Vondra	bfa2cee784	Move bsearch_arg to src/port Until now the bsearch_arg function was used only in extended statistics code, so it was defined in that code. But we already have qsort_arg in src/port, so let's move it next to it.	2021-03-23 00:11:22 +01:00
Tom Lane	2c75f8a612	Remove useless configure probe for <lz4/lz4.h>. This seems to have been just copied-and-pasted from some other header checks. But our C code is entirely unprepared to support such a header name, so it's only wasting cycles to look for it. If we did need to support it, some #ifdefs would be required. (A quick trawl at codesearch.debian.net finds some packages that reference lz4/lz4.h; but they use only that spelling, and appear to be intending to reference their own copy rather than a system-level installation of liblz4. There's no evidence of freestanding installations that require this spelling.) Discussion: https://postgr.es/m/457962.1616362509@sss.pgh.pa.us	2021-03-22 11:20:44 -04:00
Tom Lane	4d399a6fbe	Bring configure support for LZ4 up to snuff. It's not okay to just shove the pkg_config results right into our build flags, for a couple different reasons: * This fails to maintain the separation between CPPFLAGS and CFLAGS, as well as that between LDFLAGS and LIBS. (The CPPFLAGS angle is, I believe, the reason for warning messages reported when building with MacPorts' liblz4.) * If pkg_config emits anything other than -I/-D/-L/-l switches, it's highly unlikely that we want to absorb those. That'd be more likely to break the build than do anything helpful. (Even the -D case is questionable; but we're doing that for libxml2, so I kept it.) Also, it's not okay to skip doing an AC_CHECK_LIB probe, as evidenced by recent build failure on topminnow; that should have been caught at configure time. Model fixes for this on configure's libxml2 support. It appears that somebody overlooked an autoheader run, too. Discussion: https://postgr.es/m/20210119190720.GL8560@telsasoft.com	2021-03-21 17:20:17 -04:00
Thomas Munro	61752afb26	Provide recovery_init_sync_method=syncfs. Since commit `2ce439f3` we have opened every file in the data directory and called fsync() at the start of crash recovery. This can be very slow if there are many files, leading to field complaints of systems taking minutes or even hours to begin crash recovery. Provide an alternative method, for Linux only, where we call syncfs() on every possibly different filesystem under the data directory. This is equivalent, but avoids faulting in potentially many inodes from potentially slow storage. The new mode comes with some caveats, described in the documentation, so the default value for the new setting is "fsync", preserving the older behavior. Reported-by: Michael Brown <michael.brown@discourse.org> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Paul Guo <guopa@vmware.com> Reviewed-by: Bruce Momjian <bruce@momjian.us> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: David Steele <david@pgmasters.net> Discussion: https://postgr.es/m/11bc2bb7-ecb5-3ad0-b39f-df632734cd81%40discourse.org Discussion: https://postgr.es/m/CAEET0ZHGnbXmi8yF3ywsDZvb3m9CbdsGZgfTXscQ6agcbzcZAw%40mail.gmail.com	2021-03-20 12:07:28 +13:00
Robert Haas	bbe0a81db6	Allow configurable LZ4 TOAST compression. There is now a per-column COMPRESSION option which can be set to pglz (the default, and the only option in up until now) or lz4. Or, if you like, you can set the new default_toast_compression GUC to lz4, and then that will be the default for new table columns for which no value is specified. We don't have lz4 support in the PostgreSQL code, so to use lz4 compression, PostgreSQL must be built --with-lz4. In general, TOAST compression means compression of individual column values, not the whole tuple, and those values can either be compressed inline within the tuple or compressed and then stored externally in the TOAST table, so those properties also apply to this feature. Prior to this commit, a TOAST pointer has two unused bits as part of the va_extsize field, and a compessed datum has two unused bits as part of the va_rawsize field. These bits are unused because the length of a varlena is limited to 1GB; we now use them to indicate the compression type that was used. This means we only have bit space for 2 more built-in compresison types, but we could work around that problem, if necessary, by introducing a new vartag_external value for any further types we end up wanting to add. Hopefully, it won't be too important to offer a wide selection of algorithms here, since each one we add not only takes more coding but also adds a build dependency for every packager. Nevertheless, it seems worth doing at least this much, because LZ4 gets better compression than PGLZ with less CPU usage. It's possible for LZ4-compressed datums to leak into composite type values stored on disk, just as it is for PGLZ. It's also possible for LZ4-compressed attributes to be copied into a different table via SQL commands such as CREATE TABLE AS or INSERT .. SELECT. It would be expensive to force such values to be decompressed, so PostgreSQL has never done so. For the same reasons, we also don't force recompression of already-compressed values even if the target table prefers a different compression method than was used for the source data. These architectural decisions are perhaps arguable but revisiting them is well beyond the scope of what seemed possible to do as part of this project. However, it's relatively cheap to recompress as part of VACUUM FULL or CLUSTER, so this commit adjusts those commands to do so, if the configured compression method of the table happens not to match what was used for some column value stored therein. Dilip Kumar. The original patches on which this work was based were written by Ildus Kurbangaliev, and those were patches were based on even earlier work by Nikita Glukhov, but the design has since changed very substantially, since allow a potentially large number of compression methods that could be added and dropped on a running system proved too problematic given some of the architectural issues mentioned above; the choice of which specific compression method to add first is now different; and a lot of the code has been heavily refactored. More recently, Justin Przyby helped quite a bit with testing and reviewing and this version also includes some code contributions from him. Other design input and review from Tomas Vondra, Álvaro Herrera, Andres Freund, Oleg Bartunov, Alexander Korotkov, and me. Discussion: http://postgr.es/m/20170907194236.4cefce96%40wp.localdomain Discussion: http://postgr.es/m/CAFiTN-uUpX3ck%3DK0mLEk-G_kUQY%3DSNOTeqdaNRR9FMdQrHKebw%40mail.gmail.com	2021-03-19 15:10:38 -04:00
Amit Kapila	c8f78b6161	Add a new GUC and a reloption to enable inserts in parallel-mode. Commit `05c8482f7f` added the implementation of parallel SELECT for "INSERT INTO ... SELECT ..." which may incur non-negligible overhead in the additional parallel-safety checks that it performs, even when, in the end, those checks determine that parallelism can't be used. This is normally only ever a problem in the case of when the target table has a large number of partitions. A new GUC option "enable_parallel_insert" is added, to allow insert in parallel-mode. The default is on. In addition to the GUC option, the user may want a mechanism to allow inserts in parallel-mode with finer granularity at table level. The new table option "parallel_insert_enabled" allows this. The default is true. Author: "Hou, Zhijie" Reviewed-by: Greg Nancarrow, Amit Langote, Takayuki Tsunakawa, Amit Kapila Discussion: https://postgr.es/m/CAA4eK1K-cW7svLC2D7DHoGHxdAdg3P37BLgebqBOC2ZLc9a6QQ%40mail.gmail.com Discussion: https://postgr.es/m/CAJcOf-cXnB5cnMKqWEp2E2z7Mvcd04iLVmV=qpFJrR3AcrTS3g@mail.gmail.com	2021-03-18 07:25:27 +05:30
Alvaro Herrera	acb7e4eb6b	Implement pipeline mode in libpq Pipeline mode in libpq lets an application avoid the Sync messages in the FE/BE protocol that are implicit in the old libpq API after each query. The application can then insert Sync at its leisure with a new libpq function PQpipelineSync. This can lead to substantial reductions in query latency. Co-authored-by: Craig Ringer <craig.ringer@enterprisedb.com> Co-authored-by: Matthieu Garrigues <matthieu.garrigues@gmail.com> Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Aya Iwata <iwata.aya@jp.fujitsu.com> Reviewed-by: Daniel Vérité <daniel@manitou-mail.org> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Kirk Jamison <k.jamison@fujitsu.com> Reviewed-by: Michael Paquier <michael.paquier@gmail.com> Reviewed-by: Nikhil Sontakke <nikhils@2ndquadrant.com> Reviewed-by: Vaishnavi Prabakaran <VaishnaviP@fast.au.fujitsu.com> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Discussion: https://postgr.es/m/CAMsr+YFUjJytRyV4J-16bEoiZyH=4nj+sQ7JP9ajwz=B4dMMZw@mail.gmail.com Discussion: https://postgr.es/m/CAJkzx4T5E-2cQe3dtv2R78dYFvz+in8PY7A8MArvLhs_pg75gg@mail.gmail.com	2021-03-15 18:13:42 -03:00
Fujii Masao	d75288fb27	Make archiver process an auxiliary process. This commit changes WAL archiver process so that it's treated as an auxiliary process and can use shared memory. This is an infrastructure patch required for upcoming shared-memory based stats collector patch series. These patch series basically need any processes including archiver that can report the statistics to access to shared memory. Since this patch itself is useful to simplify the code and when users monitor the status of archiver, it's committed separately in advance. This commit simplifies the code for WAL archiving. For example, previously backends need to signal to archiver via postmaster when they notify archiver that there are some WAL files to archive. On the other hand, this commit removes that signal to postmaster and enables backends to notify archier directly using shared latch. Also, as the side of this change, the information about archiver process becomes viewable at pg_stat_activity view. Author: Kyotaro Horiguchi Reviewed-by: Andres Freund, Álvaro Herrera, Julien Rouhaud, Tomas Vondra, Arthur Zakirov, Fujii Masao Discussion: https://postgr.es/m/20180629.173418.190173462.horiguchi.kyotaro@lab.ntt.co.jp	2021-03-15 13:13:14 +09:00
Robert Haas	9706092839	Add pg_amcheck, a CLI for contrib/amcheck. This makes it a lot easier to run the corruption checks that are implemented by contrib/amcheck against lots of relations and get the result in an easily understandable format. It has a wide variety of options for choosing which relations to check and which checks to perform, and it can run checks in parallel if you want. Mark Dilger, reviewed by Peter Geoghegan and by me. Discussion: http://postgr.es/m/12ED3DA8-25F0-4B68-937D-D907CFBF08E7@enterprisedb.com Discussion: http://postgr.es/m/BA592F2D-F928-46FF-9516-2B827F067F57@enterprisedb.com	2021-03-12 13:00:01 -05:00
Robert Haas	f71519e545	Refactor and generalize the ParallelSlot machinery. Create a wrapper object, ParallelSlotArray, to encapsulate the number of slots and the slot array itself, plus some other relevant bits of information. This reduces the number of parameters we have to pass around all over the place. Allow for a ParallelSlotArray to contain slots connected to different databases within a single cluster. The current clients of this mechanism don't need this, but it is expected to be used by future patches. Defer connecting to databases until we actually need the connection for something. This is a slight behavior change for vacuumdb and reindexdb. If you specify a number of jobs that is larger than the number of objects, the extra connections will now not be used. But, on the other hand, if you specify a number of jobs that is so large that it's going to fail, the failure would previously have happened before any operations were actually started, and now it won't. Mark Dilger, reviewed by me. Discussion: http://postgr.es/m/12ED3DA8-25F0-4B68-937D-D907CFBF08E7@enterprisedb.com Discussion: http://postgr.es/m/BA592F2D-F928-46FF-9516-2B827F067F57@enterprisedb.com	2021-03-11 13:17:46 -05:00
Thomas Munro	d87251048a	Replace buffer I/O locks with condition variables. 1. Backends waiting for buffer I/O are now interruptible. 2. If something goes wrong in a backend that is currently performing I/O, waiting backends no longer wake up until that backend reaches AbortBufferIO() and broadcasts on the CV. Previously, any waiters would wake up (because the I/O lock was automatically released) and then busy-loop until AbortBufferIO() cleared BM_IO_IN_PROGRESS. 3. LWLockMinimallyPadded is removed, as it would now be unused. Author: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (earlier version, 2016) Discussion: https://postgr.es/m/CA%2BhUKGJ8nBFrjLuCTuqKN0pd2PQOwj9b_jnsiGFFMDvUxahj_A%40mail.gmail.com Discussion: https://postgr.es/m/CA+Tgmoaj2aPti0yho7FeEf2qt-JgQPRWb0gci_o1Hfr=C56Xng@mail.gmail.com	2021-03-11 10:36:17 +13:00
Michael Paquier	6c788d9f6a	Move tablespace path re-creation from the makefiles to pg_regress Moving this logic into pg_regress fixes a potential failure with parallel tests when pg_upgrade and the main regression test suite both trigger the makefile rule that cleaned up testtablespace/ under src/test/regress. Even if pg_upgrade was triggering this rule, it has no need to do so as it uses a different tablespace path. So if pg_upgrade triggered the makefile rule for the tablespace setup while the main regression test suite ran the tablespace cases, it would fail. `61be85a` was a similar attempt at achieving that, but that broke cases where the regression tests require to run under an Administrator account, like with Appveyor. Reported-by: Andres Freund, Kyotaro Horiguchi Reviewed-by: Peter Eisentraut Discussion: https://postgr.es/m/20201209012911.uk4d6nxcnkp7ehrx@alap3.anarazel.de	2021-03-10 14:50:00 +09:00
Thomas Munro	44bf3d5083	Add missing pthread_barrier_t. Supply a simple implementation of the missing pthread_barrier_t type and functions, for macOS. Discussion: https://postgr.es/m/20200227180100.zyvjwzcpiokfsqm2%40alap3.anarazel.de	2021-03-10 17:44:04 +13:00
Thomas Munro	547f04e734	pgbench: Improve time logic. Instead of instr_time (struct timespec) and the INSTR_XXX macros, introduce pg_time_usec_t and use integer arithmetic. Don't include the connection time in TPS unless using -C mode, but report it separately. Author: Fabien COELHO <coelho@cri.ensmp.fr> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Discussion: https://postgr.es/m/20200227180100.zyvjwzcpiokfsqm2%40alap3.anarazel.de	2021-03-10 17:44:04 +13:00
Michael Paquier	5bca69a76b	Add support for PROVE_TESTS and PROVE_FLAGS in MSVC scripts These can be set in buildenv.pl or a "set" command within a Windows terminal. The MSVC script vcregress.pl parses the values available in the environment to build the resulting prove commands, and the parsing of PROVE_TESTS is able to handle name patterns in the same way as other platforms. Not specifying those environment values makes vcregress.pl fall back to the previous default, with no extra flags for the prove command, and all the tests run within t/. Author: Michael Paquier Reviewed-by: Juan José Santamaría Flecha, Julien Rouhaud Discussion: https://postgr.es/m/YD9GigwHoL6lFY2y@paquier.xyz	2021-03-05 10:12:49 +09:00
Thomas Munro	8eda3eba30	Use sort_template.h for qsort_tuple() and qsort_ssup(). Replace the Perl code previously used to generate specialized sort functions with sort_template.h. Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CA%2BhUKGJ2-eaDqAum5bxhpMNhvuJmRDZxB_Tow0n-gse%2BHG0Yig%40mail.gmail.com	2021-03-03 17:02:32 +13:00
Thomas Munro	0a1f1d3cac	Add sort_template.h for making sort functions. Move our qsort implementation into a header that can be used to define specialized functions for better performance and reduced duplication. Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CA%2BhUKGJ2-eaDqAum5bxhpMNhvuJmRDZxB_Tow0n-gse%2BHG0Yig%40mail.gmail.com	2021-03-03 17:02:22 +13:00
Amit Kapila	ce0fdbfe97	Allow multiple xacts during table sync in logical replication. For the initial table data synchronization in logical replication, we use a single transaction to copy the entire table and then synchronize the position in the stream with the main apply worker. There are multiple downsides of this approach: (a) We have to perform the entire copy operation again if there is any error (network breakdown, error in the database operation, etc.) while we synchronize the WAL position between tablesync worker and apply worker; this will be onerous especially for large copies, (b) Using a single transaction in the synchronization-phase (where we can receive WAL from multiple transactions) will have the risk of exceeding the CID limit, (c) The slot will hold the WAL till the entire sync is complete because we never commit till the end. This patch solves all the above downsides by allowing multiple transactions during the tablesync phase. The initial copy is done in a single transaction and after that, we commit each transaction as we receive. To allow recovery after any error or crash, we use a permanent slot and origin to track the progress. The slot and origin will be removed once we finish the synchronization of the table. We also remove slot and origin of tablesync workers if the user performs DROP SUBSCRIPTION .. or ALTER SUBSCRIPTION .. REFERESH and some of the table syncs are still not finished. The commands ALTER SUBSCRIPTION ... REFRESH PUBLICATION and ALTER SUBSCRIPTION ... SET PUBLICATION ... with refresh option as true cannot be executed inside a transaction block because they can now drop the slots for which we have no provision to rollback. This will also open up the path for logical replication of 2PC transactions on the subscriber side. Previously, we can't do that because of the requirement of maintaining a single transaction in tablesync workers. Bump catalog version due to change of state in the catalog (pg_subscription_rel). Author: Peter Smith, Amit Kapila, and Takamichi Osumi Reviewed-by: Ajin Cherian, Petr Jelinek, Hou Zhijie and Amit Kapila Discussion: https://postgr.es/m/CAA4eK1KHJxaZS-fod-0fey=0tq3=Gkn4ho=8N4-5HWiCfu0H1A@mail.gmail.com	2021-02-12 07:41:51 +05:30
Robert Haas	e955bd4b6c	Move some code from src/bin/scripts to src/fe_utils to permit reuse. The parallel slots infrastructure (which implements client-side multiplexing of server connections doing similar things, not threading or multiple processes or anything like that) are moved from src/bin/scripts/scripts_parallel.c to src/fe_utils/parallel_slot.c. The functions consumeQueryResult() and processQueryResult() which were previously part of src/bin/scripts/common.c are now moved into that file as well, becoming static helper functions. This might need to be changed in the future, but currently they're not used for anything else. Some other functions from src/bin/scripts/common.c are moved to to src/fe_utils and are split up among several files. connectDatabase(), connectMaintenanceDatabase(), and disconnectDatabase() are moved to connect_utils.c. executeQuery(), executeCommand(), and executeMaintenanceCommand() are move to query_utils.c. handle_help_version_opts() is moved to option_utils.c. Mark Dilger, reviewed by me. The larger patch series of which this is a part has also had review from Peter Geoghegan, Andres Freund, Álvaro Herrera, Michael Paquier, and Amul Sul, but I don't know whether any of them have reviewed this bit specifically. Discussion: http://postgr.es/m/12ED3DA8-25F0-4B68-937D-D907CFBF08E7@enterprisedb.com Discussion: http://postgr.es/m/5F743835-3399-419C-8324-2D424237E999@enterprisedb.com Discussion: http://postgr.es/m/70655DF3-33CE-4527-9A4D-DDEB582B6BA0@enterprisedb.com	2021-02-05 13:33:38 -05:00
Tom Lane	ef3d4613c0	Retire findoidjoins. In the wake of commit `62f34097c`, we no longer need this tool. Discussion: https://postgr.es/m/3240355.1612129197@sss.pgh.pa.us	2021-02-02 17:21:37 -05:00

1 2 3 4 5 ...

1819 Commits