postgres

mirror of https://github.com/postgres/postgres.git synced 2025-11-07 19:06:32 +03:00

Author	SHA1	Message	Date
Stephen Frost	c37b3d08ca	Allow group access on PGDATA Allow the cluster to be optionally init'd with read access for the group. This means a relatively non-privileged user can perform a backup of the cluster without requiring write privileges, which enhances security. The mode of PGDATA is used to determine whether group permissions are enabled for directory and file creates. This method was chosen as it's simple and works well for the various utilities that write into PGDATA. Changing the mode of PGDATA manually will not automatically change the mode of all the files contained therein. If the user would like to enable group access on an existing cluster then changing the mode of all the existing files will be required. Note that pg_upgrade will automatically change the mode of all migrated files if the new cluster is init'd with the -g option. Tests are included for the backend and all the utilities which operate on the PG data directory to ensure that the correct mode is set based on the data directory permissions. Author: David Steele <david@pgmasters.net> Reviewed-By: Michael Paquier, with discussion amongst many others. Discussion: https://postgr.es/m/ad346fe6-b23e-59f1-ecb7-0e08390ad629%40pgmasters.net	2018-04-07 17:45:39 -04:00
Stephen Frost	da9b580d89	Refactor dir/file permissions Consolidate directory and file create permissions for tools which work with the PG data directory by adding a new module (common/file_perm.c) that contains variables (pg_file_create_mode, pg_dir_create_mode) and constants to initialize them (0600 for files and 0700 for directories). Convert mkdir() calls in the backend to MakePGDirectory() if the original call used default permissions (always the case for regular PG directories). Add tests to make sure permissions in PGDATA are set correctly by the tools which modify the PG data directory. Authors: David Steele <david@pgmasters.net>, Adam Brightwell <adam.brightwell@crunchydata.com> Reviewed-By: Michael Paquier, with discussion amongst many others. Discussion: https://postgr.es/m/ad346fe6-b23e-59f1-ecb7-0e08390ad629%40pgmasters.net	2018-04-07 17:45:39 -04:00
Alvaro Herrera	499be013de	Support partition pruning at execution time Existing partition pruning is only able to work at plan time, for query quals that appear in the parsed query. This is good but limiting, as there can be parameters that appear later that can be usefully used to further prune partitions. This commit adds support for pruning subnodes of Append which cannot possibly contain any matching tuples, during execution, by evaluating Params to determine the minimum set of subnodes that can possibly match. We support more than just simple Params in WHERE clauses. Support additionally includes: 1. Parameterized Nested Loop Joins: The parameter from the outer side of the join can be used to determine the minimum set of inner side partitions to scan. 2. Initplans: Once an initplan has been executed we can then determine which partitions match the value from the initplan. Partition pruning is performed in two ways. When Params external to the plan are found to match the partition key we attempt to prune away unneeded Append subplans during the initialization of the executor. This allows us to bypass the initialization of non-matching subplans meaning they won't appear in the EXPLAIN or EXPLAIN ANALYZE output. For parameters whose value is only known during the actual execution then the pruning of these subplans must wait. Subplans which are eliminated during this stage of pruning are still visible in the EXPLAIN output. In order to determine if pruning has actually taken place, the EXPLAIN ANALYZE must be viewed. If a certain Append subplan was never executed due to the elimination of the partition then the execution timing area will state "(never executed)". Whereas, if, for example in the case of parameterized nested loops, the number of loops stated in the EXPLAIN ANALYZE output for certain subplans may appear lower than others due to the subplan having been scanned fewer times. This is due to the list of matching subnodes having to be evaluated whenever a parameter which was found to match the partition key changes. This commit required some additional infrastructure that permits the building of a data structure which is able to perform the translation of the matching partition IDs, as returned by get_matching_partitions, into the list index of a subpaths list, as exist in node types such as Append, MergeAppend and ModifyTable. This allows us to translate a list of clauses into a Bitmapset of all the subpath indexes which must be included to satisfy the clause list. Author: David Rowley, based on an earlier effort by Beena Emerson Reviewers: Amit Langote, Robert Haas, Amul Sul, Rajkumar Raghuwanshi, Jesper Pedersen Discussion: https://postgr.es/m/CAOG9ApE16ac-_VVZVvv0gePSgkg_BwYEV1NBqZFqDR2bBE0X0A@mail.gmail.com	2018-04-07 17:54:39 -03:00
Alvaro Herrera	5c0675215e	Add bms_prev_member function This works very much like the existing bms_last_member function, only it traverses through the Bitmapset in the opposite direction from the most significant bit down to the least significant bit. A special prevbit value of -1 may be used to have the function determine the most significant bit. This is useful for starting a loop. When there are no members less than prevbit, the function returns -2 to indicate there are no more members. Author: David Rowley Discussion: https://postgr.es/m/CAKJS1f-K=3d5MDASNYFJpUpc20xcBnAwNC1-AOeunhn0OtkWbQ@mail.gmail.com	2018-04-07 17:54:39 -03:00
Andres Freund	f16241bef7	Raise error when affecting tuple moved into different partition. When an update moves a row between partitions (supported since `2f17844104`), our normal logic for following update chains in READ COMMITTED mode doesn't work anymore. Cross partition updates are modeled as an delete from the old and insert into the new partition. No ctid chain exists across partitions, and there's no convenient space to introduce that link. Not throwing an error in a partitioned context when one would have been thrown without partitioning is obviously problematic. This commit introduces infrastructure to detect when a tuple has been moved, not just plainly deleted. That allows to throw an error when encountering a deletion that's actually a move, while attempting to following a ctid chain. The row deleted as part of a cross partition update is marked by pointing it's t_ctid to an invalid block, instead of self as a normal update would. That was deemed to be the least invasive and most future proof way to represent the knowledge, given how few infomask bits are there to be recycled (there's also some locking issues with using infomask bits). External code following ctid chains should be updated to check for moved tuples. The most likely consequence of not doing so is a missed error. Author: Amul Sul, editorialized by me Reviewed-By: Amit Kapila, Pavan Deolasee, Andres Freund, Robert Haas Discussion: http://postgr.es/m/CAAJ_b95PkwojoYfz0bzXU8OokcTVGzN6vYGCNVUukeUDrnF3dw@mail.gmail.com	2018-04-07 13:24:27 -07:00
Teodor Sigaev	8224de4f42	Indexes with INCLUDE columns and their support in B-tree This patch introduces INCLUDE clause to index definition. This clause specifies a list of columns which will be included as a non-key part in the index. The INCLUDE columns exist solely to allow more queries to benefit from index-only scans. Also, such columns don't need to have appropriate operator classes. Expressions are not supported as INCLUDE columns since they cannot be used in index-only scans. Index access methods supporting INCLUDE are indicated by amcaninclude flag in IndexAmRoutine. For now, only B-tree indexes support INCLUDE clause. In B-tree indexes INCLUDE columns are truncated from pivot index tuples (tuples located in non-leaf pages and high keys). Therefore, B-tree indexes now might have variable number of attributes. This patch also provides generic facility to support that: pivot tuples contain number of their attributes in t_tid.ip_posid. Free 13th bit of t_info is used for indicating that. This facility will simplify further support of index suffix truncation. The changes of above are backward-compatible, pg_upgrade doesn't need special handling of B-tree indexes for that. Bump catalog version Author: Anastasia Lubennikova with contribition by Alexander Korotkov and me Reviewed by: Peter Geoghegan, Tomas Vondra, Antonin Houska, Jeff Janes, David Rowley, Alexander Korotkov Discussion: https://www.postgresql.org/message-id/flat/56168952.4010101@postgrespro.ru	2018-04-07 23:00:39 +03:00
Teodor Sigaev	1c1791e000	Add json(b)_to_tsvector function Jsonb has a complex nature so there isn't best-for-everything way to convert it to tsvector for full text search. Current to_tsvector(json(b)) suggests to convert only string values, but it's possible to index keys, numerics and even booleans value. To solve that json(b)_to_tsvector has a second required argument contained a list of desired types of json fields. Second argument is a jsonb scalar or array right now with possibility to add new options in a future. Bump catalog version Author: Dmitry Dolgov with some editorization by me Reviewed by: Teodor Sigaev Discussion: https://www.postgresql.org/message-id/CA+q6zcXJQbS1b4kJ_HeAOoOc=unfnOrUEL=KGgE32QKDww7d8g@mail.gmail.com	2018-04-07 20:58:03 +03:00
Peter Eisentraut	039eb6e92f	Logical replication support for TRUNCATE Update the built-in logical replication system to make use of the previously added logical decoding for TRUNCATE support. Add the required truncate callback to pgoutput and a new logical replication protocol message. Publications get a new attribute to determine whether to replicate truncate actions. When updating a publication via pg_dump from an older version, this is not set, thus preserving the previous behavior. Author: Simon Riggs <simon@2ndquadrant.com> Author: Marco Nenciarini <marco.nenciarini@2ndquadrant.it> Author: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> Reviewed-by: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>	2018-04-07 11:34:11 -04:00
Peter Eisentraut	5dfd1e5a66	Logical decoding of TRUNCATE Add a new WAL record type for TRUNCATE, which is only used when wal_level >= logical. (For physical replication, TRUNCATE is already replicated via SMGR records.) Add new callback for logical decoding output plugins to receive TRUNCATE actions. Author: Simon Riggs <simon@2ndquadrant.com> Author: Marco Nenciarini <marco.nenciarini@2ndquadrant.it> Author: Peter Eisentraut <peter.eisentraut@2ndquadrant.com> Reviewed-by: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>	2018-04-07 11:34:10 -04:00
Teodor Sigaev	b508a56f2f	Predicate locking in hash indexes. Hash index searches acquire predicate locks on the primary page of a bucket. It acquires a lock on both the old and new buckets for scans that happen concurrently with page splits. During a bucket split, a predicate lock is copied from the primary page of an old bucket to the primary page of a new bucket. Author: Shubham Barai, Amit Kapila Reviewed by: Amit Kapila, Alexander Korotkov, Thomas Munro Discussion: https://www.postgresql.org/message-id/flat/CALxAEPvNsM2GTiXdRgaaZ1Pjd1bs+sxfFsf7Ytr+iq+5JJoYXA@mail.gmail.com	2018-04-07 16:59:14 +03:00
Alvaro Herrera	971d7ddbe1	Document partprune.c a little better Author: Amit Langote Reviewed-by: Álvaro Herrera, David Rowley Discussion: https://postgr.es/m/CA+HiwqGzq4D6z=8R0AP+XhbTFCQ-4Ct+t2ekqjE9Fpm84_JUGg@mail.gmail.com	2018-04-07 10:35:38 -03:00
Andres Freund	8c3debbbf6	Fix and improve pg_atomic_flag fallback implementation. The atomics fallback implementation for pg_atomic_flag was broken, returning the inverted value from pg_atomic_test_set_flag(). This was unnoticed because a) atomic flags were unused until recently b) the test code wasn't run when the fallback implementation was in use (because it didn't allow to test for some edge cases). Fix the bug, and improve the fallback so it has the same behaviour as the non-fallback implementation in the problematic edge cases. That breaks ABI compatibility in the back branches when fallbacks are in use, but given they were broken until now... Author: Andres Freund Reported-by: Daniel Gustafsson Discussion: https://postgr.es/m/FB948276-7B32-4B77-83E6-D00167F8EEB4@yesql.se https://postgr.es/m/20180406233854.uni2h3mbnveczl32@alap3.anarazel.de Backpatch: 9.5-, where the atomics abstraction was introduced.	2018-04-06 19:55:32 -07:00
Robert Haas	47cb9ca49a	Fix possible failure in parallel index build. Report and proposed fix by David Rowley, put in patch form by Peter Geoghegan. Discussion: http://postgr.es/m/CAKJS1f91kq1wfYR8rnRRfKtxyhU2woEA+=whd640UxMyU+O0EQ@mail.gmail.com	2018-04-06 19:28:48 -04:00
Robert Haas	3d956d9562	Allow insert and update tuple routing and COPY for foreign tables. Also enable this for postgres_fdw. Etsuro Fujita, based on an earlier patch by Amit Langote. The larger patch series of which this is a part has been reviewed by Amit Langote, David Fetter, Maksim Milyutin, Álvaro Herrera, Stephen Frost, and me. Minor documentation changes to the final version by me. Discussion: http://postgr.es/m/29906a26-da12-8c86-4fb9-d8f88442f2b9@lab.ntt.co.jp	2018-04-06 19:22:03 -04:00
Alvaro Herrera	9fdb675fc5	Faster partition pruning Add a new module backend/partitioning/partprune.c, implementing a more sophisticated algorithm for partition pruning. The new module uses each partition's "boundinfo" for pruning instead of constraint exclusion, based on an idea proposed by Robert Haas of a "pruning program": a list of steps generated from the query quals which are run iteratively to obtain a list of partitions that must be scanned in order to satisfy those quals. At present, this targets planner-time partition pruning, but there exist further patches to apply partition pruning at execution time as well. This commit also moves some definitions from include/catalog/partition.h to a new file include/partitioning/partbounds.h, in an attempt to rationalize partitioning related code. Authors: Amit Langote, David Rowley, Dilip Kumar Reviewers: Robert Haas, Kyotaro Horiguchi, Ashutosh Bapat, Jesper Pedersen. Discussion: https://postgr.es/m/098b9c71-1915-1a2a-8d52-1a7a50ce79e8@lab.ntt.co.jp	2018-04-06 16:44:05 -03:00
Stephen Frost	11523e860f	Support new default roles with adminpack This provides a newer version of adminpack which works with the newly added default roles to support GRANT'ing to non-superusers access to read and write files, along with related functions (unlinking files, getting file length, renaming/removing files, scanning the log file directory) which are supported through adminpack. Note that new versions of the functions are required because an environment might have an updated version of the library but still have the old adminpack 1.0 catalog definitions (where EXECUTE is GRANT'd to PUBLIC for the functions). This patch also removes the long-deprecated alternative names for functions that adminpack used to include and which are now included in the backend, in adminpack v1.1. Applications using the deprecated names should be updated to use the backend functions instead. Existing installations which continue to use adminpack v1.0 should continue to function until/unless adminpack is upgraded. Reviewed-By: Michael Paquier Discussion: https://postgr.es/m/20171231191939.GR2416%40tamriel.snowman.net	2018-04-06 14:47:10 -04:00
Stephen Frost	0fdc8495bf	Add default roles for file/program access This patch adds new default roles named 'pg_read_server_files', 'pg_write_server_files', 'pg_execute_server_program' which allow an administrator to GRANT to a non-superuser role the ability to access server-side files or run programs through PostgreSQL (as the user the database is running as). Having one of these roles allows a non-superuser to use server-side COPY to read, write, or with a program, and to use file_fdw (if installed by a superuser and GRANT'd USAGE on it) to read from files or run a program. The existing misc file functions are also changed to allow a user with the 'pg_read_server_files' default role to read any files on the filesystem, matching the privileges given to that role through COPY and file_fdw from above. Reviewed-By: Michael Paquier Discussion: https://postgr.es/m/20171231191939.GR2416%40tamriel.snowman.net	2018-04-06 14:47:10 -04:00
Stephen Frost	e79350fef2	Remove explicit superuser checks in favor of ACLs This removes the explicit superuser checks in the various file-access functions in the backend, specifically pg_ls_dir(), pg_read_file(), pg_read_binary_file(), and pg_stat_file(). Instead, EXECUTE is REVOKE'd from public for these, meaning that only a superuser is able to run them by default, but access to them can be GRANT'd to other roles. Reviewed-By: Michael Paquier Discussion: https://postgr.es/m/20171231191939.GR2416%40tamriel.snowman.net	2018-04-06 14:47:10 -04:00
Peter Eisentraut	94c1f9ba11	Add memory context identifier to portal context Discussion: https://www.postgresql.org/message-id/6421.1522194949@sss.pgh.pa.us	2018-04-06 12:37:54 -04:00
Peter Eisentraut	bbca77623f	Rename MemoryContextCopySetIdentifier() for clarity MemoryContextCopySetIdentifier -> MemoryContextCopyAndSetIdentifier Discussion: https://www.postgresql.org/message-id/6421.1522194949@sss.pgh.pa.us	2018-04-06 12:37:54 -04:00
Robert Haas	cfbecf8100	Enforce child constraints during COPY TO a partitioned table. The previous coding inadvertently checked the constraints for the partitioned table rather than the target partition, which could lead to data in a partition that fails to satisfy some constraint on that partition. This problem seems to date back to when table partitioning was introduced; prior to that, there was only one target table for a COPY, so the problem didn't occur, and the code just didn't get updated. Etsuro Fujita, reviewed by Amit Langote and Ashutosh Bapat Discussion: https://postgr.es/message-id/5ABA4074.1090500%40lab.ntt.co.jp	2018-04-06 11:42:28 -04:00
Peter Eisentraut	bcf79b5bb6	Split the SetSubscriptionRelState function into two We don't actually need the insert-or-update logic, so it's clearer to have separate functions for the inserting and updating. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>	2018-04-06 10:00:26 -04:00
Peter Eisentraut	c25304a945	Improve messaging during logical replication worker startup In case the subscription is removed before the worker is fully started, give a specific error message instead of the generic "cache lookup" error. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>	2018-04-06 09:07:09 -04:00
Simon Riggs	f1464c5380	Improve parse representation for MERGE Separation of parser data structures from executor, as requested by Tom Lane. Further improvements possible. While there, implement error for multiple VALUES clauses via parser to allow line number of error, as requested by Andres Freund. Author: Pavan Deolasee Discussion: https://www.postgresql.org/message-id/CABOikdPpqjectFchg0FyTOpsGXyPoqwgC==OLKWuxgBOsrDDZw@mail.gmail.com	2018-04-06 09:38:59 +01:00
Magnus Hagander	1fde38beaa	Allow on-line enabling and disabling of data checksums This makes it possible to turn checksums on in a live cluster, without the previous need for dump/reload or logical replication (and to turn it off). Enabling checkusm starts a background process in the form of a launcher/worker combination that goes through the entire database and recalculates checksums on each and every page. Only when all pages have been checksummed are they fully enabled in the cluster. Any failure of the process will revert to checksums off and the process has to be started. This adds a new WAL record that indicates the state of checksums, so the process works across replicated clusters. Authors: Magnus Hagander and Daniel Gustafsson Review: Tomas Vondra, Michael Banck, Heikki Linnakangas, Andrey Borodin	2018-04-05 22:04:48 +02:00
Simon Riggs	530e69e59b	Allow cpluspluscheck to pass by renaming variable Use of a C++ keyword as a function name caused problems Reported-by: Álvaro Herrera	2018-04-05 20:06:02 +01:00
Magnus Hagander	eed1ce72e1	Allow background workers to bypass datallowconn THis adds a "flags" field to the BackgroundWorkerInitializeConnection() and BackgroundWorkerInitializeConnectionByOid(). For now only one flag, BGWORKER_BYPASS_ALLOWCONN, is defined, which allows the worker to ignore datallowconn.	2018-04-05 19:02:45 +02:00
Teodor Sigaev	1664ae1978	Add websearch_to_tsquery Error-tolerant conversion function with web-like syntax for search query, it simplifies constraining search engine with close to habitual interface for users. Bump catalog version Authors: Victor Drobny, Dmitry Ivanov with editorization by me Reviewed by: Aleksander Alekseev, Tomas Vondra, Thomas Munro, Aleksandr Parfenov Discussion: https://www.postgresql.org/message-id/flat/fe931111ff7e9ad79196486ada79e268@postgrespro.ru	2018-04-05 19:55:11 +03:00
Teodor Sigaev	0a64b45152	Fix handling of non-upgraded B-tree metapages `857f9c36` bumps B-tree metapage version while upgrade is performed "on the fly" when needed. However, some asserts fired when old version metapage was cached to rel->rd_amcache. Despite new metadata fields are never used from rel->rd_amcache, that needs to be fixed. This patch introduces metadata upgrade during its caching, which fills unavailable fields with their default values. contrib/pageinspect is also patched to handle non-upgraded metapages in the same way. Author: Alexander Korotkov	2018-04-05 17:56:00 +03:00
Simon Riggs	01b88b4df5	MERGE minor errata	2018-04-05 13:19:13 +01:00
Simon Riggs	3af7b2b0d4	MERGE fix variable warning in non-assert builds Author: Jesper Pedersen	2018-04-05 13:02:29 +01:00
Teodor Sigaev	17d8beb4f5	Remove unused vars and mark assert-only vars Kyotaro HORIGUCHI	2018-04-05 13:16:15 +03:00
Teodor Sigaev	51e6562324	Fix typo Masahiko Sawada	2018-04-05 13:04:18 +03:00
Simon Riggs	4b2d44031f	MERGE post-commit review Review comments from Andres Freund * Consolidate code into AfterTriggerGetTransitionTable() * Rename nodeMerge.c to execMerge.c * Rename nodeMerge.h to execMerge.h * Move MERGE handling in ExecInitModifyTable() into a execMerge.c ExecInitMerge() * Move mt_merge_subcommands flags into execMerge.h * Rename opt_and_condition to opt_merge_when_and_condition * Wordsmith various comments Author: Pavan Deolasee Reviewer: Simon Riggs	2018-04-05 09:54:07 +01:00
Andrew Gierth	1fd8690668	Install errcodes.txt for use by extensions. Maintainers of out-of-tree PLs typically need access to the set of error codes. To avoid the need to duplicate that information in some form in PL source trees, provide errcodes.txt as part of a server installation. Thomas Munro, based on a suggestion from Andrew Gierth Discussion: https://postgr.es/m/87woykk7mu.fsf%40news-spur.riddles.org.uk	2018-04-05 04:05:40 +01:00
Alvaro Herrera	7d7c99790b	Restore erroneously removed ONLY from PK check This is a blind fix, since I don't have SE-Linux to verify it. Per unwanted change in rhinoceros, running sepgsql tests. Noted by Tom Lane. Discussion: https://postgr.es/m/32347.1522865050@sss.pgh.pa.us	2018-04-04 16:38:11 -03:00
Tom Lane	1383e2a1a9	Improve FSM management for BRIN indexes. BRIN indexes like to propagate additions of free space into the upper pages of their free space maps as soon as the new space is known, even when it's just on one individual index page. Previously this required calling FreeSpaceMapVacuum, which is quite an expensive thing if the map is large. Use the FreeSpaceMapVacuumRange function recently added by commit `c79f6df75` to reduce the amount of work done for this purpose. Fix a couple of places that neglected to do the upper-page vacuuming at all after recording new free space. If the policy is to be that BRIN should do that, it should do it everywhere. Do RecordPageWithFreeSpace unconditionally in brin_page_cleanup, and do FreeSpaceMapVacuum unconditionally in brin_vacuum_scan. Because of the FSM's imprecise storage of free space, the old complications here seldom bought anything, they just slowed things down. This approach also provides a predictable path for FSM corruption to be repaired. Remove premature RecordPageWithFreeSpace call in brin_getinsertbuffer where it's about to return an extended page to the caller. The caller should do that, instead, after it's inserted its new tuple. Fix the one caller that forgot to do so. Simplify logic in brin_doupdate's same-page-update case by postponing brin_initialize_empty_new_buffer to after the critical section; I see little point in doing it before. Avoid repeat calls of RelationGetNumberOfBlocks in brin_vacuum_scan. Avoid duplicate BufferGetBlockNumber and BufferGetPage calls in a couple of places where we already had the right values. Move a BRIN_elog debug logging call out of a critical section; that's pretty unsafe and I don't think it buys us anything to not wait till after the critical section. Move the "extended = false" step in brin_getinsertbuffer into the routine's main loop. There's no actual bug there, since the loop can't iterate with extended still true, but it doesn't seem very future-proof as coded; and it's certainly not documented as a loop invariant. This is all from follow-on investigation inspired by commit `c79f6df75`. Discussion: https://postgr.es/m/5801.1522429460@sss.pgh.pa.us	2018-04-04 14:26:04 -04:00
Alvaro Herrera	3de241dba8	Foreign keys on partitioned tables Author: Álvaro Herrera Discussion: https://postgr.es/m/20171231194359.cvojcour423ulha4@alvherre.pgsql Reviewed-by: Peter Eisentraut	2018-04-04 14:02:49 -03:00
Teodor Sigaev	857f9c36cd	Skip full index scan during cleanup of B-tree indexes when possible Vacuum of index consists from two stages: multiple (zero of more) ambulkdelete calls and one amvacuumcleanup call. When workload on particular table is append-only, then autovacuum isn't intended to touch this table. However, user may run vacuum manually in order to fill visibility map and get benefits of index-only scans. Then ambulkdelete wouldn't be called for indexes of such table (because no heap tuples were deleted), only amvacuumcleanup would be called In this case, amvacuumcleanup would perform full index scan for two objectives: put recyclable pages into free space map and update index statistics. This patch allows btvacuumclanup to skip full index scan when two conditions are satisfied: no pages are going to be put into free space map and index statistics isn't stalled. In order to check first condition, we store oldest btpo_xact in the meta-page. When it's precedes RecentGlobalXmin, then there are some recyclable pages. In order to check second condition we store number of heap tuples observed during previous full index scan by cleanup. If fraction of newly inserted tuples is less than vacuum_cleanup_index_scale_factor, then statistics isn't considered to be stalled. vacuum_cleanup_index_scale_factor can be defined as both reloption and GUC (default). This patch bumps B-tree meta-page version. Upgrade of meta-page is performed "on the fly": during VACUUM meta-page is rewritten with new version. No special handling in pg_upgrade is required. Author: Masahiko Sawada, Alexander Korotkov Review by: Peter Geoghegan, Kyotaro Horiguchi, Alexander Korotkov, Yura Sokolov Discussion: https://www.postgresql.org/message-id/flat/CAD21AoAX+d2oD_nrd9O2YkpzHaFr=uQeGr9s1rKC3O4ENc568g@mail.gmail.com	2018-04-04 19:29:00 +03:00
Alvaro Herrera	851f4b4e14	Don't clone internal triggers to partitions Trigger cloning to partitions was supposed to occur for user-visible triggers only, but during development the protection that prevented it from occurring to internal triggers was lost. Reinstate it, as well as add a test case to ensure internal triggers (in the tested case, triggers implementing a deferred unique constraint) are not cloned. Without the code fix, the partitions in the test end up with different numbers of triggers, which is clearly wrong ... Bug in `86f575948c`. Discussion: https://postgr.es/m/20180403214903.ozfagwjcpk337uw7@alvherre.pgsql	2018-04-03 19:08:25 -03:00
Alvaro Herrera	cd5005bc12	Pass correct TupDesc to ri_NullCheck() in Assert Previous coding was passing the wrong table's tuple descriptor, which accidentally fails to fail because no existing test case exercises a foreign key in which the referenced attributes are further to the right of the referencing attributes. Add a test so that further breakage is visible. This got broken in `16828d5c02`. Discussion: https://postgr.es/m/20180403204723.fqte755nukgm42uf@alvherre.pgsql	2018-04-03 18:04:50 -03:00
Tom Lane	dddfc4cb2e	Prevent accidental linking of system-supplied copies of libpq.so etc. We were being careless in some places about the order of -L switches in link command lines, such that -L switches referring to external directories could come before those referring to directories within the build tree. This made it possible to accidentally link a system-supplied library, for example /usr/lib/libpq.so, in place of the one built in the build tree. Hilarity ensued, the more so the older the system-supplied library is. To fix, break LDFLAGS into two parts, a sub-variable LDFLAGS_INTERNAL and the main LDFLAGS variable, both of which are "recursively expanded" so that they can be incrementally adjusted by different makefiles. Establish a policy that -L switches for directories in the build tree must always be added to LDFLAGS_INTERNAL, while -L switches for external directories must always be added to LDFLAGS. This is sufficient to ensure a safe search order. For simplicity, we typically also put -l switches for the respective libraries into those same variables. (Traditional make usage would have us put -l switches into LIBS, but cleaning that up is a project for another day, as there's no clear need for it.) This turns out to also require separating SHLIB_LINK into two variables, SHLIB_LINK and SHLIB_LINK_INTERNAL, with a similar rule about which switches go into which variable. And likewise for PG_LIBS. Although this change might appear to affect external users of pgxs.mk, I think it doesn't; they shouldn't have any need to touch the _INTERNAL variables. In passing, tweak src/common/Makefile so that the value of CPPFLAGS recorded in pg_config lacks "-DFRONTEND" and the recorded value of LDFLAGS lacks "-L../../../src/common". Both of those things are mistakes, apparently introduced during prior code rearrangements, as old versions of pg_config don't print them. In general we don't want anything that's specific to the src/common subdirectory to appear in those outputs. This is certainly a bug fix, but in view of the lack of field complaints, I'm unsure whether it's worth the risk of back-patching. In any case it seems wise to see what the buildfarm makes of it first. Discussion: https://postgr.es/m/25214.1522604295@sss.pgh.pa.us	2018-04-03 16:26:05 -04:00
Bruce Momjian	242408dbef	C comment: mention null handling in BuildTupleFromCStrings() Discussion: https://postgr.es/m/CAFjFpRcF-wNbe0w-m3NpkEwr9shmOZ=GoESOzd2Wog9h55J8sA@mail.gmail.com Author: Ashutosh Bapat	2018-04-03 14:01:14 -04:00
Teodor Sigaev	710d90da1f	Add prefix operator for TEXT type. The prefix operator along with SP-GiST indexes can be used as an alternative for LIKE 'word%' commands and it doesn't have a limitation of string/prefix length as B-Tree has. Bump catalog version Author: Ildus Kurbangaliev with some editorization by me Review by: Arthur Zakirov, Alexander Korotkov, and me Discussion: https://www.postgresql.org/message-id/flat/20180202180327.222b04b3@wp.localdomain	2018-04-03 19:46:45 +03:00
Magnus Hagander	10d62d1065	Properly use INT64_FORMAT in output Per buildfarm animal prairiedog, suggestion solution from Tom.	2018-04-03 16:39:29 +02:00
Magnus Hagander	a08dc71195	Fix for checksum validation patch Reorder the check for non-BLCKSZ size reads to make sure we don't abort sending the file in this case. Missed in the previous commit.	2018-04-03 13:57:49 +02:00
Magnus Hagander	4eb77d50c2	Validate page level checksums in base backups When base backups are run over the replication protocol (for example using pg_basebackup), verify the checksums of all data blocks if checksums are enabled. If checksum failures are encountered, log them as warnings but don't abort the backup. This becomes the default behaviour in pg_basebackup (provided checksums are enabled on the server), so add a switch (-k) to disable the checks if necessary. Author: Michael Banck Reviewed-By: Magnus Hagander, David Steele Discussion: https://postgr.es/m/20180228180856.GE13784@nighthawk.caipicrew.dd-dns.de	2018-04-03 13:47:16 +02:00
Simon Riggs	aa3faa3c7a	WITH support in MERGE Author: Peter Geoghegan Recursive support removed, no tests Docs added by me	2018-04-03 12:13:59 +01:00
Simon Riggs	83454e3c2b	New files for MERGE	2018-04-03 10:22:21 +01:00
Simon Riggs	d204ef6377	MERGE SQL Command following SQL:2016 MERGE performs actions that modify rows in the target table using a source table or query. MERGE provides a single SQL statement that can conditionally INSERT/UPDATE/DELETE rows a task that would other require multiple PL statements. e.g. MERGE INTO target AS t USING source AS s ON t.tid = s.sid WHEN MATCHED AND t.balance > s.delta THEN UPDATE SET balance = t.balance - s.delta WHEN MATCHED THEN DELETE WHEN NOT MATCHED AND s.delta > 0 THEN INSERT VALUES (s.sid, s.delta) WHEN NOT MATCHED THEN DO NOTHING; MERGE works with regular and partitioned tables, including column and row security enforcement, as well as support for row, statement and transition triggers. MERGE is optimized for OLTP and is parameterizable, though also useful for large scale ETL/ELT. MERGE is not intended to be used in preference to existing single SQL commands for INSERT, UPDATE or DELETE since there is some overhead. MERGE can be used statically from PL/pgSQL. MERGE does not yet support inheritance, write rules, RETURNING clauses, updatable views or foreign tables. MERGE follows SQL Standard per the most recent SQL:2016. Includes full tests and documentation, including full isolation tests to demonstrate the concurrent behavior. This version written from scratch in 2017 by Simon Riggs, using docs and tests originally written in 2009. Later work from Pavan Deolasee has been both complex and deep, leaving the lead author credit now in his hands. Extensive discussion of concurrency from Peter Geoghegan, with thanks for the time and effort contributed. Various issues reported via sqlsmith by Andreas Seltenreich Authors: Pavan Deolasee, Simon Riggs Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs Discussion: https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com	2018-04-03 09:28:16 +01:00

1 2 3 4 5 ...

18194 Commits