postgres

mirror of https://github.com/postgres/postgres.git synced 2025-12-06 00:02:13 +03:00

Author	SHA1	Message	Date
Greg Stark	9e6cd794c2	Fix ADD IF NOT EXISTS used in conjunction with ALTER TABLE ONLY The flag for IF NOT EXISTS was only being passed down in the normal recursing case. It's been this way since originally added in 9.6 in commit `2cd40adb85` so backpatch back to 9.6.	2018-12-19 19:41:06 -05:00
Amit Kapila	0fe0b2f474	Remove extra semicolons. Reported-by: David Rowley Author: David Rowley Reviewed-by: Amit Kapila Backpatch-through: 10 Discussion: https://postgr.es/m/CAKJS1f8EneeYyzzvdjahVZ6gbAHFkHbSFB5m_C0Y6TUJs9Dgdg@mail.gmail.com	2018-12-17 14:29:49 +05:30
Michael Paquier	91fc2a0883	Fix use-after-free bug when renaming constraints This is an oversight from recent commit `b13fd344`. While on it, tweak the previous test with a better name for the renamed primary key. Detected by buildfarm member prion which forces relation cache release with -DRELCACHE_FORCE_RELEASE. Back-patch down to 9.4 as the previous commit.	2018-12-17 12:43:48 +09:00
Michael Paquier	da13d90a5f	Make constraint rename issue relcache invalidation on target relation When a constraint gets renamed, it may have associated with it a target relation (for example domain constraints don't have one). Not invalidating the target relation cache when issuing the renaming can result in issues with subsequent commands that refer to the old constraint name using the relation cache, causing various failures. One pattern spotted was using CREATE TABLE LIKE after a constraint renaming. Reported-by: Stuart <sfbarbee@gmail.com> Author: Amit Langote Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/2047094.V130LYfLq4@station53.ousa.org	2018-12-17 10:36:21 +09:00
Tom Lane	34010ac2fa	Improve detection of child-process SIGPIPE failures. Commit `ffa4cbd62` added logic to detect SIGPIPE failure of a COPY child process, but it only worked correctly if the SIGPIPE occurred in the immediate child process. Depending on the shell in use and the complexity of the shell command string, we might instead get back an exit code of 128 + SIGPIPE, representing a shell error exit reporting SIGPIPE in the child process. We could just hack up ClosePipeToProgram() to add the extra case, but it seems like this is a fairly general issue deserving a more general and better-documented solution. I chose to add a couple of functions in src/common/wait_error.c, which is a natural place to know about wait-result encodings, that will test for either a specific child-process signal type or any child-process signal failure. Then, adjust other places that were doing ad-hoc tests of this type to use the common functions. In RestoreArchivedFile, this fixes a race condition affecting whether the process will report an error or just silently proc_exit(1): before, that depended on whether the intermediate shell got SIGTERM'd itself or reported a child process failing on SIGTERM. Like the previous patch, back-patch to v10; we could go further but there seems no real need to. Per report from Erik Rijkers. Discussion: https://postgr.es/m/f3683f87ab1701bea5d86a7742b22432@xs4all.nl	2018-12-16 14:32:14 -05:00
Tom Lane	54f24ab76d	Fix misapplication of pgstat_count_truncate to wrong relation. The stanza of ExecuteTruncate[Guts] that truncates a target table's toast relation re-used the loop local variable "rel" to reference the toast rel. This was safe enough when written, but commit `d42358efb` added code below that that supposed "rel" still pointed to the parent table. Therefore, the stats counter update was applied to the wrong relcache entry (the toast rel not the user rel); and if we were unlucky and that relcache entry had been flushed during reindex_relation, very bad things could ensue. (I'm surprised that CLOBBER_CACHE_ALWAYS testing hasn't found this. I'm even more surprised that the problem wasn't detected during the development of d42358efb; it must not have been tested in any case with a toast table, as the incorrect stats counts are very obvious.) To fix, replace use of "rel" in that code branch with a more local variable. Adjust test cases added by `d42358efb` so that some of them use tables with toast tables. Per bug #15540 from Pan Bian. Back-patch to 9.5 where `d42358efb` came in. Discussion: https://postgr.es/m/15540-01078812338195c0@postgresql.org	2018-12-07 12:12:00 -05:00
Tom Lane	aaf7b3bd6f	Clean up sloppy coding in publicationcmds.c's OpenTableList(). Remove dead code (which would be incorrect if it weren't dead), per report from Pan Bian. Add a CHECK_FOR_INTERRUPTS in the inner loop over child relations, because there's little point in having one in the outer loop if there's not one here too. Minor stylistic adjustments and comment improvements. Seems to be aboriginal to this code (cf commit `665d1fad9`). Back-patch to v10 where that came in, not because any of this is significant, but just to keep the branches looking similar. Discussion: https://postgr.es/m/15539-06d00ef6b1e2e1bb@postgresql.org	2018-12-07 11:02:39 -05:00
Tom Lane	0064d0e9f4	Add needed #include. Per POSIX, WIFSIGNALED and related macros are provided by <sys/wait.h>. Apparently on Linux they're also pulled in by some other inclusions, but BSD-ish systems are pickier. Fixes portability issue in `ffa4cbd62`. Per buildfarm.	2018-11-19 17:28:05 -05:00
Tom Lane	8285fae070	Handle EPIPE more sanely when we close a pipe reading from a program. Previously, any program launched by COPY TO/FROM PROGRAM inherited the server's setting of SIGPIPE handling, i.e. SIG_IGN. Hence, if we were doing COPY FROM PROGRAM and closed the pipe early, the child process would see EPIPE on its output file and typically would treat that as a fatal error, in turn causing the COPY to report error. Similarly, one could get a failure report from a query that didn't read all of the output from a contrib/file_fdw foreign table that uses file_fdw's PROGRAM option. To fix, ensure that child programs inherit SIG_DFL not SIG_IGN processing of SIGPIPE. This seems like an all-around better situation since if the called program wants some non-default treatment of SIGPIPE, it would expect to have to set that up for itself. Then in COPY, if it's COPY FROM PROGRAM and we stop reading short of detecting EOF, treat a SIGPIPE exit from the called program as a non-error condition. This still allows us to report an error for any case where the called program gets SIGPIPE on some other file descriptor. As coded, we won't report a SIGPIPE if we stop reading as a result of seeing an in-band EOF marker (e.g. COPY BINARY EOF marker). It's somewhat debatable whether we should complain if the called program continues to transmit data after an EOF marker. However, it seems like we should avoid throwing error in any questionable cases, especially in a back-patched fix, and anyway it would take additional code to make such an error get reported consistently. Back-patch to v10. We could go further back, since COPY FROM PROGRAM has been around awhile, but AFAICS the only way to reach this situation using core or contrib is via file_fdw, which has only supported PROGRAM sources since v10. The COPY statement per se has no feature whereby it'd stop reading without having hit EOF or an error already. Therefore, I don't see any upside to back-patching further that'd outweigh the risk of complaints about behavioral change. Per bug #15449 from Eric Cyr. Patch by me, review by Etsuro Fujita and Kyotaro Horiguchi Discussion: https://postgr.es/m/15449-1cf737dd5929450e@postgresql.org	2018-11-19 17:02:25 -05:00
Alvaro Herrera	85efd1a041	Disallow COPY FREEZE on partitioned tables This didn't actually work: COPY would fail to flush the right files, and instead would try to flush a non-existing file, causing the whole transaction to fail. Cope by raising an error as soon as the command is sent instead, to avoid a nasty later surprise. Of course, it would be much better to make it work, but we don't have a patch for that yet, and we don't know if we'll want to backpatch one when we do. Reported-by: Tomas Vondra Author: David Rowley Reviewed-by: Amit Langote, Steve Singer, Tomas Vondra	2018-11-19 11:16:28 -03:00
Tom Lane	2d83863ea2	Fix missing role dependencies for some schema and type ACLs. This patch fixes several related cases in which pg_shdepend entries were never made, or were lost, for references to roles appearing in the ACLs of schemas and/or types. While that did no immediate harm, if a referenced role were later dropped, the drop would be allowed and would leave a dangling reference in the object's ACL. That still wasn't a big problem for normal database usage, but it would cause obscure failures in subsequent dump/reload or pg_upgrade attempts, taking the form of attempts to grant privileges to all-numeric role names. (I think I've seen field reports matching that symptom, but can't find any right now.) Several cases are fixed here: 1. ALTER DOMAIN SET/DROP DEFAULT would lose the dependencies for any existing ACL entries for the domain. This case is ancient, dating back as far as we've had pg_shdepend tracking at all. 2. If a default type privilege applies, CREATE TYPE recorded the ACL properly but forgot to install dependency entries for it. This dates to the addition of default privileges for types in 9.2. 3. If a default schema privilege applies, CREATE SCHEMA recorded the ACL properly but forgot to install dependency entries for it. This dates to the addition of default privileges for schemas in v10 (commit `ab89e465c`). Another somewhat-related problem is that when creating a relation rowtype or implicit array type, TypeCreate would apply any available default type privileges to that type, which we don't really want since such an object isn't supposed to have privileges of its own. (You can't, for example, drop such privileges once they've been added to an array type.) `ab89e465c` is also to blame for a race condition in the regression tests: privileges.sql transiently installed globally-applicable default privileges on schemas, which sometimes got absorbed into the ACLs of schemas created by concurrent test scripts. This should have resulted in failures when privileges.sql tried to drop the role holding such privileges; but thanks to the bug fixed here, it instead led to dangling ACLs in the final state of the regression database. We'd managed not to notice that, but it became obvious in the wake of commit `da906766c`, which allowed the race condition to occur in pg_upgrade tests. To fix, add a function recordDependencyOnNewAcl to encapsulate what callers of get_user_default_acl need to do; while the original call sites got that right via ad-hoc code, none of the later-added ones have. Also change GenerateTypeDependencies to generate these dependencies, which requires adding the typacl to its parameter list. (That might be annoying if there are any extensions calling that function directly; but if there are, they're most likely buggy in the same way as the core callers were, so they need work anyway.) While I was at it, I changed GenerateTypeDependencies to accept most of its parameters in the form of a Form_pg_type pointer, making its parameter list a bit less unwieldy and mistake-prone. The test race condition is fixed just by wrapping the addition and removal of default privileges into a single transaction, so that that state is never visible externally. We might eventually prefer to separate out tests of default privileges into a script that runs by itself, but that would be a bigger change and would make the tests run slower overall. Back-patch relevant parts to all supported branches. Discussion: https://postgr.es/m/15719.1541725287@sss.pgh.pa.us	2018-11-09 20:42:03 -05:00
Michael Paquier	52ea6a8209	Fix dependency handling of partitions and inheritance for ON COMMIT This commit fixes a set of issues with ON COMMIT actions when used on partitioned tables and tables with inheritance children: - Applying ON COMMIT DROP on a partitioned table with partitions or on a table with inheritance children caused a failure at commit time, with complains about the children being already dropped as all relations are dropped one at the same time. - Applying ON COMMIT DELETE on a partition relying on a partitioned table which uses ON COMMIT DROP would cause the partition truncation to fail as the parent is removed first. The solution to the first problem is to handle the removal of all the dependencies in one go instead of dropping relations one-by-one, based on a suggestion from Álvaro Herrera. So instead all the relation OIDs to remove are gathered and then processed in one round of multiple deletions. The solution to the second problem is to reorder the actions, with truncation happening first and relation drop done after. Even if it means that a partition could be first truncated, then immediately dropped if its partitioned table is dropped, this has the merit to keep the code simple as there is no need to do existence checks on the relations to drop. Contrary to a manual TRUNCATE on a partitioned table, ON COMMIT DELETE does not cascade to its partitions. The ON COMMIT action defined on each partition gets the priority. Author: Michael Paquier Reviewed-by: Amit Langote, Álvaro Herrera, Robert Haas Discussion: https://postgr.es/m/68f17907-ec98-1192-f99f-8011400517f5@lab.ntt.co.jp Backpatch-through: 10	2018-11-09 10:03:39 +09:00
Alvaro Herrera	21c9e4973c	Revise attribute handling code on partition creation The original code to propagate NOT NULL and default expressions specified when creating a partition was mostly copy-pasted from typed-tables creation, but not being a great match it contained some duplicity, inefficiency and bugs. This commit fixes the bug that NOT NULL constraints declared in the parent table would not be honored in the partition. One reported issue that is not fixed is that a DEFAULT declared in the child is not used when inserting through the parent. That would amount to a behavioral change that's better not back-patched. This rewrite makes the code simpler: 1. instead of checking for duplicate column names in its own block, reuse the original one that already did that; 2. instead of concatenating the list of columns from parent and the one declared in the partition and scanning the result to (incorrectly) propagate defaults and not-null constraints, just scan the latter searching the former for a match, and merging sensibly. This works because we know the list in the parent is already correct and there can only be one parent. This rewrite makes ColumnDef->is_from_parent unused, so it's removed on branch master; on released branches, it's kept as an unused field in order not to cause ABI incompatibilities. This commit also adds a test case for creating partitions with collations mismatching that on the parent table, something that is closely related to the code being patched. No code change is introduced though, since that'd be a behavior change that could break some (broken) working applications. Amit Langote wrote a less invasive fix for the original NOT NULL/defaults bug, but while I kept the tests he added, I ended up not using his original code. Ashutosh Bapat reviewed Amit's fix. Amit reviewed mine. Author: Álvaro Herrera, Amit Langote Reviewed-by: Ashutosh Bapat, Amit Langote Reported-by: Jürgen Strobel (bug #15212) Discussion: https://postgr.es/m/152746742177.1291.9847032632907407358@wrigleys.postgresql.org	2018-11-08 16:22:09 -03:00
Michael Paquier	8aad248f7c	Block creation of partitions with open references to its parent When a partition is created as part of a trigger processing, it is possible that the partition which just gets created changes the properties of the table the executor of the ongoing command relies on, causing a subsequent crash. This has been found possible when for example using a BEFORE INSERT which creates a new partition for a partitioned table being inserted to. Any attempt to do so is blocked when working on a partition, with regression tests added for both CREATE TABLE PARTITION OF and ALTER TABLE ATTACH PARTITION. Reported-by: Dmitry Shalashov Author: Amit Langote Reviewed-by: Michael Paquier, Tom Lane Discussion: https://postgr.es/m/15437-3fe01ee66bd1bae1@postgresql.org Backpatch-through: 10	2018-11-05 11:04:20 +09:00
Alvaro Herrera	6b6b59b38e	Silence compiler warning in Assert() gcc 6.3 does not whine about this mistake I made in `39808e8868` but evidently lots of other compilers do, according to Michael Paquier, Peter Eisentraut, Arthur Zakirov, Tomas Vondra. Discussion: too many to list	2018-10-08 10:37:21 -03:00
Alvaro Herrera	101b21ead3	Fix event triggers for partitioned tables Index DDL cascading on partitioned tables introduced a way for ALTER TABLE to be called reentrantly. This caused an an important deficiency in event trigger support to be exposed: on exiting the reentrant call, the alter table state object was clobbered, causing a crash when the outer alter table tries to finalize its processing. Fix the crash by creating a stack of event trigger state objects. There are still ways to cause things to misbehave (and probably other crashers) with more elaborate tricks, but at least it now doesn't crash in the obvious scenario. Backpatch to 9.5, where DDL deparsing of event triggers was introduced. Reported-by: Marco Slot Authors: Michaël Paquier, Álvaro Herrera Discussion: https://postgr.es/m/CANNhMLCpi+HQ7M36uPfGbJZEQLyTy7XvX=5EFkpR-b1bo0uJew@mail.gmail.com	2018-10-06 19:17:46 -03:00
Tom Lane	db01fc97ad	Fix ALTER COLUMN TYPE to not open a relation without any lock. If the column being modified is referenced by a foreign key constraint of another table, ALTER TABLE would open the other table (to re-parse the constraint's definition) without having first obtained a lock on it. This was evidently intentional, but that doesn't mean it's really safe. It's especially not safe in 9.3, which pre-dates use of MVCC scans for catalog reads, but even in current releases it doesn't seem like a good idea. We know we'll need AccessExclusiveLock shortly to drop the obsoleted constraint, so just get that a little sooner to close the hole. Per testing with a patch that complains if we open a relation without holding any lock on it. I don't plan to back-patch that patch, but we should close the holes it identifies in all supported branches. Discussion: https://postgr.es/m/2038.1538335244@sss.pgh.pa.us	2018-10-01 11:39:14 -04:00
Peter Eisentraut	5f6b0e6d69	Recurse to sequences on ownership change for all relkinds When a table ownership is changed, we must apply that also to any owned sequences. (Otherwise, it would result in a situation that cannot be restored, because linked sequences must have the same owner as the table.) But this was previously only applied to regular tables and materialized views. But it should also apply to at least foreign tables. This patch removes the relkind check altogether, because it doesn't save very much and just introduces the possibility of similar omissions. Bug: #15238 Reported-by: Christoph Berg <christoph.berg@credativ.de>	2018-09-26 20:19:44 +02:00
Tom Lane	10b9af3ebb	Avoid using potentially-under-aligned page buffers. There's a project policy against using plain "char buf[BLCKSZ]" local or static variables as page buffers; preferred style is to palloc or malloc each buffer to ensure it is MAXALIGN'd. However, that policy's been ignored in an increasing number of places. We've apparently got away with it so far, probably because (a) relatively few people use platforms on which misalignment causes core dumps and/or (b) the variables chance to be sufficiently aligned anyway. But this is not something to rely on. Moreover, even if we don't get a core dump, we might be paying a lot of cycles for misaligned accesses. To fix, invent new union types PGAlignedBlock and PGAlignedXLogBlock that the compiler must allocate with sufficient alignment, and use those in place of plain char arrays. I used these types even for variables where there's no risk of a misaligned access, since ensuring proper alignment should make kernel data transfers faster. I also changed some places where we had been palloc'ing short-lived buffers, for coding style uniformity and to save palloc/pfree overhead. Since this seems to be a live portability hazard (despite the lack of field reports), back-patch to all supported versions. Patch by me; thanks to Michael Paquier for review. Discussion: https://postgr.es/m/1535618100.1286.3.camel@credativ.de	2018-09-01 15:27:13 -04:00
Michael Paquier	ecf56dc5e5	Fix set of NLS translation issues While monitoring the code, a couple of issues related to string translation has showed up: - Some routines for auto-updatable views return an error string, which sometimes missed the shot. A comment regarding string translation is added for each routine to help with future features. - GSSAPI authentication missed two translations. - vacuumdb handles non-translated strings. Reported-by: Kyotaro Horiguchi Author: Kyotaro Horiguchi Reviewed-by: Michael Paquier, Tom Lane Discussion: https://postgr.es/m/20180810.152131.31921918.horiguchi.kyotaro@lab.ntt.co.jp Backpatch-through: 9.3	2018-08-21 15:17:38 +09:00
Peter Eisentraut	6c206de559	Remove obsolete comment The sequence name is no longer stored in the sequence relation, since `1753b1b027`.	2018-08-13 21:08:15 +02:00
Tom Lane	9446d71577	Don't record FDW user mappings as members of extensions. CreateUserMapping has a recordDependencyOnCurrentExtension call that's been there since extensions were introduced (very possibly my fault). However, there's no support anywhere else for user mappings as members of extensions, nor are they listed as a possible member object type in the documentation. Nor does it really seem like a good idea for user mappings to belong to extensions when roles don't. Hence, remove the bogus call. (As we saw in bug #15310, the lack of any pg_dump support for this case ensures that any such membership record would silently disappear during pg_upgrade. So there's probably no need for us to do anything else about cleaning up after this mistake.) Discussion: https://postgr.es/m/27952.1533667213@sss.pgh.pa.us	2018-08-07 16:33:00 -04:00
Tom Lane	2131d4501f	Remove undocumented restriction against duplicate partition key columns. transformPartitionSpec rejected duplicate simple partition columns (e.g., "PARTITION BY RANGE (x,x)") but paid no attention to expression columns, resulting in inconsistent behavior. Worse, cases like "PARTITION BY RANGE (x,(x))") were accepted but would then result in dump/reload failures, since the expression (x) would get simplified to a plain column later. There seems no better reason for this restriction than there was for the one against duplicate included index columns (cf commit `701fd0bbc`), so let's just remove it. Back-patch to v10 where this code was added. Report and patch by Yugo Nagata. Discussion: https://postgr.es/m/20180712165939.36b12aff.nagata@sraoss.co.jp	2018-07-19 15:41:46 -04:00
Heikki Linnakangas	ed529faf7e	Fix misc typos, mostly in comments. A collection of typos I happened to spot while reading code, as well as grepping for common mistakes. Backpatch to all supported versions, as applicable, to avoid conflicts when backporting other commits in the future.	2018-07-18 16:18:27 +03:00
Michael Paquier	5862174ec7	Clarify use of temporary tables within partition trees Since their introduction, partition trees have been a bit lossy regarding temporary relations. Inheritance trees respect the following patterns: 1) a child relation can be temporary if the parent is permanent. 2) a child relation can be temporary if the parent is temporary. 3) a child relation cannot be permanent if the parent is temporary. 4) The use of temporary relations also imply that when both parent and child need to be from the same sessions. Partitions share many similar patterns with inheritance, however the handling of the partition bounds make the situation a bit tricky for case 1) as the partition code bases a lot of its lookup code upon PartitionDesc which does not really look after relpersistence. This causes for example a temporary partition created by session A to be visible by another session B, preventing this session B to create an extra partition which overlaps with the temporary one created by A with a non-intuitive error message. There could be use-cases where mixing permanent partitioned tables with temporary partitions make sense, but that would be a new feature. Partitions respect 2), 3) and 4) already. It is a bit depressing to see those error checks happening in MergeAttributes() whose purpose is different, but that's left as future refactoring work. Back-patch down to 10, which is where partitioning has been introduced, except that default partitions do not apply there. Documentation also includes limitations related to the use of temporary tables with partition trees. Reported-by: David Rowley Author: Amit Langote, Michael Paquier Reviewed-by: Ashutosh Bapat, Amit Langote, Michael Paquier Discussion: https://postgr.es/m/CAKJS1f94Ojk0og9GMkRHGt8wHTW=ijq5KzJKuoBoqWLwSVwGmw@mail.gmail.com	2018-06-20 10:48:28 +09:00
Tom Lane	b10edaf4bb	Fix access to just-closed relcache entry. It might be impossible for this to cause a problem in non-debug builds, since there'd be no opportunity for the relcache entry to get recycled before the fetch. It blows up nicely with -DRELCACHE_FORCE_RELEASE plus valgrind, though. Evidently introduced by careless refactoring in commit `f0e44751d`. Back-patch accordingly. Discussion: https://postgr.es/m/27543.1528758304@sss.pgh.pa.us	2018-06-11 19:17:50 -04:00
Tom Lane	c92d1461e1	Widen COPY FROM's current-line-number counter from 32 to 64 bits. Because the code for the HEADER option skips a line when this counter is zero, a very long COPY FROM WITH HEADER operation would drop a line every 2^32 lines. A lesser but still unfortunate problem is that errors would show a wrong input line number for errors occurring beyond the 2^31'st input line. While such large input streams seemed impractical when this code was first written, they're not any more. Widening the counter (and some associated variables) to uint64 should be enough to prevent problems for the foreseeable future. David Rowley Discussion: https://postgr.es/m/CAKJS1f88yh-6wwEfO6QLEEvH3BEugOq2QX1TOja0vCauoynmOQ@mail.gmail.com	2018-05-22 13:32:52 -04:00
Tom Lane	fab4ecacc4	Fix race conditions when an event trigger is added concurrently with DDL. EventTriggerTableRewrite crashed if there were table_rewrite triggers present, but there had not been when the calling command started. EventTriggerDDLCommandEnd called ddl_command_end triggers if present, even if there had been no such triggers when the calling command started, which would lead to a failure in pg_event_trigger_ddl_commands. In both cases, fix by doing nothing; it's better to wait till the next command when things will be properly initialized. In passing, remove an elog(DEBUG1) call that might have seemed interesting four years ago but surely isn't today. We found this because of intermittent failures in the buildfarm. Thanks to Alvaro Herrera and Andrew Gierth for analysis. Back-patch to 9.5; some of this code exists before that, but the specific hazards we need to guard against don't. Discussion: https://postgr.es/m/5767.1523995174@sss.pgh.pa.us	2018-04-20 17:15:31 -04:00
Tom Lane	94a898f69c	Better fix for deadlock hazard in CREATE INDEX CONCURRENTLY. Commit `54eff5311` did not account for the possibility that we'd have a transaction snapshot due to default_transaction_isolation being set high enough to require one. The transaction snapshot is enough to hold back our advertised xmin and thus risk deadlock anyway. The only way to get rid of that snap is to start a new transaction, so let's do that instead. Also throw in an assert checking that we really have gotten to a state where no xmin is being advertised. Back-patch to 9.4, like the previous commit. Discussion: https://postgr.es/m/CAMkU=1ztk3TpQdcUNbxq93pc80FrXUjpDWLGMeVBDx71GHNwZQ@mail.gmail.com	2018-04-18 12:07:37 -04:00
Robert Haas	29ab1e24a6	Enforce child constraints during COPY TO a partitioned table. The previous coding inadvertently checked the constraints for the partitioned table rather than the target partition, which could lead to data in a partition that fails to satisfy some constraint on that partition. This problem seems to date back to when table partitioning was introduced; prior to that, there was only one target table for a COPY, so the problem didn't occur, and the code just didn't get updated. Etsuro Fujita, reviewed by Amit Langote and Ashutosh Bapat Discussion: https://postgr.es/message-id/5ABA4074.1090500%40lab.ntt.co.jp	2018-04-06 11:52:38 -04:00
Tom Lane	e17e9055f5	Fix some corner-case issues in REFRESH MATERIALIZED VIEW CONCURRENTLY. refresh_by_match_merge() has some issues in the way it builds a SQL query to construct the "diff" table: 1. It doesn't require the selected unique index(es) to be indimmediate. 2. It doesn't pay attention to the particular equality semantics enforced by a given index, but just assumes that they must be those of the column datatype's default btree opclass. 3. It doesn't check that the indexes are btrees. 4. It's insufficiently careful to ensure that the parser will pick the intended operator when parsing the query. (This would have been a security bug before CVE-2018-1058.) 5. It's not careful about indexes on system columns. The way to fix #4 is to make use of the existing code in ri_triggers.c for generating an arbitrary binary operator clause. I chose to move that to ruleutils.c, since that seems a more reasonable place to be exporting such functionality from than ri_triggers.c. While #1, #3, and #5 are just latent given existing feature restrictions, and #2 doesn't arise in the core system for lack of alternate opclasses with different equality behaviors, #4 seems like an issue worth back-patching. That's the bulk of the change anyway, so just back-patch the whole thing to 9.4 where this code was introduced. Discussion: https://postgr.es/m/13836.1521413227@sss.pgh.pa.us	2018-03-19 18:49:53 -04:00
Tom Lane	1568156d8f	Fix performance hazard in REFRESH MATERIALIZED VIEW CONCURRENTLY. Jeff Janes discovered that commit `7ca25b7de` made one of the queries run by REFRESH MATERIALIZED VIEW CONCURRENTLY perform badly. The root cause is bad cardinality estimation for correlated quals, but a principled solution to that problem is some way off, especially since the planner lacks any statistics about whole-row variables. Moreover, in non-error cases this query produces no rows, meaning it must be run to completion; but use of LIMIT 1 encourages the planner to pick a fast-start, slow-completion plan, exactly not what we want. Remove the LIMIT clause, and instead rely on the count parameter we pass to SPI_execute() to prevent excess work if the query does return some rows. While we've heard no field reports of planner misbehavior with this query, it could be that people are having performance issues that haven't reached the level of pain needed to cause a bug report. In any case, that LIMIT clause can't possibly do anything helpful with any existing version of the planner, and it demonstrably can cause bad choices in some cases, so back-patch to 9.4 where the code was introduced. Thomas Munro Discussion: https://postgr.es/m/CAMkU=1z-JoGymHneGHar1cru4F1XDfHqJDzxP_CtK5cL3DOfmg@mail.gmail.com	2018-03-19 17:23:23 -04:00
Alvaro Herrera	e3faddf537	Fix state reversal after partition tuple routing We make some changes to ModifyTableState and the EState it uses whenever we route tuples to partitions; but we weren't restoring properly in all cases, possibly causing crashes when partitions with different tuple descriptors are targeted by tuples inserted in the same command. Refactor some code, creating ExecPrepareTupleRouting, to encapsulate the needed state changing logic, and have it invoked one level above its current place (ie. put it in ExecModifyTable instead of ExecInsert); this makes it all more readable. Add a test case to exercise this. We don't support having views as partitions; and since only views can have INSTEAD OF triggers, there is no point in testing for INSTEAD OF when processing insertions into a partitioned table. Remove code that appears to support this (but which is actually never relevant.) In passing, fix location of some very confusing comments in ModifyTableState. Reported-by: Amit Langote Author: Etsuro Fujita, Amit Langote Discussion: https://postgr/es/m/0473bf5c-57b1-f1f7-3d58-455c2230bc5f@lab.ntt.co.jp	2018-03-19 17:43:55 -03:00
Tom Lane	1bfb567230	When updating reltuples after ANALYZE, just extrapolate from our sample. The existing logic for updating pg_class.reltuples trusted the sampling results only for the pages ANALYZE actually visited, preferring to believe the previous tuple density estimate for all the unvisited pages. While there's some rationale for doing that for VACUUM (first that VACUUM is likely to visit a very nonrandom subset of pages, and second that we know for sure that the unvisited pages did not change), there's no such rationale for ANALYZE: by assumption, it's looked at an unbiased random sample of the table's pages. Furthermore, in a very large table ANALYZE will have examined only a tiny fraction of the table's pages, meaning it cannot slew the overall density estimate very far at all. In a table that is physically growing, this causes reltuples to increase nearly proportionally to the change in relpages, regardless of what is actually happening in the table. This has been observed to cause reltuples to become so much larger than reality that it effectively shuts off autovacuum, whose threshold for doing anything is a fraction of reltuples. (Getting to the point where that would happen seems to require some additional, not well understood, conditions. But it's undeniable that if reltuples is seriously off in a large table, ANALYZE alone will not fix it in any reasonable number of iterations, especially not if the table is continuing to grow.) Hence, restrict the use of vac_estimate_reltuples() to VACUUM alone, and in ANALYZE, just extrapolate from the sample pages on the assumption that they provide an accurate model of the whole table. If, by very bad luck, they don't, at least another ANALYZE will fix it; in the old logic a single bad estimate could cause problems indefinitely. In HEAD, let's remove vac_estimate_reltuples' is_analyze argument altogether; it was never used for anything and now it's totally pointless. But keep it in the back branches, in case any third-party code is calling this function. Per bug #15005. Back-patch to all supported branches. David Gould, reviewed by Alexander Kuzmenkov, cosmetic changes by me Discussion: https://postgr.es/m/20180117164916.3fdcf2e9@engels	2018-03-13 13:24:27 -04:00
Peter Eisentraut	c32f44c4a5	Fix CREATE TABLE / LIKE with bigint identity column CREATE TABLE / LIKE with a bigint identity column would fail on platforms where long is 32 bits. Copying the sequence values used makeInteger(), which would truncate the 64-bit sequence data to 32 bits. To fix, use makeFloat() instead, like the parser. (This does not actually make use of floats, but stores the values as strings.) Bug: #15096 Reviewed-by: Michael Paquier <michael@paquier.xyz>	2018-03-13 09:41:36 -04:00
Tom Lane	e2ed3c4a30	Fix improper uses of canonicalize_qual(). One of the things canonicalize_qual() does is to remove constant-NULL subexpressions of top-level AND/OR clauses. It does that on the assumption that what it's given is a top-level WHERE clause, so that NULL can be treated like FALSE. Although this is documented down inside a subroutine of canonicalize_qual(), it wasn't mentioned in the documentation of that function itself, and some callers hadn't gotten that memo. Notably, commit `d007a9505` caused get_relation_constraints() to apply canonicalize_qual() to CHECK constraints. That allowed constraint exclusion to misoptimize situations in which a CHECK constraint had a provably-NULL subclause, as seen in the regression test case added here, in which a child table that should be scanned is not. (Although this thinko is ancient, the test case doesn't fail before 9.2, for reasons I've not bothered to track down in detail. There may be related cases that do fail before that.) More recently, commit `f0e44751d` added an independent bug by applying canonicalize_qual() to index expressions, which is even sillier since those might not even be boolean. If they are, though, I think this could lead to making incorrect index entries for affected index expressions in v10. I haven't attempted to prove that though. To fix, add an "is_check" parameter to canonicalize_qual() to specify whether it should assume WHERE or CHECK semantics, and make it perform NULL-elimination accordingly. Adjust the callers to apply the right semantics, or remove the call entirely in cases where it's not known that the expression has one or the other semantics. I also removed the call in some cases involving partition expressions, where it should be a no-op because such expressions should be canonical already ... and was a no-op, independently of whether it could in principle have done something, because it was being handed the qual in implicit-AND format which isn't what it expects. In HEAD, add an Assert to catch that type of mistake in future. This represents an API break for external callers of canonicalize_qual(). While that's intentional in HEAD to make such callers think about which case applies to them, it seems like something we probably wouldn't be thanked for in released branches. Hence, in released branches, the extra parameter is added to a new function canonicalize_qual_ext(), and canonicalize_qual() is a wrapper that retains its old behavior. Patch by me with suggestions from Dean Rasheed. Back-patch to all supported branches. Discussion: https://postgr.es/m/24475.1520635069@sss.pgh.pa.us	2018-03-11 18:10:42 -04:00
Alvaro Herrera	e20dd6a13d	Fix bogus Name assignment in CreateStatistics Apparently, it doesn't work to use a plain cstring as a Name datum: you may end up having random bytes because of failing to zero the bytes after the terminating \0, as indicated by valgrind. I introduced this bug in `5564c11815`, so backpatch this fix to REL_10_STABLE, like that commit. While at it, fix a slightly misleading comment, pointed out by David Rowley.	2018-03-06 13:21:04 -03:00
Alvaro Herrera	911e6236ba	Clone extended stats in CREATE TABLE (LIKE INCLUDING ALL) The LIKE INCLUDING ALL clause to CREATE TABLE intuitively indicates cloning of extended statistics on the source table, but it failed to do so. Patch it up so that it does. Also include an INCLUDING STATISTICS option to the LIKE clause, so that the behavior can be requested individually, or excluded individually. While at it, reorder the INCLUDING options, both in code and in docs, in alphabetical order which makes more sense than feature-implementation order that was previously used. Backpatch this to Postgres 10, where extended statistics were introduced, because this is seen as an oversight in a fresh feature which is better to get consistent from the get-go instead of changing only in pg11. In pg11, comments on statistics objects are cloned too. In pg10 they are not, because I (Álvaro) was too coward to change the parse node as required to support it. Also, in pg10 I chose not to renumber the parser symbols for the various INCLUDING options in LIKE, for the same reason. Any corresponding user-visible changes (docs) are backpatched, though. Reported-by: Stephen Froehlich Author: David Rowley Reviewed-by: Álvaro Herrera, Tomas Vondra Discussion: https://postgr.es/m/CY1PR0601MB1927315B45667A1B679D0FD5E5EF0@CY1PR0601MB1927.namprd06.prod.outlook.com	2018-03-05 19:37:19 -03:00
Tom Lane	b45f821e22	Prevent dangling-pointer access when update trigger returns old tuple. A before-update row trigger may choose to return the "new" or "old" tuple unmodified. ExecBRUpdateTriggers failed to consider the second possibility, and would proceed to free the "old" tuple even if it was the one returned, leading to subsequent access to already-deallocated memory. In debug builds this reliably leads to an "invalid memory alloc request size" failure; in production builds it might accidentally work, but data corruption is also possible. This is a very old bug. There are probably a couple of reasons it hasn't been noticed up to now. It would be more usual to return NULL if one wanted to suppress the update action; returning "old" is significantly less efficient since the update will occur anyway. Also, none of the standard PLs would ever cause this because they all returned freshly-manufactured tuples even if they were just copying "old". But commit `4b93f5799` changed that for plpgsql, making it possible to see the bug with a plpgsql trigger. Still, this is certainly legal behavior for a trigger function, so it's ExecBRUpdateTriggers's fault not plpgsql's. It seems worth creating a test case that exercises returning "old" directly with a C-language trigger; testing this through plpgsql seems unreliable because its behavior might change again. Report and fix by Rushabh Lathia; regression test case by me. Back-patch to all supported branches. Discussion: https://postgr.es/m/CAGPqQf1P4pjiNPrMof=P_16E-DFjt457j+nH2ex3=nBTew7tXw@mail.gmail.com	2018-02-27 13:27:38 -05:00
Peter Eisentraut	1597948c96	Fix application of identity values in some cases Investigation of `2d2d06b7e2` revealed that identity values were not applied in some further cases, including logical replication subscribers, VALUES RTEs, and ALTER TABLE ... ADD COLUMN. To fix all that, apply the identity column expression in build_column_default() instead of repeating the same logic at each call site. For ALTER TABLE ... ADD COLUMN ... IDENTITY, the previous coding completely ignored that existing rows for the new column should have values filled in from the identity sequence. The coding using build_column_default() fails for this because the sequence ownership isn't registered until after ALTER TABLE, and we can't do it before because we don't have the column in the catalog yet. So we specially remember in ColumnDef the sequence name that we decided on and build a custom NextValueExpr using that. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2018-02-02 15:06:52 -05:00
Alvaro Herrera	61f08c0163	Fix StoreCatalogInheritance1 to use 32bit inhseqno For no apparent reason, this function was using a 16bit-wide inhseqno value, rather than the correct 32 bit width which is what is stored in the pg_inherits catalog. This becomes evident if you try to create a table with more than 65535 parents, because this error appears: ERROR: duplicate key value violates unique constraint «pg_inherits_relid_seqno_index» DETAIL: Key (inhrelid, inhseqno)=(329371, 0) already exists. Needless to say, having so many parents is an uncommon situations, which explains why this error has never been reported despite being having been introduced with the Postgres95 1.01 sources in commit `d31084e9d1`: https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/commands/creatinh.c;hb=d31084e9d111#l349 Backpatch all the way back. David Rowley noticed this while reviewing a patch of mine. Discussion: https://postgr.es/m/CAKJS1f8Dn7swSEhOWwzZzssW7747YB=2Hi+T7uGud40dur69-g@mail.gmail.com	2018-01-19 10:15:08 -03:00
Alvaro Herrera	6d2a9ae0ed	Fix deadlock hazard in CREATE INDEX CONCURRENTLY Multiple sessions doing CREATE INDEX CONCURRENTLY simultaneously are supposed to be able to work in parallel, as evidenced by fixes in commit `c3d09b3bd2` specifically to support this case. In reality, one of the sessions would be aborted by a misterious "deadlock detected" error. Jeff Janes diagnosed that this is because of leftover snapshots used for system catalog scans -- this was broken by `8aa3e47510` keeping track of (registering) the catalog snapshot. To fix the deadlocks, it's enough to de-register that snapshot prior to waiting. Backpatch to 9.4, which introduced MVCC catalog scans. Include an isolationtester spec that 8 out of 10 times reproduces the deadlock with the unpatched code for me (Álvaro). Author: Jeff Janes Diagnosed-by: Jeff Janes Reported-by: Jeremy Finzel Discussion: https://postgr.es/m/CAMa1XUhHjCv8Qkx0WOr1Mpm_R4qxN26EibwCrj0Oor2YBUFUTg%40mail.gmail.com	2018-01-02 19:16:16 -03:00
Teodor Sigaev	bdbf29aaef	Update relation's stats in pg_class during vacuum full. Hash index depends on estimation of numbers of tuples and pages of relations, incorrect value could be a reason of significantly growing of index. Vacuum full recreates heap and reindex all indexes before renewal stats. The patch fixes that, so indexes will see correct values. Backpatch to v10 only because earlier versions haven't usable hash index and growing of hash index is a single user-visible symptom. Author: Amit Kapila Reviewed-by: Ashutosh Sharma, me Discussion: https://www.postgresql.org/message-id/flat/20171115232922.5tomkxnw3iq6jsg7@inml.weebeastie.net	2017-12-27 18:26:58 +03:00
Andres Freund	d3044f8b07	Perform a lot more sanity checks when freezing tuples. The previous commit has shown that the sanity checks around freezing aren't strong enough. Strengthening them seems especially important because the existance of the bug has caused corruption that we don't want to make even worse during future vacuum cycles. The errors are emitted with ereport rather than elog, despite being "should never happen" messages, so a proper error code is emitted. To avoid superflous translations, mark messages as internal. Author: Andres Freund and Alvaro Herrera Reviewed-By: Alvaro Herrera, Michael Paquier Discussion: https://postgr.es/m/20171102112019.33wb7g5wp4zpjelu@alap3.anarazel.de Backpatch: 9.3-	2017-12-14 18:20:48 -08:00
Peter Eisentraut	ee5b595493	Apply identity sequence values on COPY A COPY into a table should apply identity sequence values just like it does for ordinary defaults. This was previously forgotten, leading to null values being inserted, which in turn would fail because identity columns have not-null constraints. Author: Michael Paquier <michael.paquier@gmail.com> Reported-by: Steven Winfield <steven.winfield@cantabcapital.com> Bug: #14952	2017-12-08 09:39:55 -05:00
Tom Lane	6448704496	Fix assorted syscache lookup sloppiness in partition-related code. heap_drop_with_catalog and ATExecDetachPartition neglected to check for SearchSysCache failures, as noted in bugs #14927 and #14928 from Pan Bian. Such failures are pretty unlikely, since we should already have some sort of lock on the rel at these points, but it's neither a good idea nor per project style to omit a check for failure. Also, StorePartitionKey contained a syscache lookup that it never did anything with, including never releasing the result. Presumably the reason why we don't see refcount-leak complaints is that the lookup always fails; but in any case it's pretty useless, so remove it. All of these errors were evidently introduced by the relation partitioning feature. Back-patch to v10 where that came in. Amit Langote and Tom Lane Discussion: https://postgr.es/m/20171127090105.1463.3962@wrigleys.postgresql.org Discussion: https://postgr.es/m/20171127091341.1468.72696@wrigleys.postgresql.org	2017-11-27 19:22:08 -05:00
Noah Misch	2168f37c4d	Ignore CatalogSnapshot when checking COPY FREEZE prerequisites. This restores the ability, essentially lost in commit `ffaa44cb55`, to use COPY FREEZE under REPEATABLE READ isolation. Back-patch to 9.4, like that commit. Reviewed by Tom Lane. Discussion: https://postgr.es/m/CA+TgmoahWDm-7fperBxzU9uZ99LPMUmEpSXLTw9TmrOgzwnORw@mail.gmail.com	2017-11-05 09:25:59 -08:00
Alvaro Herrera	7a95966bc0	Revert bogus fixes of HOT-freezing bug It turns out we misdiagnosed what the real problem was. Revert the previous changes, because they may have worse consequences going forward. A better fix is forthcoming. The simplistic test case is kept, though disabled. Discussion: https://postgr.es/m/20171102112019.33wb7g5wp4zpjelu@alap3.anarazel.de	2017-11-02 15:51:05 +01:00
Tom Lane	f4cdf781a1	Fix low-probability loss of NOTIFY messages due to XID wraparound. Up to now async.c has used TransactionIdIsInProgress() to detect whether a notify message's source transaction is still running. However, that function has a quick-exit path that reports that XIDs before RecentXmin are no longer running. If a listening backend is doing nothing but listening, and not running any queries, there is nothing that will advance its value of RecentXmin. Once 2 billion transactions elapse, the RecentXmin check causes active transactions to be reported as not running. If they aren't committed yet according to CLOG, async.c decides they aborted and discards their messages. The timing for that is a bit tight but it can happen when multiple backends are sending notifies concurrently. The net symptom therefore is that a sufficiently-long-surviving listen-only backend starts to miss some fraction of NOTIFY traffic, but only under heavy load. The only function that updates RecentXmin is GetSnapshotData(). A brute-force fix would therefore be to take a snapshot before processing incoming notify messages. But that would add cycles, as well as contention for the ProcArrayLock. We can be smarter: having taken the snapshot, let's use that to check for running XIDs, and not call TransactionIdIsInProgress() at all. In this way we reduce the number of ProcArrayLock acquisitions from one per message to one per notify interrupt; that's the same under light load but should be a benefit under heavy load. Light testing says that this change is a wash performance-wise for normal loads. I looked around for other callers of TransactionIdIsInProgress() that might be at similar risk, and didn't find any; all of them are inside transactions that presumably have already taken a snapshot. Problem report and diagnosis by Marko Tiikkaja, patch by me. Back-patch to all supported branches, since it's been like this since 9.0. Discussion: https://postgr.es/m/20170926182935.14128.65278@wrigleys.postgresql.org	2017-10-11 14:28:33 -04:00
Tom Lane	2aab70205b	Fix inadequate locking during get_rel_oids(). get_rel_oids used to not take any relation locks at all, but that stopped being a good idea with commit `3c3bb9933`, which inserted a syscache lookup into the function. A concurrent DROP TABLE could now produce "cache lookup failed", which we don't want to have happen in normal operation. The best solution seems to be to transiently take a lock on the relation named by the RangeVar (which also makes the result of RangeVarGetRelid a lot less spongy). But we shouldn't hold the lock beyond this function, because we don't want VACUUM to lock more than one table at a time. (That would not be a big problem right now, but it will become one after the pending feature patch to allow multiple tables to be named in VACUUM.) In passing, adjust vacuum_rel and analyze_rel to document that we don't trust the passed RangeVar to be accurate, and allow the RangeVar to possibly be NULL --- which it is anyway for a whole-database VACUUM, though we accidentally didn't crash for that case. The passed RangeVar is in fact inaccurate when dealing with a child partition, as of v10, and it has been wrong for a whole long time in the case of vacuum_rel() recursing to a TOAST table. None of these things present visible bugs up to now, because the passed RangeVar is in fact only consulted for autovacuum logging, and in that particular context it's always accurate because autovacuum doesn't let vacuum.c expand partitions nor recurse to toast tables. Still, this seems like trouble waiting to happen, so let's nail the door at least partly shut. (Further cleanup is planned, in HEAD only, as part of the pending feature patch.) Fix some sadly inaccurate/obsolete comments too. Back-patch to v10. Michael Paquier and Tom Lane Discussion: https://postgr.es/m/25023.1506107590@sss.pgh.pa.us	2017-09-29 16:26:21 -04:00

1 2 3 4 5 ...

3346 Commits