postgres

mirror of https://github.com/postgres/postgres.git synced 2025-11-01 21:31:19 +03:00

Author	SHA1	Message	Date
Tom Lane	0a459cec96	Support all SQL:2011 options for window frame clauses. This patch adds the ability to use "RANGE offset PRECEDING/FOLLOWING" frame boundaries in window functions. We'd punted on that back in the original patch to add window functions, because it was not clear how to do it in a reasonably data-type-extensible fashion. That problem is resolved here by adding the ability for btree operator classes to provide an "in_range" support function that defines how to add or subtract the RANGE offset value. Factoring it this way also allows the operator class to avoid overflow problems near the ends of the datatype's range, if it wishes to expend effort on that. (In the committed patch, the integer opclasses handle that issue, but it did not seem worth the trouble to avoid overflow failures for datetime types.) The patch includes in_range support for the integer_ops opfamily (int2/int4/int8) as well as the standard datetime types. Support for other numeric types has been requested, but that seems like suitable material for a follow-on patch. In addition, the patch adds GROUPS mode which counts the offset in ORDER-BY peer groups rather than rows, and it adds the frame_exclusion options specified by SQL:2011. As far as I can see, we are now fully up to spec on window framing options. Existing behaviors remain unchanged, except that I changed the errcode for a couple of existing error reports to meet the SQL spec's expectation that negative "offset" values should be reported as SQLSTATE 22013. Internally and in relevant parts of the documentation, we now consistently use the terminology "offset PRECEDING/FOLLOWING" rather than "value PRECEDING/FOLLOWING", since the term "value" is confusingly vague. Oliver Ford, reviewed and whacked around some by me Discussion: https://postgr.es/m/CAGMVOdu9sivPAxbNN0X+q19Sfv9edEPv=HibOJhB14TJv_RCQg@mail.gmail.com	2018-02-07 00:06:56 -05:00
Tom Lane	35a528062c	Add stack-overflow guards in set-operation planning. create_plan_recurse lacked any stack depth check. This is not per our normal coding rules, but I'd supposed it was safe because earlier planner processing is more complex and presumably should eat more stack. But bug #15033 from Andrew Grossman shows this isn't true, at least not for queries having the form of a many-thousand-way INTERSECT stack. Further testing showed that recurse_set_operations is also capable of being crashed in this way, since it likewise will recurse to the bottom of a parsetree before calling any support functions that might themselves contain any stack checks. However, its stack consumption is only perhaps a third of create_plan_recurse's. It's possible that this particular problem with create_plan_recurse can only manifest in 9.6 and later, since before that we didn't build a Path tree for set operations. But having seen this example, I now have no faith in the proposition that create_plan_recurse doesn't need a stack check, so back-patch to all supported branches. Discussion: https://postgr.es/m/20180127050845.28812.58244@wrigleys.postgresql.org	2018-01-28 13:39:07 -05:00
Robert Haas	2f17844104	Allow UPDATE to move rows between partitions. When an UPDATE causes a row to no longer match the partition constraint, try to move it to a different partition where it does match the partition constraint. In essence, the UPDATE is split into a DELETE from the old partition and an INSERT into the new one. This can lead to surprising behavior in concurrency scenarios because EvalPlanQual rechecks won't work as they normally did; the known problems are documented. (There is a pending patch to improve the situation further, but it needs more review.) Amit Khandekar, reviewed and tested by Amit Langote, David Rowley, Rajkumar Raghuwanshi, Dilip Kumar, Amul Sul, Thomas Munro, Álvaro Herrera, Amit Kapila, and me. A few final revisions by me. Discussion: http://postgr.es/m/CAJ3gD9do9o2ccQ7j7+tSgiE1REY65XRiMb=yJO3u3QhyP8EEPQ@mail.gmail.com	2018-01-19 15:33:06 -05:00
Bruce Momjian	9d4649ca49	Update copyright for 2018 Backpatch-through: certain files through 9.3	2018-01-02 23:30:12 -05:00
Tom Lane	c4c2885cbb	Fix UNION/INTERSECT/EXCEPT over no columns. Since 9.4, we've allowed the syntax "select union select" and variants of that. However, the planner wasn't expecting a no-column set operation and ended up treating the set operation as if it were UNION ALL. Turns out it's trivial to fix in v10 and later; we just need to be careful about not generating a Sort node with no sort keys. However, since a weird corner case like this is never going to be exercised by developers, we'd better have thorough regression tests if we want to consider it supported. Per report from Victor Yegorov. Discussion: https://postgr.es/m/CAGnEbojGJrRSOgJwNGM7JSJZpVAf8xXcVPbVrGdhbVEHZ-BUMw@mail.gmail.com	2017-12-22 12:08:06 -05:00
Andres Freund	1804284042	Add parallel-aware hash joins. Introduce parallel-aware hash joins that appear in EXPLAIN plans as Parallel Hash Join with Parallel Hash. While hash joins could already appear in parallel queries, they were previously always parallel-oblivious and had a partial subplan only on the outer side, meaning that the work of the inner subplan was duplicated in every worker. After this commit, the planner will consider using a partial subplan on the inner side too, using the Parallel Hash node to divide the work over the available CPU cores and combine its results in shared memory. If the join needs to be split into multiple batches in order to respect work_mem, then workers process different batches as much as possible and then work together on the remaining batches. The advantages of a parallel-aware hash join over a parallel-oblivious hash join used in a parallel query are that it: * avoids wasting memory on duplicated hash tables * avoids wasting disk space on duplicated batch files * divides the work of building the hash table over the CPUs One disadvantage is that there is some communication between the participating CPUs which might outweigh the benefits of parallelism in the case of small hash tables. This is avoided by the planner's existing reluctance to supply partial plans for small scans, but it may be necessary to estimate synchronization costs in future if that situation changes. Another is that outer batch 0 must be written to disk if multiple batches are required. A potential future advantage of parallel-aware hash joins is that right and full outer joins could be supported, since there is a single set of matched bits for each hashtable, but that is not yet implemented. A new GUC enable_parallel_hash is defined to control the feature, defaulting to on. Author: Thomas Munro Reviewed-By: Andres Freund, Robert Haas Tested-By: Rafia Sabih, Prabhat Sahu Discussion: https://postgr.es/m/CAEepm=2W=cOkiZxcg6qiFQP-dHUe09aqTrEMM7yJDrHMhDv_RA@mail.gmail.com https://postgr.es/m/CAEepm=37HKyJ4U6XOLi=JgfSHM3o6B-GaeO-6hkOmneTDkH+Uw@mail.gmail.com	2017-12-21 00:43:41 -08:00
Robert Haas	ab72716778	Support Parallel Append plan nodes. When we create an Append node, we can spread out the workers over the subplans instead of piling on to each subplan one at a time, which should typically be a bit more efficient, both because the startup cost of any plan executed entirely by one worker is paid only once and also because of reduced contention. We can also construct Append plans using a mix of partial and non-partial subplans, which may allow for parallelism in places that otherwise couldn't support it. Unfortunately, this patch doesn't handle the important case of parallelizing UNION ALL by running each branch in a separate worker; the executor infrastructure is added here, but more planner work is needed. Amit Khandekar, Robert Haas, Amul Sul, reviewed and tested by Ashutosh Bapat, Amit Langote, Rafia Sabih, Amit Kapila, and Rajkumar Raghuwanshi. Discussion: http://postgr.es/m/CAJ3gD9dy0K_E8r727heqXoBmWZ83HwLFwdcaSSmBQ1+S+vRuUQ@mail.gmail.com	2017-12-05 17:28:39 -05:00
Robert Haas	e89a71fb44	Pass InitPlan values to workers via Gather (Merge). If a PARAM_EXEC parameter is used below a Gather (Merge) but the InitPlan that computes it is attached to or above the Gather (Merge), force the value to be computed before starting parallelism and pass it down to all workers. This allows us to use parallelism in cases where it previously would have had to be rejected as unsafe. We do - in this case - lose the optimization that the value is only computed if it's actually used. An alternative strategy would be to have the first worker that needs the value compute it, but one downside of that approach is that we'd then need to select a parallel-safe path to compute the parameter value; it couldn't for example contain a Gather (Merge) node. At some point in the future, we might want to consider both approaches. Independent of that consideration, there is a great deal more work that could be done to make more kinds of PARAM_EXEC parameters parallel-safe. This infrastructure could be used to allow a Gather (Merge) on the inner side of a nested loop (although that's not a very appealing plan) and cases where the InitPlan is attached below the Gather (Merge) could be addressed as well using various techniques. But this is a good start. Amit Kapila, reviewed and revised by me. Reviewing and testing from Kuntal Ghosh, Haribabu Kommi, and Tushar Ahuja. Discussion: http://postgr.es/m/CAA4eK1LV0Y1AUV4cUCdC+sYOx0Z0-8NAJ2Pd9=UKsbQ5Sr7+JQ@mail.gmail.com	2017-11-16 12:06:14 -05:00
Peter Eisentraut	2eb4a831e5	Change TRUE/FALSE to true/false The lower case spellings are C and C++ standard and are used in most parts of the PostgreSQL sources. The upper case spellings are only used in some files/modules. So standardize on the standard spellings. The APIs for ICU, Perl, and Windows define their own TRUE and FALSE, so those are left as is when using those APIs. In code comments, we use the lower-case spelling for the C concepts and keep the upper-case spelling for the SQL concepts. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2017-11-08 11:37:28 -05:00
Tom Lane	7c70996ebf	Allow bitmap scans to operate as index-only scans when possible. If we don't have to return any columns from heap tuples, and there's no need to recheck qual conditions, and the heap page is all-visible, then we can skip fetching the heap page altogether. Skip prefetching pages too, when possible, on the assumption that the recheck flag will remain the same from one page to the next. While that assumption is hardly bulletproof, it seems like a good bet most of the time, and better than prefetching pages we don't need. This commit installs the executor infrastructure, but doesn't change any planner cost estimates, thus possibly causing bitmap scans to not be chosen in cases where this change renders them the best choice. I (tgl) am not entirely convinced that we need to account for this behavior in the planner, because I think typically the bitmap scan would get chosen anyway if it's the best bet. In any case the submitted patch took way too many shortcuts, resulting in too many clearly-bad choices, to be committable. Alexander Kuzmenkov, reviewed by Alexey Chernyshov, and whacked around rather heavily by me. Discussion: https://postgr.es/m/239a8955-c0fc-f506-026d-c837e86c827b@postgrespro.ru	2017-11-01 17:38:20 -04:00
Robert Haas	6393613b6a	Fix possible crash with Parallel Bitmap Heap Scan. If a Parallel Bitmap Heap scan's chain of leftmost descendents includes a BitmapOr whose first child is a BitmapAnd, the prior coding would mistakenly create a non-shared TIDBitmap and then try to perform shared iteration. Report by Tomas Vondra. Patch by Dilip Kumar. Discussion: http://postgr.es/m/50e89684-8ad9-dead-8767-c9545bafd3b6@2ndquadrant.com	2017-10-13 15:02:45 -04:00
Robert Haas	f49842d1ee	Basic partition-wise join functionality. Instead of joining two partitioned tables in their entirety we can, if it is an equi-join on the partition keys, join the matching partitions individually. This involves teaching the planner about "other join" rels, which are related to regular join rels in the same way that other member rels are related to baserels. This can use significantly more CPU time and memory than regular join planning, because there may now be a set of "other" rels not only for every base relation but also for every join relation. In most practical cases, this probably shouldn't be a problem, because (1) it's probably unusual to join many tables each with many partitions using the partition keys for all joins and (2) if you do that scenario then you probably have a big enough machine to handle the increased memory cost of planning and (3) the resulting plan is highly likely to be better, so what you spend in planning you'll make up on the execution side. All the same, for now, turn this feature off by default. Currently, we can only perform joins between two tables whose partitioning schemes are absolutely identical. It would be nice to cope with other scenarios, such as extra partitions on one side or the other with no match on the other side, but that will have to wait for a future patch. Ashutosh Bapat, reviewed and tested by Rajkumar Raghuwanshi, Amit Langote, Rafia Sabih, Thomas Munro, Dilip Kumar, Antonin Houska, Amit Khandekar, and by me. A few final adjustments by me. Discussion: http://postgr.es/m/CAFjFpRfQ8GrQvzp3jA2wnLqrHmaXna-urjm_UY9BqXj=EaDTSA@mail.gmail.com Discussion: http://postgr.es/m/CAFjFpRcitjfrULr5jfuKWRPsGUX0LQ0k8-yG0Qw2+1LBGNpMdw@mail.gmail.com	2017-10-06 11:11:10 -04:00
Tom Lane	7df2c1f8da	Force rescanning of parallel-aware scan nodes below a Gather[Merge]. The ExecReScan machinery contains various optimizations for postponing or skipping rescans of plan subtrees; for example a HashAgg node may conclude that it can re-use the table it built before, instead of re-reading its input subtree. But that is wrong if the input contains a parallel-aware table scan node, since the portion of the table scanned by the leader process is likely to vary from one rescan to the next. This explains the timing-dependent buildfarm failures we saw after commit `a2b70c89c`. The established mechanism for showing that a plan node's output is potentially variable is to mark it as depending on some runtime Param. Hence, to fix this, invent a dummy Param (one that has a PARAM_EXEC parameter number, but carries no actual value) associated with each Gather or GatherMerge node, mark parallel-aware nodes below that node as dependent on that Param, and arrange for ExecReScanGather[Merge] to flag that Param as changed whenever the Gather[Merge] node is rescanned. This solution breaks an undocumented assumption made by the parallel executor logic, namely that all rescans of nodes below a Gather[Merge] will happen synchronously during the ReScan of the top node itself. But that's fundamentally contrary to the design of the ExecReScan code, and so was doomed to fail someday anyway (even if you want to argue that the bug being fixed here wasn't a failure of that assumption). A follow-on patch will address that issue. In the meantime, the worst that's expected to happen is that given very bad timing luck, the leader might have to do all the work during a rescan, because workers think they have nothing to do, if they are able to start up before the eventual ReScan of the leader's parallel-aware table scan node has reset the shared scan state. Although this problem exists in 9.6, there does not seem to be any way for it to manifest there. Without GatherMerge, it seems that a plan tree that has a rescan-short-circuiting node below Gather will always also have one above it that will short-circuit in the same cases, preventing the Gather from being rescanned. Hence we won't take the risk of back-patching this change into 9.6. But v10 needs it. Discussion: https://postgr.es/m/CAA4eK1JkByysFJNh9M349u_nNjqETuEnY_y1VUc_kJiU0bxtaQ@mail.gmail.com	2017-08-30 09:29:55 -04:00
Robert Haas	7086be6e36	When WCOs are present, disable direct foreign table modification. If the user modifies a view that has CHECK OPTIONs and this gets translated into a modification to an underlying relation which happens to be a foreign table, the check options should be enforced. In the normal code path, that was happening properly, but it was not working properly for "direct" modification because the whole operation gets pushed to the remote side in that case and we never have an option to enforce the constraint against individual tuples. Fix by disabling direct modification when there is a need to enforce CHECK OPTIONs. Etsuro Fujita, reviewed by Kyotaro Horiguchi and by me. Discussion: http://postgr.es/m/f8a48f54-6f02-9c8a-5250-9791603171ee@lab.ntt.co.jp	2017-07-24 15:57:24 -04:00
Tom Lane	278cb43411	Be more consistent about errors for opfamily member lookup failures. Add error checks in some places that were calling get_opfamily_member or get_opfamily_proc and just assuming that the call could never fail. Also, standardize the wording for such errors in some other places. None of these errors are expected in normal use, hence they're just elog not ereport. But they may be handy for diagnosing omissions in custom opclasses. Rushabh Lathia found the oversight in RelationBuildPartitionKey(); I found the others by grepping for all callers of these functions. Discussion: https://postgr.es/m/CAGPqQf2R9Nk8htpv0FFi+FP776EwMyGuORpc9zYkZKC8sFQE3g@mail.gmail.com	2017-07-24 11:23:27 -04:00
Tom Lane	382ceffdf7	Phase 3 of pgindent updates. Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us	2017-06-21 15:35:54 -04:00
Tom Lane	c7b8998ebb	Phase 2 of pgindent updates. Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit `e3860ffa4d` wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us	2017-06-21 15:19:25 -04:00
Tom Lane	9ef2dbefc7	Final pgindent run with old pg_bsd_indent (version 1.3). This is just to have a clean basis for comparison with the results of the new version (which will indeed end up reverting some of these changes...) Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us	2017-06-21 14:09:24 -04:00
Robert Haas	b08df9cab7	Teach predtest.c about CHECK clauses to fix partitioning bugs. In a CHECK clause, a null result means true, whereas in a WHERE clause it means false. predtest.c provided different functions depending on which set of semantics applied to the predicate being proved, but had no option to control what a null meant in the clauses provided as axioms. Add one. Use that in the partitioning code when figuring out whether the validation scan on a new partition can be skipped. Rip out the old logic that attempted (not very successfully) to compensate for the absence of the necessary support in predtest.c. Ashutosh Bapat and Robert Haas, reviewed by Amit Langote and incorporating feedback from Tom Lane. Discussion: http://postgr.es/m/CAFjFpReT_kq_uwU_B8aWDxR7jNGE=P0iELycdq5oupi=xSQTOw@mail.gmail.com	2017-06-14 13:13:11 -04:00
Bruce Momjian	a6fd7b7a5f	Post-PG 10 beta1 pgindent run perltidy run not included.	2017-05-17 16:31:56 -04:00
Tom Lane	f674743487	Remove no-longer-needed fields of Hash plan nodes. skewColType/skewColTypmod are no longer used in the wake of commit `9aab83fc5`, and seem unlikely to be wanted in future, so let's drop 'em. Discussion: https://postgr.es/m/16364.1494520862@sss.pgh.pa.us	2017-05-14 11:07:40 -04:00
Robert Haas	e180c8aa8c	Fire per-statement triggers on partitioned tables. Even though no actual tuples are ever inserted into a partitioned table (the actual tuples are in the partitions, not the partitioned table itself), we still need to have a ResultRelInfo for the partitioned table, or per-statement triggers won't get fired. Amit Langote, per a report from Rajkumar Raghuwanshi. Reviewed by me. Discussion: http://postgr.es/m/CAKcux6%3DwYospCRY2J4XEFuVy0L41S%3Dfic7rmkbsU-GXhhSbmBg%40mail.gmail.com	2017-05-01 08:23:01 -04:00
Tom Lane	76799fc89d	Always build a custom plan node's targetlist from the path's pathtarget. We were applying the use_physical_tlist optimization to all relation scan plans, even those implemented by custom scan providers. However, that's a bad idea for a couple of reasons. The custom provider might be unable to provide columns that it hadn't expected to be asked for (for example, the custom scan might depend on an index-only scan). Even more to the point, there's no good reason to suppose that this "optimization" is a win for a custom scan; whatever the custom provider is doing is likely not based on simply returning physical heap tuples. (As a counterexample, if the custom scan is an interface to a column store, demanding all columns would be a huge loss.) If it is a win, the custom provider could make that decision for itself and insert a suitable pathtarget into the path, anyway. Per discussion with Dmitry Ivanov. Back-patch to 9.5 where custom scan support was introduced. The argument that the custom provider can adjust the behavior by changing the pathtarget only applies to 9.6+, but on balance it seems more likely that use_physical_tlist will hurt custom scans than help them. Discussion: https://postgr.es/m/e29ddd30-8ef9-4da5-a50b-2bb7b8c7198d@postgrespro.ru	2017-04-17 15:29:15 -04:00
Tom Lane	003d80f3df	Mark finished Plan nodes with parallel_safe flags. We'd managed to avoid doing this so far, but it seems pretty obvious that it would be forced on us some day, and this is much the cleanest way of approaching the open problem that parallel-unsafe subplans are being transmitted to parallel workers. Anyway there's no space cost due to alignment considerations, and the time cost is pretty minimal since we're just copying the flag from the corresponding Path node. (At least in most cases ... some of the klugier spots in createplan.c have to work a bit harder.) In principle we could perhaps get rid of SubPlan.parallel_safe, but I thought it better to keep that in case there are reasons to consider a SubPlan unsafe even when its child plan is parallel-safe. This patch doesn't actually do anything with the new flags, but I thought I'd commit it separately anyway. Note: although this touches outfuncs/readfuncs, there's no need for a catversion bump because Plan trees aren't stored on disk. Discussion: https://postgr.es/m/87tw5x4vcu.fsf@credativ.de	2017-04-12 15:13:34 -04:00
Tom Lane	8f0530f580	Improve castNode notation by introducing list-extraction-specific variants. This extends the castNode() notation introduced by commit `5bcab1114` to provide, in one step, extraction of a list cell's pointer and coercion to a concrete node type. For example, "lfirst_node(Foo, lc)" is the same as "castNode(Foo, lfirst(lc))". Almost half of the uses of castNode that have appeared so far include a list extraction call, so this is pretty widely useful, and it saves a few more keystrokes compared to the old way. As with the previous patch, back-patch the addition of these macros to pg_list.h, so that the notation will be available when back-patching. Patch by me, after an idea of Andrew Gierth's. Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us	2017-04-10 13:51:53 -04:00
Tom Lane	9c7f5229ad	Optimize joins when the inner relation can be proven unique. If there can certainly be no more than one matching inner row for a given outer row, then the executor can move on to the next outer row as soon as it's found one match; there's no need to continue scanning the inner relation for this outer row. This saves useless scanning in nestloop and hash joins. In merge joins, it offers the opportunity to skip mark/restore processing, because we know we have not advanced past the first possible match for the next outer row. Of course, the devil is in the details: the proof of uniqueness must depend only on joinquals (not otherquals), and if we want to skip mergejoin mark/restore then it must depend only on merge clauses. To avoid adding more planning overhead than absolutely necessary, the present patch errs in the conservative direction: there are cases where inner_unique or skip_mark_restore processing could be used, but it will not do so because it's not sure that the uniqueness proof depended only on "safe" clauses. This could be improved later. David Rowley, reviewed and rather heavily editorialized on by me Discussion: https://postgr.es/m/CAApHDvqF6Sw-TK98bW48TdtFJ+3a7D2mFyZ7++=D-RyPsL76gw@mail.gmail.com	2017-04-07 22:20:13 -04:00
Robert Haas	7a39b5e4d1	Abstract logic to allow for multiple kinds of child rels. Currently, the only type of child relation is an "other member rel", which is the child of a baserel, but in the future joins and even upper relations may have child rels. To facilitate that, introduce macros that test to test for particular RelOptKind values, and use them in various places where they help to clarify the sense of a test. (For example, a test may allow RELOPT_OTHER_MEMBER_REL either because it intends to allow child rels, or because it intends to allow simple rels.) Also, remove find_childrel_top_parent, which will not work for a child rel that is not a baserel. Instead, add a new RelOptInfo member top_parent_relids to track the same kind of information in a more generic manner. Ashutosh Bapat, slightly tweaked by me. Review and testing of the patch set from which this was taken by Rajkumar Raghuwanshi and Rafia Sabih. Discussion: http://postgr.es/m/CA+TgmoagTnF2yqR3PT2rv=om=wJiZ4-A+ATwdnriTGku1CLYxA@mail.gmail.com	2017-04-03 22:41:31 -04:00
Kevin Grittner	18ce3a4ab2	Add infrastructure to support EphemeralNamedRelation references. A QueryEnvironment concept is added, which allows new types of objects to be passed into queries from parsing on through execution. At this point, the only thing implemented is a collection of EphemeralNamedRelation objects -- relations which can be referenced by name in queries, but do not exist in the catalogs. The only type of ENR implemented is NamedTuplestore, but provision is made to add more types fairly easily. An ENR can carry its own TupleDesc or reference a relation in the catalogs by relid. Although these features can be used without SPI, convenience functions are added to SPI so that ENRs can easily be used by code run through SPI. The initial use of all this is going to be transition tables in AFTER triggers, but that will be added to each PL as a separate commit. An incidental effect of this patch is to produce a more informative error message if an attempt is made to modify the contents of a CTE from a referencing DML statement. No tests previously covered that possibility, so one is added. Kevin Grittner and Thomas Munro Reviewed by Heikki Linnakangas, David Fetter, and Thomas Munro with valuable comments and suggestions from many others	2017-03-31 23:17:18 -05:00
Robert Haas	7d8f6986b8	Fix parallel query so it doesn't spoil row estimates above Gather. Commit `45be99f8cd` removed GatherPath's num_workers field, but this is entirely bogus. Normally, a path's parallel_workers flag is supposed to indicate the number of workers that it wants, and should be 0 for a non-partial path. In that commit, I mistakenly thought that GatherPath could also use that field to indicate the number of workers that it would try to start, but that's disastrous, because then it can propagate up to higher nodes in the plan tree, which will then get incorrect rowcounts because the parallel_workers flag is involved in computing those values. Repair by putting the separate field back. Report by Tomas Vondra. Patch by me, reviewed by Amit Kapila. Discussion: http://postgr.es/m/f91b4a44-f739-04bd-c4b6-f135bd643669@2ndquadrant.com	2017-03-31 21:01:20 -04:00
Peter Eisentraut	4cb824699e	Cast result of copyObject() to correct type copyObject() is declared to return void , which allows easily assigning the result independent of the input, but it loses all type checking. If the compiler supports typeof or something similar, cast the result to the input type. This creates a greater amount of type safety. In some cases, where the result is assigned to a generic type such as Node or Expr *, new casts are now necessary, but in general casts are now unnecessary in the normal case and indicate that something unusual is happening. Reviewed-by: Mark Dilger <hornschnorter@gmail.com>	2017-03-28 21:59:23 -04:00
Andrew Gierth	b5635948ab	Support hashed aggregation with grouping sets. This extends the Aggregate node with two new features: HashAggregate can now run multiple hashtables concurrently, and a new strategy MixedAggregate populates hashtables while doing sorted grouping. The planner will now attempt to save as many sorts as possible when planning grouping sets queries, while not exceeding work_mem for the estimated combined sizes of all hashtables used. No SQL-level changes are required. There should be no user-visible impact other than the new EXPLAIN output and possible changes to result ordering when ORDER BY was not used (which affected a few regression tests). The enable_hashagg option is respected. Author: Andrew Gierth Reviewers: Mark Dilger, Andres Freund Discussion: https://postgr.es/m/87vatszyhj.fsf@news-spur.riddles.org.uk	2017-03-27 04:20:54 +01:00
Robert Haas	d3cc37f1d8	Don't scan partitioned tables. Partitioned tables do not contain any data; only their unpartitioned descendents need to be scanned. However, the partitioned tables still need to be locked, even though they're not scanned. To make that work, Append and MergeAppend relations now need to carry a list of (unscanned) partitioned relations that must be locked, and InitPlan must lock all partitioned result relations. Aside from the obvious advantage of avoiding some work at execution time, this has two other advantages. First, it may improve the planner's decision-making in some cases since the empty relation might throw things off. Second, it paves the way to getting rid of the storage for partitioned tables altogether. Amit Langote, reviewed by me. Discussion: http://postgr.es/m/6837c359-45c4-8044-34d1-736756335a15@lab.ntt.co.jp	2017-03-21 09:48:04 -04:00
Peter Eisentraut	f97a028d8e	Spelling fixes in code comments From: Josh Soref <jsoref@gmail.com>	2017-03-14 12:58:39 -04:00
Robert Haas	a82178020d	Update overlooked comment for Gather Merge. Commit `355d3993c5` probably should have done this, but nobody noticed that it was needed.	2017-03-14 07:52:11 -04:00
Robert Haas	bce352fb46	Remove some bogus logic from create_gather_merge_plan. This logic was adapated from create_merge_append_plan, but the two cases aren't really analogous, because create_merge_append_plan is not projection-capable and must therefore have a tlist identical to that of the underlying paths. Overwriting the tlist of Gather Merge with whatever the underlying plan happens to produce is no good at all. Patch by me, reviewed by Rushabh Lathia, who also reported the issue and made an initial attempt at a fix. Discussion: http://postgr.es/m/CA+Tgmob_-oHEOBfT9S25bjqokdqv8e8xEmh9zOY+3MPr_LmuhA@mail.gmail.com	2017-03-14 07:43:45 -04:00
Robert Haas	0ee92e1c9b	Fix a couple of planner bugs in Gather Merge. Neha Sharma reported these to Rushabh Lathia just after I commit `355d3993c5` went in. The patch is Rushabh's, with input from me.	2017-03-09 12:06:49 -05:00
Robert Haas	355d3993c5	Add a Gather Merge executor node. Like Gather, we spawn multiple workers and run the same plan in each one; however, Gather Merge is used when each worker produces the same output ordering and we want to preserve that output ordering while merging together the streams of tuples from various workers. (In a way, Gather Merge is like a hybrid of Gather and MergeAppend.) This works out to a win if it saves us from having to perform an expensive Sort. In cases where only a small amount of data would need to be sorted, it may actually be faster to use a regular Gather node and then sort the results afterward, because Gather Merge sometimes needs to wait synchronously for tuples whereas a pure Gather generally doesn't. But if this avoids an expensive sort then it's a win. Rushabh Lathia, reviewed and tested by Amit Kapila, Thomas Munro, and Neha Sharma, and reviewed and revised by me. Discussion: http://postgr.es/m/CAGPqQf09oPX-cQRpBKS0Gq49Z+m6KBxgxd_p9gX8CKk_d75HoQ@mail.gmail.com	2017-03-09 07:49:29 -05:00
Robert Haas	f35742ccb7	Support parallel bitmap heap scans. The index is scanned by a single process, but then all cooperating processes can iterate jointly over the resulting set of heap blocks. In the future, we might also want to support using a parallel bitmap index scan to set up for a parallel bitmap heap scan, but that's a job for another day. Dilip Kumar, with some corrections and cosmetic changes by me. The larger patch set of which this is a part has been reviewed and tested by (at least) Andres Freund, Amit Khandekar, Tushar Ahuja, Rafia Sabih, Haribabu Kommi, Thomas Munro, and me. Discussion: http://postgr.es/m/CAFiTN-uc4=0WxRGfCzs-xfkMYcSEWUC-Fon6thkJGjkh9i=13A@mail.gmail.com	2017-03-08 12:05:43 -05:00
Alvaro Herrera	fcec6caafa	Support XMLTABLE query expression XMLTABLE is defined by the SQL/XML standard as a feature that allows turning XML-formatted data into relational form, so that it can be used as a <table primary> in the FROM clause of a query. This new construct provides significant simplicity and performance benefit for XML data processing; what in a client-side custom implementation was reported to take 20 minutes can be executed in 400ms using XMLTABLE. (The same functionality was said to take 10 seconds using nested PostgreSQL XPath function calls, and 5 seconds using XMLReader under PL/Python). The implemented syntax deviates slightly from what the standard requires. First, the standard indicates that the PASSING clause is optional and that multiple XML input documents may be given to it; we make it mandatory and accept a single document only. Second, we don't currently support a default namespace to be specified. This implementation relies on a new executor node based on a hardcoded method table. (Because the grammar is fixed, there is no extensibility in the current approach; further constructs can be implemented on top of this such as JSON_TABLE, but they require changes to core code.) Author: Pavel Stehule, Álvaro Herrera Extensively reviewed by: Craig Ringer Discussion: https://postgr.es/m/CAFj8pRAgfzMD-LoSmnMGybD0WsEznLHWap8DO79+-GTRAPR4qA@mail.gmail.com	2017-03-08 12:40:26 -03:00
Peter Eisentraut	38d103763d	Make more use of castNode()	2017-02-21 11:59:09 -05:00
Tom Lane	555494d1bc	Fix placement of initPlans when forcibly materializing a subplan. If we forcibly place a Material node atop a finished subplan, we need to move any initPlans attached to the subplan up to the Material node, in order to keep SS_finalize_plan() happy. I'd figured this out in commit `7b67a0a49` for the case of materializing a cursor plan, but out of an abundance of caution, I put the initPlan movement hack at the call site for that case, rather than inside materialize_finished_plan(). That was the wrong thing, because it turns out to also be necessary for the only other caller of materialize_finished_plan(), ie subselect.c. We lacked any test cases that exposed the mistake, but bug#14524 from Wei Congrui shows that it's possible to get an initPlan reference into the top tlist in that case too, and then SS_finalize_plan() complains. Hence, move the hack into materialize_finished_plan(). In HEAD, also relocate some recently-added tests in subselect.sql, which I'd unthinkingly dropped into the middle of a sequence of related tests. Report: https://postgr.es/m/20170202060020.1400.89021@wrigleys.postgresql.org	2017-02-02 19:11:32 -05:00
Andres Freund	69f4b9c85f	Move targetlist SRF handling from expression evaluation to new executor node. Evaluation of set returning functions (SRFs_ in the targetlist (like SELECT generate_series(1,5)) so far was done in the expression evaluation (i.e. ExecEvalExpr()) and projection (i.e. ExecProject/ExecTargetList) code. This meant that most executor nodes performing projection, and most expression evaluation functions, had to deal with the possibility that an evaluated expression could return a set of return values. That's bad because it leads to repeated code in a lot of places. It also, and that's my (Andres's) motivation, made it a lot harder to implement a more efficient way of doing expression evaluation. To fix this, introduce a new executor node (ProjectSet) that can evaluate targetlists containing one or more SRFs. To avoid the complexity of the old way of handling nested expressions returning sets (e.g. having to pass up ExprDoneCond, and dealing with arguments to functions returning sets etc.), those SRFs can only be at the top level of the node's targetlist. The planner makes sure (via split_pathtarget_at_srfs()) that SRF evaluation is only necessary in ProjectSet nodes and that SRFs are only present at the top level of the node's targetlist. If there are nested SRFs the planner creates multiple stacked ProjectSet nodes. The ProjectSet nodes always get input from an underlying node. We also discussed and prototyped evaluating targetlist SRFs using ROWS FROM(), but that turned out to be more complicated than we'd hoped. While moving SRF evaluation to ProjectSet would allow to retain the old "least common multiple" behavior when multiple SRFs are present in one targetlist (i.e. continue returning rows until all SRFs are at the end of their input at the same time), we decided to instead only return rows till all SRFs are exhausted, returning NULL for already exhausted ones. We deemed the previous behavior to be too confusing, unexpected and actually not particularly useful. As a side effect, the previously prohibited case of multiple set returning arguments to a function, is now allowed. Not because it's particularly desirable, but because it ends up working and there seems to be no argument for adding code to prohibit it. Currently the behavior for COALESCE and CASE containing SRFs has changed, returning multiple rows from the expression, even when the SRF containing "arm" of the expression is not evaluated. That's because the SRFs are evaluated in a separate ProjectSet node. As that's quite confusing, we're likely to instead prohibit SRFs in those places. But that's still being discussed, and the code would reside in places not touched here, so that's a task for later. There's a lot of, now superfluous, code dealing with set return expressions around. But as the changes to get rid of those are verbose largely boring, it seems better for readability to keep the cleanup as a separate commit. Author: Tom Lane and Andres Freund Discussion: https://postgr.es/m/20160822214023.aaxz5l4igypowyri@alap3.anarazel.de	2017-01-18 13:40:27 -08:00
Tom Lane	215b43cdc8	Improve RLS planning by marking individual quals with security levels. In an RLS query, we must ensure that security filter quals are evaluated before ordinary query quals, in case the latter contain "leaky" functions that could expose the contents of sensitive rows. The original implementation of RLS planning ensured this by pushing the scan of a secured table into a sub-query that it marked as a security-barrier view. Unfortunately this results in very inefficient plans in many cases, because the sub-query cannot be flattened and gets planned independently of the rest of the query. To fix, drop the use of sub-queries to enforce RLS qual order, and instead mark each qual (RestrictInfo) with a security_level field establishing its priority for evaluation. Quals must be evaluated in security_level order, except that "leakproof" quals can be allowed to go ahead of quals of lower security_level, if it's helpful to do so. This has to be enforced within the ordering of any one list of quals to be evaluated at a table scan node, and we also have to ensure that quals are not chosen for early evaluation (i.e., use as an index qual or TID scan qual) if they're not allowed to go ahead of other quals at the scan node. This is sufficient to fix the problem for RLS quals, since we only support RLS policies on simple tables and thus RLS quals will always exist at the table scan level only. Eventually these qual ordering rules should be enforced for join quals as well, which would permit improving planning for explicit security-barrier views; but that's a task for another patch. Note that FDWs would need to be aware of these rules --- and not, for example, send an insecure qual for remote execution --- but since we do not yet allow RLS policies on foreign tables, the case doesn't arise. This will need to be addressed before we can allow such policies. Patch by me, reviewed by Stephen Frost and Dean Rasheed. Discussion: https://postgr.es/m/8185.1477432701@sss.pgh.pa.us	2017-01-18 12:58:20 -05:00
Bruce Momjian	1d25779284	Update copyright via script for 2017	2017-01-03 13:48:53 -05:00
Robert Haas	7012b132d0	postgres_fdw: Push down aggregates to remote servers. Now that the upper planner uses paths, and now that we have proper hooks to inject paths into the upper planning process, it's possible for foreign data wrappers to arrange to push aggregates to the remote side instead of fetching all of the rows and aggregating them locally. This figures to be a massive win for performance, so teach postgres_fdw to do it. Jeevan Chalke and Ashutosh Bapat. Reviewed by Ashutosh Bapat with additional testing by Prabhat Sahu. Various mostly cosmetic changes by me.	2016-10-21 09:54:29 -04:00
Tom Lane	2c00fad286	Fix improper repetition of previous results from a hashed aggregate. ExecReScanAgg's check for whether it could re-use a previously calculated hashtable neglected the possibility that the Agg node might reference PARAM_EXEC Params that are not referenced by its input plan node. That's okay if the Params are in upper tlist or qual expressions; but if one appears in aggregate input expressions, then the hashtable contents need to be recomputed when the Param's value changes. To avoid unnecessary performance degradation in the case of a Param that isn't within an aggregate input, add logic to the planner to determine which Params are within aggregate inputs. This requires a new field in struct Agg, but fortunately we never write plans to disk, so this isn't an initdb-forcing change. Per report from Jeevan Chalke. This has been broken since forever, so back-patch to all supported branches. Andrew Gierth, with minor adjustments by me Report: <CAM2+6=VY8ykfLT5Q8vb9B6EbeBk-NGuLbT6seaQ+Fq4zXvrDcA@mail.gmail.com>	2016-08-24 14:38:12 -04:00
Tom Lane	45639a0525	Avoid invalidating all foreign-join cached plans when user mappings change. We must not push down a foreign join when the foreign tables involved should be accessed under different user mappings. Previously we tried to enforce that rule literally during planning, but that meant that the resulting plans were dependent on the current contents of the pg_user_mapping catalog, and we had to blow away all cached plans containing any remote join when anything at all changed in pg_user_mapping. This could have been improved somewhat, but the fact that a syscache inval callback has very limited info about what changed made it hard to do better within that design. Instead, let's change the planner to not consider user mappings per se, but to allow a foreign join if both RTEs have the same checkAsUser value. If they do, then they necessarily will use the same user mapping at runtime, and we don't need to know specifically which one that is. Post-plan-time changes in pg_user_mapping no longer require any plan invalidation. This rule does give up some optimization ability, to wit where two foreign table references come from views with different owners or one's from a view and one's directly in the query, but nonetheless the same user mapping would have applied. We'll sacrifice the first case, but to not regress more than we have to in the second case, allow a foreign join involving both zero and nonzero checkAsUser values if the nonzero one is the same as the prevailing effective userID. In that case, mark the plan as only runnable by that userID. The plancache code already had a notion of plans being userID-specific, in order to support RLS. It was a little confused though, in particular lacking clarity of thought as to whether it was the rewritten query or just the finished plan that's dependent on the userID. Rearrange that code so that it's clearer what depends on which, and so that the same logic applies to both RLS-injected role dependency and foreign-join-injected role dependency. Note that this patch doesn't remove the other issue mentioned in the original complaint, which is that while we'll reliably stop using a foreign join if it's disallowed in a new context, we might fail to start using a foreign join if it's now allowed, but we previously created a generic cached plan that didn't use one. It was agreed that the chance of winning that way was not high enough to justify the much larger number of plan invalidations that would have to occur if we tried to cause it to happen. In passing, clean up randomly-varying spelling of EXPLAIN commands in postgres_fdw.sql, and fix a COSTS ON example that had been allowed to leak into the committed tests. This reverts most of commits `fbe5a3fb7` and `5d4171d1c`, which were the previous attempt at ensuring we wouldn't push down foreign joins that span permissions contexts. Etsuro Fujita and Tom Lane Discussion: <d49c1e5b-f059-20f4-c132-e9752ee0113e@lab.ntt.co.jp>	2016-07-15 17:23:02 -04:00
Tom Lane	7b67a0a49c	Fix some interrelated planner issues with initPlans and Param munging. In commit `68fa28f77` I tried to teach SS_finalize_plan() to cope with initPlans attached anywhere in the plan tree, by dint of moving its handling of those into the recursion in finalize_plan(). It turns out that that doesn't really work: if a lower-level plan node emits an initPlan output parameter in its targetlist, it's legitimate for upper levels to reference those Params --- and at the point where this code runs, those references look just like the Param itself, so finalize_plan() quite properly rejects them as being in the wrong place. We could lobotomize the checks enough to allow that, probably, but then it's not clear that we'd have any meaningful check for misplaced Params at all. What seems better, at least in the near term, is to tweak standard_planner() a bit so that initPlans are never placed anywhere but the topmost plan node for a query level, restoring the behavior that occurred pre-9.6. Possibly we can do better if this code is ever merged into setrefs.c: then it would be possible to check a Param's placement only when we'd failed to replace it with a Var referencing a child plan node's targetlist. BTW, I'm now suspicious that finalize_plan is doing the wrong thing by returning the node's allParam rather than extParam to be incorporated in the parent node's set of used parameters. However, it makes no difference given that initPlans only appear at top level, so I'll leave that alone for now. Another thing that emerged from this is that standard_planner() needs to check for initPlans before deciding that it's safe to stick a Gather node on top in force_parallel_mode mode. We previously guarded against that by deciding the plan wasn't wholePlanParallelSafe if any subplans had been found, but after commit `5ce5e4a12` it's necessary to have this substitute test, because path parallel_safe markings don't account for initPlans. (Normally, we'd have decided the paths weren't safe anyway due to appearances of SubPlan nodes, Params, or CTE scans somewhere in the tree --- but it's possible for those all to be optimized away while initPlans still remain.) Per fuzz testing by Andreas Seltenreich. Report: <874m89rw7x.fsf@credativ.de>	2016-07-01 20:06:05 -04:00
Robert Haas	5ce5e4a12e	Set consider_parallel correctly for upper planner rels. Commit `3fc6e2d7f5` introduced new "upper" RelOptInfo structures but didn't set consider_parallel for them correctly, a point I completely missed when reviewing it. Later, commit `e06a38965b` made the situation worse by doing it incorrectly for the grouping relation. Try to straighten all of that out. Along the way, get rid of the annoying wholePlanParallelSafe flag, which was only necessarily because of the fact that upper planning stages didn't use paths at the time that code was written. The most important immediate impact of these changes is that force_parallel_mode will provide useful test coverage in quite a few more scenarios than it did previously, but it's also necessary preparation for fixing some problems related to subqueries. Patch by me, reviewed by Tom Lane.	2016-07-01 11:52:56 -04:00
Tom Lane	c12f02ffc9	Don't apply sortgroupref labels to a tlist that might not match. If we need to use a gating Result node for pseudoconstant quals, create_scan_plan() intentionally suppresses use_physical_tlist's checks on whether there are matches for sortgroupref labels, on the grounds that we don't need matches because we can label the Result's projection output properly. However, it then called apply_pathtarget_labeling_to_tlist anyway. This oversight was harmless when written, but in commit `aeb9ae645` I made that function throw an error if there was no match. Thus, the combination of a table scan, pseudoconstant quals, and a non-simple-Var sortgroupref column threw the dreaded "ORDER/GROUP BY expression not found in targetlist" error. To fix, just skip applying the labeling in this case. Per report from Rushabh Lathia. Report: <CAGPqQf2iLB8t6t-XrL-zR233DFTXxEsfVZ4WSqaYfLupEoDxXA@mail.gmail.com>	2016-06-28 10:43:11 -04:00

1 2 3 4 5 ...

435 Commits