postgres

mirror of https://github.com/postgres/postgres.git synced 2025-07-02 09:02:37 +03:00

Author	SHA1	Message	Date
Robert Haas	04ae11f62e	Remove bogus code to apply PathTargets to partial paths. The partial paths that get modified may already have been used as part of a GatherPath which appears in the path list, so modifying them is not a good idea at this stage - especially because this code has no check that the PathTarget is in fact parallel-safe. When partial aggregation is being performed, this is actually harmless because we'll end up replacing the pathtargets here with the correct ones within create_grouping_paths(). But if we've got a query tree containing only scan/join operations then this can result in incorrectly pushing down parallel-restricted target list entries. If those are, for example, references to subqueries, that can crash the server; but it's wrong in any event. Amit Kapila	2016-06-03 14:27:33 -04:00
Tom Lane	26e66184d6	Fix assorted missing infrastructure for ON CONFLICT. subquery_planner() failed to apply expression preprocessing to the arbiterElems and arbiterWhere fields of an OnConflictExpr. No doubt the theory was that this wasn't necessary because we don't actually try to execute those expressions; but that's wrong, because it results in failure to match to index expressions or index predicates that are changed at all by preprocessing. Per bug #14132 from Reynold Smith. Also add pullup_replace_vars processing for onConflictWhere. Perhaps it's impossible to have a subquery reference there, but I'm not exactly convinced; and even if true today it's a failure waiting to happen. Also add some comments to other places where one or another field of OnConflictExpr is intentionally ignored, with explanation as to why it's okay to do so. Also, catalog/dependency.c failed to record any dependency on the named constraint in ON CONFLICT ON CONSTRAINT, allowing such a constraint to be dropped while rules exist that depend on it, and allowing pg_dump to dump such a rule before the constraint it refers to. The normal execution path managed to error out reasonably for a dangling constraint reference, but ruleutils.c dumped core; so in addition to fixing the omission, add a protective check in ruleutils.c, since we can't retroactively add a dependency in existing databases. Back-patch to 9.5 where this code was introduced. Report: <20160510190350.2608.48667@wrigleys.postgresql.org>	2016-05-11 16:20:23 -04:00
Magnus Hagander	ba8fe38f58	Fix typo in comment	2016-04-15 13:32:54 +02:00
Robert Haas	deb71fa971	Fix costing for parallel aggregation. The original patch kind of ignored the fact that we were doing something different from a costing point of view, but nobody noticed. This patch fixes that oversight. David Rowley	2016-04-12 16:25:55 -04:00
Tom Lane	f1f01de145	Redefine create_upper_paths_hook as being invoked once per upper relation. Per discussion, this gives potential users of the hook more flexibility, because they can build custom Paths that implement only one stage of upper processing atop core-provided Paths for earlier stages.	2016-04-12 15:23:14 -04:00
Robert Haas	5fe5a2cee9	Allow aggregate transition states to be serialized and deserialized. This is necessary infrastructure for supporting parallel aggregation for aggregates whose transition type is "internal". Such values can't be passed between cooperating processes, because they are just pointers. David Rowley, reviewed by Tomas Vondra and by me.	2016-03-29 15:04:05 -04:00
Tom Lane	76281aa964	Avoid a couple of zero-divide scenarios in the planner. cost_subplan() supposed that the given subplan must have plan_rows > 0, which as far as I can tell was true until recent refactoring of the code in createplan.c; but now that code allows the Result for a provably empty subquery to have plan_rows = 0. Rather than undo that change, put in a clamp to prevent zero divide. get_cheapest_fractional_path() likewise supposed that best_path->rows > 0. This assumption has been wrong for longer. It's actually harmless given IEEE float math, because a positive value divided by zero gives +Infinity and compare_fractional_path_costs() will do the right thing with that. Still, best not to assume that. final_cost_nestloop() also seems to have some risks in this area, so borrow the clamping logic already present in the mergejoin cost functions. Lastly, remove unnecessary clamp_row_est() in planner.c's calls to get_number_of_groups(). The only thing that function does with path_rows is pass it to estimate_num_groups() which already has an internal clamp, so we don't need the extra call; and if we did, the callers are arguably the wrong place for it anyway. First two items reported by Piotr Stefaniak, the others are products of my nosing around for similar problems. No back-patch since there's no evidence that problems arise in the back branches.	2016-03-26 12:03:12 -04:00
Tom Lane	d543170f2f	Don't split up SRFs when choosing to postpone SELECT output expressions. In commit `9118d03a8c` we taught the planner to postpone evaluation of set-returning functions in a SELECT's targetlist until after any sort done to satisfy ORDER BY. However, if we postpone some SRFs this way while others do not get postponed (because they're sort or group key columns) we will break the traditional behavior by which all SRFs in the tlist run in-step during ExecTargetList(), so that you get the least common multiple of their periods not the product. Fix make_sort_input_target() so it will not split up SRF evaluation in such cases. There is still a hazard of similar odd behavior if there's a SRF in a grouping column and another one that isn't, but that was true before and we're just trying to preserve bug-compatibility with the traditional behavior. This whole area is overdue to be rethought and reimplemented, but we'll try to avoid changing behavior until then. Per report from Regina Obe.	2016-03-25 11:19:51 -04:00
Robert Haas	e06a38965b	Support parallel aggregation. Parallel workers can now partially aggregate the data and pass the transition values back to the leader, which can combine the partial results to produce the final answer. David Rowley, based on earlier work by Haribabu Kommi. Reviewed by Álvaro Herrera, Tomas Vondra, Amit Kapila, James Sewell, and me.	2016-03-21 09:30:18 -04:00
Tom Lane	101fd9349e	Add a GetForeignUpperPaths callback function for FDWs. This is basically like the just-added create_upper_paths_hook, but control is funneled only to the FDW responsible for all the baserels of the current query; so providing such a callback is much less likely to add useless overhead than using the hook function is. The documentation is a bit sketchy. We'll likely want to improve it, and/or adjust the call conventions, when we get some experience with actually using this callback. Hopefully somebody will find time to experiment with it before 9.6 feature freeze.	2016-03-14 20:04:48 -04:00
Tom Lane	5864d6a4b6	Provide a planner hook at a suitable place for creating upper-rel Paths. In the initial revision of the upper-planner pathification work, the only available way for an FDW or custom-scan provider to inject Paths representing post-scan-join processing was to insert them during scan-level GetForeignPaths or similar processing. While that's not impossible, it'd require quite a lot of duplicative processing to look forward and see if the extension would be capable of implementing the whole query. To improve matters for custom-scan providers, provide a hook function at the point where the core code is about to start filling in upperrel Paths. At this point Paths are available for the whole scan/join tree, which should reduce the amount of redundant effort considerably. (An alternative design that was suggested was to provide a separate hook for each post-scan-join processing step, but that seems messy and not clearly more useful.) Following our time-honored tradition, there's no documentation for this hook outside the source code. As-is, this hook is only meant for custom scan providers, which we can't assume very much about. A followon patch will implement an FDW callback to let FDWs do the same thing in a somewhat more structured fashion.	2016-03-14 19:23:29 -04:00
Tom Lane	307c78852f	Rethink representation of PathTargets. In commit `19a541143a` I did not make PathTarget a subtype of Node, and embedded a RelOptInfo's reltarget directly into it rather than having a separately-allocated Node. In hindsight that was misguided micro-optimization, enabled by the fact that at that point we didn't have any Paths with custom PathTargets. Now that PathTarget processing has been fleshed out some more, it's easier to see that it's better to have PathTarget as an indepedent Node type, even if it does cost us one more palloc to create a RelOptInfo. So change it while we still can. This commit just changes the representation, without doing anything more interesting than that.	2016-03-14 16:59:59 -04:00
Tom Lane	9118d03a8c	When appropriate, postpone SELECT output expressions till after ORDER BY. It is frequently useful for volatile, set-returning, or expensive functions in a SELECT's targetlist to be postponed till after ORDER BY and LIMIT are done. Otherwise, the functions might be executed for every row of the table despite the presence of LIMIT, and/or be executed in an unexpected order. For example, in SELECT x, nextval('seq') FROM tab ORDER BY x LIMIT 10; it's probably desirable that the nextval() values are ordered the same as x, and that nextval() is not run more than 10 times. In the past, Postgres was inconsistent in this area: you would get the desirable behavior if the ordering were performed via an indexscan, but not if it had to be done by an explicit sort step. Getting the desired behavior reliably required contortions like SELECT x, nextval('seq') FROM (SELECT x FROM tab ORDER BY x) ss LIMIT 10; This patch conditionally postpones evaluation of pure-output target expressions (that is, those that are not used as DISTINCT, ORDER BY, or GROUP BY columns) so that they effectively occur after sorting, even if an explicit sort step is necessary. Volatile expressions and set-returning expressions are always postponed, so as to provide consistent semantics. Expensive expressions (costing more than 10 times typical operator cost, which by default would include any user-defined function) are postponed if there is a LIMIT or if there are expressions that must be postponed. We could be more aggressive and postpone any nontrivial expression, but there are costs associated with doing so: it requires an extra Result plan node which adds some overhead, and postponement changes the volume of data going through the sort step, perhaps for the worse. Since we tend not to have very good estimates of the output width of nontrivial expressions, it's hard to have much confidence in our ability to predict whether postponement would increase or decrease the cost of the sort; therefore this patch doesn't attempt to make decisions conditionally on that. Between these factors and a general desire not to change query behavior when there's not a demonstrable benefit, it seems best to be conservative about applying postponement. We might tweak the decision rules in the future, though. Konstantin Knizhnik, heavily rewritten by me	2016-03-11 12:27:50 -05:00
Tom Lane	49635d7b3e	Minor additional refactoring of planner.c's PathTarget handling. Teach make_group_input_target() and make_window_input_target() to work entirely with the PathTarget representation of tlists, rather than constructing a tlist and immediately deconstructing it into PathTarget format. In itself this only saves a few palloc's; the bigger picture is that it opens the door for sharing cost_qual_eval work across all of planner.c's constructions of PathTargets. I'll come back to that later. In support of this, flesh out tlist.c's infrastructure for PathTargets a bit more.	2016-03-11 10:24:55 -05:00
Tom Lane	c82c92b111	Give pull_var_clause() reject/recurse/return behavior for WindowFuncs too. All along, this function should have treated WindowFuncs in a manner similar to Aggrefs, ie with an option whether or not to recurse into them. By not considering the case, it was always recursing, which is OK for most callers (although I suspect that the case in prepare_sort_from_pathkeys might represent a bug). But now we need return-without-recursing behavior as well. There are also more than a few callers that should never see a WindowFunc, and now we'll get some error checking on that.	2016-03-10 16:23:52 -05:00
Tom Lane	364a9f47ab	Refactor pull_var_clause's API to make it less tedious to extend. In commit `1d97c19a0f` and later `c1d9579dd8`, we extended pull_var_clause's API by adding enum-type arguments. That's sort of a pain to maintain, though, because it means every time we add a new behavior we must touch every last one of the call sites, even if there's a reasonable default behavior that most of them could use. Let's switch over to using a bitmask of flags, instead; that seems more maintainable and might save a nanosecond or two as well. This commit changes no behavior in itself, though I'm going to follow it up with one that does add a new behavior. In passing, remove flatten_tlist(), which has not been used since 9.1 and would otherwise need the same API changes. Removing these enums means that optimizer/tlist.h no longer needs to depend on optimizer/var.h. Changing that caused a number of C files to need addition of #include "optimizer/var.h" (probably we can thank old runs of pgrminclude for that); but on balance it seems like a good change anyway.	2016-03-10 15:53:07 -05:00
Tom Lane	51c0f63e4d	Improve handling of pathtargets in planner.c. Refactor so that the internal APIs in planner.c deal in PathTargets not targetlists, and establish a more regular structure for deriving the targets needed for successive steps. There is more that could be done here; calculating the eval costs of each successive target independently is both inefficient and wrong in detail, since we won't actually recompute values available from the input node's tlist. But it's no worse than what happened before the pathification rewrite. In any case this seems like a good starting point for considering how to handle Konstantin Knizhnik's function-evaluation-postponement patch.	2016-03-09 01:12:16 -05:00
Tom Lane	9e8b99420f	Improve handling of group-column indexes in GroupingSetsPath. Instead of having planner.c compute a groupColIdx array and store it in GroupingSetsPaths, make create_groupingsets_plan() find the grouping columns by searching in the child plan node's tlist. Although that's probably a bit slower for create_groupingsets_plan(), it's more like the way every other plan node type does this, and it provides positive confirmation that we know which child output columns we're supposed to be grouping on. (Indeed, looking at this now, I'm not at all sure that it wasn't broken before, because create_groupingsets_plan() isn't demanding an exact tlist match from its child node.) Also, this allows substantial simplification in planner.c, because it no longer needs to compute the groupColIdx array at all; no other cases were using it. I'd intended to put off this refactoring until later (like 9.7), but in view of the likely bug fix and the need to rationalize planner.c's tlist handling so we can do something sane with Konstantin Knizhnik's function-evaluation-postponement patch, I think it can't wait.	2016-03-08 22:32:11 -05:00
Tom Lane	8c314b9853	Finish refactoring make_foo() functions in createplan.c. This patch removes some redundant cost calculations that I left for later cleanup in commit `3fc6e2d7f5`. There's now a uniform policy that the make_foo() convenience functions don't do any cost calculations. Most of their callers copy costs from the source Path node, and for those that don't, the calculation in the make_foo() function wasn't necessarily right anyhow. (make_result() was particularly a mess, as it was serving multiple callers using cost calcs designed for only the first one or two that had ever existed.) Aside from saving a few cycles, this ensures that what EXPLAIN prints matches the costs we used for planning purposes. It does not change any planner decisions, since the decisions are already made.	2016-03-08 16:28:34 -05:00
Tom Lane	3fc6e2d7f5	Make the upper part of the planner work by generating and comparing Paths. I've been saying we needed to do this for more than five years, and here it finally is. This patch removes the ever-growing tangle of spaghetti logic that grouping_planner() used to use to try to identify the best plan for post-scan/join query steps. Now, there is (nearly) independent consideration of each execution step, and entirely separate construction of Paths to represent each of the possible ways to do that step. We choose the best Path or set of Paths using the same add_path() logic that's been used inside query_planner() for years. In addition, this patch removes the old restriction that subquery_planner() could return only a single Plan. It now returns a RelOptInfo containing a set of Paths, just as query_planner() does, and the parent query level can use each of those Paths as the basis of a SubqueryScanPath at its level. This allows finding some optimizations that we missed before, wherein a subquery was capable of returning presorted data and thereby avoiding a sort in the parent level, making the overall cost cheaper even though delivering sorted output was not the cheapest plan for the subquery in isolation. (A couple of regression test outputs change in consequence of that. However, there is very little change in visible planner behavior overall, because the point of this patch is not to get immediate planning benefits but to create the infrastructure for future improvements.) There is a great deal left to do here. This patch unblocks a lot of planner work that was basically impractical in the old code structure, such as allowing FDWs to implement remote aggregation, or rewriting plan_set_operations() to allow consideration of multiple implementation orders for set operations. (The latter will likely require a full rewrite of plan_set_operations(); what I've done here is only to fix it to return Paths not Plans.) I have also left unfinished some localized refactoring in createplan.c and planner.c, because it was not necessary to get this patch to a working state. Thanks to Robert Haas, David Rowley, and Amit Kapila for review.	2016-03-07 15:58:22 -05:00
Tom Lane	19a541143a	Add an explicit representation of the output targetlist to Paths. Up to now, there's been an assumption that all Paths for a given relation compute the same output column set (targetlist). However, there are good reasons to remove that assumption. For example, an indexscan on an expression index might be able to return the value of an expensive function "for free". While we have the ability to generate such a plan today in simple cases, we don't have a way to model that it's cheaper than a plan that computes the function from scratch, nor a way to create such a plan in join cases (where the function computation would normally happen at the topmost join node). Also, we need this so that we can have Paths representing post-scan/join steps, where the targetlist may well change from one step to the next. Therefore, invent a "struct PathTarget" representing the columns we expect a plan step to emit. It's convenient to include the output tuple width and tlist evaluation cost in this struct, and there will likely be additional fields in future. While Path nodes that actually do have custom outputs will need their own PathTargets, it will still be true that most Paths for a given relation will compute the same tlist. To reduce the overhead added by this patch, keep a "default PathTarget" in RelOptInfo, and allow Paths that compute that column set to just point to their parent RelOptInfo's reltarget. (In the patch as committed, actually every Path is like that, since we do not yet have any cases of custom PathTargets.) I took this opportunity to provide some more-honest costing of PlaceHolderVar evaluation. Up to now, the assumption that "scan/join reltargetlists have cost zero" was applied not only to Vars, where it's reasonable, but also PlaceHolderVars where it isn't. Now, we add the eval cost of a PlaceHolderVar's expression to the first plan level where it can be computed, by including it in the PathTarget cost field and adding that to the cost estimates for Paths. This isn't perfect yet but it's much better than before, and there is a way forward to improve it more. This costing change affects the join order chosen for a couple of the regression tests, changing expected row ordering.	2016-02-18 20:02:03 -05:00
Tom Lane	d4c3a156cb	Remove GROUP BY columns that are functionally dependent on other columns. If a GROUP BY clause includes all columns of a non-deferred primary key, as well as other columns of the same relation, those other columns are redundant and can be dropped from the grouping; the pkey is enough to ensure that each row of the table corresponds to a separate group. Getting rid of the excess columns will reduce the cost of the sorting or hashing needed to implement GROUP BY, and can indeed remove the need for a sort step altogether. This seems worth testing for since many query authors are not aware of the GROUP-BY-primary-key exception to the rule about queries not being allowed to reference non-grouped-by columns in their targetlists or HAVING clauses. Thus, redundant GROUP BY items are not uncommon. Also, we can make the test pretty cheap in most queries where it won't help by not looking up a rel's primary key until we've found that at least two of its columns are in GROUP BY. David Rowley, reviewed by Julien Rouhaud	2016-02-11 17:34:59 -05:00
Tom Lane	2564be360a	Fix typo in comment.	2016-02-11 15:20:14 -05:00
Andres Freund	a6897efab9	Fix overeager pushdown of HAVING clauses when grouping sets are used. In `61444bfb` we started to allow HAVING clauses to be fully pushed down into WHERE, even when grouping sets are in use. That turns out not to work correctly, because grouping sets can "produce" NULLs, meaning that filtering in WHERE and HAVING can have different results, even when no aggregates or volatile functions are involved. Instead only allow pushdown of empty grouping sets. It'd be nice to do better, but the exact mechanics of deciding which cases are safe are still being debated. It's important to give correct results till we find a good solution, and such a solution might not be appropriate for backpatching anyway. Bug: #13863 Reported-By: 'wrb' Diagnosed-By: Dean Rasheed Author: Andrew Gierth Reviewed-By: Dean Rasheed and Andres Freund Discussion: 20160113183558.12989.56904@wrigleys.postgresql.org Backpatch: 9.5, where grouping sets were introduced	2016-02-08 11:03:31 +01:00
Robert Haas	7c944bd903	Introduce a new GUC force_parallel_mode for testing purposes. When force_parallel_mode = true, we enable the parallel mode restrictions for all queries for which this is believed to be safe. For the subset of those queries believed to be safe to run entirely within a worker, we spin up a worker and run the query there instead of running it in the original process. When force_parallel_mode = regress, make additional changes to allow the regression tests to run cleanly even though parallel workers have been injected under the hood. Taken together, this facilitates both better user testing and better regression testing of the parallelism code. Robert Haas, with help from Amit Kapila and Rushabh Lathia.	2016-02-07 11:41:33 -05:00
Robert Haas	fbe5a3fb73	Only try to push down foreign joins if the user mapping OIDs match. Previously, the foreign join pushdown infrastructure left the question of security entirely up to individual FDWs, but it would be easy for a foreign data wrapper to inadvertently open up subtle security holes that way. So, make it the core code's job to determine which user mapping OID is relevant, and don't attempt join pushdown unless it's the same for all relevant relations. Per a suggestion from Tom Lane. Shigeru Hanada and Ashutosh Bapat, reviewed by Etsuro Fujita and KaiGai Kohei, with some further changes by me.	2016-01-28 14:05:36 -05:00
Robert Haas	a7de3dc5c3	Support multi-stage aggregation. Aggregate nodes now have two new modes: a "partial" mode where they output the unfinalized transition state, and a "finalize" mode where they accept unfinalized transition states rather than individual values as input. These new modes are not used anywhere yet, but they will be necessary for parallel aggregation. The infrastructure also figures to be useful for cases where we want to aggregate local data and remote data via the FDW interface, and want to bring back partial aggregates from the remote side that can then be combined with locally generated partial aggregates to produce the final value. It may also be useful even when neither FDWs nor parallelism are in play, as explained in the comments in nodeAgg.c. David Rowley and Simon Riggs, reviewed by KaiGai Kohei, Heikki Linnakangas, Haribabu Kommi, and me.	2016-01-20 13:46:50 -05:00
Tom Lane	a923af382c	Fix build_grouping_chain() to not clobber its input lists. There's no good reason for stomping on the input data; it makes the logic in this function no simpler, in fact probably the reverse. And it makes it impossible to separate path generation from plan generation, as I'm working towards doing; that will require more than one traversal of these lists.	2016-01-14 11:51:57 -05:00
Robert Haas	950ab82c3d	Remove obsolete comment. Noted while reviewing a question from Dickson S. Guedes.	2016-01-10 21:35:33 -05:00
Tom Lane	a54676acad	Marginal cleanup of GROUPING SETS code in grouping_planner(). Improve comments and make it a shade less messy. I think we might want to move all of this somewhere else later, but it needs to be more readable first. In passing, re-pgindent the file, affecting some recently-added comments concerning parallel query planning.	2016-01-07 20:32:35 -05:00
Tom Lane	c44d013835	Delay creation of subplan tlist until after create_plan(). Once upon a time it was necessary for grouping_planner() to determine the tlist it wanted from the scan/join plan subtree before it called query_planner(), because query_planner() would actually make a Plan using that. But we refactored things a long time ago to delay construction of the Plan tree till later, so there's no need to build that tlist until (and indeed unless) we're ready to plaster it onto the Plan. The only thing query_planner() cares about is what Vars are going to be needed for the tlist, and it can perfectly well get that by looking at the real tlist rather than some masticated version. Well, actually, there is one minor glitch in that argument, which is that make_subplanTargetList also adds Vars appearing only in HAVING to the tlist it produces. So now we have to account for HAVING explicitly in build_base_rel_tlists. But that just adds a few lines of code, and I doubt it moves the needle much on processing time; we might be doing pull_var_clause() twice on the havingQual, but before we had it scanning dummy tlist entries instead. This is a very small down payment on rationalizing grouping_planner enough so it can be refactored.	2016-01-07 20:23:57 -05:00
Bruce Momjian	ee94300446	Update copyright for 2016 Backpatch certain files through 9.1	2016-01-02 13:33:40 -05:00
Stephen Frost	e5e11c8cca	Collect the global OR of hasRowSecurity flags for plancache We carry around information about if a given query has row security or not to allow the plancache to use that information to invalidate a planned query in the event that the environment changes. Previously, the flag of one of the subqueries was simply being copied into place to indicate if the query overall included RLS components. That's wrong as we need the global OR of all subqueries. Fix by changing the code to match how fireRIRules works, which is results in OR'ing all of the flags. Noted by Tom. Back-patch to 9.5 where RLS was introduced.	2015-12-14 20:05:43 -05:00
Tom Lane	4fcf48450d	Get rid of the planner's LateralJoinInfo data structure. I originally modeled this data structure on SpecialJoinInfo, but after commit `acfcd45cac` that looks like a pretty poor decision. All we really need is relid sets identifying laterally-referenced rels; and most of the time, what we want to know about includes indirect lateral references, a case the LateralJoinInfo data was unsuited to compute with any efficiency. The previous commit redefined RelOptInfo.lateral_relids as the transitive closure of lateral references, so that it easily supports checking indirect references. For the places where we really do want just direct references, add a new RelOptInfo field direct_lateral_relids, which is easily set up as a copy of lateral_relids before we perform the transitive closure calculation. Then we can just drop lateral_info_list and LateralJoinInfo and the supporting code. This makes the planner's handling of lateral references noticeably more efficient, and shorter too. Such a change can't be back-patched into stable branches for fear of breaking extensions that might be looking at the planner's data structures; but it seems not too late to push it into 9.5, so I've done so.	2015-12-11 15:52:38 -05:00
Robert Haas	80558c1f5a	Generate parallel sequential scan plans in simple cases. Add a new flag, consider_parallel, to each RelOptInfo, indicating whether a plan for that relation could conceivably be run inside of a parallel worker. Right now, we're pretty conservative: for example, it might be possible to defer applying a parallel-restricted qual in a worker, and later do it in the leader, but right now we just don't try to parallelize access to that relation. That's probably the right decision in most cases, anyway. Using the new flag, generate parallel sequential scan plans for plain baserels, meaning that we now have parallel sequential scan in PostgreSQL. The logic here is pretty unsophisticated right now: the costing model probably isn't right in detail, and we can't push joins beneath Gather nodes, so the number of plans that can actually benefit from this is pretty limited right now. Lots more work is needed. Nevertheless, it seems time to enable this functionality so that all this code can actually be tested easily by users and developers. Note that, if you wish to test this functionality, it will be necessary to set max_parallel_degree to a value greater than the default of 0. Once a few more loose ends have been tidied up here, we might want to consider changing the default value of this GUC, but I'm leaving it alone for now. Along the way, fix a bug in cost_gather: the previous coding thought that a Gather node's transfer overhead should be costed on the basis of the relation size rather than the number of tuples that actually need to be passed off to the leader. Patch by me, reviewed in earlier versions by Amit Kapila.	2015-11-11 09:02:52 -05:00
Robert Haas	f0661c4e8c	Make sequential scans parallel-aware. In addition, this path fills in a number of missing bits and pieces in the parallel infrastructure. Paths and plans now have a parallel_aware flag indicating whether whatever parallel-aware logic they have should be engaged. It is believed that we will need this flag for a number of path/plan types, not just sequential scans, which is why the flag is generic rather than part of the SeqScan structures specifically. Also, execParallel.c now gives parallel nodes a chance to initialize their PlanState nodes from the DSM during parallel worker startup. Amit Kapila, with a fair amount of adjustment by me. Review of previous patch versions by Haribabu Kommi and others.	2015-11-11 08:57:52 -05:00
Robert Haas	a53c06a13e	Prohibit parallel query when the isolation level is serializable. In order for this to be safe, the code which hands true serializability will need to taught that the SIRead locks taken by a parallel worker pertain to the same transaction as those taken by the parallel leader. Some further changes may be needed as well. Until the necessary adaptations are made, don't generate parallel plans in serializable mode, and if a previously-generated parallel plan is used after serializable mode has been activated, run it serially. This fixes a bug in commit `7aea8e4f2d`.	2015-10-16 11:58:27 -04:00
Robert Haas	d1b7c1ffe7	Parallel executor support. This code provides infrastructure for a parallel leader to start up parallel workers to execute subtrees of the plan tree being executed in the master. User-supplied parameters from ParamListInfo are passed down, but PARAM_EXEC parameters are not. Various other constructs, such as initplans, subplans, and CTEs, are also not currently shared. Nevertheless, there's enough here to support a basic implementation of parallel query, and we can lift some of the current restrictions as needed. Amit Kapila and Robert Haas	2015-09-28 21:55:57 -04:00
Robert Haas	7aea8e4f2d	Determine whether it's safe to attempt a parallel plan for a query. Commit `924bcf4f16` introduced a framework for parallel computation in PostgreSQL that makes most but not all built-in functions safe to execute in parallel mode. In order to have parallel query, we'll need to be able to determine whether that query contains functions (either built-in or user-defined) that cannot be safely executed in parallel mode. This requires those functions to be labeled, so this patch introduces an infrastructure for that. Some functions currently labeled as safe may need to be revised depending on how pending issues related to heavyweight locking under paralllelism are resolved. Parallel plans can't be used except for the case where the query will run to completion. If portal execution were suspended, the parallel mode restrictions would need to remain in effect during that time, but that might make other queries fail. Therefore, this patch introduces a framework that enables consideration of parallel plans only when it is known that the plan will be run to completion. This probably needs some refinement; for example, at bind time, we do not know whether a query run via the extended protocol will be execution to completion or run with a limited fetch count. Having the client indicate its intentions at bind time would constitute a wire protocol break. Some contexts in which parallel mode would be safe are not adjusted by this patch; the default is not to try parallel plans except from call sites that have been updated to say that such plans are OK. This commit doesn't introduce any parallel paths or plans; it just provides a way to determine whether they could potentially be used. I'm committing it on the theory that the remaining parallel sequential scan patches will also get committed to this release, hopefully in the not-too-distant future. Robert Haas and Amit Kapila. Reviewed (in earlier versions) by Noah Misch.	2015-09-16 15:38:47 -04:00
Tom Lane	68fa28f771	Postpone extParam/allParam calculations until the very end of planning. Until now we computed these Param ID sets at the end of subquery_planner, but that approach depends on subquery_planner returning a concrete Plan tree. We would like to switch over to returning one or more Paths for a subquery, and in that representation the necessary details aren't fully fleshed out (not to mention that we don't really want to do this work for Paths that end up getting discarded). Hence, refactor so that we can compute the param ID sets at the end of planning, just before set_plan_references is run. The main change necessary to make this work is that we need to capture the set of outer-level Param IDs available to the current query level before exiting subquery_planner, since the outer levels' plan_params lists are transient. (That's not going to pose a problem for returning Paths, since all the work involved in producing that data is part of expression preprocessing, which will continue to happen before Paths are produced.) On the plus side, this change gets rid of several existing kluges. Eventually I'd like to get rid of SS_finalize_plan altogether in favor of doing this work during set_plan_references, but that will require some complex rejiggering because SS_finalize_plan needs to visit subplans and initplans before the main plan. So leave that idea for another day.	2015-08-11 23:48:37 -04:00
Tom Lane	8693ebe37d	Avoid some zero-divide hazards in the planner. Although I think on all modern machines floating division by zero results in Infinity not SIGFPE, we still don't want infinities running around in the planner's costing estimates; too much risk of that leading to insane behavior. grouping_planner() failed to consider the possibility that final_rel might be known dummy and hence have zero rowcount. (I wonder if it would be better to set a rows estimate of 1 for dummy relations? But at least in the back branches, changing this convention seems like a bad idea, so I'll leave that for another day.) Make certain that get_variable_numdistinct() produces a nonzero result. The case that can be shown to be broken is with stadistinct < 0.0 and small ntuples; we did not prevent the result from rounding to zero. For good luck I applied clamp_row_est() to all the nonconstant return values. In ExecChooseHashTableSize(), Assert that we compute positive nbuckets and nbatch. I know of no reason to think this isn't the case, but it seems like a good safety check. Per reports from Piotr Stefaniak. Back-patch to all active branches.	2015-07-30 12:11:23 -04:00
Andres Freund	61444bfb80	Allow to push down clauses from HAVING to WHERE when grouping sets are used. Previously we disallowed pushing down quals to WHERE in the presence of grouping sets. That's overly restrictive. We now instead copy quals to WHERE if applicable, leaving the one in HAVING in place. That's because, at that stage of the planning process, it's nontrivial to determine if it's safe to remove the one in HAVING. Author: Andrew Gierth Discussion: 874mkt3l59.fsf@news-spur.riddles.org.uk Backpatch: 9.5, where grouping sets were introduced. This isn't exactly a bugfix, but it seems better to keep the branches in sync at this point.	2015-07-26 16:50:20 +02:00
Andres Freund	144666f65b	Build column mapping for grouping sets in all required cases. The previous coding frequently failed to fail because for one it's unusual to have rollup clauses with one column, and for another sometimes the wrong mapping didn't cause obvious problems. Author: Jeevan Chalke Reviewed-By: Andrew Gierth Discussion: CAM2+6=W=9=hQOipH0HAPbkun3Z3TFWij_EiHue0_6UX=oR=1kw@mail.gmail.com Backpatch: 9.5, where grouping sets were introduced	2015-07-26 16:46:27 +02:00
Tom Lane	dd7a8f66ed	Redesign tablesample method API, and do extensive code review. The original implementation of TABLESAMPLE modeled the tablesample method API on index access methods, which wasn't a good choice because, without specialized DDL commands, there's no way to build an extension that can implement a TSM. (Raw inserts into system catalogs are not an acceptable thing to do, because we can't undo them during DROP EXTENSION, nor will pg_upgrade behave sanely.) Instead adopt an API more like procedural language handlers or foreign data wrappers, wherein the only SQL-level support object needed is a single handler function identified by having a special return type. This lets us get rid of the supporting catalog altogether, so that no custom DDL support is needed for the feature. Adjust the API so that it can support non-constant tablesample arguments (the original coding assumed we could evaluate the argument expressions at ExecInitSampleScan time, which is undesirable even if it weren't outright unsafe), and discourage sampling methods from looking at invisible tuples. Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable within and across queries, as required by the SQL standard, and deal more honestly with methods that can't support that requirement. Make a full code-review pass over the tablesample additions, and fix assorted bugs, omissions, infelicities, and cosmetic issues (such as failure to put the added code stanzas in a consistent ordering). Improve EXPLAIN's output of tablesample plans, too. Back-patch to 9.5 so that we don't have to support the original API in production.	2015-07-25 14:39:00 -04:00
Tom Lane	2cb9ec1bcb	Improve inheritance_planner()'s performance for large inheritance sets. Commit `c03ad5602f` introduced a planner performance regression for UPDATE/DELETE on large inheritance sets. It required copying the append_rel_list (which is of size proportional to the number of inherited tables) once for each inherited table, thus resulting in O(N^2) time and memory consumption. While it's difficult to avoid that in general, the extra work only has to be done for append_rel_list entries that actually reference subquery RTEs, which inheritance-set entries will not. So we can buy back essentially all of the loss in cases without subqueries in FROM; and even for those, the added work is mainly proportional to the number of UNION ALL subqueries. Back-patch to 9.2, like the previous commit. Tom Lane and Dean Rasheed, per a complaint from Thomas Munro.	2015-06-22 18:53:27 -04:00
Tom Lane	2aa0476dc3	Manual cleanup of pgindent results. Fix some places where pgindent did silly stuff, often because project style wasn't followed to begin with. (I've not touched the atomics headers, though.)	2015-05-24 15:04:10 -04:00
Bruce Momjian	807b9e0dff	pgindent run for 9.5	2015-05-23 21:35:49 -04:00
Andres Freund	631d749007	Remove the new UPSERT command tag and use INSERT instead. Previously, INSERT with ON CONFLICT DO UPDATE specified used a new command tag -- UPSERT. It was introduced out of concern that INSERT as a command tag would be a misrepresentation for ON CONFLICT DO UPDATE, as some affected rows may actually have been updated. Alvaro Herrera noticed that the implementation of that new command tag was incomplete; in subsequent discussion we concluded that having it doesn't provide benefits that are in line with the compatibility breaks it requires. Catversion bump due to the removal of PlannedStmt->isUpsert. Author: Peter Geoghegan Discussion: 20150520215816.GI5885@postgresql.org	2015-05-23 00:58:45 +02:00
Andres Freund	f3d3118532	Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com	2015-05-16 03:46:31 +02:00
Simon Riggs	f6d208d6e5	TABLESAMPLE, SQL Standard and extensible Add a TABLESAMPLE clause to SELECT statements that allows user to specify random BERNOULLI sampling or block level SYSTEM sampling. Implementation allows for extensible sampling functions to be written, using a standard API. Basic version follows SQLStandard exactly. Usable concrete use cases for the sampling API follow in later commits. Petr Jelinek Reviewed by Michael Paquier and Simon Riggs	2015-05-15 14:37:10 -04:00

1 2 3 4 5 ...

398 Commits