1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-02 09:02:37 +03:00

Enable parallel SELECT for "INSERT INTO ... SELECT ...".

Parallel SELECT can't be utilized for INSERT in the following cases:
- INSERT statement uses the ON CONFLICT DO UPDATE clause
- Target table has a parallel-unsafe: trigger, index expression or
  predicate, column default expression or check constraint
- Target table has a parallel-unsafe domain constraint on any column
- Target table is a partitioned table with a parallel-unsafe partition key
  expression or support function

The planner is updated to perform additional parallel-safety checks for
the cases listed above, for determining whether it is safe to run INSERT
in parallel-mode with an underlying parallel SELECT. The planner will
consider using parallel SELECT for "INSERT INTO ... SELECT ...", provided
nothing unsafe is found from the additional parallel-safety checks, or
from the existing parallel-safety checks for SELECT.

While checking parallel-safety, we need to check it for all the partitions
on the table which can be costly especially when we decide not to use a
parallel plan. So, in a separate patch, we will introduce a GUC and or a
reloption to enable/disable parallelism for Insert statements.

Prior to entering parallel-mode for the execution of INSERT with parallel
SELECT, a TransactionId is acquired and assigned to the current
transaction state. This is necessary to prevent the INSERT from attempting
to assign the TransactionId whilst in parallel-mode, which is not allowed.
This approach has a disadvantage in that if the underlying SELECT does not
return any rows, then the TransactionId is not used, however that
shouldn't happen in practice in many cases.

Author: Greg Nancarrow, Amit Langote, Amit Kapila
Reviewed-by: Amit Langote, Hou Zhijie, Takayuki Tsunakawa, Antonin Houska, Bharath Rupireddy, Dilip Kumar, Vignesh C, Zhihong Yu, Amit Kapila
Tested-by: Tang, Haiying
Discussion: https://postgr.es/m/CAJcOf-cXnB5cnMKqWEp2E2z7Mvcd04iLVmV=qpFJrR3AcrTS3g@mail.gmail.com
Discussion: https://postgr.es/m/CAJcOf-fAdj=nDKMsRhQzndm-O13NY4dL6xGcEvdX5Xvbbi0V7g@mail.gmail.com
This commit is contained in:
Amit Kapila
2021-03-10 07:38:58 +05:30
parent 0ba71107ef
commit 05c8482f7f
17 changed files with 1531 additions and 21 deletions

View File

@ -305,6 +305,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
glob->resultRelations = NIL;
glob->appendRelations = NIL;
glob->relationOids = NIL;
glob->partitionOids = NIL;
glob->invalItems = NIL;
glob->paramExecTypes = NIL;
glob->lastPHId = 0;
@ -316,16 +317,16 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
/*
* Assess whether it's feasible to use parallel mode for this query. We
* can't do this in a standalone backend, or if the command will try to
* modify any data, or if this is a cursor operation, or if GUCs are set
* to values that don't permit parallelism, or if parallel-unsafe
* functions are present in the query tree.
* modify any data (except for Insert), or if this is a cursor operation,
* or if GUCs are set to values that don't permit parallelism, or if
* parallel-unsafe functions are present in the query tree.
*
* (Note that we do allow CREATE TABLE AS, SELECT INTO, and CREATE
* MATERIALIZED VIEW to use parallel plans, but as of now, only the leader
* backend writes into a completely new table. In the future, we can
* extend it to allow workers to write into the table. However, to allow
* parallel updates and deletes, we have to solve other problems,
* especially around combo CIDs.)
* (Note that we do allow CREATE TABLE AS, INSERT INTO...SELECT, SELECT
* INTO, and CREATE MATERIALIZED VIEW to use parallel plans. However, as
* of now, only the leader backend writes into a completely new table. In
* the future, we can extend it to allow workers to write into the table.
* However, to allow parallel updates and deletes, we have to solve other
* problems, especially around combo CIDs.)
*
* For now, we don't try to use parallel mode if we're running inside a
* parallel worker. We might eventually be able to relax this
@ -334,13 +335,14 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
*/
if ((cursorOptions & CURSOR_OPT_PARALLEL_OK) != 0 &&
IsUnderPostmaster &&
parse->commandType == CMD_SELECT &&
(parse->commandType == CMD_SELECT ||
is_parallel_allowed_for_modify(parse)) &&
!parse->hasModifyingCTE &&
max_parallel_workers_per_gather > 0 &&
!IsParallelWorker())
{
/* all the cheap tests pass, so scan the query tree */
glob->maxParallelHazard = max_parallel_hazard(parse);
glob->maxParallelHazard = max_parallel_hazard(parse, glob);
glob->parallelModeOK = (glob->maxParallelHazard != PROPARALLEL_UNSAFE);
}
else
@ -521,6 +523,19 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->rewindPlanIDs = glob->rewindPlanIDs;
result->rowMarks = glob->finalrowmarks;
result->relationOids = glob->relationOids;
/*
* Register the Oids of parallel-safety-checked partitions as plan
* dependencies. This is only really needed in the case of a parallel plan
* so that if parallel-unsafe properties are subsequently defined on the
* partitions, the cached parallel plan will be invalidated, and a
* non-parallel plan will be generated.
*
* We also use this list to acquire locks on partitions before executing
* cached plan. See AcquireExecutorLocks().
*/
if (glob->partitionOids != NIL && glob->parallelModeNeeded)
result->partitionOids = glob->partitionOids;
result->invalItems = glob->invalItems;
result->paramExecTypes = glob->paramExecTypes;
/* utilityStmt should be null, but we might as well copy it */