1
0
mirror of https://github.com/postgres/postgres.git synced 2025-06-25 01:02:05 +03:00

Avoid using a cursor in plpgsql's RETURN QUERY statement.

plpgsql has always executed the query given in a RETURN QUERY command
by opening it as a cursor and then fetching a few rows at a time,
which it turns around and dumps into the function's result tuplestore.
The point of this was to keep from blowing out memory with an oversized
SPITupleTable result (note that while a tuplestore can spill tuples
to disk, SPITupleTable cannot).  However, it's rather inefficient, both
because of extra data copying and because of executor entry/exit
overhead.  In recent versions, a new performance problem has emerged:
use of a cursor prevents use of a parallel plan for the executed query.

We can improve matters by skipping use of a cursor and having the
executor push result tuples directly into the function's result
tuplestore.  However, a moderate amount of new infrastructure is needed
to make that idea work:

* We can use the existing tstoreReceiver.c DestReceiver code to funnel
executor output to the tuplestore, but it has to be extended to support
plpgsql's requirement for possibly applying a tuple conversion map.

* SPI needs to be extended to allow use of a caller-supplied
DestReceiver instead of its usual receiver that puts tuples into
a SPITupleTable.  Two new API calls are needed to handle both the
RETURN QUERY and RETURN QUERY EXECUTE cases.

I also felt that I didn't want these new API calls to use the legacy
method of specifying query parameter values with "char" null flags
(the old ' '/'n' convention); rather they should accept ParamListInfo
objects containing the parameter type and value info.  This required
a bit of additional new infrastructure since we didn't yet have any
parse analysis callback that would interpret $N parameter symbols
according to type data supplied in a ParamListInfo.  There seems to be
no harm in letting makeParamList install that callback by default,
rather than leaving a new ParamListInfo's parserSetup hook as NULL.
(Indeed, as of HEAD, I couldn't find anyplace that was using the
parserSetup field at all; plpgsql was using parserSetupArg for its
own purposes, but parserSetup seemed to be write-only.)

We can actually get plpgsql out of the business of using legacy null
flags altogether, and using ParamListInfo instead of its ad-hoc
PreparedParamsData structure; but this requires inventing one more
SPI API call that can replace SPI_cursor_open_with_args.  That seems
worth doing, though.

SPI_execute_with_args and SPI_cursor_open_with_args are now unused
anywhere in the core PG distribution.  Perhaps someday we could
deprecate/remove them.  But cleaning up the crufty bits of the SPI
API is a task for a different patch.

Per bug #16040 from Jeremy Smith.  This is unfortunately too invasive to
consider back-patching.  Patch by me; thanks to Hamid Akhtar for review.

Discussion: https://postgr.es/m/16040-eaacad11fecfb198@postgresql.org
This commit is contained in:
Tom Lane
2020-06-12 12:14:32 -04:00
parent aaf8c99050
commit 2f48ede080
9 changed files with 800 additions and 151 deletions

View File

@ -785,6 +785,133 @@ int SPI_execute_with_args(const char *<parameter>command</parameter>,
<!-- *********************************************** -->
<refentry id="spi-spi-execute-with-receiver">
<indexterm><primary>SPI_execute_with_receiver</primary></indexterm>
<refmeta>
<refentrytitle>SPI_execute_with_receiver</refentrytitle>
<manvolnum>3</manvolnum>
</refmeta>
<refnamediv>
<refname>SPI_execute_with_receiver</refname>
<refpurpose>execute a command with out-of-line parameters</refpurpose>
</refnamediv>
<refsynopsisdiv>
<synopsis>
int SPI_execute_with_receiver(const char *<parameter>command</parameter>,
ParamListInfo <parameter>params</parameter>,
bool <parameter>read_only</parameter>,
long <parameter>count</parameter>,
DestReceiver *<parameter>dest</parameter>)
</synopsis>
</refsynopsisdiv>
<refsect1>
<title>Description</title>
<para>
<function>SPI_execute_with_receiver</function> executes a command that might
include references to externally supplied parameters. The command text
refers to a parameter as <literal>$<replaceable>n</replaceable></literal>,
and the <parameter>params</parameter> object provides values and type
information for each such symbol.
<parameter>read_only</parameter> and <parameter>count</parameter> have
the same interpretation as in <function>SPI_execute</function>.
</para>
<para>
If <parameter>dest</parameter> is not NULL, then result tuples are passed
to that object as they are generated by the executor, instead of being
accumulated in <varname>SPI_tuptable</varname>. Using a
caller-supplied <literal>DestReceiver</literal> object is particularly
helpful for queries that might generate many tuples, since the data can
be processed on-the-fly instead of being accumulated in memory.
</para>
<para>
The <parameter>params</parameter> object should normally mark each
parameter with the <literal>PARAM_FLAG_CONST</literal> flag, since
a one-shot plan is always used for the query.
</para>
</refsect1>
<refsect1>
<title>Arguments</title>
<variablelist>
<varlistentry>
<term><literal>const char * <parameter>command</parameter></literal></term>
<listitem>
<para>
command string
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>ParamListInfo <parameter>params</parameter></literal></term>
<listitem>
<para>
data structure containing parameter types and values; NULL if none
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>bool <parameter>read_only</parameter></literal></term>
<listitem>
<para><literal>true</literal> for read-only execution</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>long <parameter>count</parameter></literal></term>
<listitem>
<para>
maximum number of rows to return,
or <literal>0</literal> for no limit
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>DestReceiver * <parameter>dest</parameter></literal></term>
<listitem>
<para>
<literal>DestReceiver</literal> object that will receive any tuples
emitted by the query; if NULL, tuples are returned
in <varname>SPI_tuptable</varname>
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>Return Value</title>
<para>
The return value is the same as for <function>SPI_execute</function>.
</para>
<para>
When <parameter>dest</parameter> is NULL,
<varname>SPI_processed</varname> and
<varname>SPI_tuptable</varname> are set as in
<function>SPI_execute</function>.
When <parameter>dest</parameter> is not NULL,
<varname>SPI_processed</varname> is set to zero and
<varname>SPI_tuptable</varname> is set to NULL. If a tuple count
is required, the caller's <literal>DestReceiver</literal> object must
calculate it.
</para>
</refsect1>
</refentry>
<!-- *********************************************** -->
<refentry id="spi-spi-prepare">
<indexterm><primary>SPI_prepare</primary></indexterm>
@ -1564,6 +1691,120 @@ int SPI_execute_plan_with_paramlist(SPIPlanPtr <parameter>plan</parameter>,
<!-- *********************************************** -->
<refentry id="spi-spi-execute-plan-with-receiver">
<indexterm><primary>SPI_execute_plan_with_receiver</primary></indexterm>
<refmeta>
<refentrytitle>SPI_execute_plan_with_receiver</refentrytitle>
<manvolnum>3</manvolnum>
</refmeta>
<refnamediv>
<refname>SPI_execute_plan_with_receiver</refname>
<refpurpose>execute a statement prepared by <function>SPI_prepare</function></refpurpose>
</refnamediv>
<refsynopsisdiv>
<synopsis>
int SPI_execute_plan_with_receiver(SPIPlanPtr <parameter>plan</parameter>,
ParamListInfo <parameter>params</parameter>,
bool <parameter>read_only</parameter>,
long <parameter>count</parameter>,
DestReceiver *<parameter>dest</parameter>)
</synopsis>
</refsynopsisdiv>
<refsect1>
<title>Description</title>
<para>
<function>SPI_execute_plan_with_receiver</function> executes a statement
prepared by <function>SPI_prepare</function>. This function is
equivalent to <function>SPI_execute_plan_with_paramlist</function>
except that, instead of always accumulating the result tuples into a
<varname>SPI_tuptable</varname> structure, tuples can be passed to a
caller-supplied <literal>DestReceiver</literal> object as they are
generated by the executor. This is particularly helpful for queries
that might generate many tuples, since the data can be processed
on-the-fly instead of being accumulated in memory.
</para>
</refsect1>
<refsect1>
<title>Arguments</title>
<variablelist>
<varlistentry>
<term><literal>SPIPlanPtr <parameter>plan</parameter></literal></term>
<listitem>
<para>
prepared statement (returned by <function>SPI_prepare</function>)
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>ParamListInfo <parameter>params</parameter></literal></term>
<listitem>
<para>
data structure containing parameter types and values; NULL if none
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>bool <parameter>read_only</parameter></literal></term>
<listitem>
<para><literal>true</literal> for read-only execution</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>long <parameter>count</parameter></literal></term>
<listitem>
<para>
maximum number of rows to return,
or <literal>0</literal> for no limit
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>DestReceiver * <parameter>dest</parameter></literal></term>
<listitem>
<para>
<literal>DestReceiver</literal> object that will receive any tuples
emitted by the query; if NULL, this function is exactly equivalent to
<function>SPI_execute_plan_with_paramlist</function>
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>Return Value</title>
<para>
The return value is the same as for <function>SPI_execute_plan</function>.
</para>
<para>
When <parameter>dest</parameter> is NULL,
<varname>SPI_processed</varname> and
<varname>SPI_tuptable</varname> are set as in
<function>SPI_execute_plan</function>.
When <parameter>dest</parameter> is not NULL,
<varname>SPI_processed</varname> is set to zero and
<varname>SPI_tuptable</varname> is set to NULL. If a tuple count
is required, the caller's <literal>DestReceiver</literal> object must
calculate it.
</para>
</refsect1>
</refentry>
<!-- *********************************************** -->
<refentry id="spi-spi-execp">
<indexterm><primary>SPI_execp</primary></indexterm>
@ -2041,6 +2282,114 @@ Portal SPI_cursor_open_with_paramlist(const char *<parameter>name</parameter>,
<!-- *********************************************** -->
<refentry id="spi-spi-cursor-parse-open-with-paramlist">
<indexterm><primary>SPI_cursor_parse_open_with_paramlist</primary></indexterm>
<refmeta>
<refentrytitle>SPI_cursor_parse_open_with_paramlist</refentrytitle>
<manvolnum>3</manvolnum>
</refmeta>
<refnamediv>
<refname>SPI_cursor_parse_open_with_paramlist</refname>
<refpurpose>set up a cursor using a query and parameters</refpurpose>
</refnamediv>
<refsynopsisdiv>
<synopsis>
Portal SPI_cursor_parse_open_with_paramlist(const char *<parameter>name</parameter>,
const char *<parameter>command</parameter>,
ParamListInfo <parameter>params</parameter>,
bool <parameter>read_only</parameter>,
int <parameter>cursorOptions</parameter>)
</synopsis>
</refsynopsisdiv>
<refsect1>
<title>Description</title>
<para>
<function>SPI_cursor_parse_open_with_paramlist</function> sets up a cursor
(internally, a portal) that will execute the specified query. This
function is equivalent to <function>SPI_cursor_open_with_args</function>
except that any parameters referenced by the query are provided by
a <literal>ParamListInfo</literal> object, rather than in ad-hoc arrays.
</para>
<para>
The <parameter>params</parameter> object should normally mark each
parameter with the <literal>PARAM_FLAG_CONST</literal> flag, since
a one-shot plan is always used for the query.
</para>
<para>
The passed-in parameter data will be copied into the cursor's portal, so it
can be freed while the cursor still exists.
</para>
</refsect1>
<refsect1>
<title>Arguments</title>
<variablelist>
<varlistentry>
<term><literal>const char * <parameter>name</parameter></literal></term>
<listitem>
<para>
name for portal, or <symbol>NULL</symbol> to let the system
select a name
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>const char * <parameter>command</parameter></literal></term>
<listitem>
<para>
command string
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>ParamListInfo <parameter>params</parameter></literal></term>
<listitem>
<para>
data structure containing parameter types and values; NULL if none
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>bool <parameter>read_only</parameter></literal></term>
<listitem>
<para><literal>true</literal> for read-only execution</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>int <parameter>cursorOptions</parameter></literal></term>
<listitem>
<para>
integer bit mask of cursor options; zero produces default behavior
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect1>
<refsect1>
<title>Return Value</title>
<para>
Pointer to portal containing the cursor. Note there is no error
return convention; any error will be reported via <function>elog</function>.
</para>
</refsect1>
</refentry>
<!-- *********************************************** -->
<refentry id="spi-spi-cursor-find">
<indexterm><primary>SPI_cursor_find</primary></indexterm>