mirror of
https://github.com/postgres/postgres.git
synced 2025-05-02 11:44:50 +03:00
Standard English uses "may", "can", and "might" in different ways: may - permission, "You may borrow my rake." can - ability, "I can lift that log." might - possibility, "It might rain today." Unfortunately, in conversational English, their use is often mixed, as in, "You may use this variable to do X", when in fact, "can" is a better choice. Similarly, "It may crash" is better stated, "It might crash". Also update two error messages mentioned in the documenation to match.
541 lines
18 KiB
Plaintext
541 lines
18 KiB
Plaintext
<!-- $PostgreSQL: pgsql/doc/src/sgml/plpython.sgml,v 1.37 2007/01/31 20:56:18 momjian Exp $ -->
|
|
|
|
<chapter id="plpython">
|
|
<title>PL/Python - Python Procedural Language</title>
|
|
|
|
<indexterm zone="plpython"><primary>PL/Python</></>
|
|
<indexterm zone="plpython"><primary>Python</></>
|
|
|
|
<para>
|
|
The <application>PL/Python</application> procedural language allows
|
|
<productname>PostgreSQL</productname> functions to be written in the
|
|
<ulink url="http://www.python.org">Python language</ulink>.
|
|
</para>
|
|
|
|
<para>
|
|
To install PL/Python in a particular database, use
|
|
<literal>createlang plpythonu <replaceable>dbname</></literal>.
|
|
</para>
|
|
|
|
<tip>
|
|
<para>
|
|
If a language is installed into <literal>template1</>, all subsequently
|
|
created databases will have the language installed automatically.
|
|
</para>
|
|
</tip>
|
|
|
|
<para>
|
|
As of <productname>PostgreSQL</productname> 7.4, PL/Python is only
|
|
available as an <quote>untrusted</> language (meaning it does not
|
|
offer any way of restricting what users can do in it). It has
|
|
therefore been renamed to <literal>plpythonu</>. The trusted
|
|
variant <literal>plpython</> might become available again in future,
|
|
if a new secure execution mechanism is developed in Python.
|
|
</para>
|
|
|
|
<note>
|
|
<para>
|
|
Users of source packages must specially enable the build of
|
|
PL/Python during the installation process. (Refer to the
|
|
installation instructions for more information.) Users of binary
|
|
packages might find PL/Python in a separate subpackage.
|
|
</para>
|
|
</note>
|
|
|
|
<sect1 id="plpython-funcs">
|
|
<title>PL/Python Functions</title>
|
|
|
|
<para>
|
|
Functions in PL/Python are declared via the standard <xref
|
|
linkend="sql-createfunction" endterm="sql-createfunction-title">
|
|
syntax:
|
|
|
|
<programlisting>
|
|
CREATE FUNCTION <replaceable>funcname</replaceable> (<replaceable>argument-list</replaceable>)
|
|
RETURNS <replaceable>return-type</replaceable>
|
|
AS $$
|
|
# PL/Python function body
|
|
$$ LANGUAGE plpythonu;
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
The body of a function is simply a Python script. When the function
|
|
is called, its arguments are passed as elements of the array
|
|
<varname>args[]</varname>; named arguments are also passed as ordinary
|
|
variables to the Python script. The result is returned from the Python code
|
|
in the usual way, with <literal>return</literal> or
|
|
<literal>yield</literal> (in case of a result-set statement).
|
|
</para>
|
|
|
|
<para>
|
|
For example, a function to return the greater of two integers can be
|
|
defined as:
|
|
|
|
<programlisting>
|
|
CREATE FUNCTION pymax (a integer, b integer)
|
|
RETURNS integer
|
|
AS $$
|
|
if a > b:
|
|
return a
|
|
return b
|
|
$$ LANGUAGE plpythonu;
|
|
</programlisting>
|
|
|
|
The Python code that is given as the body of the function definition
|
|
is transformed into a Python function. For example, the above results in
|
|
|
|
<programlisting>
|
|
def __plpython_procedure_pymax_23456():
|
|
if a > b:
|
|
return a
|
|
return b
|
|
</programlisting>
|
|
|
|
assuming that 23456 is the OID assigned to the function by
|
|
<productname>PostgreSQL</productname>.
|
|
</para>
|
|
|
|
<para>
|
|
The <productname>PostgreSQL</> function parameters are available in
|
|
the global <varname>args</varname> list. In the
|
|
<function>pymax</function> example, <varname>args[0]</varname> contains
|
|
whatever was passed in as the first argument and
|
|
<varname>args[1]</varname> contains the second argument's
|
|
value. Alternatively, one can use named parameters as shown in the example
|
|
above. Use of named parameters is usually more readable.
|
|
</para>
|
|
|
|
<para>
|
|
If an SQL null value<indexterm><primary>null value</primary><secondary
|
|
sortas="PL/Python">PL/Python</secondary></indexterm> is passed to a
|
|
function, the argument value will appear as <symbol>None</symbol> in
|
|
Python. The above function definition will return the wrong answer for null
|
|
inputs. We could add <literal>STRICT</literal> to the function definition
|
|
to make <productname>PostgreSQL</productname> do something more reasonable:
|
|
if a null value is passed, the function will not be called at all,
|
|
but will just return a null result automatically. Alternatively,
|
|
we could check for null inputs in the function body:
|
|
|
|
<programlisting>
|
|
CREATE FUNCTION pymax (a integer, b integer)
|
|
RETURNS integer
|
|
AS $$
|
|
if (a is None) or (b is None):
|
|
return None
|
|
if a > b:
|
|
return a
|
|
return b
|
|
$$ LANGUAGE plpythonu;
|
|
</programlisting>
|
|
|
|
As shown above, to return an SQL null value from a PL/Python
|
|
function, return the value <symbol>None</symbol>. This can be done whether the
|
|
function is strict or not.
|
|
</para>
|
|
|
|
<para>
|
|
Composite-type arguments are passed to the function as Python mappings. The
|
|
element names of the mapping are the attribute names of the composite type.
|
|
If an attribute in the passed row has the null value, it has the value
|
|
<symbol>None</symbol> in the mapping. Here is an example:
|
|
|
|
<programlisting>
|
|
CREATE TABLE employee (
|
|
name text,
|
|
salary integer,
|
|
age integer
|
|
);
|
|
|
|
CREATE FUNCTION overpaid (e employee)
|
|
RETURNS boolean
|
|
AS $$
|
|
if e["salary"] > 200000:
|
|
return True
|
|
if (e["age"] < 30) and (e["salary"] > 100000):
|
|
return True
|
|
return False
|
|
$$ LANGUAGE plpythonu;
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
There are multiple ways to return row or composite types from a Python
|
|
function. The following examples assume we have:
|
|
|
|
<programlisting>
|
|
CREATE TYPE named_value AS (
|
|
name text,
|
|
value integer
|
|
);
|
|
</programlisting>
|
|
|
|
A composite result can be returned as a:
|
|
|
|
<variablelist>
|
|
<varlistentry>
|
|
<term>Sequence type (a tuple or list, but not a set because
|
|
it is not indexable)</term>
|
|
<listitem>
|
|
<para>
|
|
Returned sequence objects must have the same number of items as the
|
|
composite result type has fields. The item with index 0 is assigned to
|
|
the first field of the composite type, 1 to the second and so on. For
|
|
example:
|
|
|
|
<programlisting>
|
|
CREATE FUNCTION make_pair (name text, value integer)
|
|
RETURNS named_value
|
|
AS $$
|
|
return [ name, value ]
|
|
# or alternatively, as tuple: return ( name, value )
|
|
$$ LANGUAGE plpythonu;
|
|
</programlisting>
|
|
|
|
To return a SQL null for any column, insert <symbol>None</symbol> at
|
|
the corresponding position.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>Mapping (dictionary)</term>
|
|
<listitem>
|
|
<para>
|
|
The value for each result type column is retrieved from the mapping
|
|
with the column name as key. Example:
|
|
|
|
<programlisting>
|
|
CREATE FUNCTION make_pair (name text, value integer)
|
|
RETURNS named_value
|
|
AS $$
|
|
return { "name": name, "value": value }
|
|
$$ LANGUAGE plpythonu;
|
|
</programlisting>
|
|
|
|
Any extra dictionary key/value pairs are ignored. Missing keys are
|
|
treated as errors.
|
|
To return a SQL null value for any column, insert
|
|
<symbol>None</symbol> with the corresponding column name as the key.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>Object (any object providing method <literal>__getattr__</literal>)</term>
|
|
<listitem>
|
|
<para>
|
|
This works the same as a mapping.
|
|
Example:
|
|
|
|
<programlisting>
|
|
CREATE FUNCTION make_pair (name text, value integer)
|
|
RETURNS named_value
|
|
AS $$
|
|
class named_value:
|
|
def __init__ (self, n, v):
|
|
self.name = n
|
|
self.value = v
|
|
return named_value(name, value)
|
|
|
|
# or simply
|
|
class nv: pass
|
|
nv.name = name
|
|
nv.value = value
|
|
return nv
|
|
$$ LANGUAGE plpythonu;
|
|
</programlisting>
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
</variablelist>
|
|
</para>
|
|
|
|
<para>
|
|
If you do not provide a return value, Python returns the default
|
|
<symbol>None</symbol>. <application>PL/Python</application> translates
|
|
Python's <symbol>None</symbol> into the SQL null value.
|
|
</para>
|
|
|
|
<para>
|
|
A <application>PL/Python</application> function can also return sets of
|
|
scalar or composite types. There are several ways to achieve this because
|
|
the returned object is internally turned into an iterator. The following
|
|
examples assume we have composite type:
|
|
|
|
<programlisting>
|
|
CREATE TYPE greeting AS (
|
|
how text,
|
|
who text
|
|
);
|
|
</programlisting>
|
|
|
|
A set result can be returned from a:
|
|
|
|
<variablelist>
|
|
<varlistentry>
|
|
<term>Sequence type (tuple, list, set)</term>
|
|
<listitem>
|
|
<para>
|
|
<programlisting>
|
|
CREATE FUNCTION greet (how text)
|
|
RETURNS SETOF greeting
|
|
AS $$
|
|
# return tuple containing lists as composite types
|
|
# all other combinations work also
|
|
return ( [ how, "World" ], [ how, "PostgreSQL" ], [ how, "PL/Python" ] )
|
|
$$ LANGUAGE plpythonu;
|
|
</programlisting>
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>Iterator (any object providing <symbol>__iter__</symbol> and
|
|
<symbol>next</symbol> methods)</term>
|
|
<listitem>
|
|
<para>
|
|
<programlisting>
|
|
CREATE FUNCTION greet (how text)
|
|
RETURNS SETOF greeting
|
|
AS $$
|
|
class producer:
|
|
def __init__ (self, how, who):
|
|
self.how = how
|
|
self.who = who
|
|
self.ndx = -1
|
|
|
|
def __iter__ (self):
|
|
return self
|
|
|
|
def next (self):
|
|
self.ndx += 1
|
|
if self.ndx == len(self.who):
|
|
raise StopIteration
|
|
return ( self.how, self.who[self.ndx] )
|
|
|
|
return producer(how, [ "World", "PostgreSQL", "PL/Python" ])
|
|
$$ LANGUAGE plpythonu;
|
|
</programlisting>
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>Generator (<literal>yield</literal>)</term>
|
|
<listitem>
|
|
<para>
|
|
<programlisting>
|
|
CREATE FUNCTION greet (how text)
|
|
RETURNS SETOF greeting
|
|
AS $$
|
|
for who in [ "World", "PostgreSQL", "PL/Python" ]:
|
|
yield ( how, who )
|
|
$$ LANGUAGE plpythonu;
|
|
</programlisting>
|
|
|
|
<warning>
|
|
<para>
|
|
Currently, due to Python
|
|
<ulink url="http://sourceforge.net/tracker/index.php?func=detail&aid=1483133&group_id=5470&atid=105470">bug #1483133</ulink>,
|
|
some debug versions of Python 2.4
|
|
(configured and compiled with option <literal>--with-pydebug</literal>)
|
|
are known to crash the <productname>PostgreSQL</productname> server
|
|
when using an iterator to return a set result.
|
|
Unpatched versions of Fedora 4 contain this bug.
|
|
It does not happen in production versions of Python or on patched
|
|
versions of Fedora 4.
|
|
</para>
|
|
</warning>
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
</variablelist>
|
|
</para>
|
|
|
|
<para>
|
|
The global dictionary <varname>SD</varname> is available to store
|
|
data between function calls. This variable is private static data.
|
|
The global dictionary <varname>GD</varname> is public data,
|
|
available to all Python functions within a session. Use with
|
|
care.<indexterm><primary>global data</><secondary>in
|
|
PL/Python</></indexterm>
|
|
</para>
|
|
|
|
<para>
|
|
Each function gets its own execution environment in the
|
|
Python interpreter, so that global data and function arguments from
|
|
<function>myfunc</function> are not available to
|
|
<function>myfunc2</function>. The exception is the data in the
|
|
<varname>GD</varname> dictionary, as mentioned above.
|
|
</para>
|
|
</sect1>
|
|
|
|
<sect1 id="plpython-trigger">
|
|
<title>Trigger Functions</title>
|
|
|
|
<indexterm zone="plpython-trigger">
|
|
<primary>trigger</primary>
|
|
<secondary>in PL/Python</secondary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
When a function is used as a trigger, the dictionary
|
|
<literal>TD</literal> contains trigger-related values. The trigger
|
|
rows are in <literal>TD["new"]</> and/or <literal>TD["old"]</>
|
|
depending on the trigger event. <literal>TD["event"]</> contains
|
|
the event as a string (<literal>INSERT</>, <literal>UPDATE</>,
|
|
<literal>DELETE</>, or <literal>UNKNOWN</>).
|
|
<literal>TD["when"]</> contains one of <literal>BEFORE</>,
|
|
<literal>AFTER</>, and <literal>UNKNOWN</>.
|
|
<literal>TD["level"]</> contains one of <literal>ROW</>,
|
|
<literal>STATEMENT</>, and <literal>UNKNOWN</>.
|
|
<literal>TD["name"]</> contains the trigger name,
|
|
<literal>TD["table_name"]</> contains the name of the table on which the trigger occurred,
|
|
<literal>TD["table_schema"]</> contains the schema of the table on which the trigger occurred,
|
|
<literal>TD["name"]</> contains the trigger name, and
|
|
<literal>TD["relid"]</> contains the OID of the table on
|
|
which the trigger occurred. If the <command>CREATE TRIGGER</> command
|
|
included arguments, they are available in <literal>TD["args"][0]</> to
|
|
<literal>TD["args"][(<replaceable>n</>-1)]</>.
|
|
</para>
|
|
|
|
<para>
|
|
If <literal>TD["when"]</literal> is <literal>BEFORE</>, you can
|
|
return <literal>None</literal> or <literal>"OK"</literal> from the
|
|
Python function to indicate the row is unmodified,
|
|
<literal>"SKIP"</> to abort the event, or <literal>"MODIFY"</> to
|
|
indicate you've modified the row.
|
|
</para>
|
|
</sect1>
|
|
|
|
<sect1 id="plpython-database">
|
|
<title>Database Access</title>
|
|
|
|
<para>
|
|
The PL/Python language module automatically imports a Python module
|
|
called <literal>plpy</literal>. The functions and constants in
|
|
this module are available to you in the Python code as
|
|
<literal>plpy.<replaceable>foo</replaceable></literal>. At present
|
|
<literal>plpy</literal> implements the functions
|
|
<literal>plpy.debug(<replaceable>msg</>)</literal>,
|
|
<literal>plpy.log(<replaceable>msg</>)</literal>,
|
|
<literal>plpy.info(<replaceable>msg</>)</literal>,
|
|
<literal>plpy.notice(<replaceable>msg</>)</literal>,
|
|
<literal>plpy.warning(<replaceable>msg</>)</literal>,
|
|
<literal>plpy.error(<replaceable>msg</>)</literal>, and
|
|
<literal>plpy.fatal(<replaceable>msg</>)</literal>.<indexterm><primary>elog</><secondary>in PL/Python</></indexterm>
|
|
<function>plpy.error</function> and
|
|
<function>plpy.fatal</function> actually raise a Python exception
|
|
which, if uncaught, propagates out to the calling query, causing
|
|
the current transaction or subtransaction to be aborted.
|
|
<literal>raise plpy.ERROR(<replaceable>msg</>)</literal> and
|
|
<literal>raise plpy.FATAL(<replaceable>msg</>)</literal> are
|
|
equivalent to calling
|
|
<function>plpy.error</function> and
|
|
<function>plpy.fatal</function>, respectively.
|
|
The other functions only generate messages of different
|
|
priority levels.
|
|
Whether messages of a particular priority are reported to the client,
|
|
written to the server log, or both is controlled by the
|
|
<xref linkend="guc-log-min-messages"> and
|
|
<xref linkend="guc-client-min-messages"> configuration
|
|
variables. See <xref linkend="runtime-config"> for more information.
|
|
</para>
|
|
|
|
<para>
|
|
Additionally, the <literal>plpy</literal> module provides two
|
|
functions called <function>execute</function> and
|
|
<function>prepare</function>. Calling
|
|
<function>plpy.execute</function> with a query string and an
|
|
optional limit argument causes that query to be run and the result
|
|
to be returned in a result object. The result object emulates a
|
|
list or dictionary object. The result object can be accessed by
|
|
row number and column name. It has these additional methods:
|
|
<function>nrows</function> which returns the number of rows
|
|
returned by the query, and <function>status</function> which is the
|
|
<function>SPI_execute()</function> return value. The result object
|
|
can be modified.
|
|
</para>
|
|
|
|
<para>
|
|
For example,
|
|
<programlisting>
|
|
rv = plpy.execute("SELECT * FROM my_table", 5)
|
|
</programlisting>
|
|
returns up to 5 rows from <literal>my_table</literal>. If
|
|
<literal>my_table</literal> has a column
|
|
<literal>my_column</literal>, it would be accessed as
|
|
<programlisting>
|
|
foo = rv[i]["my_column"]
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
<indexterm><primary>preparing a query</><secondary>in PL/Python</></indexterm>
|
|
The second function, <function>plpy.prepare</function>, prepares
|
|
the execution plan for a query. It is called with a query string
|
|
and a list of parameter types, if you have parameter references in
|
|
the query. For example:
|
|
<programlisting>
|
|
plan = plpy.prepare("SELECT last_name FROM my_users WHERE first_name = $1", [ "text" ])
|
|
</programlisting>
|
|
<literal>text</literal> is the type of the variable you will be
|
|
passing for <literal>$1</literal>. After preparing a statement, you
|
|
use the function <function>plpy.execute</function> to run it:
|
|
<programlisting>
|
|
rv = plpy.execute(plan, [ "name" ], 5)
|
|
</programlisting>
|
|
The third argument is the limit and is optional.
|
|
</para>
|
|
|
|
<para>
|
|
When you prepare a plan using the PL/Python module it is
|
|
automatically saved. Read the SPI documentation (<xref
|
|
linkend="spi">) for a description of what this means.
|
|
In order to make effective use of this across function calls
|
|
one needs to use one of the persistent storage dictionaries
|
|
<literal>SD</literal> or <literal>GD</literal> (see
|
|
<xref linkend="plpython-funcs">). For example:
|
|
<programlisting>
|
|
CREATE FUNCTION usesavedplan() RETURNS trigger AS $$
|
|
if SD.has_key("plan"):
|
|
plan = SD["plan"]
|
|
else:
|
|
plan = plpy.prepare("SELECT 1")
|
|
SD["plan"] = plan
|
|
# rest of function
|
|
$$ LANGUAGE plpythonu;
|
|
</programlisting>
|
|
</para>
|
|
</sect1>
|
|
|
|
<![IGNORE[
|
|
<!-- NOT CURRENTLY SUPPORTED -->
|
|
|
|
<sect1 id="plpython-trusted">
|
|
<title>Restricted Environment</title>
|
|
|
|
<para>
|
|
The current version of <application>PL/Python</application>
|
|
functions as a trusted language only; access to the file system and
|
|
other local resources is disabled. Specifically,
|
|
<application>PL/Python</application> uses the Python restricted
|
|
execution environment, further restricts it to prevent the use of
|
|
the file <function>open</> call, and allows only modules from a
|
|
specific list to be imported. Presently, that list includes:
|
|
<literal>array</>, <literal>bisect</>, <literal>binascii</>,
|
|
<literal>calendar</>, <literal>cmath</>, <literal>codecs</>,
|
|
<literal>errno</>, <literal>marshal</>, <literal>math</>, <literal>md5</>,
|
|
<literal>mpz</>, <literal>operator</>, <literal>pcre</>,
|
|
<literal>pickle</>, <literal>random</>, <literal>re</>, <literal>regex</>,
|
|
<literal>sre</>, <literal>sha</>, <literal>string</>, <literal>StringIO</>,
|
|
<literal>struct</>, <literal>time</>, <literal>whrandom</>, and
|
|
<literal>zlib</>.
|
|
</para>
|
|
</sect1>
|
|
|
|
]]>
|
|
|
|
</chapter>
|