PL/Python - Python Procedural Language
PL/Python>>
Python>>
The PL/Python procedural language allows
PostgreSQL functions to be written in the
Python language.
To install PL/Python in a particular database, use
createlang plpythonu dbname>.
If a language is installed into template1>, all subsequently
created databases will have the language installed automatically.
As of PostgreSQL 7.4, PL/Python is only
available as an untrusted> language (meaning it does not
offer any way of restricting what users can do in it). It has
therefore been renamed to plpythonu>. The trusted
variant plpython> may become available again in future,
if a new secure execution mechanism is developed in Python.
Users of source packages must specially enable the build of
PL/Python during the installation process. (Refer to the
installation instructions for more information.) Users of binary
packages might find PL/Python in a separate subpackage.
PL/Python Functions
Functions in PL/Python are declared via the standard
syntax:
CREATE FUNCTION funcname (argument-list)
RETURNS return-type
AS $$
# PL/Python function body
$$ LANGUAGE plpythonu;
The body of a function is simply a Python script. When the function
is called, all unnamed arguments are passed as elements to the array
args[] and named arguments as ordinary variables to the
Python script. The result is returned from the Python code in the usual way,
with return or yield (in case of
a resultset statement).
For example, a function to return the greater of two integers can be
defined as:
CREATE FUNCTION pymax (a integer, b integer)
RETURNS integer
AS $$
if a > b:
return a
return b
$$ LANGUAGE plpythonu;
The Python code that is given as the body of the function definition
is transformed into a Python function. For example, the above results in
def __plpython_procedure_pymax_23456():
if a > b:
return a
return b
assuming that 23456 is the OID assigned to the function by
PostgreSQL.
The PostgreSQL> function parameters are available in
the global args list. In the
pymax example, args[0] contains
whatever was passed in as the first argument and
args[1] contains the second argument's value. Alternatively,
one can use named parameters as shown in the example above. This greatly simplifies
the reading and writing of PL/Python code.
If an SQL null valuenull valuePL/Python is passed to a
function, the argument value will appear as None in
Python. The above function definition will return the wrong answer for null
inputs. We could add STRICT to the function definition
to make PostgreSQL do something more reasonable:
if a null value is passed, the function will not be called at all,
but will just return a null result automatically. Alternatively,
we could check for null inputs in the function body:
CREATE FUNCTION pymax (a integer, b integer)
RETURNS integer
AS $$
if (a is None) or (b is None):
return None
if a > b:
return a
return b
$$ LANGUAGE plpythonu;
As shown above, to return an SQL null value from a PL/Python
function, return the value None. This can be done whether the
function is strict or not.
Composite-type arguments are passed to the function as Python mappings. The
element names of the mapping are the attribute names of the composite type.
If an attribute in the passed row has the null value, it has the value
None in the mapping. Here is an example:
CREATE TABLE employee (
name text,
salary integer,
age integer
);
CREATE FUNCTION overpaid (e employee)
RETURNS boolean
AS $$
if e["salary"] > 200000:
return True
if (e["age"] < 30) and (e["salary"] > 100000):
return True
return False
$$ LANGUAGE plpythonu;
There are multiple ways to return row or composite types from a Python
scripts. In following examples we assume to have:
CREATE TABLE named_value (
name text,
value integer
);
or
CREATE TYPE named_value AS (
name text,
value integer
);
Sequence types (tuple or list), but not set (because
it is not indexable)
Returned sequence objects must have the same number of items as
composite types have fields. Item with index 0 is assigned to the first field
of the composite type, 1 to second and so on. For example:
CREATE FUNCTION make_pair (name text, value integer)
RETURNS named_value
AS $$
return [ name, value ]
# or alternatively, as tuple: return ( name, value )
$$ LANGUAGE plpythonu;
To return SQL null in any column, insert None at
the corresponding position.
Mapping (dictionary)
Value for a composite type's column is retrieved from the mapping with
the column name as key. Example:
CREATE FUNCTION make_pair (name text, value integer)
RETURNS named_value
AS $$
return { "name": name, "value": value }
$$ LANGUAGE plpythonu;
Additional dictionary key/value pairs are ignored. Missing keys are
treated as errors, i.e. to return an SQL null value for any column, insert
None with the corresponding column name as the key.
Object (any object providing method __getattr__)
Example:
CREATE FUNCTION make_pair (name text, value integer)
RETURNS named_value
AS $$
class named_value:
def __init__ (self, n, v):
self.name = n
self.value = v
return named_value(name, value)
# or simply
class nv: pass
nv.name = name
nv.value = value
return nv
$$ LANGUAGE plpythonu;
If you do not provide a return value, Python returns the default
None. PL/Python translates
Python's None into the SQL null value.
A PL/Python function can also return sets of
scalar or composite types. There are serveral ways to achieve this because
the returned object is internally turned into an iterator. For following
examples, let's assume to have composite type:
CREATE TYPE greeting AS (
how text,
who text
);
Currently known iterable types are:
Sequence types (tuple, list, set)
CREATE FUNCTION greet (how text)
RETURNS SETOF greeting
AS $$
# return tuple containing lists as composite types
# all other combinations work also
return ( [ how, "World" ], [ how, "PostgreSQL" ], [ how, "PL/Python" ] )
$$ LANGUAGE plpythonu;
Iterator (any object providing __iter__ and
next methods)
CREATE FUNCTION greet (how text)
RETURNS SETOF greeting
AS $$
class producer:
def __init__ (self, how, who):
self.how = how
self.who = who
self.ndx = -1
def __iter__ (self):
return self
def next (self):
self.ndx += 1
if self.ndx == len(self.who):
raise StopIteration
return ( self.how, self.who[self.ndx] )
return producer(how, [ "World", "PostgreSQL", "PL/Python" ])
$$ LANGUAGE plpythonu;
Generator (yield)
CREATE FUNCTION greet (how text)
RETURNS SETOF greeting
AS $$
for who in [ "World", "PostgreSQL", "PL/Python" ]:
yield ( how, who )
$$ LANGUAGE plpythonu;
Currently, due to Python
bug #1483133,
some debug versions of Python 2.4
(configured and compiled with option --with-pydebug)
are known to crash the PostgreSQL server.
Unpatched versions of Fedora 4 contain this bug.
It does not happen in production version of Python or on patched
versions of Fedora 4.
Whenever new iterable types are added to Python language,
PL/Python is ready to use it.
The global dictionary SD is available to store
data between function calls. This variable is private static data.
The global dictionary GD is public data,
available to all Python functions within a session. Use with
care.global data>in
PL/Python>
Each function gets its own execution environment in the
Python interpreter, so that global data and function arguments from
myfunc are not available to
myfunc2. The exception is the data in the
GD dictionary, as mentioned above.
Trigger Functions
trigger
in PL/Python
When a function is used as a trigger, the dictionary
TD contains trigger-related values. The trigger
rows are in TD["new"]> and/or TD["old"]>
depending on the trigger event. TD["event"]> contains
the event as a string (INSERT>, UPDATE>,
DELETE>, or UNKNOWN>).
TD["when"]> contains one of BEFORE>,
AFTER>, and UNKNOWN>.
TD["level"]> contains one of ROW>,
STATEMENT>, and UNKNOWN>.
TD["name"]> contains the trigger name,
TD["table_name"]> contains the name of the table on which the trigger occurred,
TD["table_schema"]> contains the schema of the table on which the trigger occurred,
TD["name"]> contains the trigger name, and
TD["relid"]> contains the OID of the table on
which the trigger occurred. If the CREATE TRIGGER> command
included arguments, they are available in TD["args"][0]> to
TD["args"][(n>-1)]>.
If TD["when"] is BEFORE>, you may
return None or "OK" from the
Python function to indicate the row is unmodified,
"SKIP"> to abort the event, or "MODIFY"> to
indicate you've modified the row.
Database Access
The PL/Python language module automatically imports a Python module
called plpy. The functions and constants in
this module are available to you in the Python code as
plpy.foo. At present
plpy implements the functions
plpy.debug(msg>),
plpy.log(msg>),
plpy.info(msg>),
plpy.notice(msg>),
plpy.warning(msg>),
plpy.error(msg>), and
plpy.fatal(msg>).elog>in PL/Python>
plpy.error and
plpy.fatal actually raise a Python exception
which, if uncaught, propagates out to the calling query, causing
the current transaction or subtransaction to be aborted.
raise plpy.ERROR(msg>) and
raise plpy.FATAL(msg>) are
equivalent to calling
plpy.error and
plpy.fatal, respectively.
The other functions only generate messages of different
priority levels.
Whether messages of a particular priority are reported to the client,
written to the server log, or both is controlled by the
and
configuration
variables. See for more information.
Additionally, the plpy module provides two
functions called execute and
prepare. Calling
plpy.execute with a query string and an
optional limit argument causes that query to be run and the result
to be returned in a result object. The result object emulates a
list or dictionary object. The result object can be accessed by
row number and column name. It has these additional methods:
nrows which returns the number of rows
returned by the query, and status which is the
SPI_execute() return value. The result object
can be modified.
For example,
rv = plpy.execute("SELECT * FROM my_table", 5)
returns up to 5 rows from my_table. If
my_table has a column
my_column, it would be accessed as
foo = rv[i]["my_column"]
preparing a query>in PL/Python>
The second function, plpy.prepare, prepares
the execution plan for a query. It is called with a query string
and a list of parameter types, if you have parameter references in
the query. For example:
plan = plpy.prepare("SELECT last_name FROM my_users WHERE first_name = $1", [ "text" ])
text is the type of the variable you will be
passing for $1. After preparing a statement, you
use the function plpy.execute to run it:
rv = plpy.execute(plan, [ "name" ], 5)
The third argument is the limit and is optional.
When you prepare a plan using the PL/Python module it is
automatically saved. Read the SPI documentation () for a description of what this means.
In order to make effective use of this across function calls
one needs to use one of the persistent storage dictionaries
SD or GD (see
). For example:
CREATE FUNCTION usesavedplan() RETURNS trigger AS $$
if SD.has_key("plan"):
plan = SD["plan"]
else:
plan = plpy.prepare("SELECT 1")
SD["plan"] = plan
# rest of function
$$ LANGUAGE plpythonu;
Restricted Environment
The current version of PL/Python
functions as a trusted language only; access to the file system and
other local resources is disabled. Specifically,
PL/Python uses the Python restricted
execution environment, further restricts it to prevent the use of
the file open> call, and allows only modules from a
specific list to be imported. Presently, that list includes:
array>, bisect>, binascii>,
calendar>, cmath>, codecs>,
errno>, marshal>, math>, md5>,
mpz>, operator>, pcre>,
pickle>, random>, re>, regex>,
sre>, sha>, string>, StringIO>,
struct>, time>, whrandom>, and
zlib>.
]]>