mirror of
https://github.com/postgres/postgres.git
synced 2025-05-17 06:41:24 +03:00
928 lines
31 KiB
Plaintext
928 lines
31 KiB
Plaintext
<!-- $PostgreSQL: pgsql/doc/src/sgml/plperl.sgml,v 2.66 2007/05/04 14:55:32 adunstan Exp $ -->
|
|
|
|
<chapter id="plperl">
|
|
<title>PL/Perl - Perl Procedural Language</title>
|
|
|
|
<indexterm zone="plperl">
|
|
<primary>PL/Perl</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="plperl">
|
|
<primary>Perl</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
PL/Perl is a loadable procedural language that enables you to write
|
|
<productname>PostgreSQL</productname> functions in the
|
|
<ulink url="http://www.perl.com">Perl programming language</ulink>.
|
|
</para>
|
|
|
|
<para> The usual advantage to using PL/Perl is that this allows use,
|
|
within stored functions, of the manyfold <quote>string
|
|
munging</quote> operators and functions available for Perl. Parsing
|
|
complex strings might be easier using Perl than it is with the
|
|
string functions and control structures provided in PL/pgSQL.</para>
|
|
|
|
<para>
|
|
To install PL/Perl in a particular database, use
|
|
<literal>createlang plperl <replaceable>dbname</></literal>.
|
|
</para>
|
|
|
|
<tip>
|
|
<para>
|
|
If a language is installed into <literal>template1</>, all subsequently
|
|
created databases will have the language installed automatically.
|
|
</para>
|
|
</tip>
|
|
|
|
<note>
|
|
<para>
|
|
Users of source packages must specially enable the build of
|
|
PL/Perl during the installation process. (Refer to <xref
|
|
linkend="install-short"> for more information.) Users of
|
|
binary packages might find PL/Perl in a separate subpackage.
|
|
</para>
|
|
</note>
|
|
|
|
<sect1 id="plperl-funcs">
|
|
<title>PL/Perl Functions and Arguments</title>
|
|
|
|
<para>
|
|
To create a function in the PL/Perl language, use the standard
|
|
<xref linkend="sql-createfunction" endterm="sql-createfunction-title">
|
|
syntax:
|
|
|
|
<programlisting>
|
|
CREATE FUNCTION <replaceable>funcname</replaceable> (<replaceable>argument-types</replaceable>) RETURNS <replaceable>return-type</replaceable> AS $$
|
|
# PL/Perl function body
|
|
$$ LANGUAGE plperl;
|
|
</programlisting>
|
|
The body of the function is ordinary Perl code. In fact, the PL/Perl
|
|
glue code wraps it inside a Perl subroutine. A PL/Perl function must
|
|
always return a scalar value. You can return more complex structures
|
|
(arrays, records, and sets) by returning a reference, as discussed below.
|
|
Never return a list.
|
|
</para>
|
|
|
|
<note>
|
|
<para>
|
|
The use of named nested subroutines is dangerous in Perl, especially if
|
|
they refer to lexical variables in the enclosing scope. Because a PL/Perl
|
|
function is wrapped in a subroutine, any named subroutine you create will
|
|
be nested. In general, it is far safer to create anonymous subroutines
|
|
which you call via a coderef. See the <literal>perldiag</literal>
|
|
man page for more details.
|
|
</para>
|
|
</note>
|
|
|
|
<para>
|
|
The syntax of the <command>CREATE FUNCTION</command> command requires
|
|
the function body to be written as a string constant. It is usually
|
|
most convenient to use dollar quoting (see <xref
|
|
linkend="sql-syntax-dollar-quoting">) for the string constant.
|
|
If you choose to use escape string syntax <literal>E''</>,
|
|
you must double the single quote marks (<literal>'</>) and backslashes
|
|
(<literal>\</>) used in the body of the function
|
|
(see <xref linkend="sql-syntax-strings">).
|
|
</para>
|
|
|
|
<para>
|
|
Arguments and results are handled as in any other Perl subroutine:
|
|
arguments are passed in <varname>@_</varname>, and a result value
|
|
is returned with <literal>return</> or as the last expression
|
|
evaluated in the function.
|
|
</para>
|
|
|
|
<para>
|
|
For example, a function returning the greater of two integer values
|
|
could be defined as:
|
|
|
|
<programlisting>
|
|
CREATE FUNCTION perl_max (integer, integer) RETURNS integer AS $$
|
|
if ($_[0] > $_[1]) { return $_[0]; }
|
|
return $_[1];
|
|
$$ LANGUAGE plperl;
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
If an SQL null value<indexterm><primary>null value</><secondary
|
|
sortas="PL/Perl">in PL/Perl</></indexterm> is passed to a function,
|
|
the argument value will appear as <quote>undefined</> in Perl. The
|
|
above function definition will not behave very nicely with null
|
|
inputs (in fact, it will act as though they are zeroes). We could
|
|
add <literal>STRICT</> to the function definition to make
|
|
<productname>PostgreSQL</productname> do something more reasonable:
|
|
if a null value is passed, the function will not be called at all,
|
|
but will just return a null result automatically. Alternatively,
|
|
we could check for undefined inputs in the function body. For
|
|
example, suppose that we wanted <function>perl_max</function> with
|
|
one null and one nonnull argument to return the nonnull argument,
|
|
rather than a null value:
|
|
|
|
<programlisting>
|
|
CREATE FUNCTION perl_max (integer, integer) RETURNS integer AS $$
|
|
my ($x,$y) = @_;
|
|
if (! defined $x) {
|
|
if (! defined $y) { return undef; }
|
|
return $y;
|
|
}
|
|
if (! defined $y) { return $x; }
|
|
if ($x > $y) { return $x; }
|
|
return $y;
|
|
$$ LANGUAGE plperl;
|
|
</programlisting>
|
|
As shown above, to return an SQL null value from a PL/Perl
|
|
function, return an undefined value. This can be done whether the
|
|
function is strict or not.
|
|
</para>
|
|
|
|
<para>
|
|
Anything in a function argument that is not a reference is
|
|
a string, which is in the standard <productname>PostgreSQL</productname>
|
|
external text representation for the relevant data type. In the case of
|
|
ordinary numeric or text types, Perl will just do the right thing and
|
|
the programmer will normally not have to worry about it. However, in
|
|
other cases the argument will need to be converted into a form that is
|
|
more usable in Perl. For example, here is how to convert an argument of
|
|
type <type>bytea</> into unescaped binary
|
|
data:
|
|
|
|
<programlisting>
|
|
my $arg = shift;
|
|
$arg =~ s!\\(\d{3})!chr(oct($1))!ge;
|
|
</programlisting>
|
|
|
|
</para>
|
|
|
|
<para>
|
|
Similarly, values passed back to <productname>PostgreSQL</productname>
|
|
must be in the external text representation format. For example, here
|
|
is how to escape binary data for a return value of type <type>bytea</>:
|
|
|
|
<programlisting>
|
|
$retval =~ s!([^ -~])!sprintf("\\%03o",ord($1))!ge;
|
|
return $retval;
|
|
</programlisting>
|
|
|
|
</para>
|
|
|
|
<para>
|
|
Perl can return <productname>PostgreSQL</productname> arrays as
|
|
references to Perl arrays. Here is an example:
|
|
|
|
<programlisting>
|
|
CREATE OR REPLACE function returns_array()
|
|
RETURNS text[][] AS $$
|
|
return [['a"b','c,d'],['e\\f','g']];
|
|
$$ LANGUAGE plperl;
|
|
|
|
select returns_array();
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
Composite-type arguments are passed to the function as references
|
|
to hashes. The keys of the hash are the attribute names of the
|
|
composite type. Here is an example:
|
|
|
|
<programlisting>
|
|
CREATE TABLE employee (
|
|
name text,
|
|
basesalary integer,
|
|
bonus integer
|
|
);
|
|
|
|
CREATE FUNCTION empcomp(employee) RETURNS integer AS $$
|
|
my ($emp) = @_;
|
|
return $emp->{basesalary} + $emp->{bonus};
|
|
$$ LANGUAGE plperl;
|
|
|
|
SELECT name, empcomp(employee.*) FROM employee;
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
A PL/Perl function can return a composite-type result using the same
|
|
approach: return a reference to a hash that has the required attributes.
|
|
For example:
|
|
|
|
<programlisting>
|
|
CREATE TYPE testrowperl AS (f1 integer, f2 text, f3 text);
|
|
|
|
CREATE OR REPLACE FUNCTION perl_row() RETURNS testrowperl AS $$
|
|
return {f2 => 'hello', f1 => 1, f3 => 'world'};
|
|
$$ LANGUAGE plperl;
|
|
|
|
SELECT * FROM perl_row();
|
|
</programlisting>
|
|
|
|
Any columns in the declared result data type that are not present in the
|
|
hash will be returned as null values.
|
|
</para>
|
|
|
|
<para>
|
|
PL/Perl functions can also return sets of either scalar or
|
|
composite types. Usually you'll want to return rows one at a
|
|
time, both to speed up startup time and to keep from queueing up
|
|
the entire result set in memory. You can do this with
|
|
<function>return_next</function> as illustrated below. Note that
|
|
after the last <function>return_next</function>, you must put
|
|
either <literal>return</literal> or (better) <literal>return
|
|
undef</literal>.
|
|
|
|
<programlisting>
|
|
CREATE OR REPLACE FUNCTION perl_set_int(int)
|
|
RETURNS SETOF INTEGER AS $$
|
|
foreach (0..$_[0]) {
|
|
return_next($_);
|
|
}
|
|
return undef;
|
|
$$ LANGUAGE plperl;
|
|
|
|
SELECT * FROM perl_set_int(5);
|
|
|
|
CREATE OR REPLACE FUNCTION perl_set()
|
|
RETURNS SETOF testrowperl AS $$
|
|
return_next({ f1 => 1, f2 => 'Hello', f3 => 'World' });
|
|
return_next({ f1 => 2, f2 => 'Hello', f3 => 'PostgreSQL' });
|
|
return_next({ f1 => 3, f2 => 'Hello', f3 => 'PL/Perl' });
|
|
return undef;
|
|
$$ LANGUAGE plperl;
|
|
</programlisting>
|
|
|
|
For small result sets, you can return a reference to an array that
|
|
contains either scalars, references to arrays, or references to
|
|
hashes for simple types, array types, and composite types,
|
|
respectively. Here are some simple examples of returning the entire
|
|
result set as an array reference:
|
|
|
|
<programlisting>
|
|
CREATE OR REPLACE FUNCTION perl_set_int(int) RETURNS SETOF INTEGER AS $$
|
|
return [0..$_[0]];
|
|
$$ LANGUAGE plperl;
|
|
|
|
SELECT * FROM perl_set_int(5);
|
|
|
|
CREATE OR REPLACE FUNCTION perl_set() RETURNS SETOF testrowperl AS $$
|
|
return [
|
|
{ f1 => 1, f2 => 'Hello', f3 => 'World' },
|
|
{ f1 => 2, f2 => 'Hello', f3 => 'PostgreSQL' },
|
|
{ f1 => 3, f2 => 'Hello', f3 => 'PL/Perl' }
|
|
];
|
|
$$ LANGUAGE plperl;
|
|
|
|
SELECT * FROM perl_set();
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
If you wish to use the <literal>strict</> pragma with your code,
|
|
the easiest way to do so is to <command>SET</>
|
|
<literal>plperl.use_strict</literal> to true. This parameter affects
|
|
subsequent compilations of <application>PL/Perl</> functions, but not
|
|
functions already compiled in the current session. To set the
|
|
parameter before <application>PL/Perl</> has been loaded, it is
|
|
necessary to have added <quote><literal>plperl</></> to the <xref
|
|
linkend="guc-custom-variable-classes"> list in
|
|
<filename>postgresql.conf</filename>.
|
|
</para>
|
|
|
|
<para>
|
|
Another way to use the <literal>strict</> pragma is to put:
|
|
<programlisting>
|
|
use strict;
|
|
</programlisting>
|
|
in the function body. But this only works in <application>PL/PerlU</>
|
|
functions, since <literal>use</> is not a trusted operation. In
|
|
<application>PL/Perl</> functions you can instead do:
|
|
<programlisting>
|
|
BEGIN { strict->import(); }
|
|
</programlisting>
|
|
</para>
|
|
</sect1>
|
|
|
|
<sect1 id="plperl-database">
|
|
<title>Database Access from PL/Perl</title>
|
|
|
|
<para>
|
|
Access to the database itself from your Perl function can be done
|
|
via the function <function>spi_exec_query</function> described
|
|
below, or via an experimental module
|
|
<ulink url="http://www.cpan.org/modules/by-module/DBD/APILOS/">
|
|
<literal>DBD::PgSPI</literal></ulink>
|
|
(also available at <ulink url="http://www.cpan.org/SITES.html">
|
|
<acronym>CPAN mirror sites</></ulink>). This module makes available a
|
|
<acronym>DBI</>-compliant database-handle named
|
|
<varname>$pg_dbh</varname> that can be used to perform queries with
|
|
normal <acronym>DBI</>
|
|
syntax.<indexterm><primary>DBI</></indexterm>
|
|
</para>
|
|
|
|
<para>
|
|
PL/Perl provides additional Perl commands:
|
|
|
|
<variablelist>
|
|
<varlistentry>
|
|
<indexterm>
|
|
<primary>spi_exec_query</primary>
|
|
<secondary>in PL/Perl</secondary>
|
|
</indexterm>
|
|
|
|
<term><literal><function>spi_exec_query</>(<replaceable>query</replaceable> [, <replaceable>max-rows</replaceable>])</literal></term>
|
|
<term><literal><function>spi_query</>(<replaceable>command</replaceable>)</literal></term>
|
|
<term><literal><function>spi_fetchrow</>(<replaceable>cursor</replaceable>)</literal></term>
|
|
<term><literal><function>spi_prepare</>(<replaceable>command</replaceable>, <replaceable>argument types</replaceable>)</literal></term>
|
|
<term><literal><function>spi_exec_prepared</>(<replaceable>plan</replaceable>)</literal></term>
|
|
<term><literal><function>spi_query_prepared</>(<replaceable>plan</replaceable> [, <replaceable>attributes</replaceable>], <replaceable>arguments</replaceable>)</literal></term>
|
|
<term><literal><function>spi_cursor_close</>(<replaceable>cursor</replaceable>)</literal></term>
|
|
<term><literal><function>spi_freeplan</>(<replaceable>plan</replaceable>)</literal></term>
|
|
|
|
<listitem>
|
|
<para>
|
|
<literal>spi_exec_query</literal> executes an SQL command and
|
|
returns the entire row set as a reference to an array of hash
|
|
references. <emphasis>You should only use this command when you know
|
|
that the result set will be relatively small.</emphasis> Here is an
|
|
example of a query (<command>SELECT</command> command) with the
|
|
optional maximum number of rows:
|
|
|
|
<programlisting>
|
|
$rv = spi_exec_query('SELECT * FROM my_table', 5);
|
|
</programlisting>
|
|
This returns up to 5 rows from the table
|
|
<literal>my_table</literal>. If <literal>my_table</literal>
|
|
has a column <literal>my_column</literal>, you can get that
|
|
value from row <literal>$i</literal> of the result like this:
|
|
<programlisting>
|
|
$foo = $rv->{rows}[$i]->{my_column};
|
|
</programlisting>
|
|
The total number of rows returned from a <command>SELECT</command>
|
|
query can be accessed like this:
|
|
<programlisting>
|
|
$nrows = $rv->{processed}
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
Here is an example using a different command type:
|
|
<programlisting>
|
|
$query = "INSERT INTO my_table VALUES (1, 'test')";
|
|
$rv = spi_exec_query($query);
|
|
</programlisting>
|
|
You can then access the command status (e.g.,
|
|
<literal>SPI_OK_INSERT</literal>) like this:
|
|
<programlisting>
|
|
$res = $rv->{status};
|
|
</programlisting>
|
|
To get the number of rows affected, do:
|
|
<programlisting>
|
|
$nrows = $rv->{processed};
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
Here is a complete example:
|
|
<programlisting>
|
|
CREATE TABLE test (
|
|
i int,
|
|
v varchar
|
|
);
|
|
|
|
INSERT INTO test (i, v) VALUES (1, 'first line');
|
|
INSERT INTO test (i, v) VALUES (2, 'second line');
|
|
INSERT INTO test (i, v) VALUES (3, 'third line');
|
|
INSERT INTO test (i, v) VALUES (4, 'immortal');
|
|
|
|
CREATE OR REPLACE FUNCTION test_munge() RETURNS SETOF test AS $$
|
|
my $rv = spi_exec_query('select i, v from test;');
|
|
my $status = $rv->{status};
|
|
my $nrows = $rv->{processed};
|
|
foreach my $rn (0 .. $nrows - 1) {
|
|
my $row = $rv->{rows}[$rn];
|
|
$row->{i} += 200 if defined($row->{i});
|
|
$row->{v} =~ tr/A-Za-z/a-zA-Z/ if (defined($row->{v}));
|
|
return_next($row);
|
|
}
|
|
return undef;
|
|
$$ LANGUAGE plperl;
|
|
|
|
SELECT * FROM test_munge();
|
|
</programlisting>
|
|
</para>
|
|
<para>
|
|
<literal>spi_query</literal> and <literal>spi_fetchrow</literal>
|
|
work together as a pair for row sets which might be large, or for cases
|
|
where you wish to return rows as they arrive.
|
|
<literal>spi_fetchrow</literal> works <emphasis>only</emphasis> with
|
|
<literal>spi_query</literal>. The following example illustrates how
|
|
you use them together:
|
|
|
|
<programlisting>
|
|
CREATE TYPE foo_type AS (the_num INTEGER, the_text TEXT);
|
|
|
|
CREATE OR REPLACE FUNCTION lotsa_md5 (INTEGER) RETURNS SETOF foo_type AS $$
|
|
use Digest::MD5 qw(md5_hex);
|
|
my $file = '/usr/share/dict/words';
|
|
my $t = localtime;
|
|
elog(NOTICE, "opening file $file at $t" );
|
|
open my $fh, '<', $file # ooh, it's a file access!
|
|
or elog(ERROR, "cannot open $file for reading: $!");
|
|
my @words = <$fh>;
|
|
close $fh;
|
|
$t = localtime;
|
|
elog(NOTICE, "closed file $file at $t");
|
|
chomp(@words);
|
|
my $row;
|
|
my $sth = spi_query("SELECT * FROM generate_series(1,$_[0]) AS b(a)");
|
|
while (defined ($row = spi_fetchrow($sth))) {
|
|
return_next({
|
|
the_num => $row->{a},
|
|
the_text => md5_hex($words[rand @words])
|
|
});
|
|
}
|
|
return;
|
|
$$ LANGUAGE plperlu;
|
|
|
|
SELECT * from lotsa_md5(500);
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
<literal>spi_prepare</literal>, <literal>spi_query_prepared</literal>, <literal>spi_exec_prepared</literal>,
|
|
and <literal>spi_freeplan</literal> implement the same functionality but for prepared queries. Once
|
|
a query plan is prepared by a call to <literal>spi_prepare</literal>, the plan can be used instead
|
|
of the string query, either in <literal>spi_exec_prepared</literal>, where the result is the same as returned
|
|
by <literal>spi_exec_query</literal>, or in <literal>spi_query_prepared</literal> which returns a cursor
|
|
exactly as <literal>spi_query</literal> does, which can be later passed to <literal>spi_fetchrow</literal>.
|
|
</para>
|
|
|
|
<para>
|
|
The advantage of prepared queries is that is it possible to use one prepared plan for more
|
|
than one query execution. After the plan is not needed anymore, it can be freed with
|
|
<literal>spi_freeplan</literal>:
|
|
</para>
|
|
|
|
<para>
|
|
<programlisting>
|
|
CREATE OR REPLACE FUNCTION init() RETURNS INTEGER AS $$
|
|
$_SHARED{my_plan} = spi_prepare( 'SELECT (now() + $1)::date AS now', 'INTERVAL');
|
|
$$ LANGUAGE plperl;
|
|
|
|
CREATE OR REPLACE FUNCTION add_time( INTERVAL ) RETURNS TEXT AS $$
|
|
return spi_exec_prepared(
|
|
$_SHARED{my_plan},
|
|
$_[0],
|
|
)->{rows}->[0]->{now};
|
|
$$ LANGUAGE plperl;
|
|
|
|
CREATE OR REPLACE FUNCTION done() RETURNS INTEGER AS $$
|
|
spi_freeplan( $_SHARED{my_plan});
|
|
undef $_SHARED{my_plan};
|
|
$$ LANGUAGE plperl;
|
|
|
|
SELECT init();
|
|
SELECT add_time('1 day'), add_time('2 days'), add_time('3 days');
|
|
SELECT done();
|
|
|
|
add_time | add_time | add_time
|
|
------------+------------+------------
|
|
2005-12-10 | 2005-12-11 | 2005-12-12
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
Note that the parameter subscript in <literal>spi_prepare</literal> is defined via
|
|
$1, $2, $3, etc, so avoid declaring query strings in double quotes that might easily
|
|
lead to hard-to-catch bugs.
|
|
</para>
|
|
|
|
<para>
|
|
Normally, <function>spi_fetchrow</> should be repeated until it
|
|
returns <literal>undef</literal>, indicating that there are no more
|
|
rows to read. The cursor is automatically freed when
|
|
<function>spi_fetchrow</> returns <literal>undef</literal>.
|
|
If you do not wish to read all the rows, instead call
|
|
<function>spi_cursor_close</> to free the cursor.
|
|
Failure to do so will result in memory leaks.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<indexterm>
|
|
<primary>elog</primary>
|
|
<secondary>in PL/Perl</secondary>
|
|
</indexterm>
|
|
|
|
<term><literal><function>elog</>(<replaceable>level</replaceable>, <replaceable>msg</replaceable>)</literal></term>
|
|
<listitem>
|
|
<para>
|
|
Emit a log or error message. Possible levels are
|
|
<literal>DEBUG</>, <literal>LOG</>, <literal>INFO</>,
|
|
<literal>NOTICE</>, <literal>WARNING</>, and <literal>ERROR</>.
|
|
<literal>ERROR</>
|
|
raises an error condition; if this is not trapped by the surrounding
|
|
Perl code, the error propagates out to the calling query, causing
|
|
the current transaction or subtransaction to be aborted. This
|
|
is effectively the same as the Perl <literal>die</> command.
|
|
The other levels only generate messages of different
|
|
priority levels.
|
|
Whether messages of a particular priority are reported to the client,
|
|
written to the server log, or both is controlled by the
|
|
<xref linkend="guc-log-min-messages"> and
|
|
<xref linkend="guc-client-min-messages"> configuration
|
|
variables. See <xref linkend="runtime-config"> for more
|
|
information.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
</variablelist>
|
|
</para>
|
|
</sect1>
|
|
|
|
<sect1 id="plperl-data">
|
|
<title>Data Values in PL/Perl</title>
|
|
|
|
<para>
|
|
The argument values supplied to a PL/Perl function's code are
|
|
simply the input arguments converted to text form (just as if they
|
|
had been displayed by a <command>SELECT</command> statement).
|
|
Conversely, the <literal>return</> command will accept any string
|
|
that is acceptable input format for the function's declared return
|
|
type. So, within the PL/Perl function,
|
|
all values are just text strings.
|
|
</para>
|
|
</sect1>
|
|
|
|
<sect1 id="plperl-global">
|
|
<title>Global Values in PL/Perl</title>
|
|
|
|
<para>
|
|
You can use the global hash <varname>%_SHARED</varname> to store
|
|
data, including code references, between function calls for the
|
|
lifetime of the current session.
|
|
</para>
|
|
|
|
<para>
|
|
Here is a simple example for shared data:
|
|
<programlisting>
|
|
CREATE OR REPLACE FUNCTION set_var(name text, val text) RETURNS text AS $$
|
|
if ($_SHARED{$_[0]} = $_[1]) {
|
|
return 'ok';
|
|
} else {
|
|
return "cannot set shared variable $_[0] to $_[1]";
|
|
}
|
|
$$ LANGUAGE plperl;
|
|
|
|
CREATE OR REPLACE FUNCTION get_var(name text) RETURNS text AS $$
|
|
return $_SHARED{$_[0]};
|
|
$$ LANGUAGE plperl;
|
|
|
|
SELECT set_var('sample', 'Hello, PL/Perl! How's tricks?');
|
|
SELECT get_var('sample');
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
Here is a slightly more complicated example using a code reference:
|
|
|
|
<programlisting>
|
|
CREATE OR REPLACE FUNCTION myfuncs() RETURNS void AS $$
|
|
$_SHARED{myquote} = sub {
|
|
my $arg = shift;
|
|
$arg =~ s/(['\\])/\\$1/g;
|
|
return "'$arg'";
|
|
};
|
|
$$ LANGUAGE plperl;
|
|
|
|
SELECT myfuncs(); /* initializes the function */
|
|
|
|
/* Set up a function that uses the quote function */
|
|
|
|
CREATE OR REPLACE FUNCTION use_quote(TEXT) RETURNS text AS $$
|
|
my $text_to_quote = shift;
|
|
my $qfunc = $_SHARED{myquote};
|
|
return &$qfunc($text_to_quote);
|
|
$$ LANGUAGE plperl;
|
|
</programlisting>
|
|
|
|
(You could have replaced the above with the one-liner
|
|
<literal>return $_SHARED{myquote}->($_[0]);</literal>
|
|
at the expense of readability.)
|
|
</para>
|
|
</sect1>
|
|
|
|
<sect1 id="plperl-trusted">
|
|
<title>Trusted and Untrusted PL/Perl</title>
|
|
|
|
<indexterm zone="plperl-trusted">
|
|
<primary>trusted</primary>
|
|
<secondary>PL/Perl</secondary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
Normally, PL/Perl is installed as a <quote>trusted</> programming
|
|
language named <literal>plperl</>. In this setup, certain Perl
|
|
operations are disabled to preserve security. In general, the
|
|
operations that are restricted are those that interact with the
|
|
environment. This includes file handle operations,
|
|
<literal>require</literal>, and <literal>use</literal> (for
|
|
external modules). There is no way to access internals of the
|
|
database server process or to gain OS-level access with the
|
|
permissions of the server process,
|
|
as a C function can do. Thus, any unprivileged database user can
|
|
be permitted to use this language.
|
|
</para>
|
|
|
|
<para>
|
|
Here is an example of a function that will not work because file
|
|
system operations are not allowed for security reasons:
|
|
<programlisting>
|
|
CREATE FUNCTION badfunc() RETURNS integer AS $$
|
|
my $tmpfile = "/tmp/badfile";
|
|
open my $fh, '>', $tmpfile
|
|
or elog(ERROR, qq{could not open the file "$tmpfile": $!});
|
|
print $fh "Testing writing to a file\n";
|
|
close $fh or elog(ERROR, qq{could not close the file "$tmpfile": $!});
|
|
return 1;
|
|
$$ LANGUAGE plperl;
|
|
</programlisting>
|
|
The creation of this function will fail as its use of a forbidden
|
|
operation will be caught by the validator.
|
|
</para>
|
|
|
|
<para>
|
|
Sometimes it is desirable to write Perl functions that are not
|
|
restricted. For example, one might want a Perl function that sends
|
|
mail. To handle these cases, PL/Perl can also be installed as an
|
|
<quote>untrusted</> language (usually called
|
|
<application>PL/PerlU</application><indexterm><primary>PL/PerlU</></indexterm>).
|
|
In this case the full Perl language is available. If the
|
|
<command>createlang</command> program is used to install the
|
|
language, the language name <literal>plperlu</literal> will select
|
|
the untrusted PL/Perl variant.
|
|
</para>
|
|
|
|
<para>
|
|
The writer of a <application>PL/PerlU</> function must take care that the function
|
|
cannot be used to do anything unwanted, since it will be able to do
|
|
anything that could be done by a user logged in as the database
|
|
administrator. Note that the database system allows only database
|
|
superusers to create functions in untrusted languages.
|
|
</para>
|
|
|
|
<para>
|
|
If the above function was created by a superuser using the language
|
|
<literal>plperlu</>, execution would succeed.
|
|
</para>
|
|
|
|
<note>
|
|
<para>
|
|
For security reasons, to stop a leak of privileged operations from
|
|
<application>PL/PerlU</> to <application>PL/Perl</>, these two languages
|
|
have to run in separate instances of the Perl interpreter. If your
|
|
Perl installation has been appropriately compiled, this is not a problem.
|
|
However, not all installations are compiled with the requisite flags.
|
|
If <productname>PostgreSQL</> detects that this is the case then it will
|
|
not start a second interpreter, but instead create an error. In
|
|
consequence, in such an installation, you cannot use both
|
|
<application>PL/PerlU</> and <application>PL/Perl</> in the same backend
|
|
process. The remedy for this is to obtain a Perl installation created
|
|
with the appropriate flags, namely either <literal>usemultiplicity</> or
|
|
both <literal>usethreads</> and <literal>useithreads</>.
|
|
For more details,see the <literal>perlembed</> manual page.
|
|
</para>
|
|
</note>
|
|
|
|
</sect1>
|
|
|
|
<sect1 id="plperl-triggers">
|
|
<title>PL/Perl Triggers</title>
|
|
|
|
<para>
|
|
PL/Perl can be used to write trigger functions. In a trigger function,
|
|
the hash reference <varname>$_TD</varname> contains information about the
|
|
current trigger event. <varname>$_TD</> is a global variable,
|
|
which gets a separate local value for each invocation of the trigger.
|
|
The fields of the <varname>$_TD</varname> hash reference are:
|
|
|
|
<variablelist>
|
|
<varlistentry>
|
|
<term><literal>$_TD->{new}{foo}</literal></term>
|
|
<listitem>
|
|
<para>
|
|
<literal>NEW</literal> value of column <literal>foo</literal>
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><literal>$_TD->{old}{foo}</literal></term>
|
|
<listitem>
|
|
<para>
|
|
<literal>OLD</literal> value of column <literal>foo</literal>
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><literal>$_TD->{name}</literal></term>
|
|
<listitem>
|
|
<para>
|
|
Name of the trigger being called
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><literal>$_TD->{event}</literal></term>
|
|
<listitem>
|
|
<para>
|
|
Trigger event: <literal>INSERT</>, <literal>UPDATE</>, <literal>DELETE</>, or <literal>UNKNOWN</>
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><literal>$_TD->{when}</literal></term>
|
|
<listitem>
|
|
<para>
|
|
When the trigger was called: <literal>BEFORE</literal>, <literal>AFTER</literal>, or <literal>UNKNOWN</literal>
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><literal>$_TD->{level}</literal></term>
|
|
<listitem>
|
|
<para>
|
|
The trigger level: <literal>ROW</literal>, <literal>STATEMENT</literal>, or <literal>UNKNOWN</literal>
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><literal>$_TD->{relid}</literal></term>
|
|
<listitem>
|
|
<para>
|
|
OID of the table on which the trigger fired
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><literal>$_TD->{table_name}</literal></term>
|
|
<listitem>
|
|
<para>
|
|
Name of the table on which the trigger fired
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><literal>$_TD->{relname}</literal></term>
|
|
<listitem>
|
|
<para>
|
|
Name of the table on which the trigger fired. This has been deprecated,
|
|
and could be removed in a future release.
|
|
Please use $_TD->{table_name} instead.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><literal>$_TD->{table_schema}</literal></term>
|
|
<listitem>
|
|
<para>
|
|
Name of the schema in which the table on which the trigger fired, is
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><literal>$_TD->{argc}</literal></term>
|
|
<listitem>
|
|
<para>
|
|
Number of arguments of the trigger function
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><literal>@{$_TD->{args}}</literal></term>
|
|
<listitem>
|
|
<para>
|
|
Arguments of the trigger function. Does not exist if <literal>$_TD->{argc}</literal> is 0.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
</variablelist>
|
|
</para>
|
|
|
|
<para>
|
|
Triggers can return one of the following:
|
|
|
|
<variablelist>
|
|
<varlistentry>
|
|
<term><literal>return;</literal></term>
|
|
<listitem>
|
|
<para>
|
|
Execute the statement
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><literal>"SKIP"</literal></term>
|
|
<listitem>
|
|
<para>
|
|
Don't execute the statement
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><literal>"MODIFY"</literal></term>
|
|
<listitem>
|
|
<para>
|
|
Indicates that the <literal>NEW</literal> row was modified by
|
|
the trigger function
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
</variablelist>
|
|
</para>
|
|
|
|
<para>
|
|
Here is an example of a trigger function, illustrating some of the
|
|
above:
|
|
<programlisting>
|
|
CREATE TABLE test (
|
|
i int,
|
|
v varchar
|
|
);
|
|
|
|
CREATE OR REPLACE FUNCTION valid_id() RETURNS trigger AS $$
|
|
if (($_TD->{new}{i} >= 100) || ($_TD->{new}{i} <= 0)) {
|
|
return "SKIP"; # skip INSERT/UPDATE command
|
|
} elsif ($_TD->{new}{v} ne "immortal") {
|
|
$_TD->{new}{v} .= "(modified by trigger)";
|
|
return "MODIFY"; # modify row and execute INSERT/UPDATE command
|
|
} else {
|
|
return; # execute INSERT/UPDATE command
|
|
}
|
|
$$ LANGUAGE plperl;
|
|
|
|
CREATE TRIGGER test_valid_id_trig
|
|
BEFORE INSERT OR UPDATE ON test
|
|
FOR EACH ROW EXECUTE PROCEDURE valid_id();
|
|
</programlisting>
|
|
</para>
|
|
</sect1>
|
|
|
|
<sect1 id="plperl-missing">
|
|
<title>Limitations and Missing Features</title>
|
|
|
|
<para>
|
|
The following features are currently missing from PL/Perl, but they
|
|
would make welcome contributions.
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
PL/Perl functions cannot call each other directly (because they
|
|
are anonymous subroutines inside Perl).
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
SPI is not yet fully implemented.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
If you are fetching very large data sets using
|
|
<literal>spi_exec_query</literal>, you should be aware that
|
|
these will all go into memory. You can avoid this by using
|
|
<literal>spi_query</literal>/<literal>spi_fetchrow</literal> as
|
|
illustrated earlier.
|
|
</para>
|
|
<para>
|
|
A similar problem occurs if a set-returning function passes a
|
|
large set of rows back to PostgreSQL via <literal>return</literal>. You
|
|
can avoid this problem too by instead using
|
|
<literal>return_next</literal> for each row returned, as shown
|
|
previously.
|
|
</para>
|
|
|
|
</listitem>
|
|
</itemizedlist>
|
|
</para>
|
|
</sect1>
|
|
|
|
</chapter>
|