mirror of
https://github.com/postgres/postgres.git
synced 2025-10-24 01:29:19 +03:00
475 lines
18 KiB
HTML
475 lines
18 KiB
HTML
<HTML>
|
|
<HEAD>
|
|
<TITLE>The POSTGRES95 User Manual - EXTENDING SQL: FUNCTIONS</TITLE>
|
|
</HEAD>
|
|
|
|
<BODY>
|
|
|
|
<font size=-1>
|
|
<A HREF="pg95user.html">[ TOC ]</A>
|
|
<A HREF="extend.html">[ Previous ]</A>
|
|
<A HREF="xtypes.html">[ Next ]</A>
|
|
</font>
|
|
<HR>
|
|
<H1>7. EXTENDING <B>SQL</B>: FUNCTIONS</H1>
|
|
<HR>
|
|
As it turns out, part of defining a new type is the
|
|
definition of functions that describe its behavior.
|
|
Consequently, while it is possible to define a new
|
|
function without defining a new type, the reverse is
|
|
not true. We therefore describe how to add new functions
|
|
to POSTGRES before describing how to add new
|
|
types.
|
|
POSTGRES <B>SQL</B> provides two types of functions: query
|
|
language functions (functions written in <B>SQL</B> and
|
|
programming language functions (functions written in a
|
|
compiled programming language such as <B>C</B>.) Either kind
|
|
of function can take a base type, a composite type or
|
|
some combination as arguments (parameters). In addition,
|
|
both kinds of functions can return a base type or
|
|
a composite type. It's easier to define <B>SQL</B> functions,
|
|
so we'll start with those.
|
|
Examples in this section can also be found in <CODE>funcs.sql</CODE>
|
|
and <CODE>C-code/funcs.c</CODE>.
|
|
<p>
|
|
<H2><A NAME="query-language-sql-functions">7.1. Query Language (<B>SQL</B>) Functions</A></H2>
|
|
|
|
<H3><A NAME="sql-functions-on-base-types">7.1.1. <B>SQL</B> Functions on Base Types</A></H3>
|
|
The simplest possible <B>SQL</B> function has no arguments and
|
|
simply returns a base type, such as <B>int4</B>:
|
|
|
|
<pre> CREATE FUNCTION one() RETURNS int4
|
|
AS 'SELECT 1 as RESULT' LANGUAGE 'sql';
|
|
|
|
|
|
SELECT one() AS answer;
|
|
|
|
+-------+
|
|
|answer |
|
|
+-------+
|
|
|1 |
|
|
+-------+
|
|
</pre>
|
|
Notice that we defined a target list for the function
|
|
(with the name RESULT), but the target list of the
|
|
query that invoked the function overrode the function's
|
|
target list. Hence, the result is labelled answer
|
|
instead of one.
|
|
<p>
|
|
It's almost as easy to define <B>SQL</B> functions that take
|
|
base types as arguments. In the example below, notice
|
|
how we refer to the arguments within the function as $1
|
|
and $2.
|
|
|
|
<pre> CREATE FUNCTION add_em(int4, int4) RETURNS int4
|
|
AS 'SELECT $1 + $2;' LANGUAGE 'sql';
|
|
|
|
|
|
SELECT add_em(1, 2) AS answer;
|
|
|
|
|
|
+-------+
|
|
|answer |
|
|
+-------+
|
|
|3 |
|
|
+-------+
|
|
</pre>
|
|
|
|
<H3>7.1.2. <B>SQL</B> Functions on Composite Types</H3>
|
|
When specifying functions with arguments of composite
|
|
types (such as EMP), we must not only specify which
|
|
argument we want (as we did above with $1 and $2) but
|
|
also the attributes of that argument. For example,
|
|
take the function double_salary that computes what your
|
|
salary would be if it were doubled.
|
|
|
|
<pre> CREATE FUNCTION double_salary(EMP) RETURNS int4
|
|
AS 'SELECT $1.salary * 2 AS salary;' LANGUAGE 'sql';
|
|
|
|
SELECT name, double_salary(EMP) AS dream
|
|
FROM EMP
|
|
WHERE EMP.dept = 'toy';
|
|
|
|
|
|
+-----+-------+
|
|
|name | dream |
|
|
+-----+-------+
|
|
|Sam | 2400 |
|
|
+-----+-------+
|
|
</pre>
|
|
Notice the use of the syntax $1.salary.
|
|
Before launching into the subject of functions that
|
|
return composite types, we must first introduce the
|
|
function notation for projecting attributes. The simple way
|
|
to explain this is that we can usually use the
|
|
notation attribute(class) and class.attribute interchangably.
|
|
|
|
<pre> --
|
|
-- this is the same as:
|
|
-- SELECT EMP.name AS youngster FROM EMP WHERE EMP.age < 30
|
|
--
|
|
SELECT name(EMP) AS youngster
|
|
FROM EMP
|
|
WHERE age(EMP) < 30;
|
|
|
|
|
|
+----------+
|
|
|youngster |
|
|
+----------+
|
|
|Sam |
|
|
+----------+
|
|
</pre>
|
|
As we shall see, however, this is not always the case.
|
|
This function notation is important when we want to use
|
|
a function that returns a single instance. We do this
|
|
by assembling the entire instance within the function,
|
|
attribute by attribute. This is an example of a function
|
|
that returns a single EMP instance:
|
|
|
|
<pre> CREATE FUNCTION new_emp() RETURNS EMP
|
|
AS 'SELECT \'None\'::text AS name,
|
|
1000 AS salary,
|
|
25 AS age,
|
|
\'none\'::char16 AS dept;'
|
|
LANGUAGE 'sql';
|
|
</pre>
|
|
|
|
In this case we have specified each of the attributes
|
|
with a constant value, but any computation or expression
|
|
could have been substituted for these constants.
|
|
Defining a function like this can be tricky. Some of
|
|
the more important caveats are as follows:
|
|
|
|
|
|
<UL>
|
|
<LI>The target list order must be exactly the same as
|
|
that in which the attributes appear in the <B>CREATE
|
|
TABLE</B> statement (or when you execute a .* query).
|
|
<LI>You must be careful to typecast the expressions
|
|
(using ::) very carefully or you will see the following error:
|
|
|
|
<pre> WARN::function declared to return type EMP does not retrieve (EMP.*)
|
|
</pre>
|
|
<LI>When calling a function that returns an instance, we
|
|
cannot retrieve the entire instance. We must either
|
|
project an attribute out of the instance or pass the
|
|
entire instance into another function.
|
|
<pre> SELECT name(new_emp()) AS nobody;
|
|
|
|
|
|
+-------+
|
|
|nobody |
|
|
+-------+
|
|
|None |
|
|
+-------+
|
|
</pre>
|
|
<LI>The reason why, in general, we must use the function
|
|
syntax for projecting attributes of function return
|
|
values is that the parser just doesn't understand
|
|
the other (dot) syntax for projection when combined
|
|
with function calls.
|
|
|
|
<pre> SELECT new_emp().name AS nobody;
|
|
WARN:parser: syntax error at or near "."
|
|
</pre>
|
|
</UL>
|
|
|
|
Any collection of commands in the <B>SQL</B> query language
|
|
can be packaged together and defined as a function.
|
|
The commands can include updates (i.e., <B>insert</B>, <B>update</B>
|
|
and <B>delete</B>) as well as <B>select</B> queries. However, the
|
|
final command must be a <B>select</B> that returns whatever is
|
|
specified as the function's returntype.
|
|
|
|
<pre>
|
|
CREATE FUNCTION clean_EMP () RETURNS int4
|
|
AS 'DELETE FROM EMP WHERE EMP.salary <= 0;
|
|
SELECT 1 AS ignore_this'
|
|
LANGUAGE 'sql';
|
|
|
|
SELECT clean_EMP();
|
|
|
|
|
|
+--+
|
|
|x |
|
|
+--+
|
|
|1 |
|
|
+--+
|
|
</pre>
|
|
<p>
|
|
|
|
<H2><A NAME="programming-language-functions">7.2. Programming Language Functions</A></H2>
|
|
<H3><A NAME="programming-language-functions-on-base-types">7.2.1. Programming Language Functions on Base Types</A></H3>
|
|
Internally, POSTGRES regards a base type as a "blob of
|
|
memory." The user-defined functions that you define
|
|
over a type in turn define the way that POSTGRES can
|
|
operate on it. That is, POSTGRES will only store and
|
|
retrieve the data from disk and use your user-defined
|
|
functions to input, process, and output the data.
|
|
Base types can have one of three internal formats:
|
|
<UL>
|
|
<LI>pass by value, fixed-length
|
|
<LI>pass by reference, fixed-length
|
|
<LI>pass by reference, variable-length
|
|
</UL>
|
|
By-value types can only be 1, 2 or 4 bytes in length
|
|
(even if your computer supports by-value types of other
|
|
sizes). POSTGRES itself only passes integer types by
|
|
value. You should be careful to define your types such
|
|
that they will be the same size (in bytes) on all
|
|
architectures. For example, the <B>long</B> type is dangerous
|
|
because it is 4 bytes on some machines and 8 bytes on
|
|
others, whereas <B>int</B> type is 4 bytes on most <B>UNIX</B>
|
|
machines (though not on most personal computers). A
|
|
reasonable implementation of the <B>int4</B> type on <B>UNIX</B>
|
|
machines might be:
|
|
|
|
<pre> /* 4-byte integer, passed by value */
|
|
typedef int int4;
|
|
</pre>
|
|
|
|
On the other hand, fixed-length types of any size may
|
|
be passed by-reference. For example, here is a sample
|
|
implementation of the POSTGRES char16 type:
|
|
|
|
<pre> /* 16-byte structure, passed by reference */
|
|
typedef struct {
|
|
char data[16];
|
|
} char16;
|
|
</pre>
|
|
|
|
Only pointers to such types can be used when passing
|
|
them in and out of POSTGRES functions.
|
|
Finally, all variable-length types must also be passed
|
|
by reference. All variable-length types must begin
|
|
with a length field of exactly 4 bytes, and all data to
|
|
be stored within that type must be located in the memory
|
|
immediately following that length field. The
|
|
length field is the total length of the structure
|
|
(i.e., it includes the size of the length field
|
|
itself). We can define the text type as follows:
|
|
|
|
<pre> typedef struct {
|
|
int4 length;
|
|
char data[1];
|
|
} text;
|
|
</pre>
|
|
|
|
Obviously, the data field is not long enough to hold
|
|
all possible strings -- it's impossible to declare such
|
|
a structure in <B>C</B>. When manipulating variable-length
|
|
types, we must be careful to allocate the correct
|
|
amount of memory and initialize the length field. For
|
|
example, if we wanted to store 40 bytes in a text
|
|
structure, we might use a code fragment like this:
|
|
|
|
<pre> #include "postgres.h"
|
|
#include "utils/palloc.h"
|
|
|
|
...
|
|
|
|
char buffer[40]; /* our source data */
|
|
|
|
...
|
|
|
|
text *destination = (text *) palloc(VARHDRSZ + 40);
|
|
destination->length = VARHDRSZ + 40;
|
|
memmove(destination->data, buffer, 40);
|
|
|
|
...
|
|
|
|
</pre>
|
|
Now that we've gone over all of the possible structures
|
|
for base types, we can show some examples of real functions.
|
|
Suppose <CODE>funcs.c</CODE> look like:
|
|
|
|
<pre> #include <string.h>
|
|
#include "postgres.h" /* for char16, etc. */
|
|
#include "utils/palloc.h" /* for palloc */
|
|
|
|
int
|
|
add_one(int arg)
|
|
{
|
|
return(arg + 1);
|
|
}
|
|
|
|
char16 *
|
|
concat16(char16 *arg1, char16 *arg2)
|
|
{
|
|
char16 *new_c16 = (char16 *) palloc(sizeof(char16));
|
|
|
|
memset((void *) new_c16, 0, sizeof(char16));
|
|
(void) strncpy(new_c16, arg1, 16);
|
|
return (char16 *)(strncat(new_c16, arg2, 16));
|
|
}
|
|
<p>
|
|
text *
|
|
copytext(text *t)
|
|
{
|
|
/*
|
|
* VARSIZE is the total size of the struct in bytes.
|
|
*/
|
|
text *new_t = (text *) palloc(VARSIZE(t));
|
|
<p>
|
|
memset(new_t, 0, VARSIZE(t));
|
|
<p>
|
|
VARSIZE(new_t) = VARSIZE(t);
|
|
/*
|
|
* VARDATA is a pointer to the data region of the struct.
|
|
*/
|
|
memcpy((void *) VARDATA(new_t), /* destination */
|
|
(void *) VARDATA(t), /* source */
|
|
VARSIZE(t)-VARHDRSZ); /* how many bytes */
|
|
<p>
|
|
return(new_t);
|
|
}
|
|
</pre>
|
|
On <B>OSF/1</B> we would type:
|
|
|
|
<pre> CREATE FUNCTION add_one(int4) RETURNS int4
|
|
AS '/usr/local/postgres95/tutorial/obj/funcs.so' LANGUAGE 'c';
|
|
|
|
CREATE FUNCTION concat16(char16, char16) RETURNS char16
|
|
AS '/usr/local/postgres95/tutorial/obj/funcs.so' LANGUAGE 'c';
|
|
|
|
CREATE FUNCTION copytext(text) RETURNS text
|
|
AS '/usr/local/postgres95/tutorial/obj/funcs.so' LANGUAGE 'c';
|
|
</pre>
|
|
|
|
On other systems, we might have to make the filename
|
|
end in .sl (to indicate that it's a shared library).
|
|
<p>
|
|
<H3><A NAME="programming-language-functions-on-composite-types">7.2.2. Programming Language Functions on Composite Types</A></H3>
|
|
Composite types do not have a fixed layout like C
|
|
structures. Instances of a composite type may contain
|
|
null fields. In addition, composite types that are
|
|
part of an inheritance hierarchy may have different
|
|
fields than other members of the same inheritance hierarchy.
|
|
Therefore, POSTGRES provides a procedural
|
|
interface for accessing fields of composite types from
|
|
C.
|
|
As POSTGRES processes a set of instances, each instance
|
|
will be passed into your function as an opaque structure of type <B>TUPLE</B>.
|
|
Suppose we want to write a function to answer the query
|
|
|
|
<pre> * SELECT name, c_overpaid(EMP, 1500) AS overpaid
|
|
FROM EMP
|
|
WHERE name = 'Bill' or name = 'Sam';
|
|
</pre>
|
|
In the query above, we can define c_overpaid as:
|
|
|
|
<pre> #include "postgres.h" /* for char16, etc. */
|
|
#include "libpq-fe.h" /* for TUPLE */
|
|
<p>
|
|
bool
|
|
c_overpaid(TUPLE t,/* the current instance of EMP */
|
|
int4 limit)
|
|
{
|
|
bool isnull = false;
|
|
int4 salary;
|
|
<p>
|
|
salary = (int4) GetAttributeByName(t, "salary", &isnull);
|
|
<p>
|
|
if (isnull)
|
|
return (false);
|
|
return(salary > limit);
|
|
}
|
|
</pre>
|
|
|
|
<B>GetAttributeByName</B> is the POSTGRES system function that
|
|
returns attributes out of the current instance. It has
|
|
three arguments: the argument of type TUPLE passed into
|
|
the function, the name of the desired attribute, and a
|
|
return parameter that describes whether the attribute
|
|
is null. <B>GetAttributeByName</B> will align data properly
|
|
so you can cast its return value to the desired type.
|
|
For example, if you have an attribute name which is of
|
|
the type char16, the <B>GetAttributeByName</B> call would look
|
|
like:
|
|
|
|
<pre> char *str;
|
|
...
|
|
str = (char *) GetAttributeByName(t, "name", &isnull)
|
|
</pre>
|
|
|
|
The following query lets POSTGRES know about the
|
|
c_overpaid function:
|
|
|
|
<pre> * CREATE FUNCTION c_overpaid(EMP, int4) RETURNS bool
|
|
AS '/usr/local/postgres95/tutorial/obj/funcs.so' LANGUAGE 'c';
|
|
</pre>
|
|
While there are ways to construct new instances or modify
|
|
existing instances from within a C function, these
|
|
are far too complex to discuss in this manual.
|
|
<p>
|
|
<H3><A NAME="caveats">7.2.3. Caveats</A></H3>
|
|
We now turn to the more difficult task of writing
|
|
programming language functions. Be warned: this section
|
|
of the manual will not make you a programmer. You must
|
|
have a good understanding of <B>C</B> (including the use of
|
|
pointers and the malloc memory manager) before trying
|
|
to write <B>C</B> functions for use with POSTGRES.
|
|
While it may be possible to load functions written in
|
|
languages other than <B>C</B> into POSTGRES, this is often
|
|
difficult (when it is possible at all) because other
|
|
languages, such as <B>FORTRAN</B> and <B>Pascal</B> often do not follow
|
|
the same "calling convention" as <B>C</B>. That is, other
|
|
languages do not pass argument and return values
|
|
between functions in the same way. For this reason, we
|
|
will assume that your programming language functions
|
|
are written in <B>C</B>.
|
|
The basic rules for building <B>C</B> functions are as follows:
|
|
<OL>
|
|
<LI> Most of the header (include) files for POSTGRES
|
|
should already be installed in
|
|
/usr/local/postgres95/include (see Figure 2).
|
|
You should always include
|
|
|
|
<pre> -I/usr/local/postgres95/include
|
|
</pre>
|
|
on your cc command lines. Sometimes, you may
|
|
find that you require header files that are in
|
|
the server source itself (i.e., you need a file
|
|
we neglected to install in include). In those
|
|
cases you may need to add one or more of
|
|
<pre>
|
|
-I/usr/local/postgres95/src/backend
|
|
-I/usr/local/postgres95/src/backend/include
|
|
-I/usr/local/postgres95/src/backend/port/<PORTNAME>
|
|
-I/usr/local/postgres95/src/backend/obj
|
|
</pre>
|
|
|
|
(where <PORTNAME> is the name of the port, e.g.,
|
|
alpha or sparc).
|
|
<LI> When allocating memory, use the POSTGRES
|
|
routines palloc and pfree instead of the
|
|
corresponding <B>C</B> library routines malloc and free.
|
|
The memory allocated by palloc will be freed
|
|
automatically at the end of each transaction,
|
|
preventing memory leaks.
|
|
<LI> Always zero the bytes of your structures using
|
|
memset or bzero. Several routines (such as the
|
|
hash access method, hash join and the sort algorithm)
|
|
compute functions of the raw bits contained in
|
|
your structure. Even if you initialize all fields
|
|
of your structure, there may be
|
|
several bytes of alignment padding (holes in the
|
|
structure) that may contain garbage values.
|
|
<LI> Most of the internal POSTGRES types are declared
|
|
in postgres.h, so it's usually a good idea to
|
|
include that file as well.
|
|
<LI> Compiling and loading your object code so that
|
|
it can be dynamically loaded into POSTGRES
|
|
always requires special flags. See Appendix A
|
|
for a detailed explanation of how to do it for
|
|
your particular operating system.
|
|
</OL>
|
|
<HR>
|
|
<font size=-1>
|
|
<A HREF="pg95user.html">[ TOC ]</A>
|
|
<A HREF="extend.html">[ Previous ]</A>
|
|
<A HREF="xtypes.html">[ Next ]</A>
|
|
</font>
|
|
</BODY>
|
|
</HTML>
|