mirror of
				https://github.com/postgres/postgres.git
				synced 2025-10-31 10:30:33 +03:00 
			
		
		
		
	
		
			
				
	
	
		
			475 lines
		
	
	
		
			18 KiB
		
	
	
	
		
			HTML
		
	
	
	
	
	
			
		
		
	
	
			475 lines
		
	
	
		
			18 KiB
		
	
	
	
		
			HTML
		
	
	
	
	
	
| <HTML>
 | |
| <HEAD>
 | |
| 	<TITLE>The POSTGRES95 User Manual - EXTENDING SQL: FUNCTIONS</TITLE>
 | |
| </HEAD>
 | |
| 
 | |
| <BODY>
 | |
| 
 | |
| <font size=-1>
 | |
| <A HREF="pg95user.html">[ TOC ]</A> 
 | |
| <A HREF="extend.html">[ Previous ]</A> 
 | |
| <A HREF="xtypes.html">[ Next ]</A> 
 | |
| </font>
 | |
| <HR>
 | |
| <H1>7.  EXTENDING <B>SQL</B>: FUNCTIONS</H1>
 | |
| <HR>
 | |
|      As  it  turns  out,  part of defining a new type is the
 | |
|      definition of functions  that  describe  its  behavior.
 | |
|      Consequently,  while  it  is  possible  to define a new
 | |
|      function without defining a new type,  the  reverse  is
 | |
|      not  true.   We therefore describe how to add new functions 
 | |
|      to POSTGRES before  describing  how  to  add  new
 | |
|      types.
 | |
|      POSTGRES  <B>SQL</B>  provides  two  types of functions: query
 | |
|      language functions (functions written in <B>SQL</B>  and  
 | |
|      programming  language  functions  (functions  written in a
 | |
|      compiled programming language such as <B>C</B>.)  Either  kind
 | |
|      of  function  can take a base type, a composite type or
 | |
|      some combination as arguments (parameters).   In  addition, 
 | |
|      both kinds of functions can return a base type or
 | |
|      a composite type.  It's easier to define <B>SQL</B> functions,
 | |
|      so we'll start with those.
 | |
|      Examples in this section can also be found in <CODE>funcs.sql</CODE>
 | |
|      and <CODE>C-code/funcs.c</CODE>.
 | |
| <p>
 | |
| <H2><A NAME="query-language-sql-functions">7.1.  Query Language (<B>SQL</B>) Functions</A></H2>
 | |
| 
 | |
| <H3><A NAME="sql-functions-on-base-types">7.1.1.  <B>SQL</B> Functions on Base Types</A></H3>
 | |
|      The simplest possible <B>SQL</B> function has no arguments and
 | |
|      simply returns a base type, such as <B>int4</B>:
 | |
|      
 | |
| <pre>         CREATE FUNCTION one() RETURNS int4
 | |
|               AS 'SELECT 1 as RESULT' LANGUAGE 'sql';
 | |
| 
 | |
| 
 | |
|          SELECT one() AS answer;
 | |
| 
 | |
|          +-------+
 | |
|          |answer |
 | |
|          +-------+
 | |
|          |1      |
 | |
|          +-------+
 | |
| </pre>
 | |
|      Notice that we defined a target list for  the  function
 | |
|      (with  the  name  RESULT),  but  the target list of the
 | |
|      query that invoked the function overrode the function's
 | |
|      target  list.   Hence,  the  result  is labelled answer
 | |
|      instead of one.
 | |
| <p>
 | |
|      It's almost as easy to define <B>SQL</B> functions  that  take
 | |
|      base  types as arguments.  In the example below, notice
 | |
|      how we refer to the arguments within the function as $1
 | |
|      and $2.
 | |
|      
 | |
| <pre>         CREATE FUNCTION add_em(int4, int4) RETURNS int4
 | |
|               AS 'SELECT $1 + $2;' LANGUAGE 'sql';
 | |
| 
 | |
| 
 | |
|          SELECT add_em(1, 2) AS answer;
 | |
| 
 | |
| 
 | |
|          +-------+
 | |
|          |answer |
 | |
|          +-------+
 | |
|          |3      |
 | |
|          +-------+
 | |
| </pre>
 | |
| 
 | |
| <H3>7.1.2.  <B>SQL</B> Functions on Composite Types</H3>
 | |
|      When  specifying  functions with arguments of composite
 | |
|      types (such as EMP), we must  not  only  specify  which
 | |
|      argument  we  want (as we did above with $1 and $2) but
 | |
|      also the attributes of  that  argument.   For  example,
 | |
|      take the function double_salary that computes what your
 | |
|      salary would be if it were doubled.
 | |
|      
 | |
| <pre>         CREATE FUNCTION double_salary(EMP) RETURNS int4
 | |
|               AS 'SELECT $1.salary * 2 AS salary;' LANGUAGE 'sql';
 | |
| 
 | |
|          SELECT name, double_salary(EMP) AS dream
 | |
|            FROM EMP
 | |
|            WHERE EMP.dept = 'toy';
 | |
| 
 | |
| 
 | |
|          +-----+-------+
 | |
|          |name | dream |
 | |
|          +-----+-------+
 | |
|          |Sam  | 2400  |
 | |
|          +-----+-------+
 | |
| </pre>
 | |
|      Notice the use of the syntax $1.salary.
 | |
|      Before launching into the  subject  of  functions  that
 | |
|      return  composite  types,  we  must first introduce the
 | |
|      function notation for projecting attributes.  The  simple  way 
 | |
|      to explain this is that we can usually use the
 | |
|      notation attribute(class)  and  class.attribute  interchangably.
 | |
|      
 | |
| <pre>         --
 | |
|          -- this is the same as:
 | |
|          --   SELECT EMP.name AS youngster FROM EMP WHERE EMP.age < 30
 | |
|          --
 | |
|          SELECT name(EMP) AS youngster
 | |
|          FROM EMP
 | |
|          WHERE age(EMP) < 30;
 | |
| 
 | |
| 
 | |
|          +----------+
 | |
|          |youngster |
 | |
|          +----------+
 | |
|          |Sam       |
 | |
|          +----------+
 | |
| </pre>
 | |
|      As  we shall see, however, this is not always the case.
 | |
|      This function notation is important when we want to use
 | |
|      a  function that returns a single instance.  We do this
 | |
|      by assembling the entire instance within the  function,
 | |
|      attribute  by attribute.  This is an example of a function 
 | |
|      that returns a single EMP instance:
 | |
|      
 | |
| <pre>         CREATE FUNCTION new_emp() RETURNS EMP
 | |
|             AS 'SELECT \'None\'::text AS name,
 | |
|                        1000 AS salary,
 | |
|                        25 AS age,
 | |
|                        \'none\'::char16 AS dept;'
 | |
|             LANGUAGE 'sql';
 | |
| </pre>
 | |
| 
 | |
|      In this case we have specified each of  the  attributes
 | |
|      with  a  constant value, but any computation or expression 
 | |
|      could have been substituted for these constants.
 | |
|      Defining a function like this can be tricky.   Some  of
 | |
|      the more important caveats are as follows:
 | |
|      
 | |
|      
 | |
|      <UL>
 | |
|       <LI>The  target  list  order must be exactly the same as
 | |
|         that in which the attributes appear  in  the  <B>CREATE
 | |
|         TABLE</B> statement (or when you execute a .*  query).
 | |
|       <LI>You  must  be  careful  to  typecast the expressions
 | |
|         (using ::) very carefully or you will see  the  following error:
 | |
|         
 | |
| <pre>            WARN::function declared to return type EMP does not retrieve (EMP.*)
 | |
| </pre>
 | |
|       <LI>When calling a function that returns an instance, we
 | |
|         cannot retrieve the entire instance.  We must either
 | |
|         project an attribute out of the instance or pass the
 | |
|         entire instance into another function.
 | |
| <pre>            SELECT name(new_emp()) AS nobody;
 | |
| 
 | |
| 
 | |
|             +-------+
 | |
|             |nobody |
 | |
|             +-------+
 | |
|             |None   |
 | |
|             +-------+
 | |
| </pre>
 | |
|       <LI>The reason why, in general, we must use the function
 | |
|         syntax  for projecting attributes of function return
 | |
|         values is that the parser  just  doesn't  understand
 | |
|         the  other (dot) syntax for projection when combined
 | |
|         with function calls.
 | |
|         
 | |
| <pre>            SELECT new_emp().name AS nobody;
 | |
|             WARN:parser: syntax error at or near "."
 | |
| </pre>
 | |
|      </UL>
 | |
|      
 | |
|      Any collection of commands in the  <B>SQL</B>  query  language
 | |
|      can  be  packaged  together  and defined as a function.
 | |
|      The commands can include updates (i.e., <B>insert</B>,  <B>update</B>
 | |
|      and  <B>delete</B>)  as  well as <B>select</B> queries.  However, the
 | |
|      final command must be a <B>select</B> that returns whatever is
 | |
|      specified as the function's returntype.
 | |
|      
 | |
| <pre>
 | |
|          CREATE FUNCTION clean_EMP () RETURNS int4
 | |
|             AS 'DELETE FROM EMP WHERE EMP.salary <= 0;
 | |
|                 SELECT 1 AS ignore_this'
 | |
|             LANGUAGE 'sql';
 | |
| 
 | |
|          SELECT clean_EMP();
 | |
| 
 | |
| 
 | |
|          +--+
 | |
|          |x |
 | |
|          +--+
 | |
|          |1 |
 | |
|          +--+
 | |
| </pre>
 | |
| <p>
 | |
| 
 | |
| <H2><A NAME="programming-language-functions">7.2.  Programming Language Functions</A></H2>
 | |
| <H3><A NAME="programming-language-functions-on-base-types">7.2.1.  Programming Language Functions on Base Types</A></H3>
 | |
|      Internally, POSTGRES regards a base type as a "blob  of
 | |
|      memory."   The  user-defined  functions that you define
 | |
|      over a type in turn define the way  that  POSTGRES  can
 | |
|      operate  on  it.  That is, POSTGRES will only store and
 | |
|      retrieve the data from disk and use  your  user-defined
 | |
|      functions to input, process, and output the data.
 | |
|      Base types can have one of three internal formats:
 | |
|      <UL>
 | |
|       <LI>pass by value, fixed-length
 | |
|       <LI>pass by reference, fixed-length
 | |
|       <LI>pass by reference, variable-length
 | |
|      </UL>
 | |
|      By-value  types  can  only be 1, 2 or 4 bytes in length
 | |
|      (even if your computer supports by-value types of other
 | |
|      sizes).   POSTGRES  itself only passes integer types by
 | |
|      value.  You should be careful to define your types such
 | |
|      that  they  will  be  the  same  size (in bytes) on all
 | |
|      architectures.  For example, the <B>long</B> type is dangerous
 | |
|      because  it  is 4 bytes on some machines and 8 bytes on
 | |
|      others, whereas <B>int</B>  type  is  4  bytes  on  most  <B>UNIX</B>
 | |
|      machines  (though  not  on most personal computers).  A
 | |
|      reasonable implementation of  the  <B>int4</B>  type  on  <B>UNIX</B>
 | |
|      machines might be:
 | |
|      
 | |
| <pre>         /* 4-byte integer, passed by value */
 | |
|          typedef int int4;
 | |
| </pre>
 | |
| 
 | |
|      On  the  other hand, fixed-length types of any size may
 | |
|      be passed by-reference.  For example, here is a  sample
 | |
|      implementation of the POSTGRES char16 type:
 | |
|      
 | |
| <pre>         /* 16-byte structure, passed by reference */
 | |
|          typedef struct {
 | |
|              char data[16];
 | |
|          } char16;
 | |
| </pre>
 | |
| 
 | |
|      Only  pointers  to  such types can be used when passing
 | |
|      them in and out of POSTGRES functions.
 | |
|      Finally, all variable-length types must also be  passed
 | |
|      by  reference.   All  variable-length  types must begin
 | |
|      with a length field of exactly 4 bytes, and all data to
 | |
|      be  stored within that type must be located in the memory 
 | |
|      immediately  following  that  length  field.   The
 | |
|      length  field  is  the  total  length  of the structure
 | |
|      (i.e.,  it  includes  the  size  of  the  length  field
 | |
|      itself).  We can define the text type as follows:
 | |
| 
 | |
| <pre>         typedef struct {
 | |
|              int4 length;
 | |
|              char data[1];
 | |
|          } text;
 | |
| </pre>
 | |
| 
 | |
|      Obviously,  the  data  field is not long enough to hold
 | |
|      all possible strings -- it's impossible to declare such
 | |
|      a  structure  in  <B>C</B>.  When manipulating variable-length
 | |
|      types, we must  be  careful  to  allocate  the  correct
 | |
|      amount  of memory and initialize the length field.  For
 | |
|      example, if we wanted to  store  40  bytes  in  a  text
 | |
|      structure, we might use a code fragment like this:
 | |
| 
 | |
| <pre>         #include "postgres.h"
 | |
|          #include "utils/palloc.h"
 | |
| 
 | |
|          ...
 | |
| 
 | |
|          char buffer[40]; /* our source data */
 | |
| 
 | |
|          ...
 | |
| 
 | |
|          text *destination = (text *) palloc(VARHDRSZ + 40);
 | |
|          destination->length = VARHDRSZ + 40;
 | |
|          memmove(destination->data, buffer, 40);
 | |
| 
 | |
|          ...
 | |
| 
 | |
| </pre>
 | |
|      Now that we've gone over all of the possible structures
 | |
|      for base types, we can show some examples of real functions. 
 | |
|      Suppose <CODE>funcs.c</CODE> look like:
 | |
| 
 | |
| <pre>         #include <string.h>
 | |
|          #include "postgres.h"  /* for char16, etc. */
 | |
|          #include "utils/palloc.h" /* for palloc */
 | |
| 
 | |
|          int
 | |
|          add_one(int arg)
 | |
|          {
 | |
|              return(arg + 1);
 | |
|          }
 | |
| 
 | |
|          char16 *
 | |
|          concat16(char16 *arg1, char16 *arg2)
 | |
|          {
 | |
|              char16 *new_c16 = (char16 *) palloc(sizeof(char16));
 | |
| 
 | |
|              memset((void *) new_c16, 0, sizeof(char16));
 | |
|              (void) strncpy(new_c16, arg1, 16);
 | |
|              return (char16 *)(strncat(new_c16, arg2, 16));
 | |
|          }
 | |
| <p>
 | |
|          text *
 | |
|          copytext(text *t)
 | |
|          {
 | |
|              /*
 | |
|               * VARSIZE is the total size of the struct in bytes.
 | |
|               */
 | |
|              text *new_t = (text *) palloc(VARSIZE(t));
 | |
| <p>
 | |
|              memset(new_t, 0, VARSIZE(t));
 | |
| <p>
 | |
|              VARSIZE(new_t) = VARSIZE(t);
 | |
|              /*
 | |
|               * VARDATA is a pointer to the data region of the struct.
 | |
|               */
 | |
|              memcpy((void *) VARDATA(new_t), /* destination */
 | |
|                     (void *) VARDATA(t),     /* source */
 | |
|                     VARSIZE(t)-VARHDRSZ);        /* how many bytes */
 | |
| <p>
 | |
|              return(new_t);
 | |
|          }
 | |
| </pre>
 | |
|      On <B>OSF/1</B> we would type:
 | |
|      
 | |
| <pre>         CREATE FUNCTION add_one(int4) RETURNS int4
 | |
|               AS '/usr/local/postgres95/tutorial/obj/funcs.so' LANGUAGE 'c';
 | |
| 
 | |
|          CREATE FUNCTION concat16(char16, char16) RETURNS char16
 | |
|               AS '/usr/local/postgres95/tutorial/obj/funcs.so' LANGUAGE 'c';
 | |
| 
 | |
|          CREATE FUNCTION copytext(text) RETURNS text
 | |
|               AS '/usr/local/postgres95/tutorial/obj/funcs.so' LANGUAGE 'c';
 | |
| </pre>
 | |
| 
 | |
|      On  other  systems,  we might have to make the filename
 | |
|      end in .sl (to indicate that it's a shared library).
 | |
| <p>
 | |
| <H3><A NAME="programming-language-functions-on-composite-types">7.2.2.  Programming Language Functions on Composite Types</A></H3>
 | |
|      Composite types do not  have  a  fixed  layout  like  C
 | |
|      structures.   Instances of a composite type may contain
 | |
|      null fields.  In addition,  composite  types  that  are
 | |
|      part  of  an  inheritance  hierarchy may have different
 | |
|      fields than other members of the same inheritance hierarchy.    
 | |
|      Therefore,  POSTGRES  provides  a  procedural
 | |
|      interface for accessing fields of composite types  from
 | |
|      C.
 | |
|      As POSTGRES processes a set of instances, each instance
 | |
|      will be passed into your function as an  opaque  structure of type <B>TUPLE</B>.
 | |
|      Suppose we want to write a function to answer the query
 | |
| 
 | |
| <pre>         * SELECT name, c_overpaid(EMP, 1500) AS overpaid
 | |
|            FROM EMP
 | |
|            WHERE name = 'Bill' or name = 'Sam';
 | |
| </pre>
 | |
|      In the query above, we can define c_overpaid as:
 | |
|      
 | |
| <pre>         #include "postgres.h"  /* for char16, etc. */
 | |
|          #include "libpq-fe.h" /* for TUPLE */
 | |
| <p>
 | |
|          bool
 | |
|          c_overpaid(TUPLE t,/* the current instance of EMP */
 | |
|                     int4 limit)
 | |
|          {
 | |
|              bool isnull = false;
 | |
|              int4 salary;
 | |
| <p>
 | |
|              salary = (int4) GetAttributeByName(t, "salary", &isnull);
 | |
| <p>
 | |
|              if (isnull)
 | |
|                  return (false);
 | |
|              return(salary > limit);
 | |
|          }
 | |
| </pre>
 | |
| 
 | |
|      <B>GetAttributeByName</B> is the POSTGRES system function that
 | |
|      returns attributes out of the current instance.  It has
 | |
|      three arguments: the argument of type TUPLE passed into
 | |
|      the  function, the name of the desired attribute, and a
 | |
|      return parameter that describes whether  the  attribute
 | |
|      is  null.   <B>GetAttributeByName</B> will align data properly
 | |
|      so you can cast its return value to the  desired  type.
 | |
|      For  example, if you have an attribute name which is of
 | |
|      the type char16, the <B>GetAttributeByName</B> call would look
 | |
|      like:
 | |
| 
 | |
| <pre>         char *str;
 | |
|          ...
 | |
|          str = (char *) GetAttributeByName(t, "name", &isnull)
 | |
| </pre>
 | |
| 
 | |
|      The  following  query  lets  POSTGRES  know  about  the
 | |
|      c_overpaid function:
 | |
|      
 | |
| <pre>         * CREATE FUNCTION c_overpaid(EMP, int4) RETURNS bool
 | |
|               AS '/usr/local/postgres95/tutorial/obj/funcs.so' LANGUAGE 'c';
 | |
| </pre>
 | |
|      While there are ways to construct new instances or modify  
 | |
|      existing instances from within a C function, these
 | |
|      are far too complex to discuss in this manual.
 | |
| <p>
 | |
| <H3><A NAME="caveats">7.2.3.  Caveats</A></H3>
 | |
|      We now turn to the more difficult task of writing  
 | |
|      programming  language  functions.  Be warned: this section
 | |
|      of the manual will not make you a programmer.  You must
 | |
|      have  a  good  understanding of <B>C</B> (including the use of
 | |
|      pointers and the malloc memory manager)  before  trying
 | |
|      to write <B>C</B> functions for use with POSTGRES.
 | |
|      While  it  may be possible to load functions written in
 | |
|      languages other than <B>C</B> into  POSTGRES,  this  is  often
 | |
|      difficult  (when  it  is possible at all) because other
 | |
|      languages, such as <B>FORTRAN</B> and <B>Pascal</B> often do not follow 
 | |
|      the same "calling convention" as <B>C</B>.  That is, other
 | |
|      languages  do  not  pass  argument  and  return  values
 | |
|      between functions in the same way.  For this reason, we
 | |
|      will assume that your  programming  language  functions
 | |
|      are written in <B>C</B>.
 | |
|      The  basic  rules  for building <B>C</B> functions are as follows:
 | |
|      <OL>
 | |
|       <LI>   Most of the header (include) files for  POSTGRES
 | |
|             should      already      be     installed     in
 | |
|             /usr/local/postgres95/include  (see  Figure  2).
 | |
|             You should always include
 | |
|             
 | |
| <pre>                -I/usr/local/postgres95/include
 | |
| </pre>
 | |
|             on  your  cc  command lines.  Sometimes, you may
 | |
|             find that you require header files that  are  in
 | |
|             the  server source itself (i.e., you need a file
 | |
|             we neglected to install in include).   In  those
 | |
|             cases you may need to add one or more of
 | |
| <pre>
 | |
|                 -I/usr/local/postgres95/src/backend
 | |
|                 -I/usr/local/postgres95/src/backend/include
 | |
|                 -I/usr/local/postgres95/src/backend/port/<PORTNAME>
 | |
|                 -I/usr/local/postgres95/src/backend/obj
 | |
| </pre>
 | |
| 
 | |
|             (where <PORTNAME> is the name of the port, e.g.,
 | |
|             alpha or sparc).
 | |
|       <LI>   When allocating memory, use  the  POSTGRES  
 | |
|             routines  palloc  and  pfree  instead of the 
 | |
|             corresponding <B>C</B> library  routines  malloc  and  free.
 | |
|             The  memory  allocated  by  palloc will be freed
 | |
|             automatically at the end  of  each  transaction,
 | |
|             preventing memory leaks.
 | |
|       <LI>   Always  zero  the bytes of your structures using
 | |
|             memset or bzero.  Several routines (such as  the
 | |
|             hash access method, hash join and the sort algorithm) 
 | |
|             compute functions of the  raw  bits  contained  in 
 | |
|             your structure.  Even if you initialize all fields 
 | |
|             of your structure, there  may  be
 | |
|             several bytes of alignment padding (holes in the
 | |
|             structure) that may contain garbage values.
 | |
|       <LI>   Most of the internal POSTGRES types are declared
 | |
|             in  postgres.h,  so  it's usually a good idea to
 | |
|             include that file as well.
 | |
|       <LI>   Compiling and loading your object code  so  that
 | |
|             it  can  be  dynamically  loaded  into  POSTGRES
 | |
|             always requires special flags.  See  Appendix  A
 | |
|             for  a  detailed explanation of how to do it for
 | |
|             your particular operating system.
 | |
|      </OL>
 | |
| <HR>
 | |
| <font size=-1>
 | |
| <A HREF="pg95user.html">[ TOC ]</A> 
 | |
| <A HREF="extend.html">[ Previous ]</A> 
 | |
| <A HREF="xtypes.html">[ Next ]</A> 
 | |
| </font>
 | |
| </BODY>
 | |
| </HTML>
 |