1
0
mirror of https://github.com/postgres/postgres.git synced 2025-04-22 23:02:54 +03:00

Make an editorial pass over the newly SGML-ified contrib documentation.

Fix lots of bad markup, bad English, bad explanations.

This commit covers only about half the contrib modules, but I grow weary...
This commit is contained in:
Tom Lane 2007-12-06 04:12:10 +00:00
parent a37a0a4180
commit 53e99f57fc
21 changed files with 3713 additions and 3093 deletions

View File

@ -1,36 +1,40 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/adminpack.sgml,v 1.3 2007/12/06 04:12:09 tgl Exp $ -->
<sect1 id="adminpack"> <sect1 id="adminpack">
<title>adminpack</title> <title>adminpack</title>
<indexterm zone="adminpack"> <indexterm zone="adminpack">
<primary>adminpack</primary> <primary>adminpack</primary>
</indexterm> </indexterm>
<para> <para>
adminpack is a PostgreSQL standard module that implements a number of <filename>adminpack</> provides a number of support functions which
support functions which pgAdmin and other administration and management tools <application>pgAdmin</> and other administration and management tools can
can use to provide additional functionality if installed on a server. use to provide additional functionality, such as remote management
of server log files.
</para> </para>
<sect2> <sect2>
<title>Functions implemented</title> <title>Functions implemented</title>
<para>
Functions implemented by adminpack can only be run by a superuser. Here's a
list of these functions:
</para>
<para>
<programlisting>
int8 pg_catalog.pg_file_write(fname text, data text, append bool)
bool pg_catalog.pg_file_rename(oldname text, newname text, archivname text)
bool pg_catalog.pg_file_rename(oldname text, newname text)
bool pg_catalog.pg_file_unlink(fname text)
setof record pg_catalog.pg_logdir_ls()
/* Renaming of existing backend functions for pgAdmin compatibility */ <para>
int8 pg_catalog.pg_file_read(fname text, data text, append bool) The functions implemented by <filename>adminpack</> can only be run by a
bigint pg_catalog.pg_file_length(text) superuser. Here's a list of these functions:
int4 pg_catalog.pg_logfile_rotate()
</programlisting> <programlisting>
int8 pg_catalog.pg_file_write(fname text, data text, append bool)
bool pg_catalog.pg_file_rename(oldname text, newname text, archivename text)
bool pg_catalog.pg_file_rename(oldname text, newname text)
bool pg_catalog.pg_file_unlink(fname text)
setof record pg_catalog.pg_logdir_ls()
/* Renaming of existing backend functions for pgAdmin compatibility */
int8 pg_catalog.pg_file_read(fname text, data text, append bool)
bigint pg_catalog.pg_file_length(text)
int4 pg_catalog.pg_logfile_rotate()
</programlisting>
</para> </para>
</sect2> </sect2>
</sect1> </sect1>

View File

@ -1,37 +1,56 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/btree-gist.sgml,v 1.4 2007/12/06 04:12:09 tgl Exp $ -->
<sect1 id="btree-gist"> <sect1 id="btree-gist">
<title>btree_gist</title> <title>btree_gist</title>
<indexterm zone="btree-gist"> <indexterm zone="btree-gist">
<primary>btree_gist</primary> <primary>btree_gist</primary>
</indexterm> </indexterm>
<para> <para>
btree_gist is a B-Tree implementation using GiST that supports the int2, int4, <filename>btree_gist</> provides sample GiST operator classes that
int8, float4, float8 timestamp with/without time zone, time implement B-Tree equivalent behavior for the data types
with/without time zone, date, interval, oid, money, macaddr, char, <type>int2</>, <type>int4</>, <type>int8</>, <type>float4</>,
varchar/text, bytea, numeric, bit, varbit and inet/cidr types. <type>float8</>, <type>numeric</>, <type>timestamp with time zone</>,
<type>timestamp without time zone</>, <type>time with time zone</>,
<type>time without time zone</>, <type>date</>, <type>interval</>,
<type>oid</>, <type>money</>, <type>char</>,
<type>varchar</>, <type>text</>, <type>bytea</>, <type>bit</>,
<type>varbit</>, <type>macaddr</>, <type>inet</>, and <type>cidr</>.
</para>
<para>
In general, these operator classes will not outperform the equivalent
standard btree index methods, and they lack one major feature of the
standard btree code: the ability to enforce uniqueness. However,
they are useful for GiST testing and as a base for developing other
GiST operator classes.
</para> </para>
<sect2> <sect2>
<title>Example usage</title> <title>Example usage</title>
<programlisting>
CREATE TABLE test (a int4); <programlisting>
-- create index CREATE TABLE test (a int4);
CREATE INDEX testidx ON test USING gist (a); -- create index
-- query CREATE INDEX testidx ON test USING gist (a);
SELECT * FROM test WHERE a &lt; 10; -- query
</programlisting> SELECT * FROM test WHERE a &lt; 10;
</programlisting>
</sect2> </sect2>
<sect2> <sect2>
<title>Authors</title> <title>Authors</title>
<para> <para>
All work was done by Teodor Sigaev (<email>teodor@stack.net</email>) , Teodor Sigaev (<email>teodor@stack.net</email>) ,
Oleg Bartunov (<email>oleg@sai.msu.su</email>), Janko Richter Oleg Bartunov (<email>oleg@sai.msu.su</email>), and
(<email>jankorichter@yahoo.de</email>). See Janko Richter (<email>jankorichter@yahoo.de</email>). See
<ulink url="http://www.sai.msu.su/~megera/postgres/gist"></ulink> for additional <ulink url="http://www.sai.msu.su/~megera/postgres/gist"></ulink>
information. for additional information.
</para> </para>
</sect2> </sect2>
</sect1> </sect1>

View File

@ -1,17 +1,45 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/chkpass.sgml,v 1.2 2007/12/06 04:12:09 tgl Exp $ -->
<sect1 id="chkpass"> <sect1 id="chkpass">
<title>chkpass</title> <title>chkpass</title>
<!--
<indexterm zone="chkpass"> <indexterm zone="chkpass">
<primary>chkpass</primary> <primary>chkpass</primary>
</indexterm> </indexterm>
-->
<para> <para>
chkpass is a password type that is automatically checked and converted upon This module implements a data type <type>chkpass</> that is
entry. It is stored encrypted. To compare, simply compare against a clear designed for storing encrypted passwords.
Each password is automatically converted to encrypted form upon entry,
and is always stored encrypted. To compare, simply compare against a clear
text password and the comparison function will encrypt it before comparing. text password and the comparison function will encrypt it before comparing.
It also returns an error if the code determines that the password is easily </para>
crackable. This is currently a stub that does nothing.
<para>
There are provisions in the code to report an error if the password is
determined to be easily crackable. However, this is currently just
a stub that does nothing.
</para>
<para>
If you precede an input string with a colon, it is assumed to be an
already-encrypted password, and is stored without further encryption.
This allows entry of previously-encrypted passwords.
</para>
<para>
On output, a colon is prepended. This makes it possible to dump and reload
passwords without re-encrypting them. If you want the encrypted password
without the colon then use the <function>raw()</> function.
This allows you to use the
type with things like Apache's Auth_PostgreSQL module.
</para>
<para>
The encryption uses the standard Unix function <function>crypt()</>,
and so it suffers
from all the usual limitations of that function; notably that only the
first eight characters of a password are considered.
</para> </para>
<para> <para>
@ -23,28 +51,10 @@
</para> </para>
<para> <para>
If you precede the string with a colon, the encryption and checking are Sample usage:
skipped so that you can enter existing passwords into the field.
</para> </para>
<para> <programlisting>
On output, a colon is prepended. This makes it possible to dump and reload
passwords without re-encrypting them. If you want the password (encrypted)
without the colon then use the raw() function. This allows you to use the
type with things like Apache's Auth_PostgreSQL module.
</para>
<para>
The encryption uses the standard Unix function crypt(), and so it suffers
from all the usual limitations of that function; notably that only the
first eight characters of a password are considered.
</para>
<para>
Here is some sample usage:
</para>
<programlisting>
test=# create table test (p chkpass); test=# create table test (p chkpass);
CREATE TABLE CREATE TABLE
test=# insert into test values ('hello'); test=# insert into test values ('hello');
@ -72,13 +82,14 @@ test=# select p = 'goodbye' from test;
---------- ----------
f f
(1 row) (1 row)
</programlisting> </programlisting>
<sect2> <sect2>
<title>Author</title> <title>Author</title>
<para> <para>
D'Arcy J.M. Cain <email>darcy@druid.net</email> D'Arcy J.M. Cain (<email>darcy@druid.net</email>)
</para> </para>
</sect2> </sect2>
</sect1>
</sect1>

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/contrib-spi.sgml,v 1.1 2007/12/03 04:18:47 tgl Exp $ --> <!-- $PostgreSQL: pgsql/doc/src/sgml/contrib-spi.sgml,v 1.2 2007/12/06 04:12:09 tgl Exp $ -->
<sect1 id="contrib-spi"> <sect1 id="contrib-spi">
<title>spi</title> <title>spi</title>
@ -29,27 +29,28 @@
<para> <para>
<function>check_primary_key()</> checks the referencing table. <function>check_primary_key()</> checks the referencing table.
To use, create a BEFORE INSERT OR UPDATE trigger using this To use, create a <literal>BEFORE INSERT OR UPDATE</> trigger using this
function on a table referencing another table. You are to specify function on a table referencing another table. Specify as the trigger
as trigger arguments: triggered table column names which correspond arguments: the referencing table's column name(s) which form the foreign
to foreign key, referenced table name and column names in referenced key, the referenced table name, and the column names in the referenced table
table which correspond to primary/unique key. To handle multiple which form the primary/unique key. To handle multiple foreign
foreign keys, create a trigger for each reference. keys, create a trigger for each reference.
</para> </para>
<para> <para>
<function>check_foreign_key()</> checks the referenced table. <function>check_foreign_key()</> checks the referenced table.
To use, create a BEFORE DELETE OR UPDATE trigger using this To use, create a <literal>BEFORE DELETE OR UPDATE</> trigger using this
function on a table referenced by other table(s). You are to specify function on a table referenced by other table(s). Specify as the trigger
as trigger arguments: number of references for which the function has to arguments: the number of referencing tables for which the function has to
perform checking, action if referencing key found ('cascade' &mdash; to delete perform checking, the action if a referencing key is found
corresponding foreign key, 'restrict' &mdash; to abort transaction if foreign keys (<literal>cascade</> &mdash; to delete the referencing row,
exist, 'setnull' &mdash; to set foreign key referencing primary/unique key <literal>restrict</> &mdash; to abort transaction if referencing keys
being deleted to null), triggered table column names which correspond exist, <literal>setnull</> &mdash; to set referencing key fields to null),
to primary/unique key, then referencing table name and column names the triggered table's column names which form the primary/unique key, then
corresponding to foreign key (repeated for as many referencing tables/keys the referencing table name and column names (repeated for as many
as were specified by first argument). Note that the primary/unique key referencing tables as were specified by first argument). Note that the
columns should be marked NOT NULL and should have a unique index. primary/unique key columns should be marked NOT NULL and should have a
unique index.
</para> </para>
<para> <para>
@ -64,60 +65,65 @@
Long ago, <productname>PostgreSQL</> had a built-in time travel feature Long ago, <productname>PostgreSQL</> had a built-in time travel feature
that kept the insert and delete times for each tuple. This can be that kept the insert and delete times for each tuple. This can be
emulated using these functions. To use these functions, emulated using these functions. To use these functions,
you are to add to a table two columns of <type>abstime</> type to store you must add to a table two columns of <type>abstime</> type to store
the date when a tuple was inserted (start_date) and changed/deleted the date when a tuple was inserted (start_date) and changed/deleted
(stop_date): (stop_date):
<programlisting> <programlisting>
CREATE TABLE mytab ( CREATE TABLE mytab (
... ... ... ...
start_date abstime default now(), start_date abstime,
stop_date abstime default 'infinity' stop_date abstime
... ... ... ...
); );
</programlisting> </programlisting>
So, tuples being inserted with unspecified start_date/stop_date will get The columns can be named whatever you like, but in this discussion
the current time in start_date and <literal>infinity</> in we'll call them start_date and stop_date.
stop_date. </para>
<para>
When a new row is inserted, start_date should normally be set to
current time, and stop_date to <literal>infinity</>. The trigger
will automatically substitute these values if the inserted data
contains nulls in these columns. Generally, inserting explicit
non-null data in these columns should only be done when re-loading
dumped data.
</para> </para>
<para> <para>
Tuples with stop_date equal to <literal>infinity</> are <quote>valid Tuples with stop_date equal to <literal>infinity</> are <quote>valid
now</quote>: when trigger will be fired for UPDATE/DELETE of a tuple with now</quote>, and can be modified. Tuples with a finite stop_date cannot
stop_date NOT equal to <literal>infinity</> then be modified anymore &mdash; the trigger will prevent it. (If you need
this tuple will not be changed/deleted! to do that, you can turn off time travel as shown below.)
</para> </para>
<para> <para>
If stop_date is equal to <literal>infinity</> then on For a modifiable row, on update only the stop_date in the tuple being
update only the stop_date in the tuple being updated will be changed (to updated will be changed (to current time) and a new tuple with the modified
current time) and a new tuple with new data (coming from SET ... in UPDATE) data will be inserted. Start_date in this new tuple will be set to current
will be inserted. Start_date in this new tuple will be set to current time time and stop_date to <literal>infinity</>.
and stop_date to <literal>infinity</>.
</para> </para>
<para> <para>
A delete does not actually remove the tuple but only set its stop_date A delete does not actually remove the tuple but only sets its stop_date
to current time. to current time.
</para> </para>
<para> <para>
To query for tuples <quote>valid now</quote>, include To query for tuples <quote>valid now</quote>, include
<literal>stop_date = 'infinity'</> in the query's WHERE condition. <literal>stop_date = 'infinity'</> in the query's WHERE condition.
(You might wish to incorporate that in a view.) (You might wish to incorporate that in a view.) Similarly, you can
</para> query for tuples valid at any past time with suitable conditions on
start_date and stop_date.
<para>
You can't change start/stop date columns with UPDATE!
Use set_timetravel (below) if you need this.
</para> </para>
<para> <para>
<function>timetravel()</> is the general trigger function that supports <function>timetravel()</> is the general trigger function that supports
this behavior. Create a BEFORE INSERT OR UPDATE OR DELETE trigger using this this behavior. Create a <literal>BEFORE INSERT OR UPDATE OR DELETE</>
function on each time-traveled table. You are to specify two trigger arguments: trigger using this function on each time-traveled table. Specify two
name of start_date column and name of stop_date column in triggered table. trigger arguments: the actual
names of the start_date and stop_date columns.
Optionally, you can specify one to three more arguments, which must refer Optionally, you can specify one to three more arguments, which must refer
to columns of type <type>text</>. The trigger will store the name of to columns of type <type>text</>. The trigger will store the name of
the current user into the first of these columns during INSERT, the the current user into the first of these columns during INSERT, the
@ -130,7 +136,9 @@ CREATE TABLE mytab (
<literal>set_timetravel('mytab', 1)</> will turn TT ON for table mytab. <literal>set_timetravel('mytab', 1)</> will turn TT ON for table mytab.
<literal>set_timetravel('mytab', 0)</> will turn TT OFF for table mytab. <literal>set_timetravel('mytab', 0)</> will turn TT OFF for table mytab.
In both cases the old status is reported. While TT is off, you can modify In both cases the old status is reported. While TT is off, you can modify
the start_date and stop_date columns freely. the start_date and stop_date columns freely. Note that the on/off status
is local to the current database session &mdash; fresh sessions will
always start out with TT ON for all tables.
</para> </para>
<para> <para>
@ -156,9 +164,9 @@ CREATE TABLE mytab (
</para> </para>
<para> <para>
To use, create a BEFORE INSERT (or optionally BEFORE INSERT OR UPDATE) To use, create a <literal>BEFORE INSERT</> (or optionally <literal>BEFORE
trigger using this function. You are to specify INSERT OR UPDATE</>) trigger using this function. Specify two
as trigger arguments: the name of the integer column to be modified, trigger arguments: the name of the integer column to be modified,
and the name of the sequence object that will supply values. and the name of the sequence object that will supply values.
(Actually, you can specify any number of pairs of such names, if (Actually, you can specify any number of pairs of such names, if
you'd like to update more than one autoincrementing column.) you'd like to update more than one autoincrementing column.)
@ -180,8 +188,8 @@ CREATE TABLE mytab (
</para> </para>
<para> <para>
To use, create a BEFORE INSERT and/or UPDATE To use, create a <literal>BEFORE INSERT</> and/or <literal>UPDATE</>
trigger using this function. You are to specify a single trigger trigger using this function. Specify a single trigger
argument: the name of the text column to be modified. argument: the name of the text column to be modified.
</para> </para>
@ -201,8 +209,8 @@ CREATE TABLE mytab (
</para> </para>
<para> <para>
To use, create a BEFORE UPDATE To use, create a <literal>BEFORE UPDATE</>
trigger using this function. You are to specify a single trigger trigger using this function. Specify a single trigger
argument: the name of the <type>timestamp</> column to be modified. argument: the name of the <type>timestamp</> column to be modified.
</para> </para>

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/contrib.sgml,v 1.7 2007/12/03 04:18:47 tgl Exp $ --> <!-- $PostgreSQL: pgsql/doc/src/sgml/contrib.sgml,v 1.8 2007/12/06 04:12:09 tgl Exp $ -->
<appendix id="contrib"> <appendix id="contrib">
<title>Additional Supplied Modules</title> <title>Additional Supplied Modules</title>
@ -44,7 +44,7 @@
<para> <para>
Many modules supply new user-defined functions, operators, or types. Many modules supply new user-defined functions, operators, or types.
To make use of one of these modules, after you have installed the code To make use of one of these modules, after you have installed the code
you need to register the new objects in the database you need to register the new objects in the database
system by running the SQL commands in the <literal>.sql</> file system by running the SQL commands in the <literal>.sql</> file
supplied by the module. For example, supplied by the module. For example,
@ -54,6 +54,7 @@ psql -d dbname -f <replaceable>SHAREDIR</>/contrib/<replaceable>module</>.sql
Here, <replaceable>SHAREDIR</> means the installation's <quote>share</> Here, <replaceable>SHAREDIR</> means the installation's <quote>share</>
directory (<literal>pg_config --sharedir</> will tell you what this is). directory (<literal>pg_config --sharedir</> will tell you what this is).
In most cases the script must be run by a database superuser.
</para> </para>
<para> <para>

View File

@ -1,21 +1,24 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/cube.sgml,v 1.5 2007/12/06 04:12:09 tgl Exp $ -->
<sect1 id="cube"> <sect1 id="cube">
<title>cube</title> <title>cube</title>
<indexterm zone="cube"> <indexterm zone="cube">
<primary>cube</primary> <primary>cube</primary>
</indexterm> </indexterm>
<para> <para>
This module contains the user-defined type, CUBE, representing This module implements a data type <type>cube</> for
multidimensional cubes. representing multi-dimensional cubes.
</para> </para>
<sect2> <sect2>
<title>Syntax</title> <title>Syntax</title>
<para> <para>
The following are valid external representations for the CUBE type: The following are valid external representations for the <type>cube</>
type. <replaceable>x</>, <replaceable>y</>, etc denote floating-point
numbers:
</para> </para>
<table> <table>
@ -23,289 +26,114 @@
<tgroup cols="2"> <tgroup cols="2">
<tbody> <tbody>
<row> <row>
<entry>'x'</entry> <entry><literal><replaceable>x</></literal></entry>
<entry>A floating point value representing a one-dimensional point or <entry>A one-dimensional point
one-dimensional zero length cubement
</entry>
</row>
<row>
<entry>'(x)'</entry>
<entry>Same as above</entry>
</row>
<row>
<entry>'x1,x2,x3,...,xn'</entry>
<entry>A point in n-dimensional space, represented internally as a zero
volume box
</entry>
</row>
<row>
<entry>'(x1,x2,x3,...,xn)'</entry>
<entry>Same as above</entry>
</row>
<row>
<entry>'(x),(y)'</entry>
<entry>1-D cubement starting at x and ending at y or vice versa; the
order does not matter
</entry>
</row>
<row>
<entry>'(x1,...,xn),(y1,...,yn)'</entry>
<entry>n-dimensional box represented by a pair of its opposite corners, no
matter which. Functions take care of swapping to achieve "lower left --
upper right" representation before computing any values
</entry>
</row>
</tbody>
</tgroup>
</table>
</sect2>
<sect2>
<title>Grammar</title>
<table>
<title>Cube Grammar Rules</title>
<tgroup cols="2">
<tbody>
<row>
<entry>rule 1</entry>
<entry>box -> O_BRACKET paren_list COMMA paren_list C_BRACKET</entry>
</row>
<row>
<entry>rule 2</entry>
<entry>box -> paren_list COMMA paren_list</entry>
</row>
<row>
<entry>rule 3</entry>
<entry>box -> paren_list</entry>
</row>
<row>
<entry>rule 4</entry>
<entry>box -> list</entry>
</row>
<row>
<entry>rule 5</entry>
<entry>paren_list -> O_PAREN list C_PAREN</entry>
</row>
<row>
<entry>rule 6</entry>
<entry>list -> FLOAT</entry>
</row>
<row>
<entry>rule 7</entry>
<entry>list -> list COMMA FLOAT</entry>
</row>
</tbody>
</tgroup>
</table>
</sect2>
<sect2>
<title>Tokens</title>
<table>
<title>Cube Grammar Rules</title>
<tgroup cols="2">
<tbody>
<row>
<entry>n</entry>
<entry>[0-9]+</entry>
</row>
<row>
<entry>i</entry>
<entry>nteger [+-]?{n}</entry>
</row>
<row>
<entry>real</entry>
<entry>[+-]?({n}\.{n}?|\.{n})</entry>
</row>
<row>
<entry>FLOAT</entry>
<entry>({integer}|{real})([eE]{integer})?</entry>
</row>
<row>
<entry>O_BRACKET</entry>
<entry>\[</entry>
</row>
<row>
<entry>C_BRACKET</entry>
<entry>\]</entry>
</row>
<row>
<entry>O_PAREN</entry>
<entry>\(</entry>
</row>
<row>
<entry>C_PAREN</entry>
<entry>\)</entry>
</row>
<row>
<entry>COMMA</entry>
<entry>\,</entry>
</row>
</tbody>
</tgroup>
</table>
</sect2>
<sect2>
<title>Examples</title>
<table>
<title>Examples</title>
<tgroup cols="2">
<tbody>
<row>
<entry>'x'</entry>
<entry>A floating point value representing a one-dimensional point
(or, zero-length one-dimensional interval) (or, zero-length one-dimensional interval)
</entry> </entry>
</row> </row>
<row> <row>
<entry>'(x)'</entry> <entry><literal>(<replaceable>x</>)</literal></entry>
<entry>Same as above</entry> <entry>Same as above</entry>
</row> </row>
<row> <row>
<entry>'x1,x2,x3,...,xn'</entry> <entry><literal><replaceable>x1</>,<replaceable>x2</>,...,<replaceable>xn</></literal></entry>
<entry>A point in n-dimensional space,represented internally as a zero <entry>A point in n-dimensional space, represented internally as a
volume cube zero-volume cube
</entry> </entry>
</row> </row>
<row> <row>
<entry>'(x1,x2,x3,...,xn)'</entry> <entry><literal>(<replaceable>x1</>,<replaceable>x2</>,...,<replaceable>xn</>)</literal></entry>
<entry>Same as above</entry> <entry>Same as above</entry>
</row> </row>
<row> <row>
<entry>'(x),(y)'</entry> <entry><literal>(<replaceable>x</>),(<replaceable>y</>)</literal></entry>
<entry>A 1-D interval starting at x and ending at y or vice versa; the <entry>A one-dimensional interval starting at <replaceable>x</> and ending at <replaceable>y</> or vice versa; the
order does not matter order does not matter
</entry> </entry>
</row> </row>
<row> <row>
<entry>'[(x),(y)]'</entry> <entry><literal>[(<replaceable>x</>),(<replaceable>y</>)]</literal></entry>
<entry>Same as above</entry> <entry>Same as above</entry>
</row> </row>
<row> <row>
<entry>'(x1,...,xn),(y1,...,yn)'</entry> <entry><literal>(<replaceable>x1</>,...,<replaceable>xn</>),(<replaceable>y1</>,...,<replaceable>yn</>)</literal></entry>
<entry>An n-dimensional box represented by a pair of its diagonally <entry>An n-dimensional cube represented by a pair of its diagonally
opposite corners, regardless of order. Swapping is provided opposite corners
by all comarison routines to ensure the
"lower left -- upper right" representation
before actaul comparison takes place.
</entry> </entry>
</row> </row>
<row> <row>
<entry>'[(x1,...,xn),(y1,...,yn)]'</entry> <entry><literal>[(<replaceable>x1</>,...,<replaceable>xn</>),(<replaceable>y1</>,...,<replaceable>yn</>)]</literal></entry>
<entry>Same as above</entry> <entry>Same as above</entry>
</row> </row>
</tbody> </tbody>
</tgroup> </tgroup>
</table> </table>
<para> <para>
White space is ignored, so '[(x),(y)]' can be: '[ ( x ), ( y ) ]' It does not matter which order the opposite corners of a cube are
entered in. The <type>cube</> functions
automatically swap values if needed to create a uniform
<quote>lower left &mdash; upper right</> internal representation.
</para>
<para>
White space is ignored, so <literal>[(<replaceable>x</>),(<replaceable>y</>)]</literal> is the same as
<literal>[ ( <replaceable>x</> ), ( <replaceable>y</> ) ]</literal>.
</para> </para>
</sect2> </sect2>
<sect2>
<title>Defaults</title>
<para>
I believe this union:
</para>
<programlisting>
select cube_union('(0,5,2),(2,3,1)','0');
cube_union
-------------------
(0, 0, 0),(2, 5, 2)
(1 row)
</programlisting>
<para>
does not contradict to the common sense, neither does the intersection
</para>
<programlisting>
select cube_inter('(0,-1),(1,1)','(-2),(2)');
cube_inter
-------------
(0, 0),(1, 0)
(1 row)
</programlisting>
<para>
In all binary operations on differently sized boxes, I assume the smaller
one to be a cartesian projection, i. e., having zeroes in place of coordinates
omitted in the string representation. The above examples are equivalent to:
</para>
<programlisting>
cube_union('(0,5,2),(2,3,1)','(0,0,0),(0,0,0)');
cube_inter('(0,-1),(1,1)','(-2,0),(2,0)');
</programlisting>
<para>
The following containment predicate uses the point syntax,
while in fact the second argument is internally represented by a box.
This syntax makes it unnecessary to define the special Point type
and functions for (box,point) predicates.
</para>
<programlisting>
select cube_contains('(0,0),(1,1)', '0.5,0.5');
cube_contains
--------------
t
(1 row)
</programlisting>
</sect2>
<sect2> <sect2>
<title>Precision</title> <title>Precision</title>
<para> <para>
Values are stored internally as 64-bit floating point numbers. This means that Values are stored internally as 64-bit floating point numbers. This means
numbers with more than about 16 significant digits will be truncated. that numbers with more than about 16 significant digits will be truncated.
</para> </para>
</sect2> </sect2>
<sect2> <sect2>
<title>Usage</title> <title>Usage</title>
<para>
The access method for CUBE is a GiST index (gist_cube_ops), which is a
generalization of R-tree. GiSTs allow the postgres implementation of
R-tree, originally encoded to support 2-D geometric types such as
boxes and polygons, to be used with any data type whose data domain
can be partitioned using the concepts of containment, intersection and
equality. In other words, everything that can intersect or contain
its own kind can be indexed with a GiST. That includes, among other
things, all geometric data types, regardless of their dimensionality
(see also contrib/seg).
</para>
<para> <para>
The operators supported by the GiST access method include: The <filename>cube</> module includes a GiST index operator class for
<type>cube</> values.
The operators supported by the GiST opclass include:
</para> </para>
<programlisting> <itemizedlist>
a = b Same as <listitem>
</programlisting> <programlisting>
<para> a = b Same as
The cubements a and b are identical. </programlisting>
</para> <para>
The cubes a and b are identical.
<programlisting> </para>
</listitem>
<listitem>
<programlisting>
a &amp;&amp; b Overlaps a &amp;&amp; b Overlaps
</programlisting> </programlisting>
<para> <para>
The cubements a and b overlap. The cubes a and b overlap.
</para> </para>
</listitem>
<programlisting> <listitem>
<programlisting>
a @&gt; b Contains a @&gt; b Contains
</programlisting> </programlisting>
<para> <para>
The cubement a contains the cubement b. The cube a contains the cube b.
</para> </para>
</listitem>
<listitem>
<programlisting> <programlisting>
a &lt;@ b Contained in a &lt;@ b Contained in
</programlisting> </programlisting>
<para> <para>
The cubement a is contained in b. The cube a is contained in the cube b.
</para> </para>
</listitem>
</itemizedlist>
<para> <para>
(Before PostgreSQL 8.2, the containment operators @&gt; and &lt;@ were (Before PostgreSQL 8.2, the containment operators @&gt; and &lt;@ were
@ -316,26 +144,18 @@ a &lt;@ b Contained in
</para> </para>
<para> <para>
Although the mnemonics of the following operators is questionable, I The standard B-tree operators are also provided, for example
preserved them to maintain visual consistency with other geometric
data types defined in Postgres.
</para>
<para>
Other operators:
</para>
<programlisting> <programlisting>
[a, b] &lt; [c, d] Less than [a, b] &lt; [c, d] Less than
[a, b] &gt; [c, d] Greater than [a, b] &gt; [c, d] Greater than
</programlisting> </programlisting>
<para>
These operators do not make a lot of sense for any practical These operators do not make a lot of sense for any practical
purpose but sorting. These operators first compare (a) to (c), purpose but sorting. These operators first compare (a) to (c),
and if these are equal, compare (b) to (d). That accounts for and if these are equal, compare (b) to (d). That results in
reasonably good sorting in most cases, which is useful if reasonably good sorting in most cases, which is useful if
you want to use ORDER BY with this type you want to use ORDER BY with this type.
</para> </para>
<para> <para>
@ -343,49 +163,35 @@ a &lt;@ b Contained in
</para> </para>
<table> <table>
<title>Functions available</title> <title>Cube functions</title>
<tgroup cols="2"> <tgroup cols="2">
<tbody> <tbody>
<row>
<entry><literal>cube_distance(cube, cube) returns double</literal></entry>
<entry>cube_distance returns the distance between two cubes. If both
cubes are points, this is the normal distance function.
</entry>
</row>
<row>
<entry><literal>cube(text)</literal></entry>
<entry>Takes text input and returns a cube. This is useful for making
cubes from computed strings.
</entry>
</row>
<row> <row>
<entry><literal>cube(float8) returns cube</literal></entry> <entry><literal>cube(float8) returns cube</literal></entry>
<entry>This makes a one dimensional cube with both coordinates the same. <entry>Makes a one dimensional cube with both coordinates the same.
If the type of the argument is a numeric type other than float8 an
explicit cast to float8 may be needed.
<literal>cube(1) == '(1)'</literal> <literal>cube(1) == '(1)'</literal>
</entry> </entry>
</row> </row>
<row> <row>
<entry><literal>cube(float8, float8) returns cube</literal></entry> <entry><literal>cube(float8, float8) returns cube</literal></entry>
<entry> <entry>Makes a one dimensional cube.
This makes a one dimensional cube.
<literal>cube(1,2) == '(1),(2)'</literal> <literal>cube(1,2) == '(1),(2)'</literal>
</entry> </entry>
</row> </row>
<row> <row>
<entry><literal>cube(float8[]) returns cube</literal></entry> <entry><literal>cube(float8[]) returns cube</literal></entry>
<entry>This makes a zero-volume cube using the coordinates <entry>Makes a zero-volume cube using the coordinates
defined by thearray.<literal>cube(ARRAY[1,2]) == '(1,2)'</literal> defined by the array.
<literal>cube(ARRAY[1,2]) == '(1,2)'</literal>
</entry> </entry>
</row> </row>
<row> <row>
<entry><literal>cube(float8[], float8[]) returns cube</literal></entry> <entry><literal>cube(float8[], float8[]) returns cube</literal></entry>
<entry>This makes a cube, with upper right and lower left <entry>Makes a cube with upper right and lower left
coordinates as defined by the 2 float arrays. Arrays must be of the coordinates as defined by the two arrays, which must be of the
same length. same length.
<literal>cube('{1,2}'::float[], '{3,4}'::float[]) == '(1,2),(3,4)' <literal>cube('{1,2}'::float[], '{3,4}'::float[]) == '(1,2),(3,4)'
</literal> </literal>
@ -394,8 +200,8 @@ a &lt;@ b Contained in
<row> <row>
<entry><literal>cube(cube, float8) returns cube</literal></entry> <entry><literal>cube(cube, float8) returns cube</literal></entry>
<entry>This builds a new cube by adding a dimension on to an <entry>Makes a new cube by adding a dimension on to an
existing cube with the same values for both parts of the new coordinate. existing cube with the same values for both parts of the new coordinate.
This is useful for building cubes piece by piece from calculated values. This is useful for building cubes piece by piece from calculated values.
<literal>cube('(1)',2) == '(1,2),(1,2)'</literal> <literal>cube('(1)',2) == '(1,2),(1,2)'</literal>
</entry> </entry>
@ -403,133 +209,194 @@ a &lt;@ b Contained in
<row> <row>
<entry><literal>cube(cube, float8, float8) returns cube</literal></entry> <entry><literal>cube(cube, float8, float8) returns cube</literal></entry>
<entry>This builds a new cube by adding a dimension on to an <entry>Makes a new cube by adding a dimension on to an
existing cube. This is useful for building cubes piece by piece from existing cube. This is useful for building cubes piece by piece from
calculated values. <literal>cube('(1,2)',3,4) == '(1,3),(2,4)'</literal> calculated values. <literal>cube('(1,2)',3,4) == '(1,3),(2,4)'</literal>
</entry> </entry>
</row> </row>
<row> <row>
<entry><literal>cube_dim(cube) returns int</literal></entry> <entry><literal>cube_dim(cube) returns int</literal></entry>
<entry>cube_dim returns the number of dimensions stored in the <entry>Returns the number of dimensions of the cube
the data structure
for a cube. This is useful for constraints on the dimensions of a cube.
</entry> </entry>
</row> </row>
<row> <row>
<entry><literal>cube_ll_coord(cube, int) returns double </literal></entry> <entry><literal>cube_ll_coord(cube, int) returns double </literal></entry>
<entry> <entry>Returns the n'th coordinate value for the lower left
cube_ll_coord returns the nth coordinate value for the lower left corner of a cube
corner of a cube. This is useful for doing coordinate transformations.
</entry> </entry>
</row> </row>
<row> <row>
<entry><literal>cube_ur_coord(cube, int) returns double <entry><literal>cube_ur_coord(cube, int) returns double
</literal></entry> </literal></entry>
<entry>cube_ur_coord returns the nth coordinate value for the <entry>Returns the n'th coordinate value for the
upper right corner of a cube. This is useful for doing coordinate upper right corner of a cube
transformations. </entry>
</row>
<row>
<entry><literal>cube_is_point(cube) returns bool</literal></entry>
<entry>Returns true if a cube is a point, that is,
the two defining corners are the same.</entry>
</row>
<row>
<entry><literal>cube_distance(cube, cube) returns double</literal></entry>
<entry>Returns the distance between two cubes. If both
cubes are points, this is the normal distance function.
</entry> </entry>
</row> </row>
<row> <row>
<entry><literal>cube_subset(cube, int[]) returns cube <entry><literal>cube_subset(cube, int[]) returns cube
</literal></entry> </literal></entry>
<entry>Builds a new cube from an existing cube, using a list of <entry>Makes a new cube from an existing cube, using a list of
dimension indexes dimension indexes from an array. Can be used to find both the LL and UR
from an array. Can be used to find both the ll and ur coordinate of single coordinates of a single dimension, e.g.
dimenion, e.g.: cube_subset(cube('(1,3,5),(6,7,8)'), ARRAY[2]) = '(3),(7)' <literal>cube_subset(cube('(1,3,5),(6,7,8)'), ARRAY[2]) = '(3),(7)'</>.
Or can be used to drop dimensions, or reorder them as desired, e.g.: Or can be used to drop dimensions, or reorder them as desired, e.g.
cube_subset(cube('(1,3,5),(6,7,8)'), ARRAY[3,2,1,1]) = <literal>cube_subset(cube('(1,3,5),(6,7,8)'), ARRAY[3,2,1,1]) = '(5, 3,
'(5, 3, 1, 1),(8, 7, 6, 6)' 1, 1),(8, 7, 6, 6)'</>.
</entry> </entry>
</row> </row>
<row> <row>
<entry><literal>cube_is_point(cube) returns bool</literal></entry> <entry><literal>cube_union(cube, cube) returns cube</literal></entry>
<entry>cube_is_point returns true if a cube is also a point. <entry>Produces the union of two cubes
This is true when the two defining corners are the same.</entry> </entry>
</row> </row>
<row> <row>
<entry><literal>cube_enlarge(cube, double, int) returns cube</literal></entry> <entry><literal>cube_inter(cube, cube) returns cube</literal></entry>
<entry> <entry>Produces the intersection of two cubes
cube_enlarge increases the size of a cube by a specified </entry>
radius in at least </row>
n dimensions. If the radius is negative the box is shrunk instead. This
<row>
<entry><literal>cube_enlarge(cube c, double r, int n) returns cube</literal></entry>
<entry>Increases the size of a cube by a specified radius in at least
n dimensions. If the radius is negative the cube is shrunk instead. This
is useful for creating bounding boxes around a point for searching for is useful for creating bounding boxes around a point for searching for
nearby points. All defined dimensions are changed by the radius. If n nearby points. All defined dimensions are changed by the radius r.
is greater than the number of defined dimensions and the cube is being LL coordinates are decreased by r and UR coordinates are increased by r.
increased (r &gt;= 0) then 0 is used as the base for the extra coordinates. If a LL coordinate is increased to larger than the corresponding UR
LL coordinates are decreased by r and UR coordinates are increased by r. coordinate (this can only happen when r &lt; 0) than both coordinates
If a LL coordinate is increased to larger than the corresponding UR are set to their average. If n is greater than the number of defined
coordinate (this can only happen when r &lt; 0) than both coordinates are dimensions and the cube is being increased (r &gt;= 0) then 0 is used
set to their average. To make it harder for people to break things there as the base for the extra coordinates.
is an effective maximum on the dimension of cubes of 100. This is set
in cubedata.h if you need something bigger.
</entry> </entry>
</row> </row>
</tbody> </tbody>
</tgroup> </tgroup>
</table> </table>
</sect2>
<sect2>
<title>Defaults</title>
<para> <para>
There are a few other potentially useful functions defined in cube.c I believe this union:
that vanished from the schema because I stopped using them. Some of </para>
these were meant to support type casting. Let me know if I was wrong: <programlisting>
I will then add them back to the schema. I would also appreciate select cube_union('(0,5,2),(2,3,1)', '0');
other ideas that would enhance the type and make it more useful. cube_union
-------------------
(0, 0, 0),(2, 5, 2)
(1 row)
</programlisting>
<para>
does not contradict common sense, neither does the intersection
</para>
<programlisting>
select cube_inter('(0,-1),(1,1)', '(-2),(2)');
cube_inter
-------------
(0, 0),(1, 0)
(1 row)
</programlisting>
<para>
In all binary operations on differently-dimensioned cubes, I assume the
lower-dimensional one to be a cartesian projection, i. e., having zeroes
in place of coordinates omitted in the string representation. The above
examples are equivalent to:
</para>
<programlisting>
cube_union('(0,5,2),(2,3,1)','(0,0,0),(0,0,0)');
cube_inter('(0,-1),(1,1)','(-2,0),(2,0)');
</programlisting>
<para>
The following containment predicate uses the point syntax,
while in fact the second argument is internally represented by a box.
This syntax makes it unnecessary to define a separate point type
and functions for (box,point) predicates.
</para>
<programlisting>
select cube_contains('(0,0),(1,1)', '0.5,0.5');
cube_contains
--------------
t
(1 row)
</programlisting>
</sect2>
<sect2>
<title>Notes</title>
<para>
For examples of usage, see the regression test <filename>sql/cube.sql</>.
</para> </para>
<para> <para>
For examples of usage, see sql/cube.sql To make it harder for people to break things, there
is a limit of 100 on the number of dimensions of cubes. This is set
in <filename>cubedata.h</> if you need something bigger.
</para> </para>
</sect2> </sect2>
<sect2> <sect2>
<title>Credits</title> <title>Credits</title>
<para> <para>
This code is essentially based on the example written for Original author: Gene Selkov, Jr. <email>selkovjr@mcs.anl.gov</email>,
Illustra, <ulink url="http://garcia.me.berkeley.edu/~adong/rtree"></ulink> Mathematics and Computer Science Division, Argonne National Laboratory.
</para> </para>
<para> <para>
My thanks are primarily to Prof. Joe Hellerstein My thanks are primarily to Prof. Joe Hellerstein
(<ulink url="http://db.cs.berkeley.edu/~jmh/"></ulink>) for elucidating the (<ulink url="http://db.cs.berkeley.edu/~jmh/"></ulink>) for elucidating the
gist of the GiST (<ulink url="http://gist.cs.berkeley.edu/"></ulink>), and gist of the GiST (<ulink url="http://gist.cs.berkeley.edu/"></ulink>), and
to his former student, Andy Dong to his former student, Andy Dong (<ulink
(<ulink url="http://best.me.berkeley.edu/~adong/"></ulink>), for his exemplar. url="http://best.me.berkeley.edu/~adong/"></ulink>), for his example
I am also grateful to all postgres developers, present and past, for enabling written for Illustra,
myself to create my own world and live undisturbed in it. And I would like to <ulink url="http://garcia.me.berkeley.edu/~adong/rtree"></ulink>.
acknowledge my gratitude to Argonne Lab and to the U.S. Department of Energy I am also grateful to all Postgres developers, present and past, for
for the years of faithful support of my database research. enabling myself to create my own world and live undisturbed in it. And I
would like to acknowledge my gratitude to Argonne Lab and to the
U.S. Department of Energy for the years of faithful support of my database
research.
</para> </para>
<para> <para>
Gene Selkov, Jr. Minor updates to this package were made by Bruno Wolff III
Computational Scientist <email>bruno@wolff.to</email> in August/September of 2002. These include
Mathematics and Computer Science Division changing the precision from single precision to double precision and adding
Argonne National Laboratory
9700 S Cass Ave.
Building 221
Argonne, IL 60439-4844
<email>selkovjr@mcs.anl.gov</email>
</para>
<para>
Minor updates to this package were made by Bruno Wolff III
<email>bruno@wolff.to</email> in August/September of 2002. These include
changing the precision from single precision to double precision and adding
some new functions. some new functions.
</para> </para>
<para> <para>
Additional updates were made by Joshua Reich <email>josh@root.net</email> in Additional updates were made by Joshua Reich <email>josh@root.net</email> in
July 2006. These include <literal>cube(float8[], float8[])</literal> and July 2006. These include <literal>cube(float8[], float8[])</literal> and
cleaning up the code to use the V1 call protocol instead of the deprecated V0 cleaning up the code to use the V1 call protocol instead of the deprecated
form. V0 protocol.
</para> </para>
</sect2> </sect2>
</sect1>
</sect1>

File diff suppressed because it is too large Load Diff

View File

@ -1,36 +1,43 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/dict-int.sgml,v 1.2 2007/12/06 04:12:10 tgl Exp $ -->
<sect1 id="dict-int"> <sect1 id="dict-int">
<title>dict_int</title> <title>dict_int</title>
<indexterm zone="dict-int"> <indexterm zone="dict-int">
<primary>dict_int</primary> <primary>dict_int</primary>
</indexterm> </indexterm>
<para> <para>
The motivation for this example dictionary is to control the indexing of <filename>dict_int</> is an example of an add-on dictionary template
integers (signed and unsigned), and, consequently, to minimize the number of for full-text search. The motivation for this example dictionary is to
unique words which greatly affect the performance of searching. control the indexing of integers (signed and unsigned), allowing such
numbers to be indexed while preventing excessive growth in the number of
unique words, which greatly affects the performance of searching.
</para> </para>
<sect2> <sect2>
<title>Configuration</title> <title>Configuration</title>
<para> <para>
The dictionary accepts two options: The dictionary accepts two options:
</para> </para>
<itemizedlist> <itemizedlist>
<listitem> <listitem>
<para> <para>
The MAXLEN parameter specifies the maximum length (number of digits) The <literal>maxlen</> parameter specifies the maximum number of
allowed in an integer word. The default value is 6. digits allowed in an integer word. The default value is 6.
</para> </para>
</listitem> </listitem>
<listitem> <listitem>
<para> <para>
The REJECTLONG parameter specifies if an overlength integer should be The <literal>rejectlong</> parameter specifies whether an overlength
truncated or ignored. If REJECTLONG=FALSE (default), the dictionary returns integer should be truncated or ignored. If <literal>rejectlong</> is
the first MAXLEN digits of the integer. If REJECTLONG=TRUE, the <literal>false</> (the default), the dictionary returns the first
dictionary treats an overlength integer as a stop word, so that it will <literal>maxlen</> digits of the integer. If <literal>rejectlong</> is
not be indexed. <literal>true</>, the dictionary treats an overlength integer as a stop
word, so that it will not be indexed. Note that this also means that
such an integer cannot be searched for.
</para> </para>
</listitem> </listitem>
</itemizedlist> </itemizedlist>

View File

@ -1,33 +1,41 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/dict-xsyn.sgml,v 1.2 2007/12/06 04:12:10 tgl Exp $ -->
<sect1 id="dict-xsyn"> <sect1 id="dict-xsyn">
<title>dict_xsyn</title> <title>dict_xsyn</title>
<indexterm zone="dict-xsyn"> <indexterm zone="dict-xsyn">
<primary>dict_xsyn</primary> <primary>dict_xsyn</primary>
</indexterm> </indexterm>
<para> <para>
The Extended Synonym Dictionary module replaces words with groups of their <filename>dict_xsyn</> (Extended Synonym Dictionary) is an example of an
synonyms, and so makes it possible to search for a word using any of its add-on dictionary template for full-text search. This dictionary type
synonyms. replaces words with groups of their synonyms, and so makes it possible to
search for a word using any of its synonyms.
</para> </para>
<sect2> <sect2>
<title>Configuration</title> <title>Configuration</title>
<para> <para>
A <literal>dict_xsyn</> dictionary accepts the following options: A <literal>dict_xsyn</> dictionary accepts the following options:
</para> </para>
<itemizedlist> <itemizedlist>
<listitem> <listitem>
<para> <para>
KEEPORIG controls whether the original word is included, or only its <literal>keeporig</> controls whether the original word is included (if
synonyms. Default is 'true'. <literal>true</>), or only its synonyms (if <literal>false</>). Default
is <literal>true</>.
</para> </para>
</listitem> </listitem>
<listitem> <listitem>
<para> <para>
RULES is the base name of the file containing the list of synonyms. <literal>rules</> is the base name of the file containing the list of
This file must be in $(prefix)/share/tsearch_data/, and its name must synonyms. This file must be stored in
end in ".rules" (which is not included in the RULES parameter). <filename>$SHAREDIR/tsearch_data/</> (where <literal>$SHAREDIR</> means
the <productname>PostgreSQL</> installation's shared-data directory).
Its name must end in <literal>.rules</> (which is not to be included in
the <literal>rules</> parameter).
</para> </para>
</listitem> </listitem>
</itemizedlist> </itemizedlist>
@ -38,41 +46,63 @@
<listitem> <listitem>
<para> <para>
Each line represents a group of synonyms for a single word, which is Each line represents a group of synonyms for a single word, which is
given first on the line. Synonyms are separated by whitespace: given first on the line. Synonyms are separated by whitespace, thus:
</para>
<programlisting> <programlisting>
word syn1 syn2 syn3 word syn1 syn2 syn3
</programlisting> </programlisting>
</para>
</listitem> </listitem>
<listitem> <listitem>
<para> <para>
Sharp ('#') sign is a comment delimiter. It may appear at any position The sharp (<literal>#</>) sign is a comment delimiter. It may appear at
inside the line. The rest of the line will be skipped. any position in a line. The rest of the line will be skipped.
</para> </para>
</listitem> </listitem>
</itemizedlist> </itemizedlist>
<para> <para>
Look at xsyn_sample.rules, which is installed in $(prefix)/share/tsearch_data/, Look at <filename>xsyn_sample.rules</>, which is installed in
for an example. <filename>$SHAREDIR/tsearch_data/</>, for an example.
</para> </para>
</sect2> </sect2>
<sect2> <sect2>
<title>Usage</title> <title>Usage</title>
<programlisting>
mydb=# SELECT ts_lexize('xsyn','word');
ts_lexize
----------------
{word,syn1,syn2,syn3)
</programlisting>
<para> <para>
Change dictionary options: Running the installation script creates a text search template
</para> <literal>xsyn_template</> and a dictionary <literal>xsyn</>
<programlisting> based on it, with default parameters. You can alter the
mydb# ALTER TEXT SEARCH DICTIONARY xsyn (KEEPORIG=false); parameters, for example
<programlisting>
mydb# ALTER TEXT SEARCH DICTIONARY xsyn (RULES='my_rules', KEEPORIG=false);
ALTER TEXT SEARCH DICTIONARY ALTER TEXT SEARCH DICTIONARY
</programlisting> </programlisting>
or create new dictionaries based on the template.
</para>
<para>
To test the dictionary, you can try
<programlisting>
mydb=# SELECT ts_lexize('xsyn', 'word');
ts_lexize
-----------------------
{word,syn1,syn2,syn3}
</programlisting>
but real-world usage will involve including it in a text search
configuration as described in <xref linkend="textsearch">.
That might look like this:
<programlisting>
ALTER TEXT SEARCH CONFIGURATION english
ALTER MAPPING FOR word, asciiword WITH xsyn, english_stem;
</programlisting>
</para>
</sect2> </sect2>
</sect1> </sect1>

View File

@ -1,133 +1,191 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/earthdistance.sgml,v 1.3 2007/12/06 04:12:10 tgl Exp $ -->
<sect1 id="earthdistance"> <sect1 id="earthdistance">
<title>earthdistance</title> <title>earthdistance</title>
<indexterm zone="earthdistance"> <indexterm zone="earthdistance">
<primary>earthdistance</primary> <primary>earthdistance</primary>
</indexterm> </indexterm>
<para> <para>
This module contains two different approaches to calculating The <filename>earthdistance</> module provides two different approaches to
great circle distances on the surface of the Earth. The one described calculating great circle distances on the surface of the Earth. The one
first depends on the contrib/cube package (which MUST be installed before described first depends on the <filename>cube</> package (which
earthdistance is installed). The second one is based on the point <emphasis>must</> be installed before <filename>earthdistance</> can be
datatype using latitude and longitude for the coordinates. The install installed). The second one is based on the built-in <type>point</> datatype,
script makes the defined functions executable by anyone. using longitude and latitude for the coordinates.
</para>
<para>
A spherical model of the Earth is used.
</para>
<para>
Data is stored in cubes that are points (both corners are the same) using 3
coordinates representing the distance from the center of the Earth.
</para>
<para>
The radius of the Earth is obtained from the earth() function. It is
given in meters. But by changing this one function you can change it
to use some other units or to use a different value of the radius
that you feel is more appropiate.
</para>
<para>
This package also has applications to astronomical databases as well.
Astronomers will probably want to change earth() to return a radius of
180/pi() so that distances are in degrees.
</para>
<para>
Functions are provided to allow for input in latitude and longitude (in
degrees), to allow for output of latitude and longitude, to calculate
the great circle distance between two points and to easily specify a
bounding box usable for index searches.
</para>
<para>
The functions are all 'sql' functions. If you want to make these functions
executable by other people you will also have to make the referenced
cube functions executable. cube(text), cube(float8), cube(cube,float8),
cube_distance(cube,cube), cube_ll_coord(cube,int) and
cube_enlarge(cube,float8,int) are used indirectly by the earth distance
functions. is_point(cube) and cube_dim(cube) are used in constraints for data
in domain earth. cube_ur_coord(cube,int) is used in the regression tests and
might be useful for looking at bounding box coordinates in user applications.
</para>
<para>
A domain of type cube named earth is defined.
There are constraints on it defined to make sure the cube is a point,
that it does not have more than 3 dimensions and that it is very near
the surface of a sphere centered about the origin with the radius of
the Earth.
</para>
<para>
The following functions are provided:
</para> </para>
<table id="earthdistance-functions">
<title>EarthDistance functions</title>
<tgroup cols="2">
<tbody>
<row>
<entry><literal>earth()</literal></entry>
<entry>returns the radius of the Earth in meters.</entry>
</row>
<row>
<entry><literal>sec_to_gc(float8)</literal></entry>
<entry>converts the normal straight line
(secant) distance between between two points on the surface of the Earth
to the great circle distance between them.
</entry>
</row>
<row>
<entry><literal>gc_to_sec(float8)</literal></entry>
<entry>Converts the great circle distance
between two points on the surface of the Earth to the normal straight line
(secant) distance between them.
</entry>
</row>
<row>
<entry><literal>ll_to_earth(float8, float8)</literal></entry>
<entry>Returns the location of a point on the surface of the Earth given
its latitude (argument 1) and longitude (argument 2) in degrees.
</entry>
</row>
<row>
<entry><literal>latitude(earth)</literal></entry>
<entry>Returns the latitude in degrees of a point on the surface of the
Earth.
</entry>
</row>
<row>
<entry><literal>longitude(earth)</literal></entry>
<entry>Returns the longitude in degrees of a point on the surface of the
Earth.
</entry>
</row>
<row>
<entry><literal>earth_distance(earth, earth)</literal></entry>
<entry>Returns the great circle distance between two points on the
surface of the Earth.
</entry>
</row>
<row>
<entry><literal>earth_box(earth, float8)</literal></entry>
<entry>Returns a box suitable for an indexed search using the cube @>
operator for points within a given great circle distance of a location.
Some points in this box are further than the specified great circle
distance from the location so a second check using earth_distance
should be made at the same time.
</entry>
</row>
<row>
<entry><literal>&lt;@&gt;</literal> operator</entry>
<entry>gives the distance in statute miles between
two points on the Earth's surface. Coordinates are in degrees. Points are
taken as (longitude, latitude) and not vice versa as longitude is closer
to the intuitive idea of x-axis and latitude to y-axis.
</entry>
</row>
</tbody>
</tgroup>
</table>
<para> <para>
One advantage of using cube representation over a point using latitude and In this module, the Earth is assumed to be perfectly spherical.
longitude for coordinates, is that you don't have to worry about special (If that's too inaccurate for you, you might want to look at the
conditions at +/- 180 degrees of longitude or near the poles. <application><ulink url="http://www.postgis.org/">PostGIS</ulink></>
project.)
</para> </para>
<sect2>
<title>Cube-based earth distances</title>
<para>
Data is stored in cubes that are points (both corners are the same) using 3
coordinates representing the x, y, and z distance from the center of the
Earth. A domain <type>earth</> over <type>cube</> is provided, which
includes constraint checks that the value meets these restrictions and
is reasonably close to the actual surface of the Earth.
</para>
<para>
The radius of the Earth is obtained from the <function>earth()</>
function. It is given in meters. But by changing this one function you can
change the module to use some other units, or to use a different value of
the radius that you feel is more appropiate.
</para>
<para>
This package has applications to astronomical databases as well.
Astronomers will probably want to change <function>earth()</> to return a
radius of <literal>180/pi()</> so that distances are in degrees.
</para>
<para>
Functions are provided to support input in latitude and longitude (in
degrees), to support output of latitude and longitude, to calculate
the great circle distance between two points and to easily specify a
bounding box usable for index searches.
</para>
<para>
The following functions are provided:
</para>
<table id="earthdistance-cube-functions">
<title>Cube-based earthdistance functions</title>
<tgroup cols="3">
<thead>
<row>
<entry>Function</entry>
<entry>Returns</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><function>earth()</function></entry>
<entry><type>float8</type></entry>
<entry>Returns the assumed radius of the Earth.</entry>
</row>
<row>
<entry><function>sec_to_gc(float8)</function></entry>
<entry><type>float8</type></entry>
<entry>Converts the normal straight line
(secant) distance between between two points on the surface of the Earth
to the great circle distance between them.
</entry>
</row>
<row>
<entry><function>gc_to_sec(float8)</function></entry>
<entry><type>float8</type></entry>
<entry>Converts the great circle distance between two points on the
surface of the Earth to the normal straight line (secant) distance
between them.
</entry>
</row>
<row>
<entry><function>ll_to_earth(float8, float8)</function></entry>
<entry><type>earth</type></entry>
<entry>Returns the location of a point on the surface of the Earth given
its latitude (argument 1) and longitude (argument 2) in degrees.
</entry>
</row>
<row>
<entry><function>latitude(earth)</function></entry>
<entry><type>float8</type></entry>
<entry>Returns the latitude in degrees of a point on the surface of the
Earth.
</entry>
</row>
<row>
<entry><function>longitude(earth)</function></entry>
<entry><type>float8</type></entry>
<entry>Returns the longitude in degrees of a point on the surface of the
Earth.
</entry>
</row>
<row>
<entry><function>earth_distance(earth, earth)</function></entry>
<entry><type>float8</type></entry>
<entry>Returns the great circle distance between two points on the
surface of the Earth.
</entry>
</row>
<row>
<entry><function>earth_box(earth, float8)</function></entry>
<entry><type>cube</type></entry>
<entry>Returns a box suitable for an indexed search using the cube
<literal>@&gt;</>
operator for points within a given great circle distance of a location.
Some points in this box are further than the specified great circle
distance from the location, so a second check using
<function>earth_distance</> should be included in the query.
</entry>
</row>
</tbody>
</tgroup>
</table>
</sect2>
<sect2>
<title>Point-based earth distances</title>
<para>
The second part of the module relies on representing Earth locations as
values of type <type>point</>, in which the first component is taken to
represent longitude in degrees, and the second component is taken to
represent latitude in degrees. Points are taken as (longitude, latitude)
and not vice versa because longitude is closer to the intuitive idea of
x-axis and latitude to y-axis.
</para>
<para>
A single operator is provided:
</para>
<table id="earthdistance-point-operators">
<title>Point-based earthdistance operators</title>
<tgroup cols="3">
<thead>
<row>
<entry>Operator</entry>
<entry>Returns</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><type>point</> <literal>&lt;@&gt;</literal> <type>point</></entry>
<entry><type>float8</type></entry>
<entry>Gives the distance in statute miles between
two points on the Earth's surface.
</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
Note that unlike the <type>cube</>-based part of the module, units
are hardwired here: changing the <function>earth()</> function will
not affect the results of this operator.
</para>
<para>
One disadvantage of the longitude/latitude representation is that
you need to be careful about the edge conditions near the poles
and near +/- 180 degrees of longitude. The <type>cube</>-based
representation avoids these discontinuities.
</para>
</sect2>
</sect1> </sect1>

View File

@ -1,30 +1,51 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/fuzzystrmatch.sgml,v 1.3 2007/12/06 04:12:10 tgl Exp $ -->
<sect1 id="fuzzystrmatch"> <sect1 id="fuzzystrmatch">
<title>fuzzystrmatch</title> <title>fuzzystrmatch</title>
<indexterm zone="fuzzystrmatch">
<primary>fuzzystrmatch</primary>
</indexterm>
<para> <para>
This section describes the fuzzystrmatch module which provides different The <filename>fuzzystrmatch</> module provides several
functions to determine similarities and distance between strings. functions to determine similarities and distance between strings.
</para> </para>
<sect2> <sect2>
<title>Soundex</title> <title>Soundex</title>
<para> <para>
The Soundex system is a method of matching similar sounding names The Soundex system is a method of matching similar-sounding names
(or any words) to the same code. It was initially used by the by converting them to the same code. It was initially used by the
United States Census in 1880, 1900, and 1910, but it has little use United States Census in 1880, 1900, and 1910. Note that Soundex
beyond English names (or the English pronunciation of names), and is not very useful for non-English names.
it is not a linguistic tool.
</para> </para>
<para> <para>
When comparing two soundex values to determine similarity, the The <filename>fuzzystrmatch</> module provides two functions
difference function reports how close the match is on a scale for working with Soundex codes:
from zero to four, with zero being no match and four being an
exact match.
</para> </para>
<programlisting>
soundex(text) returns text
difference(text, text) returns int
</programlisting>
<para> <para>
The following are some usage examples: The <function>soundex</> function converts a string to its Soundex code.
The <function>difference</> function converts two strings to their Soundex
codes and then reports the number of matching code positions. Since
Soundex codes have four characters, the result ranges from zero to four,
with zero being no match and four being an exact match. (Thus, the
function is misnamed &mdash; <function>similarity</> would have been
a better name.)
</para> </para>
<para>
Here are some usage examples:
</para>
<programlisting> <programlisting>
SELECT soundex('hello world!'); SELECT soundex('hello world!');
@ -41,81 +62,106 @@ INSERT INTO s VALUES ('jack');
SELECT * FROM s WHERE soundex(nm) = soundex('john'); SELECT * FROM s WHERE soundex(nm) = soundex('john');
SELECT a.nm, b.nm FROM s a, s b WHERE soundex(a.nm) = soundex(b.nm) AND a.oid &lt;&gt; b.oid;
CREATE FUNCTION text_sx_eq(text, text) RETURNS boolean AS
'select soundex($1) = soundex($2)'
LANGUAGE SQL;
CREATE FUNCTION text_sx_lt(text, text) RETURNS boolean AS
'select soundex($1) &lt; soundex($2)'
LANGUAGE SQL;
CREATE FUNCTION text_sx_gt(text, text) RETURNS boolean AS
'select soundex($1) &gt; soundex($2)'
LANGUAGE SQL;
CREATE FUNCTION text_sx_le(text, text) RETURNS boolean AS
'select soundex($1) &lt;= soundex($2)'
LANGUAGE SQL;
CREATE FUNCTION text_sx_ge(text, text) RETURNS boolean AS
'select soundex($1) &gt;= soundex($2)'
LANGUAGE SQL;
CREATE FUNCTION text_sx_ne(text, text) RETURNS boolean AS
'select soundex($1) &lt;&gt; soundex($2)'
LANGUAGE SQL;
DROP OPERATOR #= (text, text);
CREATE OPERATOR #= (leftarg=text, rightarg=text, procedure=text_sx_eq, commutator = #=);
SELECT * FROM s WHERE text_sx_eq(nm, 'john');
SELECT * FROM s WHERE s.nm #= 'john';
SELECT * FROM s WHERE difference(s.nm, 'john') &gt; 2; SELECT * FROM s WHERE difference(s.nm, 'john') &gt; 2;
</programlisting> </programlisting>
</sect2> </sect2>
<sect2> <sect2>
<title>levenshtein</title> <title>Levenshtein</title>
<para> <para>
This function calculates the levenshtein distance between two strings: This function calculates the Levenshtein distance between two strings:
</para> </para>
<programlisting> <programlisting>
int levenshtein(text source, text target) levenshtein(text source, text target) returns int
</programlisting> </programlisting>
<para> <para>
Both <literal>source</literal> and <literal>target</literal> can be any Both <literal>source</literal> and <literal>target</literal> can be any
NOT NULL string with a maximum of 255 characters. non-null string, with a maximum of 255 characters.
</para> </para>
<para> <para>
Example: Example:
</para> </para>
<programlisting> <programlisting>
SELECT levenshtein('GUMBO','GAMBOL'); test=# SELECT levenshtein('GUMBO', 'GAMBOL');
levenshtein
-------------
2
(1 row)
</programlisting> </programlisting>
</sect2> </sect2>
<sect2> <sect2>
<title>metaphone</title> <title>Metaphone</title>
<para> <para>
This function calculates and returns the metaphone code of an input string: Metaphone, like Soundex, is based on the idea of constructing a
representative code for an input string. Two strings are then
deemed similar if they have the same codes.
</para> </para>
<programlisting>
text metahpone(text source, int max_output_length)
</programlisting>
<para> <para>
<literal>source</literal> has to be a NOT NULL string with a maximum of This function calculates the metaphone code of an input string:
255 characters. <literal>max_output_length</literal> fixes the maximum </para>
length of the output metaphone code; if longer, the output is truncated
<programlisting>
metaphone(text source, int max_output_length) returns text
</programlisting>
<para>
<literal>source</literal> has to be a non-null string with a maximum of
255 characters. <literal>max_output_length</literal> sets the maximum
length of the output metaphone code; if longer, the output is truncated
to this length. to this length.
</para> </para>
<para>Example</para>
<para>
Example:
</para>
<programlisting> <programlisting>
SELECT metaphone('GUMBO',4); test=# SELECT metaphone('GUMBO', 4);
metaphone
-----------
KM
(1 row)
</programlisting>
</sect2>
<sect2>
<title>Double Metaphone</title>
<para>
The Double Metaphone system computes two <quote>sounds like</> strings
for a given input string &mdash; a <quote>primary</> and an
<quote>alternate</>. In most cases they are the same, but for non-English
names especially they can be a bit different, depending on pronunciation.
These functions compute the primary and alternate codes:
</para>
<programlisting>
dmetaphone(text source) returns text
dmetaphone_alt(text source) returns text
</programlisting>
<para>
There is no length limit on the input strings.
</para>
<para>
Example:
</para>
<programlisting>
test=# select dmetaphone('gumbo');
dmetaphone
------------
KMP
(1 row)
</programlisting> </programlisting>
</sect2> </sect2>

View File

@ -1,229 +1,241 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/hstore.sgml,v 1.2 2007/12/06 04:12:10 tgl Exp $ -->
<sect1 id="hstore"> <sect1 id="hstore">
<title>hstore</title> <title>hstore</title>
<indexterm zone="hstore"> <indexterm zone="hstore">
<primary>hstore</primary> <primary>hstore</primary>
</indexterm> </indexterm>
<para> <para>
The <literal>hstore</literal> module is usefull for storing (key,value) pairs. This module implements a data type <type>hstore</> for storing sets of
This module can be useful in different scenarios: case with many attributes (key,value) pairs within a single <productname>PostgreSQL</> data field.
rarely searched, semistructural data or a lazy DBA. This can be useful in various scenarios, such as rows with many attributes
that are rarely examined, or semi-structured data.
</para> </para>
<sect2> <sect2>
<title>Operations</title> <title><type>hstore</> External Representation</title>
<itemizedlist>
<listitem>
<para>
<literal>hstore -> text</literal> - get value , perl analogy $h{key}
</para>
<programlisting>
select 'a=>q, b=>g'->'a';
?
------
q
</programlisting>
<para>
Note the use of parenthesis in the select below, because priority of 'is' is
higher than that of '->':
</para>
<programlisting>
SELECT id FROM entrants WHERE (info->'education_period') IS NOT NULL;
</programlisting>
</listitem>
<listitem> <para>
<para> The text representation of an <type>hstore</> value includes zero
<literal>hstore || hstore</literal> - concatenation, perl analogy %a=( %b, %c ); or more <replaceable>key</> <literal>=&gt;</> <replaceable>value</>
</para> items, separated by commas. For example:
<programlisting>
regression=# select 'a=>b'::hstore || 'c=>d'::hstore;
?column?
--------------------
"a"=>"b", "c"=>"d"
(1 row)
</programlisting>
<para> <programlisting>
but, notice k => v
</para> foo => bar, baz => whatever
"1-a" => "anything at all"
</programlisting>
<programlisting> The order of the items is not considered significant (and may not be
regression=# select 'a=>b'::hstore || 'a=>d'::hstore; reproduced on output). Whitespace between items or around the
?column? <literal>=&gt;</> sign is ignored. Use double quotes if a key or
---------- value includes whitespace, comma, <literal>=</> or <literal>&gt;</>.
"a"=>"d" To include a double quote or a backslash in a key or value, precede
(1 row) it with another backslash. (Keep in mind that depending on the
</programlisting> setting of <varname>standard_conforming_strings</>, you may need to
</listitem> double backslashes in SQL literal strings.)
</para>
<listitem> <para>
<para> A value (but not a key) can be a SQL NULL. This is represented as
<literal>text => text</literal> - creates hstore type from two text strings
</para>
<programlisting>
select 'a'=>'b';
?column?
----------
"a"=>"b"
</programlisting>
</listitem>
<listitem> <programlisting>
<para> key => NULL
<literal>hstore @> hstore</literal> - contains operation, check if left operand contains right. </programlisting>
</para>
<programlisting>
regression=# select 'a=>b, b=>1, c=>NULL'::hstore @> 'a=>c';
?column?
----------
f
(1 row)
regression=# select 'a=>b, b=>1, c=>NULL'::hstore @> 'b=>1'; The <literal>NULL</> keyword is not case-sensitive. Again, use
?column? double quotes if you want the string <literal>null</> to be treated
---------- as an ordinary data value.
t </para>
(1 row)
</programlisting> <para>
</listitem> Currently, double quotes are always used to surround key and value
strings on output, even when this is not strictly necessary.
</para>
<listitem>
<para>
<literal>hstore &lt;@ hstore</literal> - contained operation, check if
left operand is contained in right
</para>
<para>
(Before PostgreSQL 8.2, the containment operators @&gt; and &lt;@ were
respectively called @ and ~. These names are still available, but are
deprecated and will eventually be retired. Notice that the old names
are reversed from the convention formerly followed by the core geometric
datatypes!)
</para>
</listitem>
</itemizedlist>
</sect2> </sect2>
<sect2> <sect2>
<title>Functions</title> <title><type>hstore</> Operators and Functions</title>
<itemizedlist> <table id="hstore-op-table">
<listitem> <title><type>hstore</> Operators</title>
<para>
<literal>akeys(hstore)</literal> - returns all keys from hstore as array
</para>
<programlisting>
regression=# select akeys('a=>1,b=>2');
akeys
-------
{a,b}
</programlisting>
</listitem>
<listitem> <tgroup cols="4">
<para> <thead>
<literal>skeys(hstore)</literal> - returns all keys from hstore as strings <row>
</para> <entry>Operator</entry>
<programlisting> <entry>Description</entry>
regression=# select skeys('a=>1,b=>2'); <entry>Example</entry>
skeys <entry>Result</entry>
------- </row>
a </thead>
b
</programlisting>
</listitem>
<listitem> <tbody>
<para> <row>
<literal>avals(hstore)</literal> - returns all values from hstore as array <entry><type>hstore</> <literal>-&gt;</> <type>text</></entry>
</para> <entry>get value for key (null if not present)</entry>
<programlisting> <entry><literal>'a=&gt;x, b=&gt;y'::hstore -&gt; 'a'</literal></entry>
regression=# select avals('a=>1,b=>2'); <entry><literal>x</literal></entry>
avals </row>
-------
{1,2}
</programlisting>
</listitem>
<listitem> <row>
<para> <entry><type>text</> <literal>=&gt;</> <type>text</></entry>
<literal>svals(hstore)</literal> - returns all values from hstore as <entry>make single-item <type>hstore</></entry>
strings <entry><literal>'a' =&gt; 'b'</literal></entry>
</para> <entry><literal>"a"=&gt;"b"</literal></entry>
<programlisting> </row>
regression=# select svals('a=>1,b=>2');
svals
-------
1
2
</programlisting>
</listitem>
<listitem> <row>
<para> <entry><type>hstore</> <literal>||</> <type>hstore</></entry>
<literal>delete (hstore,text)</literal> - delete (key,value) from hstore if <entry>concatenation</entry>
key matches argument. <entry><literal>'a=&gt;b, c=&gt;d'::hstore || 'c=&gt;x, d=&gt;q'::hstore</literal></entry>
</para> <entry><literal>"a"=&gt;"b", "c"=&gt;"x", "d"=&gt;"q"</literal></entry>
<programlisting> </row>
regression=# select delete('a=>1,b=>2','b');
delete
----------
"a"=>"1"
</programlisting>
</listitem>
<listitem> <row>
<para> <entry><type>hstore</> <literal>?</> <type>text</></entry>
<literal>each(hstore)</literal> - return (key, value) pairs <entry>does <type>hstore</> contain key?</entry>
</para> <entry><literal>'a=&gt;1'::hstore ? 'a'</literal></entry>
<programlisting> <entry><literal>t</literal></entry>
regression=# select * from each('a=>1,b=>2'); </row>
key | value
<row>
<entry><type>hstore</> <literal>@&gt;</> <type>hstore</></entry>
<entry>does left operand contain right?</entry>
<entry><literal>'a=&gt;b, b=&gt;1, c=&gt;NULL'::hstore @&gt; 'b=&gt;1'</literal></entry>
<entry><literal>t</literal></entry>
</row>
<row>
<entry><type>hstore</> <literal>&lt;@</> <type>hstore</></entry>
<entry>is left operand contained in right?</entry>
<entry><literal>'a=&gt;c'::hstore &lt;@ 'a=&gt;b, b=&gt;1, c=&gt;NULL'</literal></entry>
<entry><literal>f</literal></entry>
</row>
</tbody>
</tgroup>
</table>
<para>
(Before PostgreSQL 8.2, the containment operators @&gt; and &lt;@ were
respectively called @ and ~. These names are still available, but are
deprecated and will eventually be retired. Notice that the old names
are reversed from the convention formerly followed by the core geometric
datatypes!)
</para>
<table id="hstore-func-table">
<title><type>hstore</> Functions</title>
<tgroup cols="5">
<thead>
<row>
<entry>Function</entry>
<entry>Return Type</entry>
<entry>Description</entry>
<entry>Example</entry>
<entry>Result</entry>
</row>
</thead>
<tbody>
<row>
<entry><function>akeys(hstore)</function></entry>
<entry><type>text[]</type></entry>
<entry>get <type>hstore</>'s keys as array</entry>
<entry><literal>akeys('a=&gt;1,b=&gt;2')</literal></entry>
<entry><literal>{a,b}</literal></entry>
</row>
<row>
<entry><function>skeys(hstore)</function></entry>
<entry><type>setof text</type></entry>
<entry>get <type>hstore</>'s keys as set</entry>
<entry><literal>skeys('a=&gt;1,b=&gt;2')</literal></entry>
<entry>
<programlisting>
a
b
</programlisting></entry>
</row>
<row>
<entry><function>avals(hstore)</function></entry>
<entry><type>text[]</type></entry>
<entry>get <type>hstore</>'s values as array</entry>
<entry><literal>avals('a=&gt;1,b=&gt;2')</literal></entry>
<entry><literal>{1,2}</literal></entry>
</row>
<row>
<entry><function>svals(hstore)</function></entry>
<entry><type>setof text</type></entry>
<entry>get <type>hstore</>'s values as set</entry>
<entry><literal>svals('a=&gt;1,b=&gt;2')</literal></entry>
<entry>
<programlisting>
1
2
</programlisting></entry>
</row>
<row>
<entry><function>each(hstore)</function></entry>
<entry><type>setof (key text, value text)</type></entry>
<entry>get <type>hstore</>'s keys and values as set</entry>
<entry><literal>select * from each('a=&gt;1,b=&gt;2')</literal></entry>
<entry>
<programlisting>
key | value
-----+------- -----+-------
a | 1 a | 1
b | 2 b | 2
</programlisting> </programlisting></entry>
</listitem> </row>
<listitem> <row>
<para> <entry><function>exist(hstore,text)</function></entry>
<literal>exist (hstore,text)</literal> <entry><type>boolean</type></entry>
</para> <entry>does <type>hstore</> contain key?</entry>
<para> <entry><literal>exist('a=&gt;1','a')</literal></entry>
<literal>hstore ? text</literal> - returns 'true if key is exists in hstore <entry><literal>t</literal></entry>
and false otherwise. </row>
</para>
<programlisting>
regression=# select exist('a=>1','a'), 'a=>1' ? 'a';
exist | ?column?
-------+----------
t | t
</programlisting>
</listitem>
<listitem> <row>
<para> <entry><function>defined(hstore,text)</function></entry>
<literal>defined (hstore,text)</literal> - returns true if key is exists in <entry><type>boolean</type></entry>
hstore and its value is not NULL. <entry>does <type>hstore</> contain non-null value for key?</entry>
</para> <entry><literal>defined('a=&gt;NULL','a')</literal></entry>
<programlisting> <entry><literal>f</literal></entry>
regression=# select defined('a=>NULL','a'); </row>
defined
--------- <row>
f <entry><function>delete(hstore,text)</function></entry>
</programlisting> <entry><type>hstore</type></entry>
</listitem> <entry>delete any item matching key</entry>
</itemizedlist> <entry><literal>delete('a=&gt;1,b=&gt;2','b')</literal></entry>
<entry><literal>"a"=>"1"</literal></entry>
</row>
</tbody>
</tgroup>
</table>
</sect2> </sect2>
<sect2> <sect2>
<title>Indices</title> <title>Indexes</title>
<para> <para>
Module provides index support for '@>' and '?' operations. <type>hstore</> has index support for <literal>@&gt;</> and <literal>?</>
operators. You can use either GiST or GIN index types. For example:
</para> </para>
<programlisting> <programlisting>
CREATE INDEX hidx ON testhstore USING GIST(h); CREATE INDEX hidx ON testhstore USING GIST(h);
CREATE INDEX hidx ON testhstore USING GIN(h); CREATE INDEX hidx ON testhstore USING GIN(h);
</programlisting> </programlisting>
</sect2> </sect2>
@ -232,45 +244,53 @@ CREATE INDEX hidx ON testhstore USING GIN(h);
<title>Examples</title> <title>Examples</title>
<para> <para>
Add a key: Add a key, or update an existing key with a new value:
</para> </para>
<programlisting> <programlisting>
UPDATE tt SET h=h||'c=>3'; UPDATE tab SET h = h || ('c' => '3');
</programlisting> </programlisting>
<para> <para>
Delete a key: Delete a key:
</para> </para>
<programlisting> <programlisting>
UPDATE tt SET h=delete(h,'k1'); UPDATE tab SET h = delete(h, 'k1');
</programlisting> </programlisting>
</sect2> </sect2>
<sect2> <sect2>
<title>Statistics</title> <title>Statistics</title>
<para> <para>
hstore type, because of its intrinsic liberality, could contain a lot of The <type>hstore</> type, because of its intrinsic liberality, could
different keys. Checking for valid keys is the task of application. contain a lot of different keys. Checking for valid keys is the task of the
Examples below demonstrate several techniques how to check keys statistics. application. Examples below demonstrate several techniques for checking
keys and obtaining statistics.
</para> </para>
<para> <para>
Simple example Simple example:
</para> </para>
<programlisting> <programlisting>
SELECT * FROM each('aaa=>bq, b=>NULL, ""=>1 '); SELECT * FROM each('aaa=>bq, b=>NULL, ""=>1');
</programlisting> </programlisting>
<para> <para>
Using table Using a table:
</para> </para>
<programlisting> <programlisting>
SELECT (each(h)).key, (each(h)).value INTO stat FROM testhstore ; SELECT (each(h)).key, (each(h)).value INTO stat FROM testhstore;
</programlisting> </programlisting>
<para>Online stat</para> <para>
Online statistics:
</para>
<programlisting> <programlisting>
SELECT key, count(*) FROM (SELECT (each(h)).key FROM testhstore) AS stat GROUP BY key ORDER BY count DESC, key; SELECT key, count(*) FROM
key | count (SELECT (each(h)).key FROM testhstore) AS stat
GROUP BY key
ORDER BY count DESC, key;
key | count
-----------+------- -----------+-------
line | 883 line | 883
query | 207 query | 207
@ -287,12 +307,14 @@ SELECT key, count(*) FROM (SELECT (each(h)).key FROM testhstore) AS stat GROUP B
<sect2> <sect2>
<title>Authors</title> <title>Authors</title>
<para> <para>
Oleg Bartunov <email>oleg@sai.msu.su</email>, Moscow, Moscow University, Russia Oleg Bartunov <email>oleg@sai.msu.su</email>, Moscow, Moscow University, Russia
</para> </para>
<para> <para>
Teodor Sigaev <email>teodor@sigaev.ru</email>, Moscow, Delta-Soft Ltd.,Russia Teodor Sigaev <email>teodor@sigaev.ru</email>, Moscow, Delta-Soft Ltd., Russia
</para> </para>
</sect2> </sect2>
</sect1>
</sect1>

View File

@ -1,118 +1,126 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/lo.sgml,v 1.3 2007/12/06 04:12:10 tgl Exp $ -->
<sect1 id="lo"> <sect1 id="lo">
<title>lo</title> <title>lo</title>
<indexterm zone="lo"> <indexterm zone="lo">
<primary>lo</primary> <primary>lo</primary>
</indexterm> </indexterm>
<para> <para>
PostgreSQL type extension for managing Large Objects The <filename>lo</> module provides support for managing Large Objects
(also called LOs or BLOBs). This includes a data type <type>lo</>
and a trigger <function>lo_manage</>.
</para> </para>
<sect2> <sect2>
<title>Overview</title> <title>Rationale</title>
<para> <para>
One of the problems with the JDBC driver (and this affects the ODBC driver One of the problems with the JDBC driver (and this affects the ODBC driver
also), is that the specification assumes that references to BLOBS (Binary also), is that the specification assumes that references to BLOBs (Binary
Large OBjectS) are stored within a table, and if that entry is changed, the Large OBjects) are stored within a table, and if that entry is changed, the
associated BLOB is deleted from the database. associated BLOB is deleted from the database.
</para> </para>
<para> <para>
As PostgreSQL stands, this doesn't occur. Large objects are treated as As <productname>PostgreSQL</> stands, this doesn't occur. Large objects
objects in their own right; a table entry can reference a large object by are treated as objects in their own right; a table entry can reference a
OID, but there can be multiple table entries referencing the same large large object by OID, but there can be multiple table entries referencing
object OID, so the system doesn't delete the large object just because you the same large object OID, so the system doesn't delete the large object
change or remove one such entry. just because you change or remove one such entry.
</para> </para>
<para> <para>
Now this is fine for new PostgreSQL-specific applications, but existing ones Now this is fine for <productname>PostgreSQL</>-specific applications, but
using JDBC or ODBC won't delete the objects, resulting in orphaning - objects standard code using JDBC or ODBC won't delete the objects, resulting in
that are not referenced by anything, and simply occupy disk space. orphan objects &mdash; objects that are not referenced by anything, and
simply occupy disk space.
</para>
<para>
The <filename>lo</> module allows fixing this by attaching a trigger
to tables that contain LO reference columns. The trigger essentially just
does a <function>lo_unlink</> whenever you delete or modify a value
referencing a large object. When you use this trigger, you are assuming
that there is only one database reference to any large object that is
referenced in a trigger-controlled column!
</para>
<para>
The module also provides a data type <type>lo</>, which is really just
a domain of the <type>oid</> type. This is useful for differentiating
database columns that hold large object references from those that are
OIDs of other things. You don't have to use the <type>lo</> type to
use the trigger, but it may be convenient to use it to keep track of which
columns in your database represent large objects that you are managing with
the trigger. It is also rumored that the ODBC driver gets confused if you
don't use <type>lo</> for BLOB columns.
</para> </para>
</sect2> </sect2>
<sect2> <sect2>
<title>The Fix</title> <title>How to Use It</title>
<para>
I've fixed this by creating a new data type 'lo', some support functions, and
a Trigger which handles the orphaning problem. The trigger essentially just
does a 'lo_unlink' whenever you delete or modify a value referencing a large
object. When you use this trigger, you are assuming that there is only one
database reference to any large object that is referenced in a
trigger-controlled column!
</para>
<para>
The 'lo' type was created because we needed to differentiate between plain
OIDs and Large Objects. Currently the JDBC driver handles this dilemma easily,
but (after talking to Byron), the ODBC driver needed a unique type. They had
created an 'lo' type, but not the solution to orphaning.
</para>
<para>
You don't actually have to use the 'lo' type to use the trigger, but it may be
convenient to use it to keep track of which columns in your database represent
large objects that you are managing with the trigger.
</para>
</sect2>
<sect2>
<title>How to Use</title>
<para> <para>
The easiest way is by an example: Here's a simple example of usage:
</para> </para>
<programlisting> <programlisting>
CREATE TABLE image (title TEXT, raster lo); CREATE TABLE image (title TEXT, raster lo);
CREATE TRIGGER t_raster BEFORE UPDATE OR DELETE ON image CREATE TRIGGER t_raster BEFORE UPDATE OR DELETE ON image
FOR EACH ROW EXECUTE PROCEDURE lo_manage(raster); FOR EACH ROW EXECUTE PROCEDURE lo_manage(raster);
</programlisting> </programlisting>
<para> <para>
Create a trigger for each column that contains a lo type, and give the column For each column that will contain unique references to large objects,
name as the trigger procedure argument. You can have more than one trigger on create a <literal>BEFORE UPDATE OR DELETE</> trigger, and give the column
a table if you need multiple lo columns in the same table, but don't forget to name as the sole trigger argument. If you need multiple <type>lo</>
give a different name to each trigger. columns in the same table, create a separate trigger for each one,
remembering to give a different name to each trigger on the same table.
</para> </para>
</sect2> </sect2>
<sect2> <sect2>
<title>Issues</title> <title>Limitations</title>
<itemizedlist> <itemizedlist>
<listitem> <listitem>
<para> <para>
Dropping a table will still orphan any objects it contains, as the trigger Dropping a table will still orphan any objects it contains, as the trigger
is not executed. is not executed. You can avoid this by preceding the <command>DROP
TABLE</> with <command>DELETE FROM <replaceable>table</></command>.
</para> </para>
<para> <para>
Avoid this by preceding the 'drop table' with 'delete from {table}'. <command>TRUNCATE</> has the same hazard.
</para> </para>
<para> <para>
If you already have, or suspect you have, orphaned large objects, see If you already have, or suspect you have, orphaned large objects, see the
the contrib/vacuumlo module to help you clean them up. It's a good idea <filename>contrib/vacuumlo</> module (<xref linkend="vacuumlo">) to help
to run contrib/vacuumlo occasionally as a back-stop to the lo_manage you clean them up. It's a good idea to run <application>vacuumlo</>
trigger. occasionally as a back-stop to the <function>lo_manage</> trigger.
</para> </para>
</listitem> </listitem>
<listitem> <listitem>
<para> <para>
Some frontends may create their own tables, and will not create the Some frontends may create their own tables, and will not create the
associated trigger(s). Also, users may not remember (or know) to create associated trigger(s). Also, users may not remember (or know) to create
the triggers. the triggers.
</para> </para>
</listitem> </listitem>
</itemizedlist> </itemizedlist>
<para>
As the ODBC driver needs a permanent lo type (&amp; JDBC could be optimised to
use it if it's Oid is fixed), and as the above issues can only be fixed by
some internal changes, I feel it should become a permanent built-in type.
</para>
</sect2> </sect2>
<sect2> <sect2>
<title>Author</title> <title>Author</title>
<para> <para>
Peter Mount <email>peter@retep.org.uk</email> June 13 1998 Peter Mount <email>peter@retep.org.uk</email>
</para> </para>
</sect2> </sect2>
</sect1>
</sect1>

View File

@ -1,19 +1,22 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/seg.sgml,v 1.4 2007/12/06 04:12:10 tgl Exp $ -->
<sect1 id="seg"> <sect1 id="seg">
<title>seg</title> <title>seg</title>
<indexterm zone="seg"> <indexterm zone="seg">
<primary>seg</primary> <primary>seg</primary>
</indexterm> </indexterm>
<para> <para>
The <literal>seg</literal> module contains the code for the user-defined This module implements a data type <type>seg</> for
type, <literal>SEG</literal>, representing laboratory measurements as representing line segments, or floating point intervals.
floating point intervals. <type>seg</> can represent uncertainty in the interval endpoints,
making it especially useful for representing laboratory measurements.
</para> </para>
<sect2> <sect2>
<title>Rationale</title> <title>Rationale</title>
<para> <para>
The geometry of measurements is usually more complex than that of a The geometry of measurements is usually more complex than that of a
point in a numeric continuum. A measurement is usually a segment of point in a numeric continuum. A measurement is usually a segment of
@ -22,26 +25,28 @@
the value being measured may naturally be an interval indicating some the value being measured may naturally be an interval indicating some
condition, such as the temperature range of stability of a protein. condition, such as the temperature range of stability of a protein.
</para> </para>
<para> <para>
Using just common sense, it appears more convenient to store such data Using just common sense, it appears more convenient to store such data
as intervals, rather than pairs of numbers. In practice, it even turns as intervals, rather than pairs of numbers. In practice, it even turns
out more efficient in most applications. out more efficient in most applications.
</para> </para>
<para> <para>
Further along the line of common sense, the fuzziness of the limits Further along the line of common sense, the fuzziness of the limits
suggests that the use of traditional numeric data types leads to a suggests that the use of traditional numeric data types leads to a
certain loss of information. Consider this: your instrument reads certain loss of information. Consider this: your instrument reads
6.50, and you input this reading into the database. What do you get 6.50, and you input this reading into the database. What do you get
when you fetch it? Watch: when you fetch it? Watch:
</para>
<programlisting> <programlisting>
test=> select 6.50 as "pH"; test=> select 6.50 :: float8 as "pH";
pH pH
--- ---
6.5 6.5
(1 row) (1 row)
</programlisting> </programlisting>
<para>
In the world of measurements, 6.50 is not the same as 6.5. It may In the world of measurements, 6.50 is not the same as 6.5. It may
sometimes be critically different. The experimenters usually write sometimes be critically different. The experimenters usually write
down (and publish) the digits they trust. 6.50 is actually a fuzzy down (and publish) the digits they trust. 6.50 is actually a fuzzy
@ -50,234 +55,171 @@ test=> select 6.50 as "pH";
share. We definitely do not want such different data items to appear the share. We definitely do not want such different data items to appear the
same. same.
</para> </para>
<para> <para>
Conclusion? It is nice to have a special data type that can record the Conclusion? It is nice to have a special data type that can record the
limits of an interval with arbitrarily variable precision. Variable in limits of an interval with arbitrarily variable precision. Variable in
a sense that each data element records its own precision. the sense that each data element records its own precision.
</para> </para>
<para> <para>
Check this out: Check this out:
</para>
<programlisting> <programlisting>
test=> select '6.25 .. 6.50'::seg as "pH"; test=> select '6.25 .. 6.50'::seg as "pH";
pH pH
------------ ------------
6.25 .. 6.50 6.25 .. 6.50
(1 row) (1 row)
</programlisting> </programlisting>
</para>
</sect2> </sect2>
<sect2> <sect2>
<title>Syntax</title> <title>Syntax</title>
<para> <para>
The external representation of an interval is formed using one or two The external representation of an interval is formed using one or two
floating point numbers joined by the range operator ('..' or '...'). floating point numbers joined by the range operator (<literal>..</literal>
Optional certainty indicators (&lt;, &gt; and ~) are ignored by the internal or <literal>...</literal>). Alternatively, it can be specified as a
logics, but are retained in the data. center point plus or minus a deviation.
Optional certainty indicators (<literal>&lt;</literal>,
<literal>&gt;</literal> and <literal>~</literal>) can be stored as well.
(Certainty indicators are ignored by all the built-in operators, however.)
</para> </para>
<table>
<title>Rules</title>
<tgroup cols="2">
<tbody>
<row>
<entry>rule 1</entry>
<entry>seg -&gt; boundary PLUMIN deviation</entry>
</row>
<row>
<entry>rule 2</entry>
<entry>seg -&gt; boundary RANGE boundary</entry>
</row>
<row>
<entry>rule 3</entry>
<entry>seg -&gt; boundary RANGE</entry>
</row>
<row>
<entry>rule 4</entry>
<entry>seg -&gt; RANGE boundary</entry>
</row>
<row>
<entry>rule 5</entry>
<entry>seg -&gt; boundary</entry>
</row>
<row>
<entry>rule 6</entry>
<entry>boundary -&gt; FLOAT</entry>
</row>
<row>
<entry>rule 7</entry>
<entry>boundary -&gt; EXTENSION FLOAT</entry>
</row>
<row>
<entry>rule 8</entry>
<entry>deviation -&gt; FLOAT</entry>
</row>
</tbody>
</tgroup>
</table>
<table>
<title>Tokens</title>
<tgroup cols="2">
<tbody>
<row>
<entry>RANGE</entry>
<entry>(\.\.)(\.)?</entry>
</row>
<row>
<entry>PLUMIN</entry>
<entry>\'\+\-\'</entry>
</row>
<row>
<entry>integer</entry>
<entry>[+-]?[0-9]+</entry>
</row>
<row>
<entry>real</entry>
<entry>[+-]?[0-9]+\.[0-9]+</entry>
</row>
<row>
<entry>FLOAT</entry>
<entry>({integer}|{real})([eE]{integer})?</entry>
</row>
<row>
<entry>EXTENSION</entry>
<entry>[&lt;&gt;~]</entry>
</row>
</tbody>
</tgroup>
</table>
<table>
<title>Examples of valid <literal>SEG</literal> representations</title>
<tgroup cols="2">
<tbody>
<row>
<entry>Any number</entry>
<entry>
(rules 5,6) -- creates a zero-length segment (a point,
if you will)
</entry>
</row>
<row>
<entry>~5.0</entry>
<entry>
(rules 5,7) -- creates a zero-length segment AND records
'~' in the data. This notation reads 'approximately 5.0',
but its meaning is not recognized by the code. It is ignored
until you get the value back. View it is a short-hand comment.
</entry>
</row>
<row>
<entry>&lt;5.0</entry>
<entry>
(rules 5,7) -- creates a point at 5.0; '&lt;' is ignored but
is preserved as a comment
</entry>
</row>
<row>
<entry>&gt;5.0</entry>
<entry>
(rules 5,7) -- creates a point at 5.0; '&gt;' is ignored but
is preserved as a comment
</entry>
</row>
<row>
<entry><para>5(+-)0.3</para><para>5'+-'0.3</para></entry>
<entry>
<para>
(rules 1,8) -- creates an interval '4.7..5.3'. As of this
writing (02/09/2000), this mechanism isn't completely accurate
in determining the number of significant digits for the
boundaries. For example, it adds an extra digit to the lower
boundary if the resulting interval includes a power of ten:
</para>
<programlisting>
postgres=> select '10(+-)1'::seg as seg;
seg
---------
9.0 .. 11 -- should be: 9 .. 11
</programlisting>
<para>
Also, the (+-) notation is not preserved: 'a(+-)b' will
always be returned as '(a-b) .. (a+b)'. The purpose of this
notation is to allow input from certain data sources without
conversion.
</para>
</entry>
</row>
<row>
<entry>50 .. </entry>
<entry>(rule 3) -- everything that is greater than or equal to 50</entry>
</row>
<row>
<entry>.. 0</entry>
<entry>(rule 4) -- everything that is less than or equal to 0</entry>
</row>
<row>
<entry>1.5e-2 .. 2E-2 </entry>
<entry>(rule 2) -- creates an interval (0.015 .. 0.02)</entry>
</row>
<row>
<entry>1 ... 2</entry>
<entry>
The same as 1...2, or 1 .. 2, or 1..2 (space is ignored).
Because of the widespread use of '...' in the data sources,
I decided to stick to is as a range operator. This, and
also the fact that the white space around the range operator
is ignored, creates a parsing conflict with numeric constants
starting with a decimal point.
</entry>
</row>
</tbody>
</tgroup>
</table>
<table>
<title>Examples</title>
<tgroup cols="2">
<tbody>
<row>
<entry>.1e7</entry>
<entry>should be: 0.1e7</entry>
</row>
<row>
<entry>.1 .. .2</entry>
<entry>should be: 0.1 .. 0.2</entry>
</row>
<row>
<entry>2.4 E4</entry>
<entry>should be: 2.4E4</entry>
</row>
</tbody>
</tgroup>
</table>
<para> <para>
The following, although it is not a syntax error, is disallowed to improve In the following table, <replaceable>x</>, <replaceable>y</>, and
the sanity of the data: <replaceable>delta</> denote
floating-point numbers. <replaceable>x</> and <replaceable>y</>, but
not <replaceable>delta</>, can be preceded by a certainty indicator:
</para> </para>
<table> <table>
<title></title> <title><type>seg</> external representations</title>
<tgroup cols="2"> <tgroup cols="2">
<tbody> <tbody>
<row> <row>
<entry>5 .. 2</entry> <entry><literal><replaceable>x</></literal></entry>
<entry>should be: 2 .. 5</entry> <entry>Single value (zero-length interval)
</entry>
</row>
<row>
<entry><literal><replaceable>x</> .. <replaceable>y</></literal></entry>
<entry>Interval from <replaceable>x</> to <replaceable>y</>
</entry>
</row>
<row>
<entry><literal><replaceable>x</> (+-) <replaceable>delta</></literal></entry>
<entry>Interval from <replaceable>x</> - <replaceable>delta</> to
<replaceable>x</> + <replaceable>delta</>
</entry>
</row>
<row>
<entry><literal><replaceable>x</> ..</literal></entry>
<entry>Open interval with lower bound <replaceable>x</>
</entry>
</row>
<row>
<entry><literal>.. <replaceable>x</></literal></entry>
<entry>Open interval with upper bound <replaceable>x</>
</entry>
</row> </row>
</tbody> </tbody>
</tgroup> </tgroup>
</table> </table>
<table>
<title>Examples of valid <type>seg</> input</title>
<tgroup cols="2">
<tbody>
<row>
<entry><literal>5.0</literal></entry>
<entry>
Creates a zero-length segment (a point, if you will)
</entry>
</row>
<row>
<entry><literal>~5.0</literal></entry>
<entry>
Creates a zero-length segment and records
<literal>~</> in the data. <literal>~</literal> is ignored
by <type>seg</> operations, but
is preserved as a comment.
</entry>
</row>
<row>
<entry><literal>&lt;5.0</literal></entry>
<entry>
Creates a point at 5.0. <literal>&lt;</literal> is ignored but
is preserved as a comment.
</entry>
</row>
<row>
<entry><literal>&gt;5.0</literal></entry>
<entry>
Creates a point at 5.0. <literal>&gt;</literal> is ignored but
is preserved as a comment.
</entry>
</row>
<row>
<entry><literal>5(+-)0.3</literal></entry>
<entry>
Creates an interval <literal>4.7 .. 5.3</literal>.
Note that the <literal>(+-)</> notation isn't preserved.
</entry>
</row>
<row>
<entry><literal>50 .. </literal></entry>
<entry>Everything that is greater than or equal to 50</entry>
</row>
<row>
<entry><literal>.. 0</literal></entry>
<entry>Everything that is less than or equal to 0</entry>
</row>
<row>
<entry><literal>1.5e-2 .. 2E-2 </literal></entry>
<entry>Creates an interval <literal>0.015 .. 0.02</literal></entry>
</row>
<row>
<entry><literal>1 ... 2</literal></entry>
<entry>
The same as <literal>1...2</literal>, or <literal>1 .. 2</literal>,
or <literal>1..2</literal>
(spaces around the range operator are ignored)
</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
Because <literal>...</> is widely used in data sources, it is allowed
as an alternative spelling of <literal>..</>. Unfortunately, this
creates a parsing ambiguity: it is not clear whether the upper bound
in <literal>0...23</> is meant to be <literal>23</> or <literal>0.23</>.
This is resolved by requiring at least one digit before the decimal
point in all numbers in <type>seg</> input.
</para>
<para>
As a sanity check, <type>seg</> rejects intervals with the lower bound
greater than the upper, for example <literal>5 .. 2</>.
</para>
</sect2> </sect2>
<sect2> <sect2>
<title>Precision</title> <title>Precision</title>
<para> <para>
The segments are stored internally as pairs of 32-bit floating point <type>seg</> values are stored internally as pairs of 32-bit floating point
numbers. It means that the numbers with more than 7 significant digits numbers. This means that numbers with more than 7 significant digits
will be truncated. will be truncated.
</para> </para>
<para> <para>
The numbers with less than or exactly 7 significant digits retain their Numbers with 7 or fewer significant digits retain their
original precision. That is, if your query returns 0.00, you will be original precision. That is, if your query returns 0.00, you will be
sure that the trailing zeroes are not the artifacts of formatting: they sure that the trailing zeroes are not the artifacts of formatting: they
reflect the precision of the original data. The number of leading reflect the precision of the original data. The number of leading
@ -288,28 +230,20 @@ postgres=> select '10(+-)1'::seg as seg;
<sect2> <sect2>
<title>Usage</title> <title>Usage</title>
<para> <para>
The access method for SEG is a GiST index (gist_seg_ops), which is a The <filename>seg</> module includes a GiST index operator class for
generalization of R-tree. GiSTs allow the postgres implementation of <type>seg</> values.
R-tree, originally encoded to support 2-D geometric types such as The operators supported by the GiST opclass include:
boxes and polygons, to be used with any data type whose data domain
can be partitioned using the concepts of containment, intersection and
equality. In other words, everything that can intersect or contain
its own kind can be indexed with a GiST. That includes, among other
things, all geometric data types, regardless of their dimensionality
(see also contrib/cube).
</para>
<para>
The operators supported by the GiST access method include:
</para> </para>
<itemizedlist> <itemizedlist>
<listitem> <listitem>
<programlisting> <programlisting>
[a, b] &lt;&lt; [c, d] Is left of [a, b] &lt;&lt; [c, d] Is left of
</programlisting> </programlisting>
<para> <para>
The left operand, [a, b], occurs entirely to the left of the [a, b] is entirely to the left of [c, d]. That is,
right operand, [c, d], on the axis (-inf, inf). It means,
[a, b] &lt;&lt; [c, d] is true if b &lt; c and false otherwise [a, b] &lt;&lt; [c, d] is true if b &lt; c and false otherwise
</para> </para>
</listitem> </listitem>
@ -318,8 +252,8 @@ postgres=> select '10(+-)1'::seg as seg;
[a, b] &gt;&gt; [c, d] Is right of [a, b] &gt;&gt; [c, d] Is right of
</programlisting> </programlisting>
<para> <para>
[a, b] is occurs entirely to the right of [c, d]. [a, b] is entirely to the right of [c, d]. That is,
[a, b] &gt;&gt; [c, d] is true if a &gt; d and false otherwise [a, b] &gt;&gt; [c, d] is true if a &gt; d and false otherwise
</para> </para>
</listitem> </listitem>
<listitem> <listitem>
@ -327,8 +261,8 @@ postgres=> select '10(+-)1'::seg as seg;
[a, b] &amp;&lt; [c, d] Overlaps or is left of [a, b] &amp;&lt; [c, d] Overlaps or is left of
</programlisting> </programlisting>
<para> <para>
This might be better read as "does not extend to right of". This might be better read as <quote>does not extend to right of</quote>.
It is true when b &lt;= d. It is true when b &lt;= d.
</para> </para>
</listitem> </listitem>
<listitem> <listitem>
@ -336,17 +270,16 @@ postgres=> select '10(+-)1'::seg as seg;
[a, b] &amp;&gt; [c, d] Overlaps or is right of [a, b] &amp;&gt; [c, d] Overlaps or is right of
</programlisting> </programlisting>
<para> <para>
This might be better read as "does not extend to left of". This might be better read as <quote>does not extend to left of</quote>.
It is true when a &gt;= c. It is true when a &gt;= c.
</para> </para>
</listitem> </listitem>
<listitem> <listitem>
<programlisting> <programlisting>
[a, b] = [c, d] Same as [a, b] = [c, d] Same as
</programlisting> </programlisting>
<para> <para>
The segments [a, b] and [c, d] are identical, that is, a == b The segments [a, b] and [c, d] are identical, that is, a = c and b = d
and c == d
</para> </para>
</listitem> </listitem>
<listitem> <listitem>
@ -354,28 +287,29 @@ postgres=> select '10(+-)1'::seg as seg;
[a, b] &amp;&amp; [c, d] Overlaps [a, b] &amp;&amp; [c, d] Overlaps
</programlisting> </programlisting>
<para> <para>
The segments [a, b] and [c, d] overlap. The segments [a, b] and [c, d] overlap.
</para> </para>
</listitem> </listitem>
<listitem> <listitem>
<programlisting> <programlisting>
[a, b] @&gt; [c, d] Contains [a, b] @&gt; [c, d] Contains
</programlisting> </programlisting>
<para> <para>
The segment [a, b] contains the segment [c, d], that is, The segment [a, b] contains the segment [c, d], that is,
a &lt;= c and b &gt;= d a &lt;= c and b &gt;= d
</para> </para>
</listitem> </listitem>
<listitem> <listitem>
<programlisting> <programlisting>
[a, b] &lt;@ [c, d] Contained in [a, b] &lt;@ [c, d] Contained in
</programlisting> </programlisting>
<para> <para>
The segment [a, b] is contained in [c, d], that is, The segment [a, b] is contained in [c, d], that is,
a &gt;= c and b &lt;= d a &gt;= c and b &lt;= d
</para> </para>
</listitem> </listitem>
</itemizedlist> </itemizedlist>
<para> <para>
(Before PostgreSQL 8.2, the containment operators @&gt; and &lt;@ were (Before PostgreSQL 8.2, the containment operators @&gt; and &lt;@ were
respectively called @ and ~. These names are still available, but are respectively called @ and ~. These names are still available, but are
@ -383,68 +317,70 @@ postgres=> select '10(+-)1'::seg as seg;
are reversed from the convention formerly followed by the core geometric are reversed from the convention formerly followed by the core geometric
datatypes!) datatypes!)
</para> </para>
<para> <para>
Although the mnemonics of the following operators is questionable, I The standard B-tree operators are also provided, for example
preserved them to maintain visual consistency with other geometric
data types defined in Postgres.
</para>
<para>
Other operators:
</para>
<programlisting> <programlisting>
[a, b] &lt; [c, d] Less than [a, b] &lt; [c, d] Less than
[a, b] &gt; [c, d] Greater than [a, b] &gt; [c, d] Greater than
</programlisting> </programlisting>
<para>
These operators do not make a lot of sense for any practical These operators do not make a lot of sense for any practical
purpose but sorting. These operators first compare (a) to (c), purpose but sorting. These operators first compare (a) to (c),
and if these are equal, compare (b) to (d). That accounts for and if these are equal, compare (b) to (d). That results in
reasonably good sorting in most cases, which is useful if reasonably good sorting in most cases, which is useful if
you want to use ORDER BY with this type you want to use ORDER BY with this type.
</para>
</sect2>
<sect2>
<title>Notes</title>
<para>
For examples of usage, see the regression test <filename>sql/seg.sql</>.
</para> </para>
<para> <para>
There are a few other potentially useful functions defined in seg.c The mechanism that converts <literal>(+-)</> to regular ranges
that vanished from the schema because I stopped using them. Some of isn't completely accurate in determining the number of significant digits
these were meant to support type casting. Let me know if I was wrong: for the boundaries. For example, it adds an extra digit to the lower
I will then add them back to the schema. I would also appreciate boundary if the resulting interval includes a power of ten:
other ideas that would enhance the type and make it more useful.
<programlisting>
postgres=> select '10(+-)1'::seg as seg;
seg
---------
9.0 .. 11 -- should be: 9 .. 11
</programlisting>
</para> </para>
<para> <para>
For examples of usage, see sql/seg.sql The performance of an R-tree index can largely depend on the initial
</para>
<para>
NOTE: The performance of an R-tree index can largely depend on the
order of input values. It may be very helpful to sort the input table order of input values. It may be very helpful to sort the input table
on the SEG column (see the script sort-segments.pl for an example) on the <type>seg</> column; see the script <filename>sort-segments.pl</>
for an example.
</para> </para>
</sect2> </sect2>
<sect2> <sect2>
<title>Credits</title> <title>Credits</title>
<para>
Original author: Gene Selkov, Jr. <email>selkovjr@mcs.anl.gov</email>,
Mathematics and Computer Science Division, Argonne National Laboratory.
</para>
<para> <para>
My thanks are primarily to Prof. Joe Hellerstein My thanks are primarily to Prof. Joe Hellerstein
(<ulink url="http://db.cs.berkeley.edu/~jmh/"></ulink>) for elucidating the (<ulink url="http://db.cs.berkeley.edu/~jmh/"></ulink>) for elucidating the
gist of the GiST (<ulink url="http://gist.cs.berkeley.edu/"></ulink>). I am gist of the GiST (<ulink url="http://gist.cs.berkeley.edu/"></ulink>). I am
also grateful to all postgres developers, present and past, for enabling also grateful to all Postgres developers, present and past, for enabling
myself to create my own world and live undisturbed in it. And I would like myself to create my own world and live undisturbed in it. And I would like
to acknowledge my gratitude to Argonne Lab and to the U.S. Department of to acknowledge my gratitude to Argonne Lab and to the U.S. Department of
Energy for the years of faithful support of my database research. Energy for the years of faithful support of my database research.
</para> </para>
<programlisting>
Gene Selkov, Jr.
Computational Scientist
Mathematics and Computer Science Division
Argonne National Laboratory
9700 S Cass Ave.
Building 221
Argonne, IL 60439-4844
</programlisting>
<para>
<email>selkovjr@mcs.anl.gov</email>
</para>
</sect2> </sect2>
</sect1> </sect1>

View File

@ -1,111 +1,126 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/sslinfo.sgml,v 1.3 2007/12/06 04:12:10 tgl Exp $ -->
<sect1 id="sslinfo"> <sect1 id="sslinfo">
<title>sslinfo</title> <title>sslinfo</title>
<indexterm zone="sslinfo"> <indexterm zone="sslinfo">
<primary>sslinfo</primary> <primary>sslinfo</primary>
</indexterm> </indexterm>
<para> <para>
This modules provides information about current SSL certificate for PostgreSQL. The <filename>sslinfo</> module provides information about the SSL
certificate that the current client provided when connecting to
<productname>PostgreSQL</>. The module is useless (most functions
will return NULL) if the current connection does not use SSL.
</para>
<para>
This extension won't build at all unless the installation was
configured with <literal>--with-openssl</>.
</para> </para>
<sect2> <sect2>
<title>Notes</title> <title>Functions Provided</title>
<para>
This extension won't build unless your PostgreSQL server is configured
with --with-openssl. Information provided with these functions would
be completely useless if you don't use SSL to connect to database.
</para>
</sect2>
<sect2> <variablelist>
<title>Functions Description</title> <varlistentry>
<term><function>
<itemizedlist> ssl_is_used() returns boolean
<listitem> </function></term>
<programlisting> <listitem>
ssl_is_used() RETURNS boolean;
</programlisting>
<para> <para>
Returns TRUE, if current connection to server uses SSL and FALSE Returns TRUE if current connection to server uses SSL, and FALSE
otherwise. otherwise.
</para> </para>
</listitem> </listitem>
</varlistentry>
<listitem> <varlistentry>
<programlisting> <term><function>
ssl_client_cert_present() RETURNS boolean ssl_client_cert_present() returns boolean
</programlisting> </function></term>
<listitem>
<para> <para>
Returns TRUE if current client have presented valid SSL client Returns TRUE if current client has presented a valid SSL client
certificate to the server and FALSE otherwise (e.g., no SSL, certificate to the server, and FALSE otherwise. (The server
certificate hadn't be requested by server). might or might not be configured to require a client certificate.)
</para> </para>
</listitem> </listitem>
</varlistentry>
<listitem>
<programlisting>
ssl_client_serial() RETURNS numeric
</programlisting>
<para>
Returns serial number of current client certificate. The combination
of certificate serial number and certificate issuer is guaranteed to
uniquely identify certificate (but not its owner -- the owner ought to
regularily change his keys, and get new certificates from the issuer).
</para>
<para>
So, if you run you own CA and allow only certificates from this CA to
be accepted by server, the serial number is the most reliable (albeit
not very mnemonic) means to indentify user.
</para>
</listitem>
<listitem> <varlistentry>
<programlisting> <term><function>
ssl_client_dn() RETURNS text ssl_client_serial() returns numeric
</programlisting> </function></term>
<listitem>
<para> <para>
Returns the full subject of current client certificate, converting Returns serial number of current client certificate. The combination of
certificate serial number and certificate issuer is guaranteed to
uniquely identify a certificate (but not its owner &mdash; the owner
ought to regularly change his keys, and get new certificates from the
issuer).
</para>
<para>
So, if you run your own CA and allow only certificates from this CA to
be accepted by the server, the serial number is the most reliable (albeit
not very mnemonic) means to identify a user.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><function>
ssl_client_dn() returns text
</function></term>
<listitem>
<para>
Returns the full subject of the current client certificate, converting
character data into the current database encoding. It is assumed that character data into the current database encoding. It is assumed that
if you use non-Latin characters in the certificate names, your if you use non-ASCII characters in the certificate names, your
database is able to represent these characters, too. If your database database is able to represent these characters, too. If your database
uses the SQL_ASCII encoding, non-Latin characters in the name will be uses the SQL_ASCII encoding, non-ASCII characters in the name will be
represented as UTF-8 sequences. represented as UTF-8 sequences.
</para> </para>
<para> <para>
The result looks like '/CN=Somebody /C=Some country/O=Some organization'. The result looks like <literal>/CN=Somebody /C=Some country/O=Some organization</>.
</para> </para>
</listitem> </listitem>
</varlistentry>
<listitem>
<programlisting> <varlistentry>
ssl_issuer_dn() <term><function>
</programlisting> ssl_issuer_dn() returns text
</function></term>
<listitem>
<para> <para>
Returns the full issuer name of the client certificate, converting Returns the full issuer name of the current client certificate, converting
character data into current database encoding. character data into the current database encoding. Encoding conversions
are handled the same as for <function>ssl_client_dn</>.
</para> </para>
<para> <para>
The combination of the return value of this function with the The combination of the return value of this function with the
certificate serial number uniquely identifies the certificate. certificate serial number uniquely identifies the certificate.
</para> </para>
<para> <para>
The result of this function is really useful only if you have more This function is really useful only if you have more than one trusted CA
than one trusted CA certificate in your server's root.crt file, or if certificate in your server's <filename>root.crt</> file, or if this CA
this CA has issued some intermediate certificate authority has issued some intermediate certificate authority certificates.
certificates.
</para> </para>
</listitem> </listitem>
</varlistentry>
<listitem>
<programlisting> <varlistentry>
ssl_client_dn_field(fieldName text) RETURNS text <term><function>
</programlisting> ssl_client_dn_field(fieldname text) returns text
</function></term>
<listitem>
<para> <para>
This function returns the value of the specified field in the This function returns the value of the specified field in the
certificate subject. Field names are string constants that are certificate subject, or NULL if the field is not present.
converted into ASN1 object identificators using the OpenSSL object Field names are string constants that are
converted into ASN1 object identifiers using the OpenSSL object
database. The following values are acceptable: database. The following values are acceptable:
</para> </para>
<programlisting> <programlisting>
@ -113,7 +128,7 @@ commonName (alias CN)
surname (alias SN) surname (alias SN)
name name
givenName (alias GN) givenName (alias GN)
countryName (alias C) countryName (alias C)
localityName (alias L) localityName (alias L)
stateOrProvinceName (alias ST) stateOrProvinceName (alias ST)
organizationName (alias O) organizationName (alias O)
@ -127,38 +142,46 @@ generationQualifier
description description
dnQualifier dnQualifier
x500UniqueIdentifier x500UniqueIdentifier
pseudonim pseudonym
role role
emailAddress emailAddress
</programlisting> </programlisting>
<para> <para>
All of these fields are optional, except commonName. It depends All of these fields are optional, except <structfield>commonName</>.
entirely on your CA policy which of them would be included and which It depends
wouldn't. The meaning of these fields, howeer, is strictly defined by entirely on your CA's policy which of them would be included and which
wouldn't. The meaning of these fields, however, is strictly defined by
the X.500 and X.509 standards, so you cannot just assign arbitrary the X.500 and X.509 standards, so you cannot just assign arbitrary
meaning to them. meaning to them.
</para> </para>
</listitem> </listitem>
</varlistentry>
<listitem>
<programlisting> <varlistentry>
ssl_issuer_field(fieldName text) RETURNS text; <term><function>
</programlisting> ssl_issuer_field(fieldname text) returns text
</function></term>
<listitem>
<para> <para>
Does same as ssl_client_dn_field, but for the certificate issuer Same as <function>ssl_client_dn_field</>, but for the certificate issuer
rather than the certificate subject. rather than the certificate subject.
</para> </para>
</listitem> </listitem>
</itemizedlist> </varlistentry>
</variablelist>
</sect2> </sect2>
<sect2> <sect2>
<title>Author</title> <title>Author</title>
<para> <para>
Victor Wagner <email>vitus@cryptocom.ru</email>, Cryptocom LTD Victor Wagner <email>vitus@cryptocom.ru</email>, Cryptocom LTD
E-Mail of Cryptocom OpenSSL development group: </para>
<para>
E-Mail of Cryptocom OpenSSL development group:
<email>openssl@cryptocom.ru</email> <email>openssl@cryptocom.ru</email>
</para> </para>
</sect2> </sect2>
</sect1>
</sect1>

File diff suppressed because it is too large Load Diff

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/test-parser.sgml,v 1.1 2007/12/03 04:18:47 tgl Exp $ --> <!-- $PostgreSQL: pgsql/doc/src/sgml/test-parser.sgml,v 1.2 2007/12/06 04:12:10 tgl Exp $ -->
<sect1 id="test-parser"> <sect1 id="test-parser">
<title>test_parser</title> <title>test_parser</title>
@ -8,15 +8,18 @@
</indexterm> </indexterm>
<para> <para>
This is an example of a custom parser for full text search. <filename>test_parser</> is an example of a custom parser for full-text
search. It doesn't do anything especially useful, but can serve as
a starting point for developing your own parser.
</para> </para>
<para> <para>
It recognizes space-delimited words and returns just two token types: <filename>test_parser</> recognizes words separated by white space,
and returns just two token types:
<programlisting> <programlisting>
mydb=# SELECT * FROM ts_token_type('testparser'); mydb=# SELECT * FROM ts_token_type('testparser');
tokid | alias | description tokid | alias | description
-------+-------+--------------- -------+-------+---------------
3 | word | Word 3 | word | Word
12 | blank | Space symbols 12 | blank | Space symbols
@ -41,16 +44,16 @@ mydb=# SELECT * FROM ts_token_type('testparser');
<programlisting> <programlisting>
mydb=# SELECT * FROM ts_parse('testparser', 'That''s my first own parser'); mydb=# SELECT * FROM ts_parse('testparser', 'That''s my first own parser');
tokid | token tokid | token
-------+-------- -------+--------
3 | That's 3 | That's
12 | 12 |
3 | my 3 | my
12 | 12 |
3 | first 3 | first
12 | 12 |
3 | own 3 | own
12 | 12 |
3 | parser 3 | parser
</programlisting> </programlisting>
</para> </para>
@ -68,14 +71,14 @@ mydb-# ADD MAPPING FOR word WITH english_stem;
ALTER TEXT SEARCH CONFIGURATION ALTER TEXT SEARCH CONFIGURATION
mydb=# SELECT to_tsvector('testcfg', 'That''s my first own parser'); mydb=# SELECT to_tsvector('testcfg', 'That''s my first own parser');
to_tsvector to_tsvector
------------------------------- -------------------------------
'that':1 'first':3 'parser':5 'that':1 'first':3 'parser':5
(1 row) (1 row)
mydb=# SELECT ts_headline('testcfg', 'Supernovae stars are the brightest phenomena in galaxies', mydb=# SELECT ts_headline('testcfg', 'Supernovae stars are the brightest phenomena in galaxies',
mydb(# to_tsquery('testcfg', 'star')); mydb(# to_tsquery('testcfg', 'star'));
ts_headline ts_headline
----------------------------------------------------------------- -----------------------------------------------------------------
Supernovae &lt;b&gt;stars&lt;/b&gt; are the brightest phenomena in galaxies Supernovae &lt;b&gt;stars&lt;/b&gt; are the brightest phenomena in galaxies
(1 row) (1 row)

View File

@ -1,6 +1,8 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/tsearch2.sgml,v 1.2 2007/12/06 04:12:10 tgl Exp $ -->
<sect1 id="tsearch2"> <sect1 id="tsearch2">
<title>tsearch2</title> <title>tsearch2</title>
<indexterm zone="tsearch2"> <indexterm zone="tsearch2">
<primary>tsearch2</primary> <primary>tsearch2</primary>
</indexterm> </indexterm>

View File

@ -1,19 +1,26 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/uuid-ossp.sgml,v 1.2 2007/12/06 04:12:10 tgl Exp $ -->
<sect1 id="uuid-ossp"> <sect1 id="uuid-ossp">
<title>uuid-ossp</title> <title>uuid-ossp</title>
<indexterm zone="uuid-ossp"> <indexterm zone="uuid-ossp">
<primary>uuid-ossp</primary> <primary>uuid-ossp</primary>
</indexterm> </indexterm>
<para> <para>
This module provides functions to generate universally unique The <filename>uuid-ossp</> module provides functions to generate universally
identifiers (UUIDs) using one of the several standard algorithms, as unique identifiers (UUIDs) using one of several standard algorithms. There
well as functions to produce certain special UUID constants. are also functions to produce certain special UUID constants.
</para>
<para>
This module depends on the OSSP UUID library, which can be found at
<ulink url="http://www.ossp.org/pkg/lib/uuid/"></ulink>.
</para> </para>
<sect2> <sect2>
<title>UUID Generation</title> <title><literal>uuid-ossp</literal> Functions</title>
<para> <para>
The relevant standards ITU-T Rec. X.667, ISO/IEC 9834-8:2005, and RFC The relevant standards ITU-T Rec. X.667, ISO/IEC 9834-8:2005, and RFC
4122 specify four algorithms for generating UUIDs, identified by the 4122 specify four algorithms for generating UUIDs, identified by the
@ -23,7 +30,7 @@
</para> </para>
<table> <table>
<title><literal>uuid-ossp</literal> functions</title> <title>Functions for UUID Generation</title>
<tgroup cols="2"> <tgroup cols="2">
<thead> <thead>
<row> <row>
@ -59,22 +66,9 @@
<para> <para>
This function generates a version 3 UUID in the given namespace using This function generates a version 3 UUID in the given namespace using
the specified input name. The namespace should be one of the special the specified input name. The namespace should be one of the special
constants produced by the uuid_ns_*() functions shown below. (It constants produced by the <function>uuid_ns_*()</> functions shown
could be any UUID in theory.) The name is an identifier in the below. (It could be any UUID in theory.) The name is an identifier
selected namespace. For example: in the selected namespace.
</para>
</entry>
</row>
<row>
<entry><literal>uuid_generate_v3(uuid_ns_url(), 'http://www.postgresql.org')</literal></entry>
<entry>
<para>
The name parameter will be MD5-hashed, so the cleartext cannot be
derived from the generated UUID.
</para>
<para>
The generation of UUIDs by this method has no random or
environment-dependent element and is therefore reproducible.
</para> </para>
</entry> </entry>
</row> </row>
@ -102,15 +96,28 @@
</tgroup> </tgroup>
</table> </table>
<para>
For example:
<programlisting>
SELECT uuid_generate_v3(uuid_ns_url(), 'http://www.postgresql.org');
</programlisting>
The name parameter will be MD5-hashed, so the cleartext cannot be
derived from the generated UUID.
The generation of UUIDs by this method has no random or
environment-dependent element and is therefore reproducible.
</para>
<table> <table>
<title>UUID Constants</title> <title>Functions Returning UUID Constants</title>
<tgroup cols="2"> <tgroup cols="2">
<tbody> <tbody>
<row> <row>
<entry><literal>uuid_nil()</literal></entry> <entry><literal>uuid_nil()</literal></entry>
<entry> <entry>
<para> <para>
A "nil" UUID constant, which does not occur as a real UUID. A <quote>nil</> UUID constant, which does not occur as a real UUID.
</para> </para>
</entry> </entry>
</row> </row>
@ -135,8 +142,8 @@
<entry> <entry>
<para> <para>
Constant designating the ISO object identifier (OID) namespace for Constant designating the ISO object identifier (OID) namespace for
UUIDs. (This pertains to ASN.1 OIDs, unrelated to the OIDs used in UUIDs. (This pertains to ASN.1 OIDs, which are unrelated to the OIDs
PostgreSQL.) used in <productname>PostgreSQL</>.)
</para> </para>
</entry> </entry>
</row> </row>
@ -153,11 +160,14 @@
</tgroup> </tgroup>
</table> </table>
</sect2> </sect2>
<sect2> <sect2>
<title>Author</title> <title>Author</title>
<para> <para>
Peter Eisentraut <email>peter_e@gmx.net</email> Peter Eisentraut <email>peter_e@gmx.net</email>
</para> </para>
</sect2>
</sect1>
</sect2>
</sect1>

View File

@ -1,74 +1,110 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/vacuumlo.sgml,v 1.2 2007/12/06 04:12:10 tgl Exp $ -->
<sect1 id="vacuumlo"> <sect1 id="vacuumlo">
<title>vacuumlo</title> <title>vacuumlo</title>
<indexterm zone="vacuumlo"> <indexterm zone="vacuumlo">
<primary>vacuumlo</primary> <primary>vacuumlo</primary>
</indexterm> </indexterm>
<para> <para>
This is a simple utility that will remove any orphaned large objects out of a <application>vacuumlo</> is a simple utility program that will remove any
PostgreSQL database. An orphaned LO is considered to be any LO whose OID <quote>orphaned</> large objects from a
does not appear in any OID data column of the database. <productname>PostgreSQL</> database. An orphaned large object (LO) is
considered to be any LO whose OID does not appear in any <type>oid</> or
<type>lo</> data column of the database.
</para> </para>
<para> <para>
If you use this, you may also be interested in the lo_manage trigger in If you use this, you may also be interested in the <function>lo_manage</>
contrib/lo. lo_manage is useful to try to avoid creating orphaned LOs trigger in <filename>contrib/lo</> (see <xref linkend="lo">).
in the first place. <function>lo_manage</> is useful to try
</para> to avoid creating orphaned LOs in the first place.
<para>
<note>
<para>
It was decided to place this in contrib as it needs further testing, but hopefully,
this (or a variant of it) would make it into the backend as a "vacuum lo"
command in a later release.
</para>
</note>
</para> </para>
<sect2> <sect2>
<title>Usage</title> <title>Usage</title>
<programlisting>
vacuumlo [options] database [database2 ... databasen] <synopsis>
</programlisting> vacuumlo [options] database [database2 ... databaseN]
</synopsis>
<para> <para>
All databases named on the command line are processed. Available options All databases named on the command line are processed. Available options
include: include:
</para> </para>
<programlisting>
-v Write a lot of progress messages <variablelist>
-n Don't remove large objects, just show what would be done <varlistentry>
-U username Username to connect as <term><option>-v</option></term>
-W Prompt for password <listitem>
-h hostname Database server host <para>Write a lot of progress messages</para>
-p port Database server port </listitem>
</programlisting> </varlistentry>
<varlistentry>
<term><option>-n</option></term>
<listitem>
<para>Don't remove anything, just show what would be done</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-U</option> <replaceable>username</></term>
<listitem>
<para>Username to connect as</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-W</option></term>
<listitem>
<para>Force prompt for password (generally useless)</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-h</option> <replaceable>hostname</></term>
<listitem>
<para>Database server's host</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-p</option> <replaceable>port</></term>
<listitem>
<para>Database server's port</para>
</listitem>
</varlistentry>
</variablelist>
</sect2> </sect2>
<sect2> <sect2>
<title>Method</title> <title>Method</title>
<para> <para>
First, it builds a temporary table which contains all of the OIDs of the First, it builds a temporary table which contains all of the OIDs of the
large objects in that database. large objects in that database.
</para> </para>
<para> <para>
It then scans through all columns in the database that are of type "oid" It then scans through all columns in the database that are of type
or "lo", and removes matching entries from the temporary table. <type>oid</> or <type>lo</>, and removes matching entries from the
temporary table.
</para> </para>
<para> <para>
The remaining entries in the temp table identify orphaned LOs. These are The remaining entries in the temp table identify orphaned LOs.
removed. These are removed.
</para> </para>
</sect2> </sect2>
<sect2> <sect2>
<title>Author</title> <title>Author</title>
<para> <para>
Peter Mount <email>peter@retep.org.uk</email> Peter Mount <email>peter@retep.org.uk</email>
</para> </para>
<para>
<ulink url="http://www.retep.org.uk"></ulink>
</para>
</sect2> </sect2>
</sect1> </sect1>

View File

@ -1,31 +1,41 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/xml2.sgml,v 1.4 2007/12/06 04:12:10 tgl Exp $ -->
<sect1 id="xml2"> <sect1 id="xml2">
<title>xml2: XML-handling functions</title> <title>xml2</title>
<indexterm zone="xml2"> <indexterm zone="xml2">
<primary>xml2</primary> <primary>xml2</primary>
</indexterm> </indexterm>
<para>
The <filename>xml2</> module provides XPath querying and
XSLT functionality.
</para>
<sect2> <sect2>
<title>Deprecation notice</title> <title>Deprecation notice</title>
<para> <para>
From PostgreSQL 8.3 on, there is XML-related From <productname>PostgreSQL</> 8.3 on, there is XML-related
functionality based on the SQL/XML standard in the core server. functionality based on the SQL/XML standard in the core server.
That functionality covers XML syntax checking and XPath queries, That functionality covers XML syntax checking and XPath queries,
which is what this module does as well, and more, but the API is which is what this module does, and more, but the API is
not at all compatible. It is planned that this module will be not at all compatible. It is planned that this module will be
removed in PostgreSQL 8.4 in favor of the newer standard API, so removed in PostgreSQL 8.4 in favor of the newer standard API, so
you are encouraged to try converting your applications. If you you are encouraged to try converting your applications. If you
find that some of the functionality of this module is not find that some of the functionality of this module is not
available in an adequate form with the newer API, please explain available in an adequate form with the newer API, please explain
your issue to pgsql-hackers@postgresql.org so that the deficiency your issue to pgsql-hackers@postgresql.org so that the deficiency
can be addressed. can be addressed.
</para> </para>
</sect2> </sect2>
<sect2> <sect2>
<title>Description of functions</title> <title>Description of functions</title>
<para> <para>
The first set of functions are straightforward XML parsing and XPath queries: These functions provide straightforward XML parsing and XPath queries.
All arguments are of type <type>text</>, so for brevity that is not shown.
</para> </para>
<table> <table>
@ -34,27 +44,27 @@
<tbody> <tbody>
<row> <row>
<entry> <entry>
<programlisting> <synopsis>
xml_is_well_formed(document) RETURNS bool xml_is_well_formed(document) returns bool
</programlisting> </synopsis>
</entry> </entry>
<entry> <entry>
<para> <para>
This parses the document text in its parameter and returns true if the This parses the document text in its parameter and returns true if the
document is well-formed XML. (Note: before PostgreSQL 8.2, this function document is well-formed XML. (Note: before PostgreSQL 8.2, this
was called xml_valid(). That is the wrong name since validity and function was called <function>xml_valid()</>. That is the wrong name
well-formedness have different meanings in XML. The old name is still since validity and well-formedness have different meanings in XML.
available, but is deprecated and will be removed in 8.3.) The old name is still available, but is deprecated.)
</para> </para>
</entry> </entry>
</row> </row>
<row> <row>
<entry> <entry>
<programlisting> <synopsis>
xpath_string(document,query) RETURNS text xpath_string(document,query) returns text
xpath_number(document,query) RETURNS float4 xpath_number(document,query) returns float4
xpath_bool(document,query) RETURNS bool xpath_bool(document,query) returns bool
</programlisting> </synopsis>
</entry> </entry>
<entry> <entry>
<para> <para>
@ -65,9 +75,9 @@
</row> </row>
<row> <row>
<entry> <entry>
<programlisting> <synopsis>
xpath_nodeset(document,query,toptag,itemtag) RETURNS text xpath_nodeset(document,query,toptag,itemtag) returns text
</programlisting> </synopsis>
</entry> </entry>
<entry> <entry>
<para> <para>
@ -75,10 +85,10 @@
the result is multivalued, the output will look like: the result is multivalued, the output will look like:
</para> </para>
<literal> <literal>
&lt;toptag> &lt;toptag&gt;
&lt;itemtag>Value 1 which could be an XML fragment&lt;/itemtag> &lt;itemtag&gt;Value 1 which could be an XML fragment&lt;/itemtag&gt;
&lt;itemtag>Value 2....&lt;/itemtag> &lt;itemtag&gt;Value 2....&lt;/itemtag&gt;
&lt;/toptag> &lt;/toptag&gt;
</literal> </literal>
<para> <para>
If either toptag or itemtag is an empty string, the relevant tag is omitted. If either toptag or itemtag is an empty string, the relevant tag is omitted.
@ -87,49 +97,51 @@
</row> </row>
<row> <row>
<entry> <entry>
<programlisting> <synopsis>
xpath_nodeset(document,query) RETURNS xpath_nodeset(document,query) returns text
</programlisting> </synopsis>
</entry> </entry>
<entry> <entry>
<para> <para>
Like xpath_nodeset(document,query,toptag,itemtag) but text omits both tags. Like xpath_nodeset(document,query,toptag,itemtag) but result omits both tags.
</para> </para>
</entry> </entry>
</row> </row>
<row> <row>
<entry> <entry>
<programlisting> <synopsis>
xpath_nodeset(document,query,itemtag) RETURNS xpath_nodeset(document,query,itemtag) returns text
</programlisting> </synopsis>
</entry> </entry>
<entry> <entry>
<para> <para>
Like xpath_nodeset(document,query,toptag,itemtag) but text omits toptag. Like xpath_nodeset(document,query,toptag,itemtag) but result omits toptag.
</para> </para>
</entry> </entry>
</row> </row>
<row> <row>
<entry> <entry>
<programlisting> <synopsis>
xpath_list(document,query,seperator) RETURNS text xpath_list(document,query,separator) returns text
</programlisting> </synopsis>
</entry> </entry>
<entry> <entry>
<para> <para>
This function returns multiple values seperated by the specified This function returns multiple values separated by the specified
seperator, e.g. Value 1,Value 2,Value 3 if seperator=','. separator, for example <literal>Value 1,Value 2,Value 3</> if
separator is <literal>,</>.
</para> </para>
</entry> </entry>
</row> </row>
<row> <row>
<entry> <entry>
<programlisting> <synopsis>
xpath_list(document,query) RETURNS text xpath_list(document,query) returns text
</programlisting> </synopsis>
</entry> </entry>
<entry> <entry>
This is a wrapper for the above function that uses ',' as the seperator. This is a wrapper for the above function that uses <literal>,</>
as the separator.
</entry> </entry>
</row> </row>
</tbody> </tbody>
@ -137,38 +149,37 @@
</table> </table>
</sect2> </sect2>
<sect2> <sect2>
<title><literal>xpath_table</literal></title> <title><literal>xpath_table</literal></title>
<synopsis>
xpath_table(text key, text document, text relation, text xpaths, text criteria) returns setof record
</synopsis>
<para> <para>
This is a table function which evaluates a set of XPath queries on <function>xpath_table</> is a table function that evaluates a set of XPath
each of a set of documents and returns the results as a table. The queries on each of a set of documents and returns the results as a
primary key field from the original document table is returned as the table. The primary key field from the original document table is returned
first column of the result so that the resultset from xpath_table can as the first column of the result so that the result set
be readily used in joins. can readily be used in joins.
</para> </para>
<para>
The function itself takes 5 arguments, all text.
</para>
<programlisting>
xpath_table(key,document,relation,xpaths,criteria)
</programlisting>
<table> <table>
<title>Parameters</title> <title>Parameters</title>
<tgroup cols="2"> <tgroup cols="2">
<tbody> <tbody>
<row> <row>
<entry><literal>key</literal></entry> <entry><parameter>key</parameter></entry>
<entry> <entry>
<para> <para>
the name of the "key" field - this is just a field to be used as the name of the <quote>key</> field &mdash; this is just a field to be used as
the first column of the output table i.e. it identifies the record from the first column of the output table, i.e. it identifies the record from
which each output row came (see note below about multiple values). which each output row came (see note below about multiple values)
</para> </para>
</entry> </entry>
</row> </row>
<row> <row>
<entry><literal>document</literal></entry> <entry><parameter>document</parameter></entry>
<entry> <entry>
<para> <para>
the name of the field containing the XML document the name of the field containing the XML document
@ -176,7 +187,7 @@
</entry> </entry>
</row> </row>
<row> <row>
<entry><literal>relation</literal></entry> <entry><parameter>relation</parameter></entry>
<entry> <entry>
<para> <para>
the name of the table or view containing the documents the name of the table or view containing the documents
@ -184,20 +195,20 @@
</entry> </entry>
</row> </row>
<row> <row>
<entry><literal>xpaths</literal></entry> <entry><parameter>xpaths</parameter></entry>
<entry> <entry>
<para> <para>
multiple xpath expressions separated by <literal>|</literal> one or more XPath expressions, separated by <literal>|</literal>
</para> </para>
</entry> </entry>
</row> </row>
<row> <row>
<entry><literal>criteria</literal></entry> <entry><parameter>criteria</parameter></entry>
<entry> <entry>
<para> <para>
The contents of the where clause. This needs to be specified, the contents of the WHERE clause. This cannot be omitted, so use
so use "true" or "1=1" here if you want to process all the rows in the <literal>true</literal> or <literal>1=1</literal> if you want to
relation. process all the rows in the relation
</para> </para>
</entry> </entry>
</row> </row>
@ -206,75 +217,75 @@
</table> </table>
<para> <para>
NB These parameters (except the XPath strings) are just substituted These parameters (except the XPath strings) are just substituted
into a plain SQL SELECT statement, so you have some flexibility - the into a plain SQL SELECT statement, so you have some flexibility &mdash; the
statement is statement is
</para> </para>
<para> <para>
<literal> <literal>
SELECT &lt;key>,&lt;document> FROM &lt;relation> WHERE &lt;criteria> SELECT &lt;key&gt;, &lt;document&gt; FROM &lt;relation&gt; WHERE &lt;criteria&gt;
</literal> </literal>
</para> </para>
<para> <para>
so those parameters can be *anything* valid in those particular so those parameters can be <emphasis>anything</> valid in those particular
locations. The result from this SELECT needs to return exactly two locations. The result from this SELECT needs to return exactly two
columns (which it will unless you try to list multiple fields for key columns (which it will unless you try to list multiple fields for key
or document). Beware that this simplistic approach requires that you or document). Beware that this simplistic approach requires that you
validate any user-supplied values to avoid SQL injection attacks. validate any user-supplied values to avoid SQL injection attacks.
</para> </para>
<para> <para>
Using the function The function has to be used in a <literal>FROM</> expression, with an
<literal>AS</> clause to specify the output columns; for example
</para> </para>
<para>
The function has to be used in a FROM expression. This gives the following
form:
</para>
<programlisting> <programlisting>
SELECT * FROM SELECT * FROM
xpath_table('article_id', xpath_table('article_id',
'article_xml', 'article_xml',
'articles', 'articles',
'/article/author|/article/pages|/article/title', '/article/author|/article/pages|/article/title',
'date_entered > ''2003-01-01'' ') 'date_entered > ''2003-01-01'' ')
AS t(article_id integer, author text, page_count integer, title text); AS t(article_id integer, author text, page_count integer, title text);
</programlisting> </programlisting>
<para> <para>
The AS clause defines the names and types of the columns in the The <literal>AS</> clause defines the names and types of the columns in the
virtual table. If there are more XPath queries than result columns, output table. The first is the <quote>key</> field and the rest correspond
to the XPath queries.
If there are more XPath queries than result columns,
the extra queries will be ignored. If there are more result columns the extra queries will be ignored. If there are more result columns
than XPath queries, the extra columns will be NULL. than XPath queries, the extra columns will be NULL.
</para> </para>
<para> <para>
Note that I've said in this example that pages is an integer. The Notice that this example defines the <structname>page_count</> result
function deals internally with string representations, so when you say column as an integer. The function deals internally with string
you want an integer in the output, it will take the string representations, so when you say you want an integer in the output, it will
representation of the XPath result and use PostgreSQL input functions take the string representation of the XPath result and use PostgreSQL input
to transform it into an integer (or whatever type the AS clause functions to transform it into an integer (or whatever type the <type>AS</>
requests). An error will result if it can't do this - for example if clause requests). An error will result if it can't do this &mdash; for
the result is empty - so you may wish to just stick to 'text' as the example if the result is empty &mdash; so you may wish to just stick to
column type if you think your data has any problems. <type>text</> as the column type if you think your data has any problems.
</para> </para>
<para> <para>
The select statement doesn't need to use * alone - it can reference the The calling <command>SELECT</> statement doesn't necessarily have be
be just <literal>SELECT *</> &mdash; it can reference the output
columns by name or join them to other tables. The function produces a columns by name or join them to other tables. The function produces a
virtual table with which you can perform any operation you wish (e.g. virtual table with which you can perform any operation you wish (e.g.
aggregation, joining, sorting etc). So we could also have: aggregation, joining, sorting etc). So we could also have:
</para> </para>
<programlisting> <programlisting>
SELECT t.title, p.fullname, p.email SELECT t.title, p.fullname, p.email
FROM xpath_table('article_id','article_xml','articles', FROM xpath_table('article_id', 'article_xml', 'articles',
'/article/title|/article/author/@id', '/article/title|/article/author/@id',
'xpath_string(article_xml,''/article/@date'') > ''2003-03-20'' ') 'xpath_string(article_xml,''/article/@date'') > ''2003-03-20'' ')
AS t(article_id integer, title text, author_id integer), AS t(article_id integer, title text, author_id integer),
tblPeopleInfo AS p tblPeopleInfo AS p
WHERE t.author_id = p.person_id; WHERE t.author_id = p.person_id;
</programlisting> </programlisting>
@ -282,91 +293,74 @@ WHERE t.author_id = p.person_id;
as a more complicated example. Of course, you could wrap all as a more complicated example. Of course, you could wrap all
of this in a view for convenience. of this in a view for convenience.
</para> </para>
<sect3> <sect3>
<title>Multivalued results</title> <title>Multivalued results</title>
<para> <para>
The xpath_table function assumes that the results of each XPath query The <function>xpath_table</> function assumes that the results of each XPath query
might be multi-valued, so the number of rows returned by the function might be multi-valued, so the number of rows returned by the function
may not be the same as the number of input documents. The first row may not be the same as the number of input documents. The first row
returned contains the first result from each query, the second row the returned contains the first result from each query, the second row the
second result from each query. If one of the queries has fewer values second result from each query. If one of the queries has fewer values
than the others, NULLs will be returned instead. than the others, NULLs will be returned instead.
</para> </para>
<para> <para>
In some cases, a user will know that a given XPath query will return In some cases, a user will know that a given XPath query will return
only a single result (perhaps a unique document identifier) - if used only a single result (perhaps a unique document identifier) &mdash; if used
alongside an XPath query returning multiple results, the single-valued alongside an XPath query returning multiple results, the single-valued
result will appear only on the first row of the result. The solution result will appear only on the first row of the result. The solution
to this is to use the key field as part of a join against a simpler to this is to use the key field as part of a join against a simpler
XPath query. As an example: XPath query. As an example:
</para> </para>
<para>
<literal>
CREATE TABLE test
(
id int4 NOT NULL,
xml text,
CONSTRAINT pk PRIMARY KEY (id)
)
WITHOUT OIDS;
INSERT INTO test VALUES (1, '&lt;doc num="C1">
&lt;line num="L1">&lt;a>1&lt;/a>&lt;b>2&lt;/b>&lt;c>3&lt;/c>&lt;/line>
&lt;line num="L2">&lt;a>11&lt;/a>&lt;b>22&lt;/b>&lt;c>33&lt;/c>&lt;/line>
&lt;/doc>');
INSERT INTO test VALUES (2, '&lt;doc num="C2">
&lt;line num="L1">&lt;a>111&lt;/a>&lt;b>222&lt;/b>&lt;c>333&lt;/c>&lt;/line>
&lt;line num="L2">&lt;a>111&lt;/a>&lt;b>222&lt;/b>&lt;c>333&lt;/c>&lt;/line>
&lt;/doc>');
</literal>
</para>
</sect3>
<sect3>
<title>The query</title>
<programlisting> <programlisting>
SELECT * FROM xpath_table('id','xml','test', CREATE TABLE test (
'/doc/@num|/doc/line/@num|/doc/line/a|/doc/line/b|/doc/line/c','1=1') id int4 NOT NULL,
AS t(id int4, doc_num varchar(10), line_num varchar(10), val1 int4, xml text,
val2 int4, val3 int4) CONSTRAINT pk PRIMARY KEY (id)
WHERE id = 1 ORDER BY doc_num, line_num );
INSERT INTO test VALUES (1, '&lt;doc num="C1"&gt;
&lt;line num="L1"&gt;&lt;a&gt;1&lt;/a&gt;&lt;b&gt;2&lt;/b&gt;&lt;c&gt;3&lt;/c&gt;&lt;/line&gt;
&lt;line num="L2"&gt;&lt;a&gt;11&lt;/a&gt;&lt;b&gt;22&lt;/b&gt;&lt;c&gt;33&lt;/c&gt;&lt;/line&gt;
&lt;/doc&gt;');
INSERT INTO test VALUES (2, '&lt;doc num="C2"&gt;
&lt;line num="L1"&gt;&lt;a&gt;111&lt;/a&gt;&lt;b&gt;222&lt;/b&gt;&lt;c&gt;333&lt;/c&gt;&lt;/line&gt;
&lt;line num="L2"&gt;&lt;a&gt;111&lt;/a&gt;&lt;b&gt;222&lt;/b&gt;&lt;c&gt;333&lt;/c&gt;&lt;/line&gt;
&lt;/doc&gt;');
SELECT * FROM
xpath_table('id','xml','test',
'/doc/@num|/doc/line/@num|/doc/line/a|/doc/line/b|/doc/line/c',
'true')
AS t(id int4, doc_num varchar(10), line_num varchar(10), val1 int4, val2 int4, val3 int4)
WHERE id = 1 ORDER BY doc_num, line_num
id | doc_num | line_num | val1 | val2 | val3
----+---------+----------+------+------+------
1 | C1 | L1 | 1 | 2 | 3
1 | | L2 | 11 | 22 | 33
</programlisting> </programlisting>
<para> <para>
Gives the result: To get doc_num on every line, the solution is to use two invocations
of xpath_table and join the results:
</para> </para>
<programlisting> <programlisting>
id | doc_num | line_num | val1 | val2 | val3 SELECT t.*,i.doc_num FROM
----+---------+----------+------+------+------ xpath_table('id', 'xml', 'test',
1 | C1 | L1 | 1 | 2 | 3 '/doc/line/@num|/doc/line/a|/doc/line/b|/doc/line/c',
1 | | L2 | 11 | 22 | 33 'true')
</programlisting> AS t(id int4, line_num varchar(10), val1 int4, val2 int4, val3 int4),
xpath_table('id', 'xml', 'test', '/doc/@num', 'true')
<para> AS i(id int4, doc_num varchar(10))
To get doc_num on every line, the solution is to use two invocations
of xpath_table and join the results:
</para>
<programlisting>
SELECT t.*,i.doc_num FROM
xpath_table('id','xml','test',
'/doc/line/@num|/doc/line/a|/doc/line/b|/doc/line/c','1=1')
AS t(id int4, line_num varchar(10), val1 int4, val2 int4, val3 int4),
xpath_table('id','xml','test','/doc/@num','1=1')
AS i(id int4, doc_num varchar(10))
WHERE i.id=t.id AND i.id=1 WHERE i.id=t.id AND i.id=1
ORDER BY doc_num, line_num; ORDER BY doc_num, line_num;
</programlisting>
<para>
which gives the desired result:
</para>
<programlisting>
id | line_num | val1 | val2 | val3 | doc_num id | line_num | val1 | val2 | val3 | doc_num
----+----------+------+------+------+--------- ----+----------+------+------+------+---------
1 | L1 | 1 | 2 | 3 | C1 1 | L1 | 1 | 2 | 3 | C1
@ -375,62 +369,58 @@ WHERE t.author_id = p.person_id;
</programlisting> </programlisting>
</sect3> </sect3>
</sect2> </sect2>
<sect2> <sect2>
<title>XSLT functions</title> <title>XSLT functions</title>
<para> <para>
The following functions are available if libxslt is installed (this is The following functions are available if libxslt is installed (this is
not currently detected automatically, so you will have to amend the not currently detected automatically, so you will have to amend the
Makefile) Makefile):
</para> </para>
<sect3> <sect3>
<title><literal>xslt_process</literal></title> <title><literal>xslt_process</literal></title>
<programlisting>
xslt_process(document,stylesheet,paramlist) RETURNS text <synopsis>
</programlisting> xslt_process(text document, text stylesheet, text paramlist) returns text
</synopsis>
<para> <para>
This function appplies the XSL stylesheet to the document and returns This function appplies the XSL stylesheet to the document and returns
the transformed result. The paramlist is a list of parameter the transformed result. The paramlist is a list of parameter
assignments to be used in the transformation, specified in the form assignments to be used in the transformation, specified in the form
'a=1,b=2'. Note that this is also proof-of-concept code and the <literal>a=1,b=2</>. Note that the
parameter parsing is very simple-minded (e.g. parameter values cannot parameter parsing is very simple-minded: parameter values cannot
contain commas!) contain commas!
</para> </para>
<para> <para>
Also note that if either the document or stylesheet values do not Also note that if either the document or stylesheet values do not
begin with a &lt; then they will be treated as URLs and libxslt will begin with a &lt; then they will be treated as URLs and libxslt will
fetch them. It thus follows that you can use xslt_process as a means fetch them. It follows that you can use <function>xslt_process</> as a
to fetch the contents of URLs - you should be aware of the security means to fetch the contents of URLs &mdash; you should be aware of the
implications of this. security implications of this.
</para> </para>
<para> <para>
There is also a two-parameter version of xslt_process which does not There is also a two-parameter version of <function>xslt_process</> which
pass any parameters to the transformation. does not pass any parameters to the transformation.
</para> </para>
</sect3> </sect3>
</sect2> </sect2>
<sect2> <sect2>
<title>Credits</title> <title>Author</title>
<para> <para>
Development of this module was sponsored by Torchbox Ltd. (www.torchbox.com) John Gray <email>jgray@azuli.co.uk</email>
</para>
<para>
Development of this module was sponsored by Torchbox Ltd. (www.torchbox.com).
It has the same BSD licence as PostgreSQL. It has the same BSD licence as PostgreSQL.
</para> </para>
<para>
This version of the XML functions provides both XPath querying and
XSLT functionality. There is also a new table function which allows
the straightforward return of multiple XML results. Note that the current code
doesn't take any particular care over character sets - this is
something that should be fixed at some point!
</para>
<para>
If you have any comments or suggestions, please do contact me at
<email>jgray@azuli.co.uk.</email> Unfortunately, this isn't my main job, so
I can't guarantee a rapid response to your query!
</para>
</sect2> </sect2>
</sect1>
</sect1>