mirror of
https://github.com/postgres/postgres.git
synced 2025-04-22 23:02:54 +03:00
Make an editorial pass over the newly SGML-ified contrib documentation.
Fix lots of bad markup, bad English, bad explanations. This commit covers only about half the contrib modules, but I grow weary...
This commit is contained in:
parent
a37a0a4180
commit
53e99f57fc
@ -1,3 +1,5 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/adminpack.sgml,v 1.3 2007/12/06 04:12:09 tgl Exp $ -->
|
||||
|
||||
<sect1 id="adminpack">
|
||||
<title>adminpack</title>
|
||||
|
||||
@ -6,31 +8,33 @@
|
||||
</indexterm>
|
||||
|
||||
<para>
|
||||
adminpack is a PostgreSQL standard module that implements a number of
|
||||
support functions which pgAdmin and other administration and management tools
|
||||
can use to provide additional functionality if installed on a server.
|
||||
<filename>adminpack</> provides a number of support functions which
|
||||
<application>pgAdmin</> and other administration and management tools can
|
||||
use to provide additional functionality, such as remote management
|
||||
of server log files.
|
||||
</para>
|
||||
|
||||
<sect2>
|
||||
<title>Functions implemented</title>
|
||||
<para>
|
||||
Functions implemented by adminpack can only be run by a superuser. Here's a
|
||||
list of these functions:
|
||||
</para>
|
||||
<para>
|
||||
<programlisting>
|
||||
int8 pg_catalog.pg_file_write(fname text, data text, append bool)
|
||||
bool pg_catalog.pg_file_rename(oldname text, newname text, archivname text)
|
||||
bool pg_catalog.pg_file_rename(oldname text, newname text)
|
||||
bool pg_catalog.pg_file_unlink(fname text)
|
||||
setof record pg_catalog.pg_logdir_ls()
|
||||
|
||||
/* Renaming of existing backend functions for pgAdmin compatibility */
|
||||
int8 pg_catalog.pg_file_read(fname text, data text, append bool)
|
||||
bigint pg_catalog.pg_file_length(text)
|
||||
int4 pg_catalog.pg_logfile_rotate()
|
||||
</programlisting>
|
||||
<para>
|
||||
The functions implemented by <filename>adminpack</> can only be run by a
|
||||
superuser. Here's a list of these functions:
|
||||
|
||||
<programlisting>
|
||||
int8 pg_catalog.pg_file_write(fname text, data text, append bool)
|
||||
bool pg_catalog.pg_file_rename(oldname text, newname text, archivename text)
|
||||
bool pg_catalog.pg_file_rename(oldname text, newname text)
|
||||
bool pg_catalog.pg_file_unlink(fname text)
|
||||
setof record pg_catalog.pg_logdir_ls()
|
||||
|
||||
/* Renaming of existing backend functions for pgAdmin compatibility */
|
||||
int8 pg_catalog.pg_file_read(fname text, data text, append bool)
|
||||
bigint pg_catalog.pg_file_length(text)
|
||||
int4 pg_catalog.pg_logfile_rotate()
|
||||
</programlisting>
|
||||
</para>
|
||||
|
||||
</sect2>
|
||||
|
||||
</sect1>
|
||||
|
@ -1,3 +1,5 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/btree-gist.sgml,v 1.4 2007/12/06 04:12:09 tgl Exp $ -->
|
||||
|
||||
<sect1 id="btree-gist">
|
||||
<title>btree_gist</title>
|
||||
|
||||
@ -6,32 +8,49 @@
|
||||
</indexterm>
|
||||
|
||||
<para>
|
||||
btree_gist is a B-Tree implementation using GiST that supports the int2, int4,
|
||||
int8, float4, float8 timestamp with/without time zone, time
|
||||
with/without time zone, date, interval, oid, money, macaddr, char,
|
||||
varchar/text, bytea, numeric, bit, varbit and inet/cidr types.
|
||||
<filename>btree_gist</> provides sample GiST operator classes that
|
||||
implement B-Tree equivalent behavior for the data types
|
||||
<type>int2</>, <type>int4</>, <type>int8</>, <type>float4</>,
|
||||
<type>float8</>, <type>numeric</>, <type>timestamp with time zone</>,
|
||||
<type>timestamp without time zone</>, <type>time with time zone</>,
|
||||
<type>time without time zone</>, <type>date</>, <type>interval</>,
|
||||
<type>oid</>, <type>money</>, <type>char</>,
|
||||
<type>varchar</>, <type>text</>, <type>bytea</>, <type>bit</>,
|
||||
<type>varbit</>, <type>macaddr</>, <type>inet</>, and <type>cidr</>.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
In general, these operator classes will not outperform the equivalent
|
||||
standard btree index methods, and they lack one major feature of the
|
||||
standard btree code: the ability to enforce uniqueness. However,
|
||||
they are useful for GiST testing and as a base for developing other
|
||||
GiST operator classes.
|
||||
</para>
|
||||
|
||||
<sect2>
|
||||
<title>Example usage</title>
|
||||
<programlisting>
|
||||
CREATE TABLE test (a int4);
|
||||
-- create index
|
||||
CREATE INDEX testidx ON test USING gist (a);
|
||||
-- query
|
||||
SELECT * FROM test WHERE a < 10;
|
||||
</programlisting>
|
||||
|
||||
<programlisting>
|
||||
CREATE TABLE test (a int4);
|
||||
-- create index
|
||||
CREATE INDEX testidx ON test USING gist (a);
|
||||
-- query
|
||||
SELECT * FROM test WHERE a < 10;
|
||||
</programlisting>
|
||||
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Authors</title>
|
||||
|
||||
<para>
|
||||
All work was done by Teodor Sigaev (<email>teodor@stack.net</email>) ,
|
||||
Oleg Bartunov (<email>oleg@sai.msu.su</email>), Janko Richter
|
||||
(<email>jankorichter@yahoo.de</email>). See
|
||||
<ulink url="http://www.sai.msu.su/~megera/postgres/gist"></ulink> for additional
|
||||
information.
|
||||
Teodor Sigaev (<email>teodor@stack.net</email>) ,
|
||||
Oleg Bartunov (<email>oleg@sai.msu.su</email>), and
|
||||
Janko Richter (<email>jankorichter@yahoo.de</email>). See
|
||||
<ulink url="http://www.sai.msu.su/~megera/postgres/gist"></ulink>
|
||||
for additional information.
|
||||
</para>
|
||||
|
||||
</sect2>
|
||||
|
||||
</sect1>
|
||||
|
@ -1,17 +1,45 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/chkpass.sgml,v 1.2 2007/12/06 04:12:09 tgl Exp $ -->
|
||||
|
||||
<sect1 id="chkpass">
|
||||
<title>chkpass</title>
|
||||
|
||||
<!--
|
||||
<indexterm zone="chkpass">
|
||||
<primary>chkpass</primary>
|
||||
</indexterm>
|
||||
-->
|
||||
|
||||
<para>
|
||||
chkpass is a password type that is automatically checked and converted upon
|
||||
entry. It is stored encrypted. To compare, simply compare against a clear
|
||||
This module implements a data type <type>chkpass</> that is
|
||||
designed for storing encrypted passwords.
|
||||
Each password is automatically converted to encrypted form upon entry,
|
||||
and is always stored encrypted. To compare, simply compare against a clear
|
||||
text password and the comparison function will encrypt it before comparing.
|
||||
It also returns an error if the code determines that the password is easily
|
||||
crackable. This is currently a stub that does nothing.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
There are provisions in the code to report an error if the password is
|
||||
determined to be easily crackable. However, this is currently just
|
||||
a stub that does nothing.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
If you precede an input string with a colon, it is assumed to be an
|
||||
already-encrypted password, and is stored without further encryption.
|
||||
This allows entry of previously-encrypted passwords.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
On output, a colon is prepended. This makes it possible to dump and reload
|
||||
passwords without re-encrypting them. If you want the encrypted password
|
||||
without the colon then use the <function>raw()</> function.
|
||||
This allows you to use the
|
||||
type with things like Apache's Auth_PostgreSQL module.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The encryption uses the standard Unix function <function>crypt()</>,
|
||||
and so it suffers
|
||||
from all the usual limitations of that function; notably that only the
|
||||
first eight characters of a password are considered.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
@ -23,28 +51,10 @@
|
||||
</para>
|
||||
|
||||
<para>
|
||||
If you precede the string with a colon, the encryption and checking are
|
||||
skipped so that you can enter existing passwords into the field.
|
||||
Sample usage:
|
||||
</para>
|
||||
|
||||
<para>
|
||||
On output, a colon is prepended. This makes it possible to dump and reload
|
||||
passwords without re-encrypting them. If you want the password (encrypted)
|
||||
without the colon then use the raw() function. This allows you to use the
|
||||
type with things like Apache's Auth_PostgreSQL module.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The encryption uses the standard Unix function crypt(), and so it suffers
|
||||
from all the usual limitations of that function; notably that only the
|
||||
first eight characters of a password are considered.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Here is some sample usage:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
<programlisting>
|
||||
test=# create table test (p chkpass);
|
||||
CREATE TABLE
|
||||
test=# insert into test values ('hello');
|
||||
@ -72,13 +82,14 @@ test=# select p = 'goodbye' from test;
|
||||
----------
|
||||
f
|
||||
(1 row)
|
||||
</programlisting>
|
||||
</programlisting>
|
||||
|
||||
<sect2>
|
||||
<title>Author</title>
|
||||
|
||||
<para>
|
||||
D'Arcy J.M. Cain <email>darcy@druid.net</email>
|
||||
D'Arcy J.M. Cain (<email>darcy@druid.net</email>)
|
||||
</para>
|
||||
</sect2>
|
||||
</sect1>
|
||||
|
||||
</sect1>
|
||||
|
@ -1,4 +1,4 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/contrib-spi.sgml,v 1.1 2007/12/03 04:18:47 tgl Exp $ -->
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/contrib-spi.sgml,v 1.2 2007/12/06 04:12:09 tgl Exp $ -->
|
||||
|
||||
<sect1 id="contrib-spi">
|
||||
<title>spi</title>
|
||||
@ -29,27 +29,28 @@
|
||||
|
||||
<para>
|
||||
<function>check_primary_key()</> checks the referencing table.
|
||||
To use, create a BEFORE INSERT OR UPDATE trigger using this
|
||||
function on a table referencing another table. You are to specify
|
||||
as trigger arguments: triggered table column names which correspond
|
||||
to foreign key, referenced table name and column names in referenced
|
||||
table which correspond to primary/unique key. To handle multiple
|
||||
foreign keys, create a trigger for each reference.
|
||||
To use, create a <literal>BEFORE INSERT OR UPDATE</> trigger using this
|
||||
function on a table referencing another table. Specify as the trigger
|
||||
arguments: the referencing table's column name(s) which form the foreign
|
||||
key, the referenced table name, and the column names in the referenced table
|
||||
which form the primary/unique key. To handle multiple foreign
|
||||
keys, create a trigger for each reference.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<function>check_foreign_key()</> checks the referenced table.
|
||||
To use, create a BEFORE DELETE OR UPDATE trigger using this
|
||||
function on a table referenced by other table(s). You are to specify
|
||||
as trigger arguments: number of references for which the function has to
|
||||
perform checking, action if referencing key found ('cascade' — to delete
|
||||
corresponding foreign key, 'restrict' — to abort transaction if foreign keys
|
||||
exist, 'setnull' — to set foreign key referencing primary/unique key
|
||||
being deleted to null), triggered table column names which correspond
|
||||
to primary/unique key, then referencing table name and column names
|
||||
corresponding to foreign key (repeated for as many referencing tables/keys
|
||||
as were specified by first argument). Note that the primary/unique key
|
||||
columns should be marked NOT NULL and should have a unique index.
|
||||
To use, create a <literal>BEFORE DELETE OR UPDATE</> trigger using this
|
||||
function on a table referenced by other table(s). Specify as the trigger
|
||||
arguments: the number of referencing tables for which the function has to
|
||||
perform checking, the action if a referencing key is found
|
||||
(<literal>cascade</> — to delete the referencing row,
|
||||
<literal>restrict</> — to abort transaction if referencing keys
|
||||
exist, <literal>setnull</> — to set referencing key fields to null),
|
||||
the triggered table's column names which form the primary/unique key, then
|
||||
the referencing table name and column names (repeated for as many
|
||||
referencing tables as were specified by first argument). Note that the
|
||||
primary/unique key columns should be marked NOT NULL and should have a
|
||||
unique index.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
@ -64,60 +65,65 @@
|
||||
Long ago, <productname>PostgreSQL</> had a built-in time travel feature
|
||||
that kept the insert and delete times for each tuple. This can be
|
||||
emulated using these functions. To use these functions,
|
||||
you are to add to a table two columns of <type>abstime</> type to store
|
||||
you must add to a table two columns of <type>abstime</> type to store
|
||||
the date when a tuple was inserted (start_date) and changed/deleted
|
||||
(stop_date):
|
||||
|
||||
<programlisting>
|
||||
CREATE TABLE mytab (
|
||||
... ...
|
||||
start_date abstime default now(),
|
||||
stop_date abstime default 'infinity'
|
||||
start_date abstime,
|
||||
stop_date abstime
|
||||
... ...
|
||||
);
|
||||
</programlisting>
|
||||
|
||||
So, tuples being inserted with unspecified start_date/stop_date will get
|
||||
the current time in start_date and <literal>infinity</> in
|
||||
stop_date.
|
||||
The columns can be named whatever you like, but in this discussion
|
||||
we'll call them start_date and stop_date.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
When a new row is inserted, start_date should normally be set to
|
||||
current time, and stop_date to <literal>infinity</>. The trigger
|
||||
will automatically substitute these values if the inserted data
|
||||
contains nulls in these columns. Generally, inserting explicit
|
||||
non-null data in these columns should only be done when re-loading
|
||||
dumped data.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Tuples with stop_date equal to <literal>infinity</> are <quote>valid
|
||||
now</quote>: when trigger will be fired for UPDATE/DELETE of a tuple with
|
||||
stop_date NOT equal to <literal>infinity</> then
|
||||
this tuple will not be changed/deleted!
|
||||
now</quote>, and can be modified. Tuples with a finite stop_date cannot
|
||||
be modified anymore — the trigger will prevent it. (If you need
|
||||
to do that, you can turn off time travel as shown below.)
|
||||
</para>
|
||||
|
||||
<para>
|
||||
If stop_date is equal to <literal>infinity</> then on
|
||||
update only the stop_date in the tuple being updated will be changed (to
|
||||
current time) and a new tuple with new data (coming from SET ... in UPDATE)
|
||||
will be inserted. Start_date in this new tuple will be set to current time
|
||||
and stop_date to <literal>infinity</>.
|
||||
For a modifiable row, on update only the stop_date in the tuple being
|
||||
updated will be changed (to current time) and a new tuple with the modified
|
||||
data will be inserted. Start_date in this new tuple will be set to current
|
||||
time and stop_date to <literal>infinity</>.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
A delete does not actually remove the tuple but only set its stop_date
|
||||
A delete does not actually remove the tuple but only sets its stop_date
|
||||
to current time.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
To query for tuples <quote>valid now</quote>, include
|
||||
<literal>stop_date = 'infinity'</> in the query's WHERE condition.
|
||||
(You might wish to incorporate that in a view.)
|
||||
</para>
|
||||
|
||||
<para>
|
||||
You can't change start/stop date columns with UPDATE!
|
||||
Use set_timetravel (below) if you need this.
|
||||
(You might wish to incorporate that in a view.) Similarly, you can
|
||||
query for tuples valid at any past time with suitable conditions on
|
||||
start_date and stop_date.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<function>timetravel()</> is the general trigger function that supports
|
||||
this behavior. Create a BEFORE INSERT OR UPDATE OR DELETE trigger using this
|
||||
function on each time-traveled table. You are to specify two trigger arguments:
|
||||
name of start_date column and name of stop_date column in triggered table.
|
||||
this behavior. Create a <literal>BEFORE INSERT OR UPDATE OR DELETE</>
|
||||
trigger using this function on each time-traveled table. Specify two
|
||||
trigger arguments: the actual
|
||||
names of the start_date and stop_date columns.
|
||||
Optionally, you can specify one to three more arguments, which must refer
|
||||
to columns of type <type>text</>. The trigger will store the name of
|
||||
the current user into the first of these columns during INSERT, the
|
||||
@ -130,7 +136,9 @@ CREATE TABLE mytab (
|
||||
<literal>set_timetravel('mytab', 1)</> will turn TT ON for table mytab.
|
||||
<literal>set_timetravel('mytab', 0)</> will turn TT OFF for table mytab.
|
||||
In both cases the old status is reported. While TT is off, you can modify
|
||||
the start_date and stop_date columns freely.
|
||||
the start_date and stop_date columns freely. Note that the on/off status
|
||||
is local to the current database session — fresh sessions will
|
||||
always start out with TT ON for all tables.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
@ -156,9 +164,9 @@ CREATE TABLE mytab (
|
||||
</para>
|
||||
|
||||
<para>
|
||||
To use, create a BEFORE INSERT (or optionally BEFORE INSERT OR UPDATE)
|
||||
trigger using this function. You are to specify
|
||||
as trigger arguments: the name of the integer column to be modified,
|
||||
To use, create a <literal>BEFORE INSERT</> (or optionally <literal>BEFORE
|
||||
INSERT OR UPDATE</>) trigger using this function. Specify two
|
||||
trigger arguments: the name of the integer column to be modified,
|
||||
and the name of the sequence object that will supply values.
|
||||
(Actually, you can specify any number of pairs of such names, if
|
||||
you'd like to update more than one autoincrementing column.)
|
||||
@ -180,8 +188,8 @@ CREATE TABLE mytab (
|
||||
</para>
|
||||
|
||||
<para>
|
||||
To use, create a BEFORE INSERT and/or UPDATE
|
||||
trigger using this function. You are to specify a single trigger
|
||||
To use, create a <literal>BEFORE INSERT</> and/or <literal>UPDATE</>
|
||||
trigger using this function. Specify a single trigger
|
||||
argument: the name of the text column to be modified.
|
||||
</para>
|
||||
|
||||
@ -201,8 +209,8 @@ CREATE TABLE mytab (
|
||||
</para>
|
||||
|
||||
<para>
|
||||
To use, create a BEFORE UPDATE
|
||||
trigger using this function. You are to specify a single trigger
|
||||
To use, create a <literal>BEFORE UPDATE</>
|
||||
trigger using this function. Specify a single trigger
|
||||
argument: the name of the <type>timestamp</> column to be modified.
|
||||
</para>
|
||||
|
||||
|
@ -1,4 +1,4 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/contrib.sgml,v 1.7 2007/12/03 04:18:47 tgl Exp $ -->
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/contrib.sgml,v 1.8 2007/12/06 04:12:09 tgl Exp $ -->
|
||||
|
||||
<appendix id="contrib">
|
||||
<title>Additional Supplied Modules</title>
|
||||
@ -54,6 +54,7 @@ psql -d dbname -f <replaceable>SHAREDIR</>/contrib/<replaceable>module</>.sql
|
||||
|
||||
Here, <replaceable>SHAREDIR</> means the installation's <quote>share</>
|
||||
directory (<literal>pg_config --sharedir</> will tell you what this is).
|
||||
In most cases the script must be run by a database superuser.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
|
@ -1,3 +1,4 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/cube.sgml,v 1.5 2007/12/06 04:12:09 tgl Exp $ -->
|
||||
|
||||
<sect1 id="cube">
|
||||
<title>cube</title>
|
||||
@ -7,15 +8,17 @@
|
||||
</indexterm>
|
||||
|
||||
<para>
|
||||
This module contains the user-defined type, CUBE, representing
|
||||
multidimensional cubes.
|
||||
This module implements a data type <type>cube</> for
|
||||
representing multi-dimensional cubes.
|
||||
</para>
|
||||
|
||||
<sect2>
|
||||
<title>Syntax</title>
|
||||
|
||||
<para>
|
||||
The following are valid external representations for the CUBE type:
|
||||
The following are valid external representations for the <type>cube</>
|
||||
type. <replaceable>x</>, <replaceable>y</>, etc denote floating-point
|
||||
numbers:
|
||||
</para>
|
||||
|
||||
<table>
|
||||
@ -23,192 +26,281 @@
|
||||
<tgroup cols="2">
|
||||
<tbody>
|
||||
<row>
|
||||
<entry>'x'</entry>
|
||||
<entry>A floating point value representing a one-dimensional point or
|
||||
one-dimensional zero length cubement
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>'(x)'</entry>
|
||||
<entry>Same as above</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>'x1,x2,x3,...,xn'</entry>
|
||||
<entry>A point in n-dimensional space, represented internally as a zero
|
||||
volume box
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>'(x1,x2,x3,...,xn)'</entry>
|
||||
<entry>Same as above</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>'(x),(y)'</entry>
|
||||
<entry>1-D cubement starting at x and ending at y or vice versa; the
|
||||
order does not matter
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>'(x1,...,xn),(y1,...,yn)'</entry>
|
||||
<entry>n-dimensional box represented by a pair of its opposite corners, no
|
||||
matter which. Functions take care of swapping to achieve "lower left --
|
||||
upper right" representation before computing any values
|
||||
</entry>
|
||||
</row>
|
||||
</tbody>
|
||||
</tgroup>
|
||||
</table>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Grammar</title>
|
||||
<table>
|
||||
<title>Cube Grammar Rules</title>
|
||||
<tgroup cols="2">
|
||||
<tbody>
|
||||
<row>
|
||||
<entry>rule 1</entry>
|
||||
<entry>box -> O_BRACKET paren_list COMMA paren_list C_BRACKET</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>rule 2</entry>
|
||||
<entry>box -> paren_list COMMA paren_list</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>rule 3</entry>
|
||||
<entry>box -> paren_list</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>rule 4</entry>
|
||||
<entry>box -> list</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>rule 5</entry>
|
||||
<entry>paren_list -> O_PAREN list C_PAREN</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>rule 6</entry>
|
||||
<entry>list -> FLOAT</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>rule 7</entry>
|
||||
<entry>list -> list COMMA FLOAT</entry>
|
||||
</row>
|
||||
</tbody>
|
||||
</tgroup>
|
||||
</table>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Tokens</title>
|
||||
<table>
|
||||
<title>Cube Grammar Rules</title>
|
||||
<tgroup cols="2">
|
||||
<tbody>
|
||||
<row>
|
||||
<entry>n</entry>
|
||||
<entry>[0-9]+</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>i</entry>
|
||||
<entry>nteger [+-]?{n}</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>real</entry>
|
||||
<entry>[+-]?({n}\.{n}?|\.{n})</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>FLOAT</entry>
|
||||
<entry>({integer}|{real})([eE]{integer})?</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>O_BRACKET</entry>
|
||||
<entry>\[</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>C_BRACKET</entry>
|
||||
<entry>\]</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>O_PAREN</entry>
|
||||
<entry>\(</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>C_PAREN</entry>
|
||||
<entry>\)</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>COMMA</entry>
|
||||
<entry>\,</entry>
|
||||
</row>
|
||||
</tbody>
|
||||
</tgroup>
|
||||
</table>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Examples</title>
|
||||
<table>
|
||||
<title>Examples</title>
|
||||
<tgroup cols="2">
|
||||
<tbody>
|
||||
<row>
|
||||
<entry>'x'</entry>
|
||||
<entry>A floating point value representing a one-dimensional point
|
||||
<entry><literal><replaceable>x</></literal></entry>
|
||||
<entry>A one-dimensional point
|
||||
(or, zero-length one-dimensional interval)
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>'(x)'</entry>
|
||||
<entry><literal>(<replaceable>x</>)</literal></entry>
|
||||
<entry>Same as above</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>'x1,x2,x3,...,xn'</entry>
|
||||
<entry>A point in n-dimensional space,represented internally as a zero
|
||||
volume cube
|
||||
<entry><literal><replaceable>x1</>,<replaceable>x2</>,...,<replaceable>xn</></literal></entry>
|
||||
<entry>A point in n-dimensional space, represented internally as a
|
||||
zero-volume cube
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>'(x1,x2,x3,...,xn)'</entry>
|
||||
<entry><literal>(<replaceable>x1</>,<replaceable>x2</>,...,<replaceable>xn</>)</literal></entry>
|
||||
<entry>Same as above</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>'(x),(y)'</entry>
|
||||
<entry>A 1-D interval starting at x and ending at y or vice versa; the
|
||||
<entry><literal>(<replaceable>x</>),(<replaceable>y</>)</literal></entry>
|
||||
<entry>A one-dimensional interval starting at <replaceable>x</> and ending at <replaceable>y</> or vice versa; the
|
||||
order does not matter
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>'[(x),(y)]'</entry>
|
||||
<entry><literal>[(<replaceable>x</>),(<replaceable>y</>)]</literal></entry>
|
||||
<entry>Same as above</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>'(x1,...,xn),(y1,...,yn)'</entry>
|
||||
<entry>An n-dimensional box represented by a pair of its diagonally
|
||||
opposite corners, regardless of order. Swapping is provided
|
||||
by all comarison routines to ensure the
|
||||
"lower left -- upper right" representation
|
||||
before actaul comparison takes place.
|
||||
<entry><literal>(<replaceable>x1</>,...,<replaceable>xn</>),(<replaceable>y1</>,...,<replaceable>yn</>)</literal></entry>
|
||||
<entry>An n-dimensional cube represented by a pair of its diagonally
|
||||
opposite corners
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>'[(x1,...,xn),(y1,...,yn)]'</entry>
|
||||
<entry><literal>[(<replaceable>x1</>,...,<replaceable>xn</>),(<replaceable>y1</>,...,<replaceable>yn</>)]</literal></entry>
|
||||
<entry>Same as above</entry>
|
||||
</row>
|
||||
</tbody>
|
||||
</tgroup>
|
||||
</table>
|
||||
|
||||
<para>
|
||||
White space is ignored, so '[(x),(y)]' can be: '[ ( x ), ( y ) ]'
|
||||
It does not matter which order the opposite corners of a cube are
|
||||
entered in. The <type>cube</> functions
|
||||
automatically swap values if needed to create a uniform
|
||||
<quote>lower left — upper right</> internal representation.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
White space is ignored, so <literal>[(<replaceable>x</>),(<replaceable>y</>)]</literal> is the same as
|
||||
<literal>[ ( <replaceable>x</> ), ( <replaceable>y</> ) ]</literal>.
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Precision</title>
|
||||
|
||||
<para>
|
||||
Values are stored internally as 64-bit floating point numbers. This means
|
||||
that numbers with more than about 16 significant digits will be truncated.
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Usage</title>
|
||||
|
||||
<para>
|
||||
The <filename>cube</> module includes a GiST index operator class for
|
||||
<type>cube</> values.
|
||||
The operators supported by the GiST opclass include:
|
||||
</para>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<programlisting>
|
||||
a = b Same as
|
||||
</programlisting>
|
||||
<para>
|
||||
The cubes a and b are identical.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<programlisting>
|
||||
a && b Overlaps
|
||||
</programlisting>
|
||||
<para>
|
||||
The cubes a and b overlap.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<programlisting>
|
||||
a @> b Contains
|
||||
</programlisting>
|
||||
<para>
|
||||
The cube a contains the cube b.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<programlisting>
|
||||
a <@ b Contained in
|
||||
</programlisting>
|
||||
<para>
|
||||
The cube a is contained in the cube b.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
|
||||
<para>
|
||||
(Before PostgreSQL 8.2, the containment operators @> and <@ were
|
||||
respectively called @ and ~. These names are still available, but are
|
||||
deprecated and will eventually be retired. Notice that the old names
|
||||
are reversed from the convention formerly followed by the core geometric
|
||||
datatypes!)
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The standard B-tree operators are also provided, for example
|
||||
|
||||
<programlisting>
|
||||
[a, b] < [c, d] Less than
|
||||
[a, b] > [c, d] Greater than
|
||||
</programlisting>
|
||||
|
||||
These operators do not make a lot of sense for any practical
|
||||
purpose but sorting. These operators first compare (a) to (c),
|
||||
and if these are equal, compare (b) to (d). That results in
|
||||
reasonably good sorting in most cases, which is useful if
|
||||
you want to use ORDER BY with this type.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The following functions are available:
|
||||
</para>
|
||||
|
||||
<table>
|
||||
<title>Cube functions</title>
|
||||
<tgroup cols="2">
|
||||
<tbody>
|
||||
<row>
|
||||
<entry><literal>cube(float8) returns cube</literal></entry>
|
||||
<entry>Makes a one dimensional cube with both coordinates the same.
|
||||
<literal>cube(1) == '(1)'</literal>
|
||||
</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>cube(float8, float8) returns cube</literal></entry>
|
||||
<entry>Makes a one dimensional cube.
|
||||
<literal>cube(1,2) == '(1),(2)'</literal>
|
||||
</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>cube(float8[]) returns cube</literal></entry>
|
||||
<entry>Makes a zero-volume cube using the coordinates
|
||||
defined by the array.
|
||||
<literal>cube(ARRAY[1,2]) == '(1,2)'</literal>
|
||||
</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>cube(float8[], float8[]) returns cube</literal></entry>
|
||||
<entry>Makes a cube with upper right and lower left
|
||||
coordinates as defined by the two arrays, which must be of the
|
||||
same length.
|
||||
<literal>cube('{1,2}'::float[], '{3,4}'::float[]) == '(1,2),(3,4)'
|
||||
</literal>
|
||||
</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>cube(cube, float8) returns cube</literal></entry>
|
||||
<entry>Makes a new cube by adding a dimension on to an
|
||||
existing cube with the same values for both parts of the new coordinate.
|
||||
This is useful for building cubes piece by piece from calculated values.
|
||||
<literal>cube('(1)',2) == '(1,2),(1,2)'</literal>
|
||||
</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>cube(cube, float8, float8) returns cube</literal></entry>
|
||||
<entry>Makes a new cube by adding a dimension on to an
|
||||
existing cube. This is useful for building cubes piece by piece from
|
||||
calculated values. <literal>cube('(1,2)',3,4) == '(1,3),(2,4)'</literal>
|
||||
</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>cube_dim(cube) returns int</literal></entry>
|
||||
<entry>Returns the number of dimensions of the cube
|
||||
</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>cube_ll_coord(cube, int) returns double </literal></entry>
|
||||
<entry>Returns the n'th coordinate value for the lower left
|
||||
corner of a cube
|
||||
</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>cube_ur_coord(cube, int) returns double
|
||||
</literal></entry>
|
||||
<entry>Returns the n'th coordinate value for the
|
||||
upper right corner of a cube
|
||||
</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>cube_is_point(cube) returns bool</literal></entry>
|
||||
<entry>Returns true if a cube is a point, that is,
|
||||
the two defining corners are the same.</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>cube_distance(cube, cube) returns double</literal></entry>
|
||||
<entry>Returns the distance between two cubes. If both
|
||||
cubes are points, this is the normal distance function.
|
||||
</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>cube_subset(cube, int[]) returns cube
|
||||
</literal></entry>
|
||||
<entry>Makes a new cube from an existing cube, using a list of
|
||||
dimension indexes from an array. Can be used to find both the LL and UR
|
||||
coordinates of a single dimension, e.g.
|
||||
<literal>cube_subset(cube('(1,3,5),(6,7,8)'), ARRAY[2]) = '(3),(7)'</>.
|
||||
Or can be used to drop dimensions, or reorder them as desired, e.g.
|
||||
<literal>cube_subset(cube('(1,3,5),(6,7,8)'), ARRAY[3,2,1,1]) = '(5, 3,
|
||||
1, 1),(8, 7, 6, 6)'</>.
|
||||
</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>cube_union(cube, cube) returns cube</literal></entry>
|
||||
<entry>Produces the union of two cubes
|
||||
</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>cube_inter(cube, cube) returns cube</literal></entry>
|
||||
<entry>Produces the intersection of two cubes
|
||||
</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>cube_enlarge(cube c, double r, int n) returns cube</literal></entry>
|
||||
<entry>Increases the size of a cube by a specified radius in at least
|
||||
n dimensions. If the radius is negative the cube is shrunk instead. This
|
||||
is useful for creating bounding boxes around a point for searching for
|
||||
nearby points. All defined dimensions are changed by the radius r.
|
||||
LL coordinates are decreased by r and UR coordinates are increased by r.
|
||||
If a LL coordinate is increased to larger than the corresponding UR
|
||||
coordinate (this can only happen when r < 0) than both coordinates
|
||||
are set to their average. If n is greater than the number of defined
|
||||
dimensions and the cube is being increased (r >= 0) then 0 is used
|
||||
as the base for the extra coordinates.
|
||||
</entry>
|
||||
</row>
|
||||
</tbody>
|
||||
</tgroup>
|
||||
</table>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Defaults</title>
|
||||
|
||||
<para>
|
||||
I believe this union:
|
||||
</para>
|
||||
<programlisting>
|
||||
select cube_union('(0,5,2),(2,3,1)','0');
|
||||
select cube_union('(0,5,2),(2,3,1)', '0');
|
||||
cube_union
|
||||
-------------------
|
||||
(0, 0, 0),(2, 5, 2)
|
||||
@ -216,11 +308,11 @@ cube_union
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
does not contradict to the common sense, neither does the intersection
|
||||
does not contradict common sense, neither does the intersection
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
select cube_inter('(0,-1),(1,1)','(-2),(2)');
|
||||
select cube_inter('(0,-1),(1,1)', '(-2),(2)');
|
||||
cube_inter
|
||||
-------------
|
||||
(0, 0),(1, 0)
|
||||
@ -228,9 +320,10 @@ cube_inter
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
In all binary operations on differently sized boxes, I assume the smaller
|
||||
one to be a cartesian projection, i. e., having zeroes in place of coordinates
|
||||
omitted in the string representation. The above examples are equivalent to:
|
||||
In all binary operations on differently-dimensioned cubes, I assume the
|
||||
lower-dimensional one to be a cartesian projection, i. e., having zeroes
|
||||
in place of coordinates omitted in the string representation. The above
|
||||
examples are equivalent to:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
@ -241,7 +334,7 @@ cube_inter('(0,-1),(1,1)','(-2,0),(2,0)');
|
||||
<para>
|
||||
The following containment predicate uses the point syntax,
|
||||
while in fact the second argument is internally represented by a box.
|
||||
This syntax makes it unnecessary to define the special Point type
|
||||
This syntax makes it unnecessary to define a separate point type
|
||||
and functions for (box,point) predicates.
|
||||
</para>
|
||||
|
||||
@ -253,268 +346,42 @@ t
|
||||
(1 row)
|
||||
</programlisting>
|
||||
</sect2>
|
||||
<sect2>
|
||||
<title>Precision</title>
|
||||
<para>
|
||||
Values are stored internally as 64-bit floating point numbers. This means that
|
||||
numbers with more than about 16 significant digits will be truncated.
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Usage</title>
|
||||
<title>Notes</title>
|
||||
|
||||
<para>
|
||||
The access method for CUBE is a GiST index (gist_cube_ops), which is a
|
||||
generalization of R-tree. GiSTs allow the postgres implementation of
|
||||
R-tree, originally encoded to support 2-D geometric types such as
|
||||
boxes and polygons, to be used with any data type whose data domain
|
||||
can be partitioned using the concepts of containment, intersection and
|
||||
equality. In other words, everything that can intersect or contain
|
||||
its own kind can be indexed with a GiST. That includes, among other
|
||||
things, all geometric data types, regardless of their dimensionality
|
||||
(see also contrib/seg).
|
||||
For examples of usage, see the regression test <filename>sql/cube.sql</>.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The operators supported by the GiST access method include:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
a = b Same as
|
||||
</programlisting>
|
||||
<para>
|
||||
The cubements a and b are identical.
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
a && b Overlaps
|
||||
</programlisting>
|
||||
<para>
|
||||
The cubements a and b overlap.
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
a @> b Contains
|
||||
</programlisting>
|
||||
<para>
|
||||
The cubement a contains the cubement b.
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
a <@ b Contained in
|
||||
</programlisting>
|
||||
<para>
|
||||
The cubement a is contained in b.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
(Before PostgreSQL 8.2, the containment operators @> and <@ were
|
||||
respectively called @ and ~. These names are still available, but are
|
||||
deprecated and will eventually be retired. Notice that the old names
|
||||
are reversed from the convention formerly followed by the core geometric
|
||||
datatypes!)
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Although the mnemonics of the following operators is questionable, I
|
||||
preserved them to maintain visual consistency with other geometric
|
||||
data types defined in Postgres.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Other operators:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
[a, b] < [c, d] Less than
|
||||
[a, b] > [c, d] Greater than
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
These operators do not make a lot of sense for any practical
|
||||
purpose but sorting. These operators first compare (a) to (c),
|
||||
and if these are equal, compare (b) to (d). That accounts for
|
||||
reasonably good sorting in most cases, which is useful if
|
||||
you want to use ORDER BY with this type
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The following functions are available:
|
||||
</para>
|
||||
|
||||
<table>
|
||||
<title>Functions available</title>
|
||||
<tgroup cols="2">
|
||||
<tbody>
|
||||
<row>
|
||||
<entry><literal>cube_distance(cube, cube) returns double</literal></entry>
|
||||
<entry>cube_distance returns the distance between two cubes. If both
|
||||
cubes are points, this is the normal distance function.
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><literal>cube(text)</literal></entry>
|
||||
<entry>Takes text input and returns a cube. This is useful for making
|
||||
cubes from computed strings.
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><literal>cube(float8) returns cube</literal></entry>
|
||||
<entry>This makes a one dimensional cube with both coordinates the same.
|
||||
If the type of the argument is a numeric type other than float8 an
|
||||
explicit cast to float8 may be needed.
|
||||
<literal>cube(1) == '(1)'</literal>
|
||||
</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>cube(float8, float8) returns cube</literal></entry>
|
||||
<entry>
|
||||
This makes a one dimensional cube.
|
||||
<literal>cube(1,2) == '(1),(2)'</literal>
|
||||
</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>cube(float8[]) returns cube</literal></entry>
|
||||
<entry>This makes a zero-volume cube using the coordinates
|
||||
defined by thearray.<literal>cube(ARRAY[1,2]) == '(1,2)'</literal>
|
||||
</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>cube(float8[], float8[]) returns cube</literal></entry>
|
||||
<entry>This makes a cube, with upper right and lower left
|
||||
coordinates as defined by the 2 float arrays. Arrays must be of the
|
||||
same length.
|
||||
<literal>cube('{1,2}'::float[], '{3,4}'::float[]) == '(1,2),(3,4)'
|
||||
</literal>
|
||||
</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>cube(cube, float8) returns cube</literal></entry>
|
||||
<entry>This builds a new cube by adding a dimension on to an
|
||||
existing cube with the same values for both parts of the new coordinate.
|
||||
This is useful for building cubes piece by piece from calculated values.
|
||||
<literal>cube('(1)',2) == '(1,2),(1,2)'</literal>
|
||||
</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>cube(cube, float8, float8) returns cube</literal></entry>
|
||||
<entry>This builds a new cube by adding a dimension on to an
|
||||
existing cube. This is useful for building cubes piece by piece from
|
||||
calculated values. <literal>cube('(1,2)',3,4) == '(1,3),(2,4)'</literal>
|
||||
</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>cube_dim(cube) returns int</literal></entry>
|
||||
<entry>cube_dim returns the number of dimensions stored in the
|
||||
the data structure
|
||||
for a cube. This is useful for constraints on the dimensions of a cube.
|
||||
</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>cube_ll_coord(cube, int) returns double </literal></entry>
|
||||
<entry>
|
||||
cube_ll_coord returns the nth coordinate value for the lower left
|
||||
corner of a cube. This is useful for doing coordinate transformations.
|
||||
</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>cube_ur_coord(cube, int) returns double
|
||||
</literal></entry>
|
||||
<entry>cube_ur_coord returns the nth coordinate value for the
|
||||
upper right corner of a cube. This is useful for doing coordinate
|
||||
transformations.
|
||||
</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>cube_subset(cube, int[]) returns cube
|
||||
</literal></entry>
|
||||
<entry>Builds a new cube from an existing cube, using a list of
|
||||
dimension indexes
|
||||
from an array. Can be used to find both the ll and ur coordinate of single
|
||||
dimenion, e.g.: cube_subset(cube('(1,3,5),(6,7,8)'), ARRAY[2]) = '(3),(7)'
|
||||
Or can be used to drop dimensions, or reorder them as desired, e.g.:
|
||||
cube_subset(cube('(1,3,5),(6,7,8)'), ARRAY[3,2,1,1]) =
|
||||
'(5, 3, 1, 1),(8, 7, 6, 6)'
|
||||
</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>cube_is_point(cube) returns bool</literal></entry>
|
||||
<entry>cube_is_point returns true if a cube is also a point.
|
||||
This is true when the two defining corners are the same.</entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><literal>cube_enlarge(cube, double, int) returns cube</literal></entry>
|
||||
<entry>
|
||||
cube_enlarge increases the size of a cube by a specified
|
||||
radius in at least
|
||||
n dimensions. If the radius is negative the box is shrunk instead. This
|
||||
is useful for creating bounding boxes around a point for searching for
|
||||
nearby points. All defined dimensions are changed by the radius. If n
|
||||
is greater than the number of defined dimensions and the cube is being
|
||||
increased (r >= 0) then 0 is used as the base for the extra coordinates.
|
||||
LL coordinates are decreased by r and UR coordinates are increased by r.
|
||||
If a LL coordinate is increased to larger than the corresponding UR
|
||||
coordinate (this can only happen when r < 0) than both coordinates are
|
||||
set to their average. To make it harder for people to break things there
|
||||
is an effective maximum on the dimension of cubes of 100. This is set
|
||||
in cubedata.h if you need something bigger.
|
||||
</entry>
|
||||
</row>
|
||||
</tbody>
|
||||
</tgroup>
|
||||
</table>
|
||||
|
||||
<para>
|
||||
There are a few other potentially useful functions defined in cube.c
|
||||
that vanished from the schema because I stopped using them. Some of
|
||||
these were meant to support type casting. Let me know if I was wrong:
|
||||
I will then add them back to the schema. I would also appreciate
|
||||
other ideas that would enhance the type and make it more useful.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
For examples of usage, see sql/cube.sql
|
||||
To make it harder for people to break things, there
|
||||
is a limit of 100 on the number of dimensions of cubes. This is set
|
||||
in <filename>cubedata.h</> if you need something bigger.
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Credits</title>
|
||||
|
||||
<para>
|
||||
This code is essentially based on the example written for
|
||||
Illustra, <ulink url="http://garcia.me.berkeley.edu/~adong/rtree"></ulink>
|
||||
Original author: Gene Selkov, Jr. <email>selkovjr@mcs.anl.gov</email>,
|
||||
Mathematics and Computer Science Division, Argonne National Laboratory.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
My thanks are primarily to Prof. Joe Hellerstein
|
||||
(<ulink url="http://db.cs.berkeley.edu/~jmh/"></ulink>) for elucidating the
|
||||
gist of the GiST (<ulink url="http://gist.cs.berkeley.edu/"></ulink>), and
|
||||
to his former student, Andy Dong
|
||||
(<ulink url="http://best.me.berkeley.edu/~adong/"></ulink>), for his exemplar.
|
||||
I am also grateful to all postgres developers, present and past, for enabling
|
||||
myself to create my own world and live undisturbed in it. And I would like to
|
||||
acknowledge my gratitude to Argonne Lab and to the U.S. Department of Energy
|
||||
for the years of faithful support of my database research.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Gene Selkov, Jr.
|
||||
Computational Scientist
|
||||
Mathematics and Computer Science Division
|
||||
Argonne National Laboratory
|
||||
9700 S Cass Ave.
|
||||
Building 221
|
||||
Argonne, IL 60439-4844
|
||||
<email>selkovjr@mcs.anl.gov</email>
|
||||
to his former student, Andy Dong (<ulink
|
||||
url="http://best.me.berkeley.edu/~adong/"></ulink>), for his example
|
||||
written for Illustra,
|
||||
<ulink url="http://garcia.me.berkeley.edu/~adong/rtree"></ulink>.
|
||||
I am also grateful to all Postgres developers, present and past, for
|
||||
enabling myself to create my own world and live undisturbed in it. And I
|
||||
would like to acknowledge my gratitude to Argonne Lab and to the
|
||||
U.S. Department of Energy for the years of faithful support of my database
|
||||
research.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
@ -527,9 +394,9 @@ a <@ b Contained in
|
||||
<para>
|
||||
Additional updates were made by Joshua Reich <email>josh@root.net</email> in
|
||||
July 2006. These include <literal>cube(float8[], float8[])</literal> and
|
||||
cleaning up the code to use the V1 call protocol instead of the deprecated V0
|
||||
form.
|
||||
cleaning up the code to use the V1 call protocol instead of the deprecated
|
||||
V0 protocol.
|
||||
</para>
|
||||
</sect2>
|
||||
</sect1>
|
||||
|
||||
</sect1>
|
||||
|
File diff suppressed because it is too large
Load Diff
@ -1,3 +1,5 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/dict-int.sgml,v 1.2 2007/12/06 04:12:10 tgl Exp $ -->
|
||||
|
||||
<sect1 id="dict-int">
|
||||
<title>dict_int</title>
|
||||
|
||||
@ -6,13 +8,16 @@
|
||||
</indexterm>
|
||||
|
||||
<para>
|
||||
The motivation for this example dictionary is to control the indexing of
|
||||
integers (signed and unsigned), and, consequently, to minimize the number of
|
||||
unique words which greatly affect the performance of searching.
|
||||
<filename>dict_int</> is an example of an add-on dictionary template
|
||||
for full-text search. The motivation for this example dictionary is to
|
||||
control the indexing of integers (signed and unsigned), allowing such
|
||||
numbers to be indexed while preventing excessive growth in the number of
|
||||
unique words, which greatly affects the performance of searching.
|
||||
</para>
|
||||
|
||||
<sect2>
|
||||
<title>Configuration</title>
|
||||
|
||||
<para>
|
||||
The dictionary accepts two options:
|
||||
</para>
|
||||
@ -20,17 +25,19 @@
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
The MAXLEN parameter specifies the maximum length (number of digits)
|
||||
allowed in an integer word. The default value is 6.
|
||||
The <literal>maxlen</> parameter specifies the maximum number of
|
||||
digits allowed in an integer word. The default value is 6.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
The REJECTLONG parameter specifies if an overlength integer should be
|
||||
truncated or ignored. If REJECTLONG=FALSE (default), the dictionary returns
|
||||
the first MAXLEN digits of the integer. If REJECTLONG=TRUE, the
|
||||
dictionary treats an overlength integer as a stop word, so that it will
|
||||
not be indexed.
|
||||
The <literal>rejectlong</> parameter specifies whether an overlength
|
||||
integer should be truncated or ignored. If <literal>rejectlong</> is
|
||||
<literal>false</> (the default), the dictionary returns the first
|
||||
<literal>maxlen</> digits of the integer. If <literal>rejectlong</> is
|
||||
<literal>true</>, the dictionary treats an overlength integer as a stop
|
||||
word, so that it will not be indexed. Note that this also means that
|
||||
such an integer cannot be searched for.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
|
@ -1,3 +1,5 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/dict-xsyn.sgml,v 1.2 2007/12/06 04:12:10 tgl Exp $ -->
|
||||
|
||||
<sect1 id="dict-xsyn">
|
||||
<title>dict_xsyn</title>
|
||||
|
||||
@ -6,28 +8,34 @@
|
||||
</indexterm>
|
||||
|
||||
<para>
|
||||
The Extended Synonym Dictionary module replaces words with groups of their
|
||||
synonyms, and so makes it possible to search for a word using any of its
|
||||
synonyms.
|
||||
<filename>dict_xsyn</> (Extended Synonym Dictionary) is an example of an
|
||||
add-on dictionary template for full-text search. This dictionary type
|
||||
replaces words with groups of their synonyms, and so makes it possible to
|
||||
search for a word using any of its synonyms.
|
||||
</para>
|
||||
|
||||
<sect2>
|
||||
<title>Configuration</title>
|
||||
|
||||
<para>
|
||||
A <literal>dict_xsyn</> dictionary accepts the following options:
|
||||
</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
KEEPORIG controls whether the original word is included, or only its
|
||||
synonyms. Default is 'true'.
|
||||
<literal>keeporig</> controls whether the original word is included (if
|
||||
<literal>true</>), or only its synonyms (if <literal>false</>). Default
|
||||
is <literal>true</>.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
RULES is the base name of the file containing the list of synonyms.
|
||||
This file must be in $(prefix)/share/tsearch_data/, and its name must
|
||||
end in ".rules" (which is not included in the RULES parameter).
|
||||
<literal>rules</> is the base name of the file containing the list of
|
||||
synonyms. This file must be stored in
|
||||
<filename>$SHAREDIR/tsearch_data/</> (where <literal>$SHAREDIR</> means
|
||||
the <productname>PostgreSQL</> installation's shared-data directory).
|
||||
Its name must end in <literal>.rules</> (which is not to be included in
|
||||
the <literal>rules</> parameter).
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
@ -38,41 +46,63 @@
|
||||
<listitem>
|
||||
<para>
|
||||
Each line represents a group of synonyms for a single word, which is
|
||||
given first on the line. Synonyms are separated by whitespace:
|
||||
</para>
|
||||
given first on the line. Synonyms are separated by whitespace, thus:
|
||||
<programlisting>
|
||||
word syn1 syn2 syn3
|
||||
</programlisting>
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
Sharp ('#') sign is a comment delimiter. It may appear at any position
|
||||
inside the line. The rest of the line will be skipped.
|
||||
The sharp (<literal>#</>) sign is a comment delimiter. It may appear at
|
||||
any position in a line. The rest of the line will be skipped.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
|
||||
<para>
|
||||
Look at xsyn_sample.rules, which is installed in $(prefix)/share/tsearch_data/,
|
||||
for an example.
|
||||
Look at <filename>xsyn_sample.rules</>, which is installed in
|
||||
<filename>$SHAREDIR/tsearch_data/</>, for an example.
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Usage</title>
|
||||
<programlisting>
|
||||
mydb=# SELECT ts_lexize('xsyn','word');
|
||||
ts_lexize
|
||||
----------------
|
||||
{word,syn1,syn2,syn3)
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
Change dictionary options:
|
||||
</para>
|
||||
<programlisting>
|
||||
mydb# ALTER TEXT SEARCH DICTIONARY xsyn (KEEPORIG=false);
|
||||
Running the installation script creates a text search template
|
||||
<literal>xsyn_template</> and a dictionary <literal>xsyn</>
|
||||
based on it, with default parameters. You can alter the
|
||||
parameters, for example
|
||||
|
||||
<programlisting>
|
||||
mydb# ALTER TEXT SEARCH DICTIONARY xsyn (RULES='my_rules', KEEPORIG=false);
|
||||
ALTER TEXT SEARCH DICTIONARY
|
||||
</programlisting>
|
||||
</programlisting>
|
||||
|
||||
or create new dictionaries based on the template.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
To test the dictionary, you can try
|
||||
|
||||
<programlisting>
|
||||
mydb=# SELECT ts_lexize('xsyn', 'word');
|
||||
ts_lexize
|
||||
-----------------------
|
||||
{word,syn1,syn2,syn3}
|
||||
</programlisting>
|
||||
|
||||
but real-world usage will involve including it in a text search
|
||||
configuration as described in <xref linkend="textsearch">.
|
||||
That might look like this:
|
||||
|
||||
<programlisting>
|
||||
ALTER TEXT SEARCH CONFIGURATION english
|
||||
ALTER MAPPING FOR word, asciiword WITH xsyn, english_stem;
|
||||
</programlisting>
|
||||
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
</sect1>
|
||||
|
@ -1,3 +1,5 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/earthdistance.sgml,v 1.3 2007/12/06 04:12:10 tgl Exp $ -->
|
||||
|
||||
<sect1 id="earthdistance">
|
||||
<title>earthdistance</title>
|
||||
|
||||
@ -6,128 +8,184 @@
|
||||
</indexterm>
|
||||
|
||||
<para>
|
||||
This module contains two different approaches to calculating
|
||||
great circle distances on the surface of the Earth. The one described
|
||||
first depends on the contrib/cube package (which MUST be installed before
|
||||
earthdistance is installed). The second one is based on the point
|
||||
datatype using latitude and longitude for the coordinates. The install
|
||||
script makes the defined functions executable by anyone.
|
||||
</para>
|
||||
<para>
|
||||
A spherical model of the Earth is used.
|
||||
</para>
|
||||
<para>
|
||||
Data is stored in cubes that are points (both corners are the same) using 3
|
||||
coordinates representing the distance from the center of the Earth.
|
||||
</para>
|
||||
<para>
|
||||
The radius of the Earth is obtained from the earth() function. It is
|
||||
given in meters. But by changing this one function you can change it
|
||||
to use some other units or to use a different value of the radius
|
||||
that you feel is more appropiate.
|
||||
</para>
|
||||
<para>
|
||||
This package also has applications to astronomical databases as well.
|
||||
Astronomers will probably want to change earth() to return a radius of
|
||||
180/pi() so that distances are in degrees.
|
||||
</para>
|
||||
<para>
|
||||
Functions are provided to allow for input in latitude and longitude (in
|
||||
degrees), to allow for output of latitude and longitude, to calculate
|
||||
the great circle distance between two points and to easily specify a
|
||||
bounding box usable for index searches.
|
||||
</para>
|
||||
<para>
|
||||
The functions are all 'sql' functions. If you want to make these functions
|
||||
executable by other people you will also have to make the referenced
|
||||
cube functions executable. cube(text), cube(float8), cube(cube,float8),
|
||||
cube_distance(cube,cube), cube_ll_coord(cube,int) and
|
||||
cube_enlarge(cube,float8,int) are used indirectly by the earth distance
|
||||
functions. is_point(cube) and cube_dim(cube) are used in constraints for data
|
||||
in domain earth. cube_ur_coord(cube,int) is used in the regression tests and
|
||||
might be useful for looking at bounding box coordinates in user applications.
|
||||
</para>
|
||||
<para>
|
||||
A domain of type cube named earth is defined.
|
||||
There are constraints on it defined to make sure the cube is a point,
|
||||
that it does not have more than 3 dimensions and that it is very near
|
||||
the surface of a sphere centered about the origin with the radius of
|
||||
the Earth.
|
||||
</para>
|
||||
<para>
|
||||
The following functions are provided:
|
||||
The <filename>earthdistance</> module provides two different approaches to
|
||||
calculating great circle distances on the surface of the Earth. The one
|
||||
described first depends on the <filename>cube</> package (which
|
||||
<emphasis>must</> be installed before <filename>earthdistance</> can be
|
||||
installed). The second one is based on the built-in <type>point</> datatype,
|
||||
using longitude and latitude for the coordinates.
|
||||
</para>
|
||||
|
||||
<table id="earthdistance-functions">
|
||||
<title>EarthDistance functions</title>
|
||||
<tgroup cols="2">
|
||||
<tbody>
|
||||
<row>
|
||||
<entry><literal>earth()</literal></entry>
|
||||
<entry>returns the radius of the Earth in meters.</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><literal>sec_to_gc(float8)</literal></entry>
|
||||
<entry>converts the normal straight line
|
||||
(secant) distance between between two points on the surface of the Earth
|
||||
to the great circle distance between them.
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><literal>gc_to_sec(float8)</literal></entry>
|
||||
<entry>Converts the great circle distance
|
||||
between two points on the surface of the Earth to the normal straight line
|
||||
(secant) distance between them.
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><literal>ll_to_earth(float8, float8)</literal></entry>
|
||||
<entry>Returns the location of a point on the surface of the Earth given
|
||||
its latitude (argument 1) and longitude (argument 2) in degrees.
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><literal>latitude(earth)</literal></entry>
|
||||
<entry>Returns the latitude in degrees of a point on the surface of the
|
||||
Earth.
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><literal>longitude(earth)</literal></entry>
|
||||
<entry>Returns the longitude in degrees of a point on the surface of the
|
||||
Earth.
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><literal>earth_distance(earth, earth)</literal></entry>
|
||||
<entry>Returns the great circle distance between two points on the
|
||||
surface of the Earth.
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><literal>earth_box(earth, float8)</literal></entry>
|
||||
<entry>Returns a box suitable for an indexed search using the cube @>
|
||||
operator for points within a given great circle distance of a location.
|
||||
Some points in this box are further than the specified great circle
|
||||
distance from the location so a second check using earth_distance
|
||||
should be made at the same time.
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><literal><@></literal> operator</entry>
|
||||
<entry>gives the distance in statute miles between
|
||||
two points on the Earth's surface. Coordinates are in degrees. Points are
|
||||
taken as (longitude, latitude) and not vice versa as longitude is closer
|
||||
to the intuitive idea of x-axis and latitude to y-axis.
|
||||
</entry>
|
||||
</row>
|
||||
</tbody>
|
||||
</tgroup>
|
||||
</table>
|
||||
<para>
|
||||
One advantage of using cube representation over a point using latitude and
|
||||
longitude for coordinates, is that you don't have to worry about special
|
||||
conditions at +/- 180 degrees of longitude or near the poles.
|
||||
In this module, the Earth is assumed to be perfectly spherical.
|
||||
(If that's too inaccurate for you, you might want to look at the
|
||||
<application><ulink url="http://www.postgis.org/">PostGIS</ulink></>
|
||||
project.)
|
||||
</para>
|
||||
|
||||
<sect2>
|
||||
<title>Cube-based earth distances</title>
|
||||
|
||||
<para>
|
||||
Data is stored in cubes that are points (both corners are the same) using 3
|
||||
coordinates representing the x, y, and z distance from the center of the
|
||||
Earth. A domain <type>earth</> over <type>cube</> is provided, which
|
||||
includes constraint checks that the value meets these restrictions and
|
||||
is reasonably close to the actual surface of the Earth.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The radius of the Earth is obtained from the <function>earth()</>
|
||||
function. It is given in meters. But by changing this one function you can
|
||||
change the module to use some other units, or to use a different value of
|
||||
the radius that you feel is more appropiate.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
This package has applications to astronomical databases as well.
|
||||
Astronomers will probably want to change <function>earth()</> to return a
|
||||
radius of <literal>180/pi()</> so that distances are in degrees.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Functions are provided to support input in latitude and longitude (in
|
||||
degrees), to support output of latitude and longitude, to calculate
|
||||
the great circle distance between two points and to easily specify a
|
||||
bounding box usable for index searches.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The following functions are provided:
|
||||
</para>
|
||||
|
||||
<table id="earthdistance-cube-functions">
|
||||
<title>Cube-based earthdistance functions</title>
|
||||
<tgroup cols="3">
|
||||
<thead>
|
||||
<row>
|
||||
<entry>Function</entry>
|
||||
<entry>Returns</entry>
|
||||
<entry>Description</entry>
|
||||
</row>
|
||||
</thead>
|
||||
<tbody>
|
||||
<row>
|
||||
<entry><function>earth()</function></entry>
|
||||
<entry><type>float8</type></entry>
|
||||
<entry>Returns the assumed radius of the Earth.</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><function>sec_to_gc(float8)</function></entry>
|
||||
<entry><type>float8</type></entry>
|
||||
<entry>Converts the normal straight line
|
||||
(secant) distance between between two points on the surface of the Earth
|
||||
to the great circle distance between them.
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><function>gc_to_sec(float8)</function></entry>
|
||||
<entry><type>float8</type></entry>
|
||||
<entry>Converts the great circle distance between two points on the
|
||||
surface of the Earth to the normal straight line (secant) distance
|
||||
between them.
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><function>ll_to_earth(float8, float8)</function></entry>
|
||||
<entry><type>earth</type></entry>
|
||||
<entry>Returns the location of a point on the surface of the Earth given
|
||||
its latitude (argument 1) and longitude (argument 2) in degrees.
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><function>latitude(earth)</function></entry>
|
||||
<entry><type>float8</type></entry>
|
||||
<entry>Returns the latitude in degrees of a point on the surface of the
|
||||
Earth.
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><function>longitude(earth)</function></entry>
|
||||
<entry><type>float8</type></entry>
|
||||
<entry>Returns the longitude in degrees of a point on the surface of the
|
||||
Earth.
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><function>earth_distance(earth, earth)</function></entry>
|
||||
<entry><type>float8</type></entry>
|
||||
<entry>Returns the great circle distance between two points on the
|
||||
surface of the Earth.
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><function>earth_box(earth, float8)</function></entry>
|
||||
<entry><type>cube</type></entry>
|
||||
<entry>Returns a box suitable for an indexed search using the cube
|
||||
<literal>@></>
|
||||
operator for points within a given great circle distance of a location.
|
||||
Some points in this box are further than the specified great circle
|
||||
distance from the location, so a second check using
|
||||
<function>earth_distance</> should be included in the query.
|
||||
</entry>
|
||||
</row>
|
||||
</tbody>
|
||||
</tgroup>
|
||||
</table>
|
||||
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Point-based earth distances</title>
|
||||
|
||||
<para>
|
||||
The second part of the module relies on representing Earth locations as
|
||||
values of type <type>point</>, in which the first component is taken to
|
||||
represent longitude in degrees, and the second component is taken to
|
||||
represent latitude in degrees. Points are taken as (longitude, latitude)
|
||||
and not vice versa because longitude is closer to the intuitive idea of
|
||||
x-axis and latitude to y-axis.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
A single operator is provided:
|
||||
</para>
|
||||
|
||||
<table id="earthdistance-point-operators">
|
||||
<title>Point-based earthdistance operators</title>
|
||||
<tgroup cols="3">
|
||||
<thead>
|
||||
<row>
|
||||
<entry>Operator</entry>
|
||||
<entry>Returns</entry>
|
||||
<entry>Description</entry>
|
||||
</row>
|
||||
</thead>
|
||||
<tbody>
|
||||
<row>
|
||||
<entry><type>point</> <literal><@></literal> <type>point</></entry>
|
||||
<entry><type>float8</type></entry>
|
||||
<entry>Gives the distance in statute miles between
|
||||
two points on the Earth's surface.
|
||||
</entry>
|
||||
</row>
|
||||
</tbody>
|
||||
</tgroup>
|
||||
</table>
|
||||
|
||||
<para>
|
||||
Note that unlike the <type>cube</>-based part of the module, units
|
||||
are hardwired here: changing the <function>earth()</> function will
|
||||
not affect the results of this operator.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
One disadvantage of the longitude/latitude representation is that
|
||||
you need to be careful about the edge conditions near the poles
|
||||
and near +/- 180 degrees of longitude. The <type>cube</>-based
|
||||
representation avoids these discontinuities.
|
||||
</para>
|
||||
|
||||
</sect2>
|
||||
|
||||
</sect1>
|
||||
|
||||
|
@ -1,30 +1,51 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/fuzzystrmatch.sgml,v 1.3 2007/12/06 04:12:10 tgl Exp $ -->
|
||||
|
||||
<sect1 id="fuzzystrmatch">
|
||||
<title>fuzzystrmatch</title>
|
||||
|
||||
<indexterm zone="fuzzystrmatch">
|
||||
<primary>fuzzystrmatch</primary>
|
||||
</indexterm>
|
||||
|
||||
<para>
|
||||
This section describes the fuzzystrmatch module which provides different
|
||||
The <filename>fuzzystrmatch</> module provides several
|
||||
functions to determine similarities and distance between strings.
|
||||
</para>
|
||||
|
||||
<sect2>
|
||||
<title>Soundex</title>
|
||||
|
||||
<para>
|
||||
The Soundex system is a method of matching similar sounding names
|
||||
(or any words) to the same code. It was initially used by the
|
||||
United States Census in 1880, 1900, and 1910, but it has little use
|
||||
beyond English names (or the English pronunciation of names), and
|
||||
it is not a linguistic tool.
|
||||
The Soundex system is a method of matching similar-sounding names
|
||||
by converting them to the same code. It was initially used by the
|
||||
United States Census in 1880, 1900, and 1910. Note that Soundex
|
||||
is not very useful for non-English names.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
When comparing two soundex values to determine similarity, the
|
||||
difference function reports how close the match is on a scale
|
||||
from zero to four, with zero being no match and four being an
|
||||
exact match.
|
||||
The <filename>fuzzystrmatch</> module provides two functions
|
||||
for working with Soundex codes:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
soundex(text) returns text
|
||||
difference(text, text) returns int
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
The following are some usage examples:
|
||||
The <function>soundex</> function converts a string to its Soundex code.
|
||||
The <function>difference</> function converts two strings to their Soundex
|
||||
codes and then reports the number of matching code positions. Since
|
||||
Soundex codes have four characters, the result ranges from zero to four,
|
||||
with zero being no match and four being an exact match. (Thus, the
|
||||
function is misnamed — <function>similarity</> would have been
|
||||
a better name.)
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Here are some usage examples:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
SELECT soundex('hello world!');
|
||||
|
||||
@ -41,81 +62,106 @@ INSERT INTO s VALUES ('jack');
|
||||
|
||||
SELECT * FROM s WHERE soundex(nm) = soundex('john');
|
||||
|
||||
SELECT a.nm, b.nm FROM s a, s b WHERE soundex(a.nm) = soundex(b.nm) AND a.oid <> b.oid;
|
||||
|
||||
CREATE FUNCTION text_sx_eq(text, text) RETURNS boolean AS
|
||||
'select soundex($1) = soundex($2)'
|
||||
LANGUAGE SQL;
|
||||
|
||||
CREATE FUNCTION text_sx_lt(text, text) RETURNS boolean AS
|
||||
'select soundex($1) < soundex($2)'
|
||||
LANGUAGE SQL;
|
||||
|
||||
CREATE FUNCTION text_sx_gt(text, text) RETURNS boolean AS
|
||||
'select soundex($1) > soundex($2)'
|
||||
LANGUAGE SQL;
|
||||
|
||||
CREATE FUNCTION text_sx_le(text, text) RETURNS boolean AS
|
||||
'select soundex($1) <= soundex($2)'
|
||||
LANGUAGE SQL;
|
||||
|
||||
CREATE FUNCTION text_sx_ge(text, text) RETURNS boolean AS
|
||||
'select soundex($1) >= soundex($2)'
|
||||
LANGUAGE SQL;
|
||||
|
||||
CREATE FUNCTION text_sx_ne(text, text) RETURNS boolean AS
|
||||
'select soundex($1) <> soundex($2)'
|
||||
LANGUAGE SQL;
|
||||
|
||||
DROP OPERATOR #= (text, text);
|
||||
|
||||
CREATE OPERATOR #= (leftarg=text, rightarg=text, procedure=text_sx_eq, commutator = #=);
|
||||
|
||||
SELECT * FROM s WHERE text_sx_eq(nm, 'john');
|
||||
|
||||
SELECT * FROM s WHERE s.nm #= 'john';
|
||||
|
||||
SELECT * FROM s WHERE difference(s.nm, 'john') > 2;
|
||||
</programlisting>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>levenshtein</title>
|
||||
<title>Levenshtein</title>
|
||||
|
||||
<para>
|
||||
This function calculates the levenshtein distance between two strings:
|
||||
This function calculates the Levenshtein distance between two strings:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
int levenshtein(text source, text target)
|
||||
levenshtein(text source, text target) returns int
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
Both <literal>source</literal> and <literal>target</literal> can be any
|
||||
NOT NULL string with a maximum of 255 characters.
|
||||
non-null string, with a maximum of 255 characters.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Example:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
SELECT levenshtein('GUMBO','GAMBOL');
|
||||
test=# SELECT levenshtein('GUMBO', 'GAMBOL');
|
||||
levenshtein
|
||||
-------------
|
||||
2
|
||||
(1 row)
|
||||
</programlisting>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>metaphone</title>
|
||||
<title>Metaphone</title>
|
||||
|
||||
<para>
|
||||
This function calculates and returns the metaphone code of an input string:
|
||||
Metaphone, like Soundex, is based on the idea of constructing a
|
||||
representative code for an input string. Two strings are then
|
||||
deemed similar if they have the same codes.
|
||||
</para>
|
||||
<programlisting>
|
||||
text metahpone(text source, int max_output_length)
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
<literal>source</literal> has to be a NOT NULL string with a maximum of
|
||||
255 characters. <literal>max_output_length</literal> fixes the maximum
|
||||
This function calculates the metaphone code of an input string:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
metaphone(text source, int max_output_length) returns text
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
<literal>source</literal> has to be a non-null string with a maximum of
|
||||
255 characters. <literal>max_output_length</literal> sets the maximum
|
||||
length of the output metaphone code; if longer, the output is truncated
|
||||
to this length.
|
||||
</para>
|
||||
<para>Example</para>
|
||||
|
||||
<para>
|
||||
Example:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
SELECT metaphone('GUMBO',4);
|
||||
test=# SELECT metaphone('GUMBO', 4);
|
||||
metaphone
|
||||
-----------
|
||||
KM
|
||||
(1 row)
|
||||
</programlisting>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Double Metaphone</title>
|
||||
|
||||
<para>
|
||||
The Double Metaphone system computes two <quote>sounds like</> strings
|
||||
for a given input string — a <quote>primary</> and an
|
||||
<quote>alternate</>. In most cases they are the same, but for non-English
|
||||
names especially they can be a bit different, depending on pronunciation.
|
||||
These functions compute the primary and alternate codes:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
dmetaphone(text source) returns text
|
||||
dmetaphone_alt(text source) returns text
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
There is no length limit on the input strings.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Example:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
test=# select dmetaphone('gumbo');
|
||||
dmetaphone
|
||||
------------
|
||||
KMP
|
||||
(1 row)
|
||||
</programlisting>
|
||||
</sect2>
|
||||
|
||||
|
@ -1,3 +1,5 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/hstore.sgml,v 1.2 2007/12/06 04:12:10 tgl Exp $ -->
|
||||
|
||||
<sect1 id="hstore">
|
||||
<title>hstore</title>
|
||||
|
||||
@ -6,224 +8,234 @@
|
||||
</indexterm>
|
||||
|
||||
<para>
|
||||
The <literal>hstore</literal> module is usefull for storing (key,value) pairs.
|
||||
This module can be useful in different scenarios: case with many attributes
|
||||
rarely searched, semistructural data or a lazy DBA.
|
||||
This module implements a data type <type>hstore</> for storing sets of
|
||||
(key,value) pairs within a single <productname>PostgreSQL</> data field.
|
||||
This can be useful in various scenarios, such as rows with many attributes
|
||||
that are rarely examined, or semi-structured data.
|
||||
</para>
|
||||
|
||||
<sect2>
|
||||
<title>Operations</title>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
<literal>hstore -> text</literal> - get value , perl analogy $h{key}
|
||||
</para>
|
||||
<programlisting>
|
||||
select 'a=>q, b=>g'->'a';
|
||||
?
|
||||
------
|
||||
q
|
||||
</programlisting>
|
||||
<para>
|
||||
Note the use of parenthesis in the select below, because priority of 'is' is
|
||||
higher than that of '->':
|
||||
</para>
|
||||
<programlisting>
|
||||
SELECT id FROM entrants WHERE (info->'education_period') IS NOT NULL;
|
||||
</programlisting>
|
||||
</listitem>
|
||||
<title><type>hstore</> External Representation</title>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
<literal>hstore || hstore</literal> - concatenation, perl analogy %a=( %b, %c );
|
||||
</para>
|
||||
<programlisting>
|
||||
regression=# select 'a=>b'::hstore || 'c=>d'::hstore;
|
||||
?column?
|
||||
--------------------
|
||||
"a"=>"b", "c"=>"d"
|
||||
(1 row)
|
||||
</programlisting>
|
||||
<para>
|
||||
The text representation of an <type>hstore</> value includes zero
|
||||
or more <replaceable>key</> <literal>=></> <replaceable>value</>
|
||||
items, separated by commas. For example:
|
||||
|
||||
<para>
|
||||
but, notice
|
||||
</para>
|
||||
<programlisting>
|
||||
k => v
|
||||
foo => bar, baz => whatever
|
||||
"1-a" => "anything at all"
|
||||
</programlisting>
|
||||
|
||||
<programlisting>
|
||||
regression=# select 'a=>b'::hstore || 'a=>d'::hstore;
|
||||
?column?
|
||||
----------
|
||||
"a"=>"d"
|
||||
(1 row)
|
||||
</programlisting>
|
||||
</listitem>
|
||||
The order of the items is not considered significant (and may not be
|
||||
reproduced on output). Whitespace between items or around the
|
||||
<literal>=></> sign is ignored. Use double quotes if a key or
|
||||
value includes whitespace, comma, <literal>=</> or <literal>></>.
|
||||
To include a double quote or a backslash in a key or value, precede
|
||||
it with another backslash. (Keep in mind that depending on the
|
||||
setting of <varname>standard_conforming_strings</>, you may need to
|
||||
double backslashes in SQL literal strings.)
|
||||
</para>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
<literal>text => text</literal> - creates hstore type from two text strings
|
||||
</para>
|
||||
<programlisting>
|
||||
select 'a'=>'b';
|
||||
?column?
|
||||
----------
|
||||
"a"=>"b"
|
||||
</programlisting>
|
||||
</listitem>
|
||||
<para>
|
||||
A value (but not a key) can be a SQL NULL. This is represented as
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
<literal>hstore @> hstore</literal> - contains operation, check if left operand contains right.
|
||||
</para>
|
||||
<programlisting>
|
||||
regression=# select 'a=>b, b=>1, c=>NULL'::hstore @> 'a=>c';
|
||||
?column?
|
||||
----------
|
||||
f
|
||||
(1 row)
|
||||
<programlisting>
|
||||
key => NULL
|
||||
</programlisting>
|
||||
|
||||
regression=# select 'a=>b, b=>1, c=>NULL'::hstore @> 'b=>1';
|
||||
?column?
|
||||
----------
|
||||
t
|
||||
(1 row)
|
||||
</programlisting>
|
||||
</listitem>
|
||||
The <literal>NULL</> keyword is not case-sensitive. Again, use
|
||||
double quotes if you want the string <literal>null</> to be treated
|
||||
as an ordinary data value.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Currently, double quotes are always used to surround key and value
|
||||
strings on output, even when this is not strictly necessary.
|
||||
</para>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
<literal>hstore <@ hstore</literal> - contained operation, check if
|
||||
left operand is contained in right
|
||||
</para>
|
||||
<para>
|
||||
(Before PostgreSQL 8.2, the containment operators @> and <@ were
|
||||
respectively called @ and ~. These names are still available, but are
|
||||
deprecated and will eventually be retired. Notice that the old names
|
||||
are reversed from the convention formerly followed by the core geometric
|
||||
datatypes!)
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Functions</title>
|
||||
<title><type>hstore</> Operators and Functions</title>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
<literal>akeys(hstore)</literal> - returns all keys from hstore as array
|
||||
</para>
|
||||
<programlisting>
|
||||
regression=# select akeys('a=>1,b=>2');
|
||||
akeys
|
||||
-------
|
||||
{a,b}
|
||||
</programlisting>
|
||||
</listitem>
|
||||
<table id="hstore-op-table">
|
||||
<title><type>hstore</> Operators</title>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
<literal>skeys(hstore)</literal> - returns all keys from hstore as strings
|
||||
</para>
|
||||
<programlisting>
|
||||
regression=# select skeys('a=>1,b=>2');
|
||||
skeys
|
||||
-------
|
||||
a
|
||||
b
|
||||
</programlisting>
|
||||
</listitem>
|
||||
<tgroup cols="4">
|
||||
<thead>
|
||||
<row>
|
||||
<entry>Operator</entry>
|
||||
<entry>Description</entry>
|
||||
<entry>Example</entry>
|
||||
<entry>Result</entry>
|
||||
</row>
|
||||
</thead>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
<literal>avals(hstore)</literal> - returns all values from hstore as array
|
||||
</para>
|
||||
<programlisting>
|
||||
regression=# select avals('a=>1,b=>2');
|
||||
avals
|
||||
-------
|
||||
{1,2}
|
||||
</programlisting>
|
||||
</listitem>
|
||||
<tbody>
|
||||
<row>
|
||||
<entry><type>hstore</> <literal>-></> <type>text</></entry>
|
||||
<entry>get value for key (null if not present)</entry>
|
||||
<entry><literal>'a=>x, b=>y'::hstore -> 'a'</literal></entry>
|
||||
<entry><literal>x</literal></entry>
|
||||
</row>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
<literal>svals(hstore)</literal> - returns all values from hstore as
|
||||
strings
|
||||
</para>
|
||||
<programlisting>
|
||||
regression=# select svals('a=>1,b=>2');
|
||||
svals
|
||||
-------
|
||||
1
|
||||
2
|
||||
</programlisting>
|
||||
</listitem>
|
||||
<row>
|
||||
<entry><type>text</> <literal>=></> <type>text</></entry>
|
||||
<entry>make single-item <type>hstore</></entry>
|
||||
<entry><literal>'a' => 'b'</literal></entry>
|
||||
<entry><literal>"a"=>"b"</literal></entry>
|
||||
</row>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
<literal>delete (hstore,text)</literal> - delete (key,value) from hstore if
|
||||
key matches argument.
|
||||
</para>
|
||||
<programlisting>
|
||||
regression=# select delete('a=>1,b=>2','b');
|
||||
delete
|
||||
----------
|
||||
"a"=>"1"
|
||||
</programlisting>
|
||||
</listitem>
|
||||
<row>
|
||||
<entry><type>hstore</> <literal>||</> <type>hstore</></entry>
|
||||
<entry>concatenation</entry>
|
||||
<entry><literal>'a=>b, c=>d'::hstore || 'c=>x, d=>q'::hstore</literal></entry>
|
||||
<entry><literal>"a"=>"b", "c"=>"x", "d"=>"q"</literal></entry>
|
||||
</row>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
<literal>each(hstore)</literal> - return (key, value) pairs
|
||||
</para>
|
||||
<programlisting>
|
||||
regression=# select * from each('a=>1,b=>2');
|
||||
<row>
|
||||
<entry><type>hstore</> <literal>?</> <type>text</></entry>
|
||||
<entry>does <type>hstore</> contain key?</entry>
|
||||
<entry><literal>'a=>1'::hstore ? 'a'</literal></entry>
|
||||
<entry><literal>t</literal></entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><type>hstore</> <literal>@></> <type>hstore</></entry>
|
||||
<entry>does left operand contain right?</entry>
|
||||
<entry><literal>'a=>b, b=>1, c=>NULL'::hstore @> 'b=>1'</literal></entry>
|
||||
<entry><literal>t</literal></entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><type>hstore</> <literal><@</> <type>hstore</></entry>
|
||||
<entry>is left operand contained in right?</entry>
|
||||
<entry><literal>'a=>c'::hstore <@ 'a=>b, b=>1, c=>NULL'</literal></entry>
|
||||
<entry><literal>f</literal></entry>
|
||||
</row>
|
||||
|
||||
</tbody>
|
||||
</tgroup>
|
||||
</table>
|
||||
|
||||
<para>
|
||||
(Before PostgreSQL 8.2, the containment operators @> and <@ were
|
||||
respectively called @ and ~. These names are still available, but are
|
||||
deprecated and will eventually be retired. Notice that the old names
|
||||
are reversed from the convention formerly followed by the core geometric
|
||||
datatypes!)
|
||||
</para>
|
||||
|
||||
<table id="hstore-func-table">
|
||||
<title><type>hstore</> Functions</title>
|
||||
|
||||
<tgroup cols="5">
|
||||
<thead>
|
||||
<row>
|
||||
<entry>Function</entry>
|
||||
<entry>Return Type</entry>
|
||||
<entry>Description</entry>
|
||||
<entry>Example</entry>
|
||||
<entry>Result</entry>
|
||||
</row>
|
||||
</thead>
|
||||
|
||||
<tbody>
|
||||
<row>
|
||||
<entry><function>akeys(hstore)</function></entry>
|
||||
<entry><type>text[]</type></entry>
|
||||
<entry>get <type>hstore</>'s keys as array</entry>
|
||||
<entry><literal>akeys('a=>1,b=>2')</literal></entry>
|
||||
<entry><literal>{a,b}</literal></entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><function>skeys(hstore)</function></entry>
|
||||
<entry><type>setof text</type></entry>
|
||||
<entry>get <type>hstore</>'s keys as set</entry>
|
||||
<entry><literal>skeys('a=>1,b=>2')</literal></entry>
|
||||
<entry>
|
||||
<programlisting>
|
||||
a
|
||||
b
|
||||
</programlisting></entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><function>avals(hstore)</function></entry>
|
||||
<entry><type>text[]</type></entry>
|
||||
<entry>get <type>hstore</>'s values as array</entry>
|
||||
<entry><literal>avals('a=>1,b=>2')</literal></entry>
|
||||
<entry><literal>{1,2}</literal></entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><function>svals(hstore)</function></entry>
|
||||
<entry><type>setof text</type></entry>
|
||||
<entry>get <type>hstore</>'s values as set</entry>
|
||||
<entry><literal>svals('a=>1,b=>2')</literal></entry>
|
||||
<entry>
|
||||
<programlisting>
|
||||
1
|
||||
2
|
||||
</programlisting></entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><function>each(hstore)</function></entry>
|
||||
<entry><type>setof (key text, value text)</type></entry>
|
||||
<entry>get <type>hstore</>'s keys and values as set</entry>
|
||||
<entry><literal>select * from each('a=>1,b=>2')</literal></entry>
|
||||
<entry>
|
||||
<programlisting>
|
||||
key | value
|
||||
-----+-------
|
||||
a | 1
|
||||
b | 2
|
||||
</programlisting>
|
||||
</listitem>
|
||||
</programlisting></entry>
|
||||
</row>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
<literal>exist (hstore,text)</literal>
|
||||
</para>
|
||||
<para>
|
||||
<literal>hstore ? text</literal> - returns 'true if key is exists in hstore
|
||||
and false otherwise.
|
||||
</para>
|
||||
<programlisting>
|
||||
regression=# select exist('a=>1','a'), 'a=>1' ? 'a';
|
||||
exist | ?column?
|
||||
-------+----------
|
||||
t | t
|
||||
</programlisting>
|
||||
</listitem>
|
||||
<row>
|
||||
<entry><function>exist(hstore,text)</function></entry>
|
||||
<entry><type>boolean</type></entry>
|
||||
<entry>does <type>hstore</> contain key?</entry>
|
||||
<entry><literal>exist('a=>1','a')</literal></entry>
|
||||
<entry><literal>t</literal></entry>
|
||||
</row>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
<literal>defined (hstore,text)</literal> - returns true if key is exists in
|
||||
hstore and its value is not NULL.
|
||||
</para>
|
||||
<programlisting>
|
||||
regression=# select defined('a=>NULL','a');
|
||||
defined
|
||||
---------
|
||||
f
|
||||
</programlisting>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<row>
|
||||
<entry><function>defined(hstore,text)</function></entry>
|
||||
<entry><type>boolean</type></entry>
|
||||
<entry>does <type>hstore</> contain non-null value for key?</entry>
|
||||
<entry><literal>defined('a=>NULL','a')</literal></entry>
|
||||
<entry><literal>f</literal></entry>
|
||||
</row>
|
||||
|
||||
<row>
|
||||
<entry><function>delete(hstore,text)</function></entry>
|
||||
<entry><type>hstore</type></entry>
|
||||
<entry>delete any item matching key</entry>
|
||||
<entry><literal>delete('a=>1,b=>2','b')</literal></entry>
|
||||
<entry><literal>"a"=>"1"</literal></entry>
|
||||
</row>
|
||||
|
||||
</tbody>
|
||||
</tgroup>
|
||||
</table>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Indices</title>
|
||||
<title>Indexes</title>
|
||||
|
||||
<para>
|
||||
Module provides index support for '@>' and '?' operations.
|
||||
<type>hstore</> has index support for <literal>@></> and <literal>?</>
|
||||
operators. You can use either GiST or GIN index types. For example:
|
||||
</para>
|
||||
<programlisting>
|
||||
CREATE INDEX hidx ON testhstore USING GIST(h);
|
||||
|
||||
CREATE INDEX hidx ON testhstore USING GIN(h);
|
||||
</programlisting>
|
||||
</sect2>
|
||||
@ -232,44 +244,52 @@ CREATE INDEX hidx ON testhstore USING GIN(h);
|
||||
<title>Examples</title>
|
||||
|
||||
<para>
|
||||
Add a key:
|
||||
Add a key, or update an existing key with a new value:
|
||||
</para>
|
||||
<programlisting>
|
||||
UPDATE tt SET h=h||'c=>3';
|
||||
UPDATE tab SET h = h || ('c' => '3');
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
Delete a key:
|
||||
</para>
|
||||
<programlisting>
|
||||
UPDATE tt SET h=delete(h,'k1');
|
||||
UPDATE tab SET h = delete(h, 'k1');
|
||||
</programlisting>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Statistics</title>
|
||||
|
||||
<para>
|
||||
hstore type, because of its intrinsic liberality, could contain a lot of
|
||||
different keys. Checking for valid keys is the task of application.
|
||||
Examples below demonstrate several techniques how to check keys statistics.
|
||||
The <type>hstore</> type, because of its intrinsic liberality, could
|
||||
contain a lot of different keys. Checking for valid keys is the task of the
|
||||
application. Examples below demonstrate several techniques for checking
|
||||
keys and obtaining statistics.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Simple example
|
||||
Simple example:
|
||||
</para>
|
||||
<programlisting>
|
||||
SELECT * FROM each('aaa=>bq, b=>NULL, ""=>1 ');
|
||||
SELECT * FROM each('aaa=>bq, b=>NULL, ""=>1');
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
Using table
|
||||
Using a table:
|
||||
</para>
|
||||
<programlisting>
|
||||
SELECT (each(h)).key, (each(h)).value INTO stat FROM testhstore ;
|
||||
SELECT (each(h)).key, (each(h)).value INTO stat FROM testhstore;
|
||||
</programlisting>
|
||||
|
||||
<para>Online stat</para>
|
||||
<para>
|
||||
Online statistics:
|
||||
</para>
|
||||
<programlisting>
|
||||
SELECT key, count(*) FROM (SELECT (each(h)).key FROM testhstore) AS stat GROUP BY key ORDER BY count DESC, key;
|
||||
SELECT key, count(*) FROM
|
||||
(SELECT (each(h)).key FROM testhstore) AS stat
|
||||
GROUP BY key
|
||||
ORDER BY count DESC, key;
|
||||
key | count
|
||||
-----------+-------
|
||||
line | 883
|
||||
@ -287,12 +307,14 @@ SELECT key, count(*) FROM (SELECT (each(h)).key FROM testhstore) AS stat GROUP B
|
||||
|
||||
<sect2>
|
||||
<title>Authors</title>
|
||||
|
||||
<para>
|
||||
Oleg Bartunov <email>oleg@sai.msu.su</email>, Moscow, Moscow University, Russia
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Teodor Sigaev <email>teodor@sigaev.ru</email>, Moscow, Delta-Soft Ltd.,Russia
|
||||
Teodor Sigaev <email>teodor@sigaev.ru</email>, Moscow, Delta-Soft Ltd., Russia
|
||||
</para>
|
||||
</sect2>
|
||||
</sect1>
|
||||
|
||||
</sect1>
|
||||
|
@ -1,3 +1,4 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/lo.sgml,v 1.3 2007/12/06 04:12:10 tgl Exp $ -->
|
||||
|
||||
<sect1 id="lo">
|
||||
<title>lo</title>
|
||||
@ -7,112 +8,119 @@
|
||||
</indexterm>
|
||||
|
||||
<para>
|
||||
PostgreSQL type extension for managing Large Objects
|
||||
The <filename>lo</> module provides support for managing Large Objects
|
||||
(also called LOs or BLOBs). This includes a data type <type>lo</>
|
||||
and a trigger <function>lo_manage</>.
|
||||
</para>
|
||||
|
||||
<sect2>
|
||||
<title>Overview</title>
|
||||
<title>Rationale</title>
|
||||
|
||||
<para>
|
||||
One of the problems with the JDBC driver (and this affects the ODBC driver
|
||||
also), is that the specification assumes that references to BLOBS (Binary
|
||||
Large OBjectS) are stored within a table, and if that entry is changed, the
|
||||
also), is that the specification assumes that references to BLOBs (Binary
|
||||
Large OBjects) are stored within a table, and if that entry is changed, the
|
||||
associated BLOB is deleted from the database.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
As PostgreSQL stands, this doesn't occur. Large objects are treated as
|
||||
objects in their own right; a table entry can reference a large object by
|
||||
OID, but there can be multiple table entries referencing the same large
|
||||
object OID, so the system doesn't delete the large object just because you
|
||||
change or remove one such entry.
|
||||
As <productname>PostgreSQL</> stands, this doesn't occur. Large objects
|
||||
are treated as objects in their own right; a table entry can reference a
|
||||
large object by OID, but there can be multiple table entries referencing
|
||||
the same large object OID, so the system doesn't delete the large object
|
||||
just because you change or remove one such entry.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Now this is fine for new PostgreSQL-specific applications, but existing ones
|
||||
using JDBC or ODBC won't delete the objects, resulting in orphaning - objects
|
||||
that are not referenced by anything, and simply occupy disk space.
|
||||
Now this is fine for <productname>PostgreSQL</>-specific applications, but
|
||||
standard code using JDBC or ODBC won't delete the objects, resulting in
|
||||
orphan objects — objects that are not referenced by anything, and
|
||||
simply occupy disk space.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The <filename>lo</> module allows fixing this by attaching a trigger
|
||||
to tables that contain LO reference columns. The trigger essentially just
|
||||
does a <function>lo_unlink</> whenever you delete or modify a value
|
||||
referencing a large object. When you use this trigger, you are assuming
|
||||
that there is only one database reference to any large object that is
|
||||
referenced in a trigger-controlled column!
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The module also provides a data type <type>lo</>, which is really just
|
||||
a domain of the <type>oid</> type. This is useful for differentiating
|
||||
database columns that hold large object references from those that are
|
||||
OIDs of other things. You don't have to use the <type>lo</> type to
|
||||
use the trigger, but it may be convenient to use it to keep track of which
|
||||
columns in your database represent large objects that you are managing with
|
||||
the trigger. It is also rumored that the ODBC driver gets confused if you
|
||||
don't use <type>lo</> for BLOB columns.
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>The Fix</title>
|
||||
<para>
|
||||
I've fixed this by creating a new data type 'lo', some support functions, and
|
||||
a Trigger which handles the orphaning problem. The trigger essentially just
|
||||
does a 'lo_unlink' whenever you delete or modify a value referencing a large
|
||||
object. When you use this trigger, you are assuming that there is only one
|
||||
database reference to any large object that is referenced in a
|
||||
trigger-controlled column!
|
||||
</para>
|
||||
<para>
|
||||
The 'lo' type was created because we needed to differentiate between plain
|
||||
OIDs and Large Objects. Currently the JDBC driver handles this dilemma easily,
|
||||
but (after talking to Byron), the ODBC driver needed a unique type. They had
|
||||
created an 'lo' type, but not the solution to orphaning.
|
||||
</para>
|
||||
<para>
|
||||
You don't actually have to use the 'lo' type to use the trigger, but it may be
|
||||
convenient to use it to keep track of which columns in your database represent
|
||||
large objects that you are managing with the trigger.
|
||||
</para>
|
||||
</sect2>
|
||||
<title>How to Use It</title>
|
||||
|
||||
<sect2>
|
||||
<title>How to Use</title>
|
||||
<para>
|
||||
The easiest way is by an example:
|
||||
Here's a simple example of usage:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
CREATE TABLE image (title TEXT, raster lo);
|
||||
|
||||
CREATE TRIGGER t_raster BEFORE UPDATE OR DELETE ON image
|
||||
FOR EACH ROW EXECUTE PROCEDURE lo_manage(raster);
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
Create a trigger for each column that contains a lo type, and give the column
|
||||
name as the trigger procedure argument. You can have more than one trigger on
|
||||
a table if you need multiple lo columns in the same table, but don't forget to
|
||||
give a different name to each trigger.
|
||||
For each column that will contain unique references to large objects,
|
||||
create a <literal>BEFORE UPDATE OR DELETE</> trigger, and give the column
|
||||
name as the sole trigger argument. If you need multiple <type>lo</>
|
||||
columns in the same table, create a separate trigger for each one,
|
||||
remembering to give a different name to each trigger on the same table.
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Issues</title>
|
||||
<title>Limitations</title>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
Dropping a table will still orphan any objects it contains, as the trigger
|
||||
is not executed.
|
||||
is not executed. You can avoid this by preceding the <command>DROP
|
||||
TABLE</> with <command>DELETE FROM <replaceable>table</></command>.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Avoid this by preceding the 'drop table' with 'delete from {table}'.
|
||||
<command>TRUNCATE</> has the same hazard.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
If you already have, or suspect you have, orphaned large objects, see
|
||||
the contrib/vacuumlo module to help you clean them up. It's a good idea
|
||||
to run contrib/vacuumlo occasionally as a back-stop to the lo_manage
|
||||
trigger.
|
||||
If you already have, or suspect you have, orphaned large objects, see the
|
||||
<filename>contrib/vacuumlo</> module (<xref linkend="vacuumlo">) to help
|
||||
you clean them up. It's a good idea to run <application>vacuumlo</>
|
||||
occasionally as a back-stop to the <function>lo_manage</> trigger.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
Some frontends may create their own tables, and will not create the
|
||||
associated trigger(s). Also, users may not remember (or know) to create
|
||||
associated trigger(s). Also, users may not remember (or know) to create
|
||||
the triggers.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
|
||||
<para>
|
||||
As the ODBC driver needs a permanent lo type (& JDBC could be optimised to
|
||||
use it if it's Oid is fixed), and as the above issues can only be fixed by
|
||||
some internal changes, I feel it should become a permanent built-in type.
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Author</title>
|
||||
|
||||
<para>
|
||||
Peter Mount <email>peter@retep.org.uk</email> June 13 1998
|
||||
Peter Mount <email>peter@retep.org.uk</email>
|
||||
</para>
|
||||
</sect2>
|
||||
</sect1>
|
||||
|
||||
</sect1>
|
||||
|
@ -1,3 +1,4 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/seg.sgml,v 1.4 2007/12/06 04:12:10 tgl Exp $ -->
|
||||
|
||||
<sect1 id="seg">
|
||||
<title>seg</title>
|
||||
@ -7,13 +8,15 @@
|
||||
</indexterm>
|
||||
|
||||
<para>
|
||||
The <literal>seg</literal> module contains the code for the user-defined
|
||||
type, <literal>SEG</literal>, representing laboratory measurements as
|
||||
floating point intervals.
|
||||
This module implements a data type <type>seg</> for
|
||||
representing line segments, or floating point intervals.
|
||||
<type>seg</> can represent uncertainty in the interval endpoints,
|
||||
making it especially useful for representing laboratory measurements.
|
||||
</para>
|
||||
|
||||
<sect2>
|
||||
<title>Rationale</title>
|
||||
|
||||
<para>
|
||||
The geometry of measurements is usually more complex than that of a
|
||||
point in a numeric continuum. A measurement is usually a segment of
|
||||
@ -22,26 +25,28 @@
|
||||
the value being measured may naturally be an interval indicating some
|
||||
condition, such as the temperature range of stability of a protein.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Using just common sense, it appears more convenient to store such data
|
||||
as intervals, rather than pairs of numbers. In practice, it even turns
|
||||
out more efficient in most applications.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Further along the line of common sense, the fuzziness of the limits
|
||||
suggests that the use of traditional numeric data types leads to a
|
||||
certain loss of information. Consider this: your instrument reads
|
||||
6.50, and you input this reading into the database. What do you get
|
||||
when you fetch it? Watch:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
test=> select 6.50 as "pH";
|
||||
test=> select 6.50 :: float8 as "pH";
|
||||
pH
|
||||
---
|
||||
6.5
|
||||
(1 row)
|
||||
</programlisting>
|
||||
<para>
|
||||
|
||||
In the world of measurements, 6.50 is not the same as 6.5. It may
|
||||
sometimes be critically different. The experimenters usually write
|
||||
down (and publish) the digits they trust. 6.50 is actually a fuzzy
|
||||
@ -50,234 +55,171 @@ test=> select 6.50 as "pH";
|
||||
share. We definitely do not want such different data items to appear the
|
||||
same.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Conclusion? It is nice to have a special data type that can record the
|
||||
limits of an interval with arbitrarily variable precision. Variable in
|
||||
a sense that each data element records its own precision.
|
||||
the sense that each data element records its own precision.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Check this out:
|
||||
</para>
|
||||
<programlisting>
|
||||
|
||||
<programlisting>
|
||||
test=> select '6.25 .. 6.50'::seg as "pH";
|
||||
pH
|
||||
------------
|
||||
6.25 .. 6.50
|
||||
(1 row)
|
||||
</programlisting>
|
||||
</programlisting>
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Syntax</title>
|
||||
|
||||
<para>
|
||||
The external representation of an interval is formed using one or two
|
||||
floating point numbers joined by the range operator ('..' or '...').
|
||||
Optional certainty indicators (<, > and ~) are ignored by the internal
|
||||
logics, but are retained in the data.
|
||||
floating point numbers joined by the range operator (<literal>..</literal>
|
||||
or <literal>...</literal>). Alternatively, it can be specified as a
|
||||
center point plus or minus a deviation.
|
||||
Optional certainty indicators (<literal><</literal>,
|
||||
<literal>></literal> and <literal>~</literal>) can be stored as well.
|
||||
(Certainty indicators are ignored by all the built-in operators, however.)
|
||||
</para>
|
||||
|
||||
<table>
|
||||
<title>Rules</title>
|
||||
<tgroup cols="2">
|
||||
<tbody>
|
||||
<row>
|
||||
<entry>rule 1</entry>
|
||||
<entry>seg -> boundary PLUMIN deviation</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>rule 2</entry>
|
||||
<entry>seg -> boundary RANGE boundary</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>rule 3</entry>
|
||||
<entry>seg -> boundary RANGE</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>rule 4</entry>
|
||||
<entry>seg -> RANGE boundary</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>rule 5</entry>
|
||||
<entry>seg -> boundary</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>rule 6</entry>
|
||||
<entry>boundary -> FLOAT</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>rule 7</entry>
|
||||
<entry>boundary -> EXTENSION FLOAT</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>rule 8</entry>
|
||||
<entry>deviation -> FLOAT</entry>
|
||||
</row>
|
||||
</tbody>
|
||||
</tgroup>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<title>Tokens</title>
|
||||
<tgroup cols="2">
|
||||
<tbody>
|
||||
<row>
|
||||
<entry>RANGE</entry>
|
||||
<entry>(\.\.)(\.)?</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>PLUMIN</entry>
|
||||
<entry>\'\+\-\'</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>integer</entry>
|
||||
<entry>[+-]?[0-9]+</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>real</entry>
|
||||
<entry>[+-]?[0-9]+\.[0-9]+</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>FLOAT</entry>
|
||||
<entry>({integer}|{real})([eE]{integer})?</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>EXTENSION</entry>
|
||||
<entry>[<>~]</entry>
|
||||
</row>
|
||||
</tbody>
|
||||
</tgroup>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<title>Examples of valid <literal>SEG</literal> representations</title>
|
||||
<tgroup cols="2">
|
||||
<tbody>
|
||||
<row>
|
||||
<entry>Any number</entry>
|
||||
<entry>
|
||||
(rules 5,6) -- creates a zero-length segment (a point,
|
||||
if you will)
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>~5.0</entry>
|
||||
<entry>
|
||||
(rules 5,7) -- creates a zero-length segment AND records
|
||||
'~' in the data. This notation reads 'approximately 5.0',
|
||||
but its meaning is not recognized by the code. It is ignored
|
||||
until you get the value back. View it is a short-hand comment.
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><5.0</entry>
|
||||
<entry>
|
||||
(rules 5,7) -- creates a point at 5.0; '<' is ignored but
|
||||
is preserved as a comment
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>>5.0</entry>
|
||||
<entry>
|
||||
(rules 5,7) -- creates a point at 5.0; '>' is ignored but
|
||||
is preserved as a comment
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><para>5(+-)0.3</para><para>5'+-'0.3</para></entry>
|
||||
<entry>
|
||||
<para>
|
||||
(rules 1,8) -- creates an interval '4.7..5.3'. As of this
|
||||
writing (02/09/2000), this mechanism isn't completely accurate
|
||||
in determining the number of significant digits for the
|
||||
boundaries. For example, it adds an extra digit to the lower
|
||||
boundary if the resulting interval includes a power of ten:
|
||||
</para>
|
||||
<programlisting>
|
||||
postgres=> select '10(+-)1'::seg as seg;
|
||||
seg
|
||||
---------
|
||||
9.0 .. 11 -- should be: 9 .. 11
|
||||
</programlisting>
|
||||
<para>
|
||||
Also, the (+-) notation is not preserved: 'a(+-)b' will
|
||||
always be returned as '(a-b) .. (a+b)'. The purpose of this
|
||||
notation is to allow input from certain data sources without
|
||||
conversion.
|
||||
</para>
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>50 .. </entry>
|
||||
<entry>(rule 3) -- everything that is greater than or equal to 50</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>.. 0</entry>
|
||||
<entry>(rule 4) -- everything that is less than or equal to 0</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>1.5e-2 .. 2E-2 </entry>
|
||||
<entry>(rule 2) -- creates an interval (0.015 .. 0.02)</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>1 ... 2</entry>
|
||||
<entry>
|
||||
The same as 1...2, or 1 .. 2, or 1..2 (space is ignored).
|
||||
Because of the widespread use of '...' in the data sources,
|
||||
I decided to stick to is as a range operator. This, and
|
||||
also the fact that the white space around the range operator
|
||||
is ignored, creates a parsing conflict with numeric constants
|
||||
starting with a decimal point.
|
||||
</entry>
|
||||
</row>
|
||||
</tbody>
|
||||
</tgroup>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<title>Examples</title>
|
||||
<tgroup cols="2">
|
||||
<tbody>
|
||||
<row>
|
||||
<entry>.1e7</entry>
|
||||
<entry>should be: 0.1e7</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>.1 .. .2</entry>
|
||||
<entry>should be: 0.1 .. 0.2</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>2.4 E4</entry>
|
||||
<entry>should be: 2.4E4</entry>
|
||||
</row>
|
||||
</tbody>
|
||||
</tgroup>
|
||||
</table>
|
||||
<para>
|
||||
The following, although it is not a syntax error, is disallowed to improve
|
||||
the sanity of the data:
|
||||
In the following table, <replaceable>x</>, <replaceable>y</>, and
|
||||
<replaceable>delta</> denote
|
||||
floating-point numbers. <replaceable>x</> and <replaceable>y</>, but
|
||||
not <replaceable>delta</>, can be preceded by a certainty indicator:
|
||||
</para>
|
||||
|
||||
<table>
|
||||
<title></title>
|
||||
<title><type>seg</> external representations</title>
|
||||
<tgroup cols="2">
|
||||
<tbody>
|
||||
<row>
|
||||
<entry>5 .. 2</entry>
|
||||
<entry>should be: 2 .. 5</entry>
|
||||
<entry><literal><replaceable>x</></literal></entry>
|
||||
<entry>Single value (zero-length interval)
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><literal><replaceable>x</> .. <replaceable>y</></literal></entry>
|
||||
<entry>Interval from <replaceable>x</> to <replaceable>y</>
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><literal><replaceable>x</> (+-) <replaceable>delta</></literal></entry>
|
||||
<entry>Interval from <replaceable>x</> - <replaceable>delta</> to
|
||||
<replaceable>x</> + <replaceable>delta</>
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><literal><replaceable>x</> ..</literal></entry>
|
||||
<entry>Open interval with lower bound <replaceable>x</>
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><literal>.. <replaceable>x</></literal></entry>
|
||||
<entry>Open interval with upper bound <replaceable>x</>
|
||||
</entry>
|
||||
</row>
|
||||
</tbody>
|
||||
</tgroup>
|
||||
</table>
|
||||
|
||||
<table>
|
||||
<title>Examples of valid <type>seg</> input</title>
|
||||
<tgroup cols="2">
|
||||
<tbody>
|
||||
<row>
|
||||
<entry><literal>5.0</literal></entry>
|
||||
<entry>
|
||||
Creates a zero-length segment (a point, if you will)
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><literal>~5.0</literal></entry>
|
||||
<entry>
|
||||
Creates a zero-length segment and records
|
||||
<literal>~</> in the data. <literal>~</literal> is ignored
|
||||
by <type>seg</> operations, but
|
||||
is preserved as a comment.
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><literal><5.0</literal></entry>
|
||||
<entry>
|
||||
Creates a point at 5.0. <literal><</literal> is ignored but
|
||||
is preserved as a comment.
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><literal>>5.0</literal></entry>
|
||||
<entry>
|
||||
Creates a point at 5.0. <literal>></literal> is ignored but
|
||||
is preserved as a comment.
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><literal>5(+-)0.3</literal></entry>
|
||||
<entry>
|
||||
Creates an interval <literal>4.7 .. 5.3</literal>.
|
||||
Note that the <literal>(+-)</> notation isn't preserved.
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><literal>50 .. </literal></entry>
|
||||
<entry>Everything that is greater than or equal to 50</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><literal>.. 0</literal></entry>
|
||||
<entry>Everything that is less than or equal to 0</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><literal>1.5e-2 .. 2E-2 </literal></entry>
|
||||
<entry>Creates an interval <literal>0.015 .. 0.02</literal></entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><literal>1 ... 2</literal></entry>
|
||||
<entry>
|
||||
The same as <literal>1...2</literal>, or <literal>1 .. 2</literal>,
|
||||
or <literal>1..2</literal>
|
||||
(spaces around the range operator are ignored)
|
||||
</entry>
|
||||
</row>
|
||||
</tbody>
|
||||
</tgroup>
|
||||
</table>
|
||||
|
||||
<para>
|
||||
Because <literal>...</> is widely used in data sources, it is allowed
|
||||
as an alternative spelling of <literal>..</>. Unfortunately, this
|
||||
creates a parsing ambiguity: it is not clear whether the upper bound
|
||||
in <literal>0...23</> is meant to be <literal>23</> or <literal>0.23</>.
|
||||
This is resolved by requiring at least one digit before the decimal
|
||||
point in all numbers in <type>seg</> input.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
As a sanity check, <type>seg</> rejects intervals with the lower bound
|
||||
greater than the upper, for example <literal>5 .. 2</>.
|
||||
</para>
|
||||
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Precision</title>
|
||||
|
||||
<para>
|
||||
The segments are stored internally as pairs of 32-bit floating point
|
||||
numbers. It means that the numbers with more than 7 significant digits
|
||||
<type>seg</> values are stored internally as pairs of 32-bit floating point
|
||||
numbers. This means that numbers with more than 7 significant digits
|
||||
will be truncated.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The numbers with less than or exactly 7 significant digits retain their
|
||||
Numbers with 7 or fewer significant digits retain their
|
||||
original precision. That is, if your query returns 0.00, you will be
|
||||
sure that the trailing zeroes are not the artifacts of formatting: they
|
||||
reflect the precision of the original data. The number of leading
|
||||
@ -288,28 +230,20 @@ postgres=> select '10(+-)1'::seg as seg;
|
||||
|
||||
<sect2>
|
||||
<title>Usage</title>
|
||||
|
||||
<para>
|
||||
The access method for SEG is a GiST index (gist_seg_ops), which is a
|
||||
generalization of R-tree. GiSTs allow the postgres implementation of
|
||||
R-tree, originally encoded to support 2-D geometric types such as
|
||||
boxes and polygons, to be used with any data type whose data domain
|
||||
can be partitioned using the concepts of containment, intersection and
|
||||
equality. In other words, everything that can intersect or contain
|
||||
its own kind can be indexed with a GiST. That includes, among other
|
||||
things, all geometric data types, regardless of their dimensionality
|
||||
(see also contrib/cube).
|
||||
</para>
|
||||
<para>
|
||||
The operators supported by the GiST access method include:
|
||||
The <filename>seg</> module includes a GiST index operator class for
|
||||
<type>seg</> values.
|
||||
The operators supported by the GiST opclass include:
|
||||
</para>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<programlisting>
|
||||
[a, b] << [c, d] Is left of
|
||||
</programlisting>
|
||||
<para>
|
||||
The left operand, [a, b], occurs entirely to the left of the
|
||||
right operand, [c, d], on the axis (-inf, inf). It means,
|
||||
[a, b] is entirely to the left of [c, d]. That is,
|
||||
[a, b] << [c, d] is true if b < c and false otherwise
|
||||
</para>
|
||||
</listitem>
|
||||
@ -318,8 +252,8 @@ postgres=> select '10(+-)1'::seg as seg;
|
||||
[a, b] >> [c, d] Is right of
|
||||
</programlisting>
|
||||
<para>
|
||||
[a, b] is occurs entirely to the right of [c, d].
|
||||
[a, b] >> [c, d] is true if a > d and false otherwise
|
||||
[a, b] is entirely to the right of [c, d]. That is,
|
||||
[a, b] >> [c, d] is true if a > d and false otherwise
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
@ -327,8 +261,8 @@ postgres=> select '10(+-)1'::seg as seg;
|
||||
[a, b] &< [c, d] Overlaps or is left of
|
||||
</programlisting>
|
||||
<para>
|
||||
This might be better read as "does not extend to right of".
|
||||
It is true when b <= d.
|
||||
This might be better read as <quote>does not extend to right of</quote>.
|
||||
It is true when b <= d.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
@ -336,17 +270,16 @@ postgres=> select '10(+-)1'::seg as seg;
|
||||
[a, b] &> [c, d] Overlaps or is right of
|
||||
</programlisting>
|
||||
<para>
|
||||
This might be better read as "does not extend to left of".
|
||||
It is true when a >= c.
|
||||
This might be better read as <quote>does not extend to left of</quote>.
|
||||
It is true when a >= c.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<programlisting>
|
||||
[a, b] = [c, d] Same as
|
||||
[a, b] = [c, d] Same as
|
||||
</programlisting>
|
||||
<para>
|
||||
The segments [a, b] and [c, d] are identical, that is, a == b
|
||||
and c == d
|
||||
The segments [a, b] and [c, d] are identical, that is, a = c and b = d
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
@ -354,28 +287,29 @@ postgres=> select '10(+-)1'::seg as seg;
|
||||
[a, b] && [c, d] Overlaps
|
||||
</programlisting>
|
||||
<para>
|
||||
The segments [a, b] and [c, d] overlap.
|
||||
The segments [a, b] and [c, d] overlap.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<programlisting>
|
||||
[a, b] @> [c, d] Contains
|
||||
[a, b] @> [c, d] Contains
|
||||
</programlisting>
|
||||
<para>
|
||||
The segment [a, b] contains the segment [c, d], that is,
|
||||
a <= c and b >= d
|
||||
The segment [a, b] contains the segment [c, d], that is,
|
||||
a <= c and b >= d
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<programlisting>
|
||||
[a, b] <@ [c, d] Contained in
|
||||
[a, b] <@ [c, d] Contained in
|
||||
</programlisting>
|
||||
<para>
|
||||
The segment [a, b] is contained in [c, d], that is,
|
||||
a >= c and b <= d
|
||||
The segment [a, b] is contained in [c, d], that is,
|
||||
a >= c and b <= d
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
|
||||
<para>
|
||||
(Before PostgreSQL 8.2, the containment operators @> and <@ were
|
||||
respectively called @ and ~. These names are still available, but are
|
||||
@ -383,68 +317,70 @@ postgres=> select '10(+-)1'::seg as seg;
|
||||
are reversed from the convention formerly followed by the core geometric
|
||||
datatypes!)
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Although the mnemonics of the following operators is questionable, I
|
||||
preserved them to maintain visual consistency with other geometric
|
||||
data types defined in Postgres.
|
||||
</para>
|
||||
<para>
|
||||
Other operators:
|
||||
</para>
|
||||
The standard B-tree operators are also provided, for example
|
||||
|
||||
<programlisting>
|
||||
[a, b] < [c, d] Less than
|
||||
[a, b] > [c, d] Greater than
|
||||
</programlisting>
|
||||
<para>
|
||||
|
||||
These operators do not make a lot of sense for any practical
|
||||
purpose but sorting. These operators first compare (a) to (c),
|
||||
and if these are equal, compare (b) to (d). That accounts for
|
||||
and if these are equal, compare (b) to (d). That results in
|
||||
reasonably good sorting in most cases, which is useful if
|
||||
you want to use ORDER BY with this type
|
||||
you want to use ORDER BY with this type.
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Notes</title>
|
||||
|
||||
<para>
|
||||
For examples of usage, see the regression test <filename>sql/seg.sql</>.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
There are a few other potentially useful functions defined in seg.c
|
||||
that vanished from the schema because I stopped using them. Some of
|
||||
these were meant to support type casting. Let me know if I was wrong:
|
||||
I will then add them back to the schema. I would also appreciate
|
||||
other ideas that would enhance the type and make it more useful.
|
||||
The mechanism that converts <literal>(+-)</> to regular ranges
|
||||
isn't completely accurate in determining the number of significant digits
|
||||
for the boundaries. For example, it adds an extra digit to the lower
|
||||
boundary if the resulting interval includes a power of ten:
|
||||
|
||||
<programlisting>
|
||||
postgres=> select '10(+-)1'::seg as seg;
|
||||
seg
|
||||
---------
|
||||
9.0 .. 11 -- should be: 9 .. 11
|
||||
</programlisting>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
For examples of usage, see sql/seg.sql
|
||||
</para>
|
||||
<para>
|
||||
NOTE: The performance of an R-tree index can largely depend on the
|
||||
The performance of an R-tree index can largely depend on the initial
|
||||
order of input values. It may be very helpful to sort the input table
|
||||
on the SEG column (see the script sort-segments.pl for an example)
|
||||
on the <type>seg</> column; see the script <filename>sort-segments.pl</>
|
||||
for an example.
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Credits</title>
|
||||
|
||||
<para>
|
||||
Original author: Gene Selkov, Jr. <email>selkovjr@mcs.anl.gov</email>,
|
||||
Mathematics and Computer Science Division, Argonne National Laboratory.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
My thanks are primarily to Prof. Joe Hellerstein
|
||||
(<ulink url="http://db.cs.berkeley.edu/~jmh/"></ulink>) for elucidating the
|
||||
gist of the GiST (<ulink url="http://gist.cs.berkeley.edu/"></ulink>). I am
|
||||
also grateful to all postgres developers, present and past, for enabling
|
||||
also grateful to all Postgres developers, present and past, for enabling
|
||||
myself to create my own world and live undisturbed in it. And I would like
|
||||
to acknowledge my gratitude to Argonne Lab and to the U.S. Department of
|
||||
Energy for the years of faithful support of my database research.
|
||||
</para>
|
||||
<programlisting>
|
||||
Gene Selkov, Jr.
|
||||
Computational Scientist
|
||||
Mathematics and Computer Science Division
|
||||
Argonne National Laboratory
|
||||
9700 S Cass Ave.
|
||||
Building 221
|
||||
Argonne, IL 60439-4844
|
||||
</programlisting>
|
||||
<para>
|
||||
<email>selkovjr@mcs.anl.gov</email>
|
||||
</para>
|
||||
|
||||
</sect2>
|
||||
|
||||
</sect1>
|
||||
|
||||
|
@ -1,3 +1,4 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/sslinfo.sgml,v 1.3 2007/12/06 04:12:10 tgl Exp $ -->
|
||||
|
||||
<sect1 id="sslinfo">
|
||||
<title>sslinfo</title>
|
||||
@ -7,105 +8,119 @@
|
||||
</indexterm>
|
||||
|
||||
<para>
|
||||
This modules provides information about current SSL certificate for PostgreSQL.
|
||||
The <filename>sslinfo</> module provides information about the SSL
|
||||
certificate that the current client provided when connecting to
|
||||
<productname>PostgreSQL</>. The module is useless (most functions
|
||||
will return NULL) if the current connection does not use SSL.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
This extension won't build at all unless the installation was
|
||||
configured with <literal>--with-openssl</>.
|
||||
</para>
|
||||
|
||||
<sect2>
|
||||
<title>Notes</title>
|
||||
<para>
|
||||
This extension won't build unless your PostgreSQL server is configured
|
||||
with --with-openssl. Information provided with these functions would
|
||||
be completely useless if you don't use SSL to connect to database.
|
||||
</para>
|
||||
</sect2>
|
||||
<title>Functions Provided</title>
|
||||
|
||||
<sect2>
|
||||
<title>Functions Description</title>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<programlisting>
|
||||
ssl_is_used() RETURNS boolean;
|
||||
</programlisting>
|
||||
<variablelist>
|
||||
<varlistentry>
|
||||
<term><function>
|
||||
ssl_is_used() returns boolean
|
||||
</function></term>
|
||||
<listitem>
|
||||
<para>
|
||||
Returns TRUE, if current connection to server uses SSL and FALSE
|
||||
Returns TRUE if current connection to server uses SSL, and FALSE
|
||||
otherwise.
|
||||
</para>
|
||||
</listitem>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<listitem>
|
||||
<programlisting>
|
||||
ssl_client_cert_present() RETURNS boolean
|
||||
</programlisting>
|
||||
<varlistentry>
|
||||
<term><function>
|
||||
ssl_client_cert_present() returns boolean
|
||||
</function></term>
|
||||
<listitem>
|
||||
<para>
|
||||
Returns TRUE if current client have presented valid SSL client
|
||||
certificate to the server and FALSE otherwise (e.g., no SSL,
|
||||
certificate hadn't be requested by server).
|
||||
Returns TRUE if current client has presented a valid SSL client
|
||||
certificate to the server, and FALSE otherwise. (The server
|
||||
might or might not be configured to require a client certificate.)
|
||||
</para>
|
||||
</listitem>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<listitem>
|
||||
<programlisting>
|
||||
ssl_client_serial() RETURNS numeric
|
||||
</programlisting>
|
||||
<varlistentry>
|
||||
<term><function>
|
||||
ssl_client_serial() returns numeric
|
||||
</function></term>
|
||||
<listitem>
|
||||
<para>
|
||||
Returns serial number of current client certificate. The combination
|
||||
of certificate serial number and certificate issuer is guaranteed to
|
||||
uniquely identify certificate (but not its owner -- the owner ought to
|
||||
regularily change his keys, and get new certificates from the issuer).
|
||||
Returns serial number of current client certificate. The combination of
|
||||
certificate serial number and certificate issuer is guaranteed to
|
||||
uniquely identify a certificate (but not its owner — the owner
|
||||
ought to regularly change his keys, and get new certificates from the
|
||||
issuer).
|
||||
</para>
|
||||
<para>
|
||||
So, if you run you own CA and allow only certificates from this CA to
|
||||
be accepted by server, the serial number is the most reliable (albeit
|
||||
not very mnemonic) means to indentify user.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<programlisting>
|
||||
ssl_client_dn() RETURNS text
|
||||
</programlisting>
|
||||
<para>
|
||||
Returns the full subject of current client certificate, converting
|
||||
So, if you run your own CA and allow only certificates from this CA to
|
||||
be accepted by the server, the serial number is the most reliable (albeit
|
||||
not very mnemonic) means to identify a user.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term><function>
|
||||
ssl_client_dn() returns text
|
||||
</function></term>
|
||||
<listitem>
|
||||
<para>
|
||||
Returns the full subject of the current client certificate, converting
|
||||
character data into the current database encoding. It is assumed that
|
||||
if you use non-Latin characters in the certificate names, your
|
||||
if you use non-ASCII characters in the certificate names, your
|
||||
database is able to represent these characters, too. If your database
|
||||
uses the SQL_ASCII encoding, non-Latin characters in the name will be
|
||||
uses the SQL_ASCII encoding, non-ASCII characters in the name will be
|
||||
represented as UTF-8 sequences.
|
||||
</para>
|
||||
<para>
|
||||
The result looks like '/CN=Somebody /C=Some country/O=Some organization'.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<programlisting>
|
||||
ssl_issuer_dn()
|
||||
</programlisting>
|
||||
<para>
|
||||
Returns the full issuer name of the client certificate, converting
|
||||
character data into current database encoding.
|
||||
The result looks like <literal>/CN=Somebody /C=Some country/O=Some organization</>.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term><function>
|
||||
ssl_issuer_dn() returns text
|
||||
</function></term>
|
||||
<listitem>
|
||||
<para>
|
||||
Returns the full issuer name of the current client certificate, converting
|
||||
character data into the current database encoding. Encoding conversions
|
||||
are handled the same as for <function>ssl_client_dn</>.
|
||||
</para>
|
||||
<para>
|
||||
The combination of the return value of this function with the
|
||||
certificate serial number uniquely identifies the certificate.
|
||||
</para>
|
||||
<para>
|
||||
The result of this function is really useful only if you have more
|
||||
than one trusted CA certificate in your server's root.crt file, or if
|
||||
this CA has issued some intermediate certificate authority
|
||||
certificates.
|
||||
This function is really useful only if you have more than one trusted CA
|
||||
certificate in your server's <filename>root.crt</> file, or if this CA
|
||||
has issued some intermediate certificate authority certificates.
|
||||
</para>
|
||||
</listitem>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<listitem>
|
||||
<programlisting>
|
||||
ssl_client_dn_field(fieldName text) RETURNS text
|
||||
</programlisting>
|
||||
<varlistentry>
|
||||
<term><function>
|
||||
ssl_client_dn_field(fieldname text) returns text
|
||||
</function></term>
|
||||
<listitem>
|
||||
<para>
|
||||
This function returns the value of the specified field in the
|
||||
certificate subject. Field names are string constants that are
|
||||
converted into ASN1 object identificators using the OpenSSL object
|
||||
certificate subject, or NULL if the field is not present.
|
||||
Field names are string constants that are
|
||||
converted into ASN1 object identifiers using the OpenSSL object
|
||||
database. The following values are acceptable:
|
||||
</para>
|
||||
<programlisting>
|
||||
@ -127,38 +142,46 @@ generationQualifier
|
||||
description
|
||||
dnQualifier
|
||||
x500UniqueIdentifier
|
||||
pseudonim
|
||||
pseudonym
|
||||
role
|
||||
emailAddress
|
||||
</programlisting>
|
||||
<para>
|
||||
All of these fields are optional, except commonName. It depends
|
||||
entirely on your CA policy which of them would be included and which
|
||||
wouldn't. The meaning of these fields, howeer, is strictly defined by
|
||||
All of these fields are optional, except <structfield>commonName</>.
|
||||
It depends
|
||||
entirely on your CA's policy which of them would be included and which
|
||||
wouldn't. The meaning of these fields, however, is strictly defined by
|
||||
the X.500 and X.509 standards, so you cannot just assign arbitrary
|
||||
meaning to them.
|
||||
</para>
|
||||
</listitem>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<listitem>
|
||||
<programlisting>
|
||||
ssl_issuer_field(fieldName text) RETURNS text;
|
||||
</programlisting>
|
||||
<varlistentry>
|
||||
<term><function>
|
||||
ssl_issuer_field(fieldname text) returns text
|
||||
</function></term>
|
||||
<listitem>
|
||||
<para>
|
||||
Does same as ssl_client_dn_field, but for the certificate issuer
|
||||
Same as <function>ssl_client_dn_field</>, but for the certificate issuer
|
||||
rather than the certificate subject.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
</variablelist>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Author</title>
|
||||
|
||||
<para>
|
||||
Victor Wagner <email>vitus@cryptocom.ru</email>, Cryptocom LTD
|
||||
</para>
|
||||
|
||||
<para>
|
||||
E-Mail of Cryptocom OpenSSL development group:
|
||||
<email>openssl@cryptocom.ru</email>
|
||||
</para>
|
||||
</sect2>
|
||||
</sect1>
|
||||
|
||||
</sect1>
|
||||
|
File diff suppressed because it is too large
Load Diff
@ -1,4 +1,4 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/test-parser.sgml,v 1.1 2007/12/03 04:18:47 tgl Exp $ -->
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/test-parser.sgml,v 1.2 2007/12/06 04:12:10 tgl Exp $ -->
|
||||
|
||||
<sect1 id="test-parser">
|
||||
<title>test_parser</title>
|
||||
@ -8,11 +8,14 @@
|
||||
</indexterm>
|
||||
|
||||
<para>
|
||||
This is an example of a custom parser for full text search.
|
||||
<filename>test_parser</> is an example of a custom parser for full-text
|
||||
search. It doesn't do anything especially useful, but can serve as
|
||||
a starting point for developing your own parser.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
It recognizes space-delimited words and returns just two token types:
|
||||
<filename>test_parser</> recognizes words separated by white space,
|
||||
and returns just two token types:
|
||||
|
||||
<programlisting>
|
||||
mydb=# SELECT * FROM ts_token_type('testparser');
|
||||
|
@ -1,3 +1,5 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/tsearch2.sgml,v 1.2 2007/12/06 04:12:10 tgl Exp $ -->
|
||||
|
||||
<sect1 id="tsearch2">
|
||||
<title>tsearch2</title>
|
||||
|
||||
|
@ -1,3 +1,4 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/uuid-ossp.sgml,v 1.2 2007/12/06 04:12:10 tgl Exp $ -->
|
||||
|
||||
<sect1 id="uuid-ossp">
|
||||
<title>uuid-ossp</title>
|
||||
@ -7,13 +8,19 @@
|
||||
</indexterm>
|
||||
|
||||
<para>
|
||||
This module provides functions to generate universally unique
|
||||
identifiers (UUIDs) using one of the several standard algorithms, as
|
||||
well as functions to produce certain special UUID constants.
|
||||
The <filename>uuid-ossp</> module provides functions to generate universally
|
||||
unique identifiers (UUIDs) using one of several standard algorithms. There
|
||||
are also functions to produce certain special UUID constants.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
This module depends on the OSSP UUID library, which can be found at
|
||||
<ulink url="http://www.ossp.org/pkg/lib/uuid/"></ulink>.
|
||||
</para>
|
||||
|
||||
<sect2>
|
||||
<title>UUID Generation</title>
|
||||
<title><literal>uuid-ossp</literal> Functions</title>
|
||||
|
||||
<para>
|
||||
The relevant standards ITU-T Rec. X.667, ISO/IEC 9834-8:2005, and RFC
|
||||
4122 specify four algorithms for generating UUIDs, identified by the
|
||||
@ -23,7 +30,7 @@
|
||||
</para>
|
||||
|
||||
<table>
|
||||
<title><literal>uuid-ossp</literal> functions</title>
|
||||
<title>Functions for UUID Generation</title>
|
||||
<tgroup cols="2">
|
||||
<thead>
|
||||
<row>
|
||||
@ -59,22 +66,9 @@
|
||||
<para>
|
||||
This function generates a version 3 UUID in the given namespace using
|
||||
the specified input name. The namespace should be one of the special
|
||||
constants produced by the uuid_ns_*() functions shown below. (It
|
||||
could be any UUID in theory.) The name is an identifier in the
|
||||
selected namespace. For example:
|
||||
</para>
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><literal>uuid_generate_v3(uuid_ns_url(), 'http://www.postgresql.org')</literal></entry>
|
||||
<entry>
|
||||
<para>
|
||||
The name parameter will be MD5-hashed, so the cleartext cannot be
|
||||
derived from the generated UUID.
|
||||
</para>
|
||||
<para>
|
||||
The generation of UUIDs by this method has no random or
|
||||
environment-dependent element and is therefore reproducible.
|
||||
constants produced by the <function>uuid_ns_*()</> functions shown
|
||||
below. (It could be any UUID in theory.) The name is an identifier
|
||||
in the selected namespace.
|
||||
</para>
|
||||
</entry>
|
||||
</row>
|
||||
@ -102,15 +96,28 @@
|
||||
</tgroup>
|
||||
</table>
|
||||
|
||||
<para>
|
||||
For example:
|
||||
|
||||
<programlisting>
|
||||
SELECT uuid_generate_v3(uuid_ns_url(), 'http://www.postgresql.org');
|
||||
</programlisting>
|
||||
|
||||
The name parameter will be MD5-hashed, so the cleartext cannot be
|
||||
derived from the generated UUID.
|
||||
The generation of UUIDs by this method has no random or
|
||||
environment-dependent element and is therefore reproducible.
|
||||
</para>
|
||||
|
||||
<table>
|
||||
<title>UUID Constants</title>
|
||||
<title>Functions Returning UUID Constants</title>
|
||||
<tgroup cols="2">
|
||||
<tbody>
|
||||
<row>
|
||||
<entry><literal>uuid_nil()</literal></entry>
|
||||
<entry>
|
||||
<para>
|
||||
A "nil" UUID constant, which does not occur as a real UUID.
|
||||
A <quote>nil</> UUID constant, which does not occur as a real UUID.
|
||||
</para>
|
||||
</entry>
|
||||
</row>
|
||||
@ -135,8 +142,8 @@
|
||||
<entry>
|
||||
<para>
|
||||
Constant designating the ISO object identifier (OID) namespace for
|
||||
UUIDs. (This pertains to ASN.1 OIDs, unrelated to the OIDs used in
|
||||
PostgreSQL.)
|
||||
UUIDs. (This pertains to ASN.1 OIDs, which are unrelated to the OIDs
|
||||
used in <productname>PostgreSQL</>.)
|
||||
</para>
|
||||
</entry>
|
||||
</row>
|
||||
@ -153,11 +160,14 @@
|
||||
</tgroup>
|
||||
</table>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Author</title>
|
||||
|
||||
<para>
|
||||
Peter Eisentraut <email>peter_e@gmx.net</email>
|
||||
</para>
|
||||
</sect2>
|
||||
</sect1>
|
||||
|
||||
</sect2>
|
||||
|
||||
</sect1>
|
||||
|
@ -1,3 +1,5 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/vacuumlo.sgml,v 1.2 2007/12/06 04:12:10 tgl Exp $ -->
|
||||
|
||||
<sect1 id="vacuumlo">
|
||||
<title>vacuumlo</title>
|
||||
|
||||
@ -6,69 +8,103 @@
|
||||
</indexterm>
|
||||
|
||||
<para>
|
||||
This is a simple utility that will remove any orphaned large objects out of a
|
||||
PostgreSQL database. An orphaned LO is considered to be any LO whose OID
|
||||
does not appear in any OID data column of the database.
|
||||
<application>vacuumlo</> is a simple utility program that will remove any
|
||||
<quote>orphaned</> large objects from a
|
||||
<productname>PostgreSQL</> database. An orphaned large object (LO) is
|
||||
considered to be any LO whose OID does not appear in any <type>oid</> or
|
||||
<type>lo</> data column of the database.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
If you use this, you may also be interested in the lo_manage trigger in
|
||||
contrib/lo. lo_manage is useful to try to avoid creating orphaned LOs
|
||||
in the first place.
|
||||
</para>
|
||||
<para>
|
||||
<note>
|
||||
<para>
|
||||
It was decided to place this in contrib as it needs further testing, but hopefully,
|
||||
this (or a variant of it) would make it into the backend as a "vacuum lo"
|
||||
command in a later release.
|
||||
</para>
|
||||
</note>
|
||||
If you use this, you may also be interested in the <function>lo_manage</>
|
||||
trigger in <filename>contrib/lo</> (see <xref linkend="lo">).
|
||||
<function>lo_manage</> is useful to try
|
||||
to avoid creating orphaned LOs in the first place.
|
||||
</para>
|
||||
|
||||
<sect2>
|
||||
<title>Usage</title>
|
||||
<programlisting>
|
||||
vacuumlo [options] database [database2 ... databasen]
|
||||
</programlisting>
|
||||
|
||||
<synopsis>
|
||||
vacuumlo [options] database [database2 ... databaseN]
|
||||
</synopsis>
|
||||
|
||||
<para>
|
||||
All databases named on the command line are processed. Available options
|
||||
include:
|
||||
</para>
|
||||
<programlisting>
|
||||
-v Write a lot of progress messages
|
||||
-n Don't remove large objects, just show what would be done
|
||||
-U username Username to connect as
|
||||
-W Prompt for password
|
||||
-h hostname Database server host
|
||||
-p port Database server port
|
||||
</programlisting>
|
||||
|
||||
<variablelist>
|
||||
<varlistentry>
|
||||
<term><option>-v</option></term>
|
||||
<listitem>
|
||||
<para>Write a lot of progress messages</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term><option>-n</option></term>
|
||||
<listitem>
|
||||
<para>Don't remove anything, just show what would be done</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term><option>-U</option> <replaceable>username</></term>
|
||||
<listitem>
|
||||
<para>Username to connect as</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term><option>-W</option></term>
|
||||
<listitem>
|
||||
<para>Force prompt for password (generally useless)</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term><option>-h</option> <replaceable>hostname</></term>
|
||||
<listitem>
|
||||
<para>Database server's host</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term><option>-p</option> <replaceable>port</></term>
|
||||
<listitem>
|
||||
<para>Database server's port</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
</variablelist>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Method</title>
|
||||
|
||||
<para>
|
||||
First, it builds a temporary table which contains all of the OIDs of the
|
||||
large objects in that database.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
It then scans through all columns in the database that are of type "oid"
|
||||
or "lo", and removes matching entries from the temporary table.
|
||||
It then scans through all columns in the database that are of type
|
||||
<type>oid</> or <type>lo</>, and removes matching entries from the
|
||||
temporary table.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The remaining entries in the temp table identify orphaned LOs. These are
|
||||
removed.
|
||||
The remaining entries in the temp table identify orphaned LOs.
|
||||
These are removed.
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Author</title>
|
||||
|
||||
<para>
|
||||
Peter Mount <email>peter@retep.org.uk</email>
|
||||
</para>
|
||||
<para>
|
||||
<ulink url="http://www.retep.org.uk"></ulink>
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
</sect1>
|
||||
|
||||
|
@ -1,31 +1,41 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/xml2.sgml,v 1.4 2007/12/06 04:12:10 tgl Exp $ -->
|
||||
|
||||
<sect1 id="xml2">
|
||||
<title>xml2: XML-handling functions</title>
|
||||
<title>xml2</title>
|
||||
|
||||
<indexterm zone="xml2">
|
||||
<primary>xml2</primary>
|
||||
</indexterm>
|
||||
|
||||
<para>
|
||||
The <filename>xml2</> module provides XPath querying and
|
||||
XSLT functionality.
|
||||
</para>
|
||||
|
||||
<sect2>
|
||||
<title>Deprecation notice</title>
|
||||
|
||||
<para>
|
||||
From PostgreSQL 8.3 on, there is XML-related
|
||||
functionality based on the SQL/XML standard in the core server.
|
||||
That functionality covers XML syntax checking and XPath queries,
|
||||
which is what this module does as well, and more, but the API is
|
||||
not at all compatible. It is planned that this module will be
|
||||
removed in PostgreSQL 8.4 in favor of the newer standard API, so
|
||||
you are encouraged to try converting your applications. If you
|
||||
find that some of the functionality of this module is not
|
||||
available in an adequate form with the newer API, please explain
|
||||
your issue to pgsql-hackers@postgresql.org so that the deficiency
|
||||
can be addressed.
|
||||
From <productname>PostgreSQL</> 8.3 on, there is XML-related
|
||||
functionality based on the SQL/XML standard in the core server.
|
||||
That functionality covers XML syntax checking and XPath queries,
|
||||
which is what this module does, and more, but the API is
|
||||
not at all compatible. It is planned that this module will be
|
||||
removed in PostgreSQL 8.4 in favor of the newer standard API, so
|
||||
you are encouraged to try converting your applications. If you
|
||||
find that some of the functionality of this module is not
|
||||
available in an adequate form with the newer API, please explain
|
||||
your issue to pgsql-hackers@postgresql.org so that the deficiency
|
||||
can be addressed.
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Description of functions</title>
|
||||
|
||||
<para>
|
||||
The first set of functions are straightforward XML parsing and XPath queries:
|
||||
These functions provide straightforward XML parsing and XPath queries.
|
||||
All arguments are of type <type>text</>, so for brevity that is not shown.
|
||||
</para>
|
||||
|
||||
<table>
|
||||
@ -34,27 +44,27 @@
|
||||
<tbody>
|
||||
<row>
|
||||
<entry>
|
||||
<programlisting>
|
||||
xml_is_well_formed(document) RETURNS bool
|
||||
</programlisting>
|
||||
<synopsis>
|
||||
xml_is_well_formed(document) returns bool
|
||||
</synopsis>
|
||||
</entry>
|
||||
<entry>
|
||||
<para>
|
||||
This parses the document text in its parameter and returns true if the
|
||||
document is well-formed XML. (Note: before PostgreSQL 8.2, this function
|
||||
was called xml_valid(). That is the wrong name since validity and
|
||||
well-formedness have different meanings in XML. The old name is still
|
||||
available, but is deprecated and will be removed in 8.3.)
|
||||
document is well-formed XML. (Note: before PostgreSQL 8.2, this
|
||||
function was called <function>xml_valid()</>. That is the wrong name
|
||||
since validity and well-formedness have different meanings in XML.
|
||||
The old name is still available, but is deprecated.)
|
||||
</para>
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>
|
||||
<programlisting>
|
||||
xpath_string(document,query) RETURNS text
|
||||
xpath_number(document,query) RETURNS float4
|
||||
xpath_bool(document,query) RETURNS bool
|
||||
</programlisting>
|
||||
<synopsis>
|
||||
xpath_string(document,query) returns text
|
||||
xpath_number(document,query) returns float4
|
||||
xpath_bool(document,query) returns bool
|
||||
</synopsis>
|
||||
</entry>
|
||||
<entry>
|
||||
<para>
|
||||
@ -65,9 +75,9 @@
|
||||
</row>
|
||||
<row>
|
||||
<entry>
|
||||
<programlisting>
|
||||
xpath_nodeset(document,query,toptag,itemtag) RETURNS text
|
||||
</programlisting>
|
||||
<synopsis>
|
||||
xpath_nodeset(document,query,toptag,itemtag) returns text
|
||||
</synopsis>
|
||||
</entry>
|
||||
<entry>
|
||||
<para>
|
||||
@ -75,10 +85,10 @@
|
||||
the result is multivalued, the output will look like:
|
||||
</para>
|
||||
<literal>
|
||||
<toptag>
|
||||
<itemtag>Value 1 which could be an XML fragment</itemtag>
|
||||
<itemtag>Value 2....</itemtag>
|
||||
</toptag>
|
||||
<toptag>
|
||||
<itemtag>Value 1 which could be an XML fragment</itemtag>
|
||||
<itemtag>Value 2....</itemtag>
|
||||
</toptag>
|
||||
</literal>
|
||||
<para>
|
||||
If either toptag or itemtag is an empty string, the relevant tag is omitted.
|
||||
@ -87,49 +97,51 @@
|
||||
</row>
|
||||
<row>
|
||||
<entry>
|
||||
<programlisting>
|
||||
xpath_nodeset(document,query) RETURNS
|
||||
</programlisting>
|
||||
<synopsis>
|
||||
xpath_nodeset(document,query) returns text
|
||||
</synopsis>
|
||||
</entry>
|
||||
<entry>
|
||||
<para>
|
||||
Like xpath_nodeset(document,query,toptag,itemtag) but text omits both tags.
|
||||
Like xpath_nodeset(document,query,toptag,itemtag) but result omits both tags.
|
||||
</para>
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>
|
||||
<programlisting>
|
||||
xpath_nodeset(document,query,itemtag) RETURNS
|
||||
</programlisting>
|
||||
<synopsis>
|
||||
xpath_nodeset(document,query,itemtag) returns text
|
||||
</synopsis>
|
||||
</entry>
|
||||
<entry>
|
||||
<para>
|
||||
Like xpath_nodeset(document,query,toptag,itemtag) but text omits toptag.
|
||||
Like xpath_nodeset(document,query,toptag,itemtag) but result omits toptag.
|
||||
</para>
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>
|
||||
<programlisting>
|
||||
xpath_list(document,query,seperator) RETURNS text
|
||||
</programlisting>
|
||||
<synopsis>
|
||||
xpath_list(document,query,separator) returns text
|
||||
</synopsis>
|
||||
</entry>
|
||||
<entry>
|
||||
<para>
|
||||
This function returns multiple values seperated by the specified
|
||||
seperator, e.g. Value 1,Value 2,Value 3 if seperator=','.
|
||||
This function returns multiple values separated by the specified
|
||||
separator, for example <literal>Value 1,Value 2,Value 3</> if
|
||||
separator is <literal>,</>.
|
||||
</para>
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>
|
||||
<programlisting>
|
||||
xpath_list(document,query) RETURNS text
|
||||
</programlisting>
|
||||
<synopsis>
|
||||
xpath_list(document,query) returns text
|
||||
</synopsis>
|
||||
</entry>
|
||||
<entry>
|
||||
This is a wrapper for the above function that uses ',' as the seperator.
|
||||
This is a wrapper for the above function that uses <literal>,</>
|
||||
as the separator.
|
||||
</entry>
|
||||
</row>
|
||||
</tbody>
|
||||
@ -137,38 +149,37 @@
|
||||
</table>
|
||||
</sect2>
|
||||
|
||||
|
||||
<sect2>
|
||||
<title><literal>xpath_table</literal></title>
|
||||
|
||||
<synopsis>
|
||||
xpath_table(text key, text document, text relation, text xpaths, text criteria) returns setof record
|
||||
</synopsis>
|
||||
|
||||
<para>
|
||||
This is a table function which evaluates a set of XPath queries on
|
||||
each of a set of documents and returns the results as a table. The
|
||||
primary key field from the original document table is returned as the
|
||||
first column of the result so that the resultset from xpath_table can
|
||||
be readily used in joins.
|
||||
<function>xpath_table</> is a table function that evaluates a set of XPath
|
||||
queries on each of a set of documents and returns the results as a
|
||||
table. The primary key field from the original document table is returned
|
||||
as the first column of the result so that the result set
|
||||
can readily be used in joins.
|
||||
</para>
|
||||
<para>
|
||||
The function itself takes 5 arguments, all text.
|
||||
</para>
|
||||
<programlisting>
|
||||
xpath_table(key,document,relation,xpaths,criteria)
|
||||
</programlisting>
|
||||
|
||||
<table>
|
||||
<title>Parameters</title>
|
||||
<tgroup cols="2">
|
||||
<tbody>
|
||||
<row>
|
||||
<entry><literal>key</literal></entry>
|
||||
<entry><parameter>key</parameter></entry>
|
||||
<entry>
|
||||
<para>
|
||||
the name of the "key" field - this is just a field to be used as
|
||||
the first column of the output table i.e. it identifies the record from
|
||||
which each output row came (see note below about multiple values).
|
||||
the name of the <quote>key</> field — this is just a field to be used as
|
||||
the first column of the output table, i.e. it identifies the record from
|
||||
which each output row came (see note below about multiple values)
|
||||
</para>
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><literal>document</literal></entry>
|
||||
<entry><parameter>document</parameter></entry>
|
||||
<entry>
|
||||
<para>
|
||||
the name of the field containing the XML document
|
||||
@ -176,7 +187,7 @@
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><literal>relation</literal></entry>
|
||||
<entry><parameter>relation</parameter></entry>
|
||||
<entry>
|
||||
<para>
|
||||
the name of the table or view containing the documents
|
||||
@ -184,20 +195,20 @@
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><literal>xpaths</literal></entry>
|
||||
<entry><parameter>xpaths</parameter></entry>
|
||||
<entry>
|
||||
<para>
|
||||
multiple xpath expressions separated by <literal>|</literal>
|
||||
one or more XPath expressions, separated by <literal>|</literal>
|
||||
</para>
|
||||
</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry><literal>criteria</literal></entry>
|
||||
<entry><parameter>criteria</parameter></entry>
|
||||
<entry>
|
||||
<para>
|
||||
The contents of the where clause. This needs to be specified,
|
||||
so use "true" or "1=1" here if you want to process all the rows in the
|
||||
relation.
|
||||
the contents of the WHERE clause. This cannot be omitted, so use
|
||||
<literal>true</literal> or <literal>1=1</literal> if you want to
|
||||
process all the rows in the relation
|
||||
</para>
|
||||
</entry>
|
||||
</row>
|
||||
@ -206,19 +217,19 @@
|
||||
</table>
|
||||
|
||||
<para>
|
||||
NB These parameters (except the XPath strings) are just substituted
|
||||
into a plain SQL SELECT statement, so you have some flexibility - the
|
||||
These parameters (except the XPath strings) are just substituted
|
||||
into a plain SQL SELECT statement, so you have some flexibility — the
|
||||
statement is
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<literal>
|
||||
SELECT <key>,<document> FROM <relation> WHERE <criteria>
|
||||
SELECT <key>, <document> FROM <relation> WHERE <criteria>
|
||||
</literal>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
so those parameters can be *anything* valid in those particular
|
||||
so those parameters can be <emphasis>anything</> valid in those particular
|
||||
locations. The result from this SELECT needs to return exactly two
|
||||
columns (which it will unless you try to list multiple fields for key
|
||||
or document). Beware that this simplistic approach requires that you
|
||||
@ -226,43 +237,43 @@
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Using the function
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The function has to be used in a FROM expression. This gives the following
|
||||
form:
|
||||
The function has to be used in a <literal>FROM</> expression, with an
|
||||
<literal>AS</> clause to specify the output columns; for example
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
SELECT * FROM
|
||||
xpath_table('article_id',
|
||||
'article_xml',
|
||||
'articles',
|
||||
'/article/author|/article/pages|/article/title',
|
||||
'date_entered > ''2003-01-01'' ')
|
||||
'article_xml',
|
||||
'articles',
|
||||
'/article/author|/article/pages|/article/title',
|
||||
'date_entered > ''2003-01-01'' ')
|
||||
AS t(article_id integer, author text, page_count integer, title text);
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
The AS clause defines the names and types of the columns in the
|
||||
virtual table. If there are more XPath queries than result columns,
|
||||
The <literal>AS</> clause defines the names and types of the columns in the
|
||||
output table. The first is the <quote>key</> field and the rest correspond
|
||||
to the XPath queries.
|
||||
If there are more XPath queries than result columns,
|
||||
the extra queries will be ignored. If there are more result columns
|
||||
than XPath queries, the extra columns will be NULL.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Note that I've said in this example that pages is an integer. The
|
||||
function deals internally with string representations, so when you say
|
||||
you want an integer in the output, it will take the string
|
||||
representation of the XPath result and use PostgreSQL input functions
|
||||
to transform it into an integer (or whatever type the AS clause
|
||||
requests). An error will result if it can't do this - for example if
|
||||
the result is empty - so you may wish to just stick to 'text' as the
|
||||
column type if you think your data has any problems.
|
||||
Notice that this example defines the <structname>page_count</> result
|
||||
column as an integer. The function deals internally with string
|
||||
representations, so when you say you want an integer in the output, it will
|
||||
take the string representation of the XPath result and use PostgreSQL input
|
||||
functions to transform it into an integer (or whatever type the <type>AS</>
|
||||
clause requests). An error will result if it can't do this — for
|
||||
example if the result is empty — so you may wish to just stick to
|
||||
<type>text</> as the column type if you think your data has any problems.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The select statement doesn't need to use * alone - it can reference the
|
||||
The calling <command>SELECT</> statement doesn't necessarily have be
|
||||
be just <literal>SELECT *</> — it can reference the output
|
||||
columns by name or join them to other tables. The function produces a
|
||||
virtual table with which you can perform any operation you wish (e.g.
|
||||
aggregation, joining, sorting etc). So we could also have:
|
||||
@ -270,10 +281,10 @@ AS t(article_id integer, author text, page_count integer, title text);
|
||||
|
||||
<programlisting>
|
||||
SELECT t.title, p.fullname, p.email
|
||||
FROM xpath_table('article_id','article_xml','articles',
|
||||
'/article/title|/article/author/@id',
|
||||
'xpath_string(article_xml,''/article/@date'') > ''2003-03-20'' ')
|
||||
AS t(article_id integer, title text, author_id integer),
|
||||
FROM xpath_table('article_id', 'article_xml', 'articles',
|
||||
'/article/title|/article/author/@id',
|
||||
'xpath_string(article_xml,''/article/@date'') > ''2003-03-20'' ')
|
||||
AS t(article_id integer, title text, author_id integer),
|
||||
tblPeopleInfo AS p
|
||||
WHERE t.author_id = p.person_id;
|
||||
</programlisting>
|
||||
@ -282,91 +293,74 @@ WHERE t.author_id = p.person_id;
|
||||
as a more complicated example. Of course, you could wrap all
|
||||
of this in a view for convenience.
|
||||
</para>
|
||||
|
||||
<sect3>
|
||||
<title>Multivalued results</title>
|
||||
|
||||
<para>
|
||||
The xpath_table function assumes that the results of each XPath query
|
||||
The <function>xpath_table</> function assumes that the results of each XPath query
|
||||
might be multi-valued, so the number of rows returned by the function
|
||||
may not be the same as the number of input documents. The first row
|
||||
returned contains the first result from each query, the second row the
|
||||
second result from each query. If one of the queries has fewer values
|
||||
than the others, NULLs will be returned instead.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
In some cases, a user will know that a given XPath query will return
|
||||
only a single result (perhaps a unique document identifier) - if used
|
||||
only a single result (perhaps a unique document identifier) — if used
|
||||
alongside an XPath query returning multiple results, the single-valued
|
||||
result will appear only on the first row of the result. The solution
|
||||
to this is to use the key field as part of a join against a simpler
|
||||
XPath query. As an example:
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<literal>
|
||||
CREATE TABLE test
|
||||
(
|
||||
id int4 NOT NULL,
|
||||
xml text,
|
||||
CONSTRAINT pk PRIMARY KEY (id)
|
||||
)
|
||||
WITHOUT OIDS;
|
||||
|
||||
INSERT INTO test VALUES (1, '<doc num="C1">
|
||||
<line num="L1"><a>1</a><b>2</b><c>3</c></line>
|
||||
<line num="L2"><a>11</a><b>22</b><c>33</c></line>
|
||||
</doc>');
|
||||
|
||||
INSERT INTO test VALUES (2, '<doc num="C2">
|
||||
<line num="L1"><a>111</a><b>222</b><c>333</c></line>
|
||||
<line num="L2"><a>111</a><b>222</b><c>333</c></line>
|
||||
</doc>');
|
||||
</literal>
|
||||
</para>
|
||||
</sect3>
|
||||
|
||||
<sect3>
|
||||
<title>The query</title>
|
||||
|
||||
<programlisting>
|
||||
SELECT * FROM xpath_table('id','xml','test',
|
||||
'/doc/@num|/doc/line/@num|/doc/line/a|/doc/line/b|/doc/line/c','1=1')
|
||||
AS t(id int4, doc_num varchar(10), line_num varchar(10), val1 int4,
|
||||
val2 int4, val3 int4)
|
||||
WHERE id = 1 ORDER BY doc_num, line_num
|
||||
CREATE TABLE test (
|
||||
id int4 NOT NULL,
|
||||
xml text,
|
||||
CONSTRAINT pk PRIMARY KEY (id)
|
||||
);
|
||||
|
||||
INSERT INTO test VALUES (1, '<doc num="C1">
|
||||
<line num="L1"><a>1</a><b>2</b><c>3</c></line>
|
||||
<line num="L2"><a>11</a><b>22</b><c>33</c></line>
|
||||
</doc>');
|
||||
|
||||
INSERT INTO test VALUES (2, '<doc num="C2">
|
||||
<line num="L1"><a>111</a><b>222</b><c>333</c></line>
|
||||
<line num="L2"><a>111</a><b>222</b><c>333</c></line>
|
||||
</doc>');
|
||||
|
||||
SELECT * FROM
|
||||
xpath_table('id','xml','test',
|
||||
'/doc/@num|/doc/line/@num|/doc/line/a|/doc/line/b|/doc/line/c',
|
||||
'true')
|
||||
AS t(id int4, doc_num varchar(10), line_num varchar(10), val1 int4, val2 int4, val3 int4)
|
||||
WHERE id = 1 ORDER BY doc_num, line_num
|
||||
|
||||
id | doc_num | line_num | val1 | val2 | val3
|
||||
----+---------+----------+------+------+------
|
||||
1 | C1 | L1 | 1 | 2 | 3
|
||||
1 | | L2 | 11 | 22 | 33
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
Gives the result:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
id | doc_num | line_num | val1 | val2 | val3
|
||||
----+---------+----------+------+------+------
|
||||
1 | C1 | L1 | 1 | 2 | 3
|
||||
1 | | L2 | 11 | 22 | 33
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
To get doc_num on every line, the solution is to use two invocations
|
||||
of xpath_table and join the results:
|
||||
To get doc_num on every line, the solution is to use two invocations
|
||||
of xpath_table and join the results:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
SELECT t.*,i.doc_num FROM
|
||||
xpath_table('id','xml','test',
|
||||
'/doc/line/@num|/doc/line/a|/doc/line/b|/doc/line/c','1=1')
|
||||
AS t(id int4, line_num varchar(10), val1 int4, val2 int4, val3 int4),
|
||||
xpath_table('id','xml','test','/doc/@num','1=1')
|
||||
AS i(id int4, doc_num varchar(10))
|
||||
xpath_table('id', 'xml', 'test',
|
||||
'/doc/line/@num|/doc/line/a|/doc/line/b|/doc/line/c',
|
||||
'true')
|
||||
AS t(id int4, line_num varchar(10), val1 int4, val2 int4, val3 int4),
|
||||
xpath_table('id', 'xml', 'test', '/doc/@num', 'true')
|
||||
AS i(id int4, doc_num varchar(10))
|
||||
WHERE i.id=t.id AND i.id=1
|
||||
ORDER BY doc_num, line_num;
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
which gives the desired result:
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
id | line_num | val1 | val2 | val3 | doc_num
|
||||
----+----------+------+------+------+---------
|
||||
1 | L1 | 1 | 2 | 3 | C1
|
||||
@ -376,61 +370,57 @@ WHERE t.author_id = p.person_id;
|
||||
</sect3>
|
||||
</sect2>
|
||||
|
||||
|
||||
<sect2>
|
||||
<title>XSLT functions</title>
|
||||
|
||||
<para>
|
||||
The following functions are available if libxslt is installed (this is
|
||||
not currently detected automatically, so you will have to amend the
|
||||
Makefile)
|
||||
Makefile):
|
||||
</para>
|
||||
|
||||
<sect3>
|
||||
<title><literal>xslt_process</literal></title>
|
||||
<programlisting>
|
||||
xslt_process(document,stylesheet,paramlist) RETURNS text
|
||||
</programlisting>
|
||||
|
||||
<synopsis>
|
||||
xslt_process(text document, text stylesheet, text paramlist) returns text
|
||||
</synopsis>
|
||||
|
||||
<para>
|
||||
This function appplies the XSL stylesheet to the document and returns
|
||||
the transformed result. The paramlist is a list of parameter
|
||||
assignments to be used in the transformation, specified in the form
|
||||
'a=1,b=2'. Note that this is also proof-of-concept code and the
|
||||
parameter parsing is very simple-minded (e.g. parameter values cannot
|
||||
contain commas!)
|
||||
<literal>a=1,b=2</>. Note that the
|
||||
parameter parsing is very simple-minded: parameter values cannot
|
||||
contain commas!
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Also note that if either the document or stylesheet values do not
|
||||
begin with a < then they will be treated as URLs and libxslt will
|
||||
fetch them. It thus follows that you can use xslt_process as a means
|
||||
to fetch the contents of URLs - you should be aware of the security
|
||||
implications of this.
|
||||
fetch them. It follows that you can use <function>xslt_process</> as a
|
||||
means to fetch the contents of URLs — you should be aware of the
|
||||
security implications of this.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
There is also a two-parameter version of xslt_process which does not
|
||||
pass any parameters to the transformation.
|
||||
There is also a two-parameter version of <function>xslt_process</> which
|
||||
does not pass any parameters to the transformation.
|
||||
</para>
|
||||
</sect3>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Credits</title>
|
||||
<title>Author</title>
|
||||
|
||||
<para>
|
||||
Development of this module was sponsored by Torchbox Ltd. (www.torchbox.com)
|
||||
John Gray <email>jgray@azuli.co.uk</email>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Development of this module was sponsored by Torchbox Ltd. (www.torchbox.com).
|
||||
It has the same BSD licence as PostgreSQL.
|
||||
</para>
|
||||
<para>
|
||||
This version of the XML functions provides both XPath querying and
|
||||
XSLT functionality. There is also a new table function which allows
|
||||
the straightforward return of multiple XML results. Note that the current code
|
||||
doesn't take any particular care over character sets - this is
|
||||
something that should be fixed at some point!
|
||||
</para>
|
||||
<para>
|
||||
If you have any comments or suggestions, please do contact me at
|
||||
<email>jgray@azuli.co.uk.</email> Unfortunately, this isn't my main job, so
|
||||
I can't guarantee a rapid response to your query!
|
||||
</para>
|
||||
</sect2>
|
||||
</sect1>
|
||||
|
||||
</sect1>
|
||||
|
Loading…
x
Reference in New Issue
Block a user