diff --git a/doc/src/sgml/intagg.sgml b/doc/src/sgml/intagg.sgml
index f92af5485a3..b6f37f752cf 100644
--- a/doc/src/sgml/intagg.sgml
+++ b/doc/src/sgml/intagg.sgml
@@ -1,82 +1,145 @@
+
intagg
-
+
intagg
- This section describes the intagg module which provides an integer aggregator and an enumerator.
+ The intagg module provides an integer aggregator and an
+ enumerator.
-
- Many database systems have the notion of a one to many table. Such a table usually sits between two indexed tables, as:
-
-
-CREATE TABLE one_to_many(left INT, right INT) ;
-
+
+
+ Functions
- And it is used like this:
+ The aggregator is an aggregate function
+ int_array_aggregate(integer)>
+ that produces an integer array
+ containing exactly the integers it is fed.
+ Here is a not-tremendously-useful example:
- SELECT right.* from right JOIN one_to_many ON (right.id = one_to_many.right)
- WHERE one_to_many.left = item;
+test=# select int_array_aggregate(i) from
+test-# generate_series(1,10,2) i;
+ int_array_aggregate
+---------------------
+ {1,3,5,7,9}
+(1 row)
- This will return all the items in the right hand table for an entry
+ The enumerator is a function
+ int_array_enum(integer[])>
+ that returns setof integer>. It is essentially the reverse
+ operation of the aggregator: given an array of integers, expand it
+ into a set of rows. For example,
+
+
+
+test=# select * from int_array_enum(array[1,3,5,7,9]);
+ int_array_enum
+----------------
+ 1
+ 3
+ 5
+ 7
+ 9
+(5 rows)
+
+
+
+
+
+ Sample Uses
+
+
+ Many database systems have the notion of a one to many table. Such a table
+ usually sits between two indexed tables, for example:
+
+
+
+CREATE TABLE left (id INT PRIMARY KEY, ...);
+CREATE TABLE right (id INT PRIMARY KEY, ...);
+CREATE TABLE one_to_many(left INT REFERENCES left, right INT REFERENCES right);
+
+
+
+ It is typically used like this:
+
+
+
+ SELECT right.* from right JOIN one_to_many ON (right.id = one_to_many.right)
+ WHERE one_to_many.left = item>;
+
+
+
+ This will return all the items in the right hand table for an entry
in the left hand table. This is a very common construct in SQL.
Now, this methodology can be cumbersome with a very large number of
- entries in the one_to_many table. Depending on the order in which
- data was entered, a join like this could result in an index scan
+ entries in the one_to_many> table. Often,
+ a join like this would result in an index scan
and a fetch for each right hand entry in the table for a particular
- left hand entry. If you have a very dynamic system, there is not much you
+ left hand entry. If you have a very dynamic system, there is not much you
can do. However, if you have some data which is fairly static, you can
create a summary table with the aggregator.
-CREATE TABLE summary as SELECT left, int_array_aggregate(right)
- AS right FROM one_to_many GROUP BY left;
+CREATE TABLE summary as
+ SELECT left, int_array_aggregate(right) AS right
+ FROM one_to_many
+ GROUP BY left;
This will create a table with one row per left item, and an array
of right items. Now this is pretty useless without some way of using
- the array, thats why there is an array enumerator.
+ the array; that's why there is an array enumerator. You can do
+
-SELECT left, int_array_enum(right) FROM summary WHERE left = item;
+SELECT left, int_array_enum(right) FROM summary WHERE left = item>;
- The above query using int_array_enum, produces the same results as:
+ The above query using int_array_enum> produces the same results
+ as
+
-SELECT left, right FROM one_to_many WHERE left = item;
+SELECT left, right FROM one_to_many WHERE left = item>;
-
+
The difference is that the query against the summary table has to get
- only one row from the table, where as the query against "one_to_many"
- must index scan and fetch a row for each entry.
+ only one row from the table, whereas the direct query against
+ one_to_many> must index scan and fetch a row for each entry.
-
- On our system, an EXPLAIN shows a query with a cost of 8488 gets reduced
- to a cost of 329. The query is a join between the one_to_many table,
-
-
-SELECT right, count(right) FROM
-(
- SELECT left, int_array_enum(right) AS right FROM summary JOIN
- (SELECT left FROM left_table WHERE left = item) AS lefts
- ON (summary.left = lefts.left )
-) AS list GROUP BY right ORDER BY count DESC ;
-
-
+
+ On one system, an EXPLAIN> showed a query with a cost of 8488 was
+ reduced to a cost of 329. The original query was a join involving the
+ one_to_many> table, which was replaced by:
+
+
+
+SELECT right, count(right) FROM
+ ( SELECT left, int_array_enum(right) AS right
+ FROM summary JOIN (SELECT left FROM left_table WHERE left = item>) AS lefts
+ ON (summary.left = lefts.left)
+ ) AS list
+ GROUP BY right
+ ORDER BY count DESC;
+
+
+
+
+
diff --git a/doc/src/sgml/intarray.sgml b/doc/src/sgml/intarray.sgml
index a7f29980e59..95121c1e9ba 100644
--- a/doc/src/sgml/intarray.sgml
+++ b/doc/src/sgml/intarray.sgml
@@ -1,196 +1,207 @@
+
+
intarray
-
+
intarray
- This is an implementation of RD-tree data structure using GiST interface
- of PostgreSQL. It has built-in lossy compression.
-
-
-
- Current implementation provides index support for one-dimensional array of
- integers: gist__int_ops, suitable for small and medium size of arrays (used by
- default), and gist__intbig_ops for indexing large arrays (we use superimposed
- signature with length of 4096 bits to represent sets). There is also a
- non-default gin__int_ops for GIN indexes on integer arrays.
+ The intarray> module provides a number of useful functions
+ and operators for manipulating one-dimensional arrays of integers.
+ There is also support for indexed searches using some of the operators.
- Functions
+ intarray> Functions and Operators
-
+
+ intarray> Functions
-
-
- int icount(int[]) - the number of elements in intarray
-
-
-test=# select icount('{1,2,3}'::int[]);
- icount
---------
- 3
-(1 row)
-
-
+
+
+
+ Function
+ Return Type
+ Description
+ Example
+ Result
+
+
-
-
- int[] sort(int[], 'asc' | 'desc') - sort intarray
-
-
-test=# select sort('{1,2,3}'::int[],'desc');
- sort
----------
- {3,2,1}
-(1 row)
-
-
+
+
+ icount(int[])
+ int
+ number of elements in array
+ icount('{1,2,3}'::int[])
+ 3
+
-
-
- int[] sort(int[]) - sort in ascending order
-
-
+
+ sort(int[], text dir)
+ int[]
+ sort array — dir> must be asc> or desc>
+ sort('{1,2,3}'::int[], 'desc')
+ {3,2,1}
+
-
-
- int[] sort_asc(int[]),sort_desc(int[]) - shortcuts for sort
-
-
+
+ sort(int[])
+ int[]
+ sort in ascending order
+ sort(array[11,77,44])
+ {11,44,77}
+
-
-
- int[] uniq(int[]) - returns unique elements
-
-
-test=# select uniq(sort('{1,2,3,2,1}'::int[]));
- uniq
----------
- {1,2,3}
-(1 row)
-
-
+
+ sort_asc(int[])
+ int[]
+ sort in ascending order
+
+
+
-
-
- int idx(int[], int item) - returns index of first
- intarray matching element to item, or '0' if matching failed.
-
-
-test=# select idx('{1,2,3,2,1}'::int[],2);
- idx
------
- 2
-(1 row)
-
-
+
+ sort_desc(int[])
+ int[]
+ sort in descending order
+
+
+
-
-
- int[] subarray(int[],int START [, int LEN]) - returns
- part of intarray starting from element number START (from 1) and length LEN.
-
-
-test=# select subarray('{1,2,3,2,1}'::int[],2,3);
- subarray
-----------
- {2,3,2}
-(1 row)
-
-
+
+ uniq(int[])
+ int[]
+ remove adjacent duplicates
+ uniq(sort('{1,2,3,2,1}'::int[]))
+ {1,2,3}
+
-
-
- int[] intset(int4) - casting int4 to int[]
-
-
-test=# select intset(1);
- intset
---------
- {1}
-(1 row)
-
-
+
+ idx(int[], int item)
+ int
+ index of first element matching item> (0 if none)
+ idx(array[11,22,33,22,11], 22)
+ 2
+
-
-
+
+ subarray(int[], int start, int len)
+ int[]
+ portion of array starting at position start>, len elements>
+ subarray('{1,2,3,2,1}'::int[], 2, 3)
+ {2,3,2}
+
-
- Operations
-
- Operations
-
+
+ subarray(int[], int start)
+ int[]
+ portion of array starting at position start>
+ subarray('{1,2,3,2,1}'::int[], 2)
+ {2,3,2,1}
+
+
+
+ intset(int)
+ int[]
+ make single-element array
+ intset(42)
+ {42}
+
+
+
+
+
+
+
+ intarray> Operators
+
+
Operator
+ Returns
Description
+
int[] && int[]
- overlap - returns TRUE if arrays have at least one common element
+ boolean
+ overlap — true> if arrays have at least one common element
int[] @> int[]
- contains - returns TRUE if left array contains right array
+ boolean
+ contains — true> if left array contains right array
int[] <@ int[]
- contained - returns TRUE if left array is contained in right array
+ boolean
+ contained — true> if left array is contained in right array
# int[]
- returns the number of elements in array
+ int
+ number of elements in array
+
+
+ int[] # int
+ int
+ index (same as idx> function)
int[] + int
- push element to array ( add to end of array)
+ int[]
+ push element onto array (add it to end of array)
int[] + int[]
- merge of arrays (right array added to the end of left one)
+ int[]
+ array concatenation (right array added to the end of left one)
int[] - int
- remove entries matched by right argument from array
+ int[]
+ remove entries matching right argument from array
int[] - int[]
- remove right array from left
+ int[]
+ remove elements of right array from left
int[] | int
- returns intarray - union of arguments
+ int[]
+ union of arguments
int[] | int[]
- returns intarray as a union of two arrays
+ int[]
+ union of arrays
-
int[] & int[]
- returns intersection of arrays
+ int[]
+ intersection of arrays
-
int[] @@ query_int
-
- returns TRUE if array satisfies query (like
- '1&(2|3)')
-
+ boolean
+ true> if array satisfies query (see below)
-
query_int ~~ int[]
- returns TRUE if array satisfies query (commutator of @@)
+ boolean
+ true> if array satisfies query (commutator of @@>)
+
(Before PostgreSQL 8.2, the containment operators @> and <@ were
respectively called @ and ~. These names are still available, but are
@@ -198,85 +209,102 @@ test=# select intset(1);
are reversed from the convention formerly followed by the core geometric
datatypes!)
+
+
+ The @@> and ~~> operators test whether an array
+ satisfies a query>, which is expressed as a value of a
+ specialized data type query_int>. A query>
+ consists of integer values that are checked against the elements of
+ the array, possibly combined using the operators &>
+ (AND), |> (OR), and !> (NOT). Parentheses
+ can be used as needed. For example,
+ the query 1&(2|3)> matches arrays that contain 1
+ and also contain either 2 or 3.
+
+
+
+
+ Index Support
+
+
+ intarray> provides index support for the
+ &&>, @>>, <@>,
+ and @@> operators, as well as regular array equality.
+ The implementation uses an RD-tree data structure with
+ built-in lossy compression.
+
+
+
+ Two GiST index operator classes are provided:
+ gist__int_ops> (used by default) is suitable for
+ small and medium-size arrays, while
+ gist__intbig_ops> uses a larger signature and is more
+ suitable for indexing large arrays.
+
+
+
+ There is also a non-default GIN operator class
+ gin__int_ops>.
+
+
+
+ The choice between GiST and GIN indexing depends on the relative
+ performance characteristics of GiST and GIN, which are discussed elsewhere.
+ As a rule of thumb, a GIN index is faster to search than a GiST index, but
+ slower to build or update; so GIN is better suited for static data and GiST
+ for often-updated data.
+
Example
-CREATE TABLE message (mid INT NOT NULL,sections INT[]);
-CREATE TABLE message_section_map (mid INT NOT NULL,sid INT NOT NULL);
+-- a message can be in one or more sections>
+CREATE TABLE message (mid INT PRIMARY KEY, sections INT[], ...);
--- create indices
-CREATE unique index message_key ON message ( mid );
-CREATE unique index message_section_map_key2 ON message_section_map (sid, mid );
-CREATE INDEX message_rdtree_idx ON message USING GIST ( sections gist__int_ops);
+-- create specialized index
+CREATE INDEX message_rdtree_idx ON message USING GIST (sections gist__int_ops);
--- select some messages with section in 1 OR 2 - OVERLAP operator
-SELECT message.mid FROM message WHERE message.sections && '{1,2}';
+-- select messages in section 1 OR 2 - OVERLAP operator
+SELECT message.mid FROM message WHERE message.sections && '{1,2}';
--- select messages contains in sections 1 AND 2 - CONTAINS operator
+-- select messages in sections 1 AND 2 - CONTAINS operator
SELECT message.mid FROM message WHERE message.sections @> '{1,2}';
--- the same, CONTAINED operator
-SELECT message.mid FROM message WHERE '{1,2}' <@ message.sections;
+
+-- the same, using QUERY operator
+SELECT message.mid FROM message WHERE message.sections @@ '1&2'::query_int;
Benchmark
+
- subdirectory bench contains benchmark suite.
+ The source directory contrib/intarray/bench> contains a
+ benchmark test suite. To run:
+
- cd ./bench
- 1. createdb TEST
- 2. psql TEST < ../_int.sql
- 3. ./create_test.pl | psql TEST
- 4. ./bench.pl - perl script to benchmark queries, supports OR, AND queries
- with/without RD-Tree. Run script without arguments to
- see availbale options.
-
- a)test without RD-Tree (OR)
- ./bench.pl -d TEST -c -s 1,2 -v
- b)test with RD-Tree
- ./bench.pl -d TEST -c -s 1,2 -v -r
-
- BENCHMARKS:
-
- Size of table <message>: 200000
- Size of table <message_section_map>: 269133
-
- Distribution of messages by sections:
-
- section 0: 74377 messages
- section 1: 16284 messages
- section 50: 1229 messages
- section 99: 683 messages
-
- old - without RD-Tree support,
- new - with RD-Tree
-
- +----------+---------------+----------------+
- |Search set|OR, time in sec|AND, time in sec|
- | +-------+-------+--------+-------+
- | | old | new | old | new |
- +----------+-------+-------+--------+-------+
- | 1| 0.625| 0.101| -| -|
- +----------+-------+-------+--------+-------+
- | 99| 0.018| 0.017| -| -|
- +----------+-------+-------+--------+-------+
- | 1,2| 0.766| 0.133| 0.628| 0.045|
- +----------+-------+-------+--------+-------+
- | 1,2,50,65| 0.794| 0.141| 0.030| 0.006|
- +----------+-------+-------+--------+-------+
+ cd .../bench
+ createdb TEST
+ psql TEST < ../_int.sql
+ ./create_test.pl | psql TEST
+ ./bench.pl
+
+
+ The bench.pl> script has numerous options, which
+ are displayed when it is run without any arguments.
+
Authors
+
- All work was done by Teodor Sigaev (teodor@stack.net) and Oleg
- Bartunov (oleg@sai.msu.su). See
+ All work was done by Teodor Sigaev (teodor@stack.net) and
+ Oleg Bartunov (oleg@sai.msu.su). See
for
additional information. Andrey Oktyabrski did a great work on adding new
functions and operations.
@@ -284,4 +312,3 @@ SELECT message.mid FROM message WHERE '{1,2}' <@ message.sections;
-
diff --git a/doc/src/sgml/isn.sgml b/doc/src/sgml/isn.sgml
index 81b32968188..8d4c8fd3d70 100644
--- a/doc/src/sgml/isn.sgml
+++ b/doc/src/sgml/isn.sgml
@@ -1,429 +1,299 @@
+
+
isn
-
+
isn
- The isn module adds data types for the following
- international-standard namespaces: EAN13, UPC, ISBN (books), ISMN (music),
- and ISSN (serials). This module is inspired by Garrett A. Wollman's
- isbn_issn code.
+ The isn module provides data types for the following
+ international product numbering standards: EAN13, UPC, ISBN (books), ISMN
+ (music), and ISSN (serials). Numbers are validated on input, and correctly
+ hyphenated on output.
-
- This module validates, and automatically adds the correct
- hyphenations to the numbers. Also, it supports the new ISBN-13
- numbers to be used starting in January 2007.
-
-
-
- Premises:
-
-
-
-
- ISBN13, ISMN13, ISSN13 numbers are all EAN13 numbers
-
-
- EAN13 numbers aren't always ISBN13, ISMN13 or ISSN13 (some are)
-
-
- some ISBN13 numbers can be displayed as ISBN
-
-
- some ISMN13 numbers can be displayed as ISMN
-
-
- some ISSN13 numbers can be displayed as ISSN
-
-
- all UPC, ISBN, ISMN and ISSN can be represented as EAN13 numbers
-
-
-
-
-
- All types are internally represented as 64 bit integers,
- and internally all are consistently interchangeable.
-
-
-
-
- We have two operator classes (for btree and for hash) so each data type
- can be indexed for faster access.
-
-
Data types
-
-
- We have the following data types:
-
- Data types
+ isn data types
- Data type
- Description
-
-
-
-
- EAN13
-
-
- European Article Numbers. This type will always show the EAN13-display
- format. Te output function for this is ean13_out()
-
-
-
-
-
- ISBN13
-
-
- For International Standard Book Numbers to be displayed in
- the new EAN13-display format.
-
-
-
-
-
- ISMN13
-
-
- For International Standard Music Numbers to be displayed in
- the new EAN13-display format.
-
-
-
-
- ISSN13
-
-
- For International Standard Serial Numbers to be displayed in the new
- EAN13-display format.
-
-
-
-
- ISBN
-
-
- For International Standard Book Numbers to be displayed in the current
- short-display format.
-
-
-
-
- ISMN
-
-
- For International Standard Music Numbers to be displayed in the
- current short-display format.
-
-
-
-
- ISSN
-
-
- For International Standard Serial Numbers to be displayed in the
- current short-display format. These types will display the short
- version of the ISxN (ISxN 10) whenever it's possible, and it will
- show ISxN 13 when it's impossible to show the short version. The
- output function to do this is isn_out()
-
-
-
-
- UPC
-
-
- For Universal Product Codes. UPC numbers are a subset of the EAN13
- numbers (they are basically EAN13 without the first '0' digit.)
- The output function to do this is also isn_out()
-
-
-
-
-
-
-
-
-
- EAN13, ISBN13,
- ISMN13 and ISSN13 types will always
- display the long version of the ISxN (EAN13). The output function to do
- this is ean13_out().
-
-
- The need for these types is just for displaying in different ways the same
- data: ISBN13 is actually the same as
- ISBN, ISMN13=ISMN and
- ISSN13=ISSN.
-
-
-
-
-
- Input functions
-
-
- We have the following input functions:
-
-
-
- Input functions
-
-
-
- Function
+ Data type
Description
+
- ean13_in()
+ EAN13
-
- To take a string and return an EAN13.
-
+ European Article Numbers, always displayed in the EAN13 display format
- isbn_in()
+ ISBN13
-
- To take a string and return valid ISBN or ISBN13 numbers.
-
+ International Standard Book Numbers to be displayed in
+ the new EAN13 display format
- ismn_in()
+ ISMN13
-
- To take a string and return valid ISMN or ISMN13 numbers.
-
-
-
-
-
- issn_in()
-
-
- To take a string and return valid ISSN or ISSN13 numbers.
-
+ International Standard Music Numbers to be displayed in
+ the new EAN13 display format
- upc_in()
+ ISSN13
-
- To take a string and return an UPC codes.
-
+ International Standard Serial Numbers to be displayed in the new
+ EAN13 display format
+
+
+
+ ISBN
+
+ International Standard Book Numbers to be displayed in the old
+ short display format
+
+
+
+ ISMN
+
+ International Standard Music Numbers to be displayed in the
+ old short display format
+
+
+
+ ISSN
+
+ International Standard Serial Numbers to be displayed in the
+ old short display format
+
+
+
+ UPC
+
+ Universal Product Codes
+
+
+ Some notes:
+
+
+
+
+ ISBN13, ISMN13, ISSN13 numbers are all EAN13 numbers.
+
+
+ EAN13 numbers aren't always ISBN13, ISMN13 or ISSN13 (some
+ are).
+
+
+ Some ISBN13 numbers can be displayed as ISBN.
+
+
+ Some ISMN13 numbers can be displayed as ISMN.
+
+
+ Some ISSN13 numbers can be displayed as ISSN.
+
+
+ UPC numbers are a subset of the EAN13 numbers (they are basically
+ EAN13 without the first 0> digit).
+
+
+ All UPC, ISBN, ISMN and ISSN numbers can be represented as EAN13
+ numbers.
+
+
+
+
+ Internally, all these types use the same representation (a 64-bit
+ integer), and all are interchangeable. Multiple types are provided
+ to control display formatting and to permit tighter validity checking
+ of input that is supposed to denote one particular type of number.
+
+
+
+ The ISBN>, ISMN>, and ISSN> types will display the
+ short version of the number (ISxN 10) whenever it's possible, and will show
+ ISxN 13 format for numbers that do not fit in the short version.
+ The EAN13, ISBN13, ISMN13 and
+ ISSN13 types will always display the long version of the ISxN
+ (EAN13).
+
Casts
- We are able to cast from:
+ The isn> module provides the following pairs of type casts:
+
- ISBN13 -> EAN13
+ ISBN13 <=> EAN13
- ISMN13 -> EAN13
+ ISMN13 <=> EAN13
- ISSN13 -> EAN13
+ ISSN13 <=> EAN13
- ISBN -> EAN13
+ ISBN <=> EAN13
- ISMN -> EAN13
+ ISMN <=> EAN13
- ISSN -> EAN13
+ ISSN <=> EAN13
- UPC -> EAN13
+ UPC <=> EAN13
- ISBN <-> ISBN13
+ ISBN <=> ISBN13
- ISMN <-> ISMN13
+ ISMN <=> ISMN13
- ISSN <-> ISSN13
-
-
-
-
-
-
- C API
-
- The C API is implemented as:
-
-
- extern Datum isn_out(PG_FUNCTION_ARGS);
- extern Datum ean13_out(PG_FUNCTION_ARGS);
- extern Datum ean13_in(PG_FUNCTION_ARGS);
- extern Datum isbn_in(PG_FUNCTION_ARGS);
- extern Datum ismn_in(PG_FUNCTION_ARGS);
- extern Datum issn_in(PG_FUNCTION_ARGS);
- extern Datum upc_in(PG_FUNCTION_ARGS);
-
-
-
- On success:
-
-
-
-
- isn_out() takes any of our types and returns a string containing
- the shortes possible representation of the number.
-
-
-
-
- ean13_out() takes any of our types and returns the
- EAN13 (long) representation of the number.
-
-
-
-
- ean13_in() takes a string and return a EAN13. Which, as stated in (2)
- could or could not be any of our types, but it certainly is an EAN13
- number. Only if the string is a valid EAN13 number, otherwise it fails.
-
-
-
-
- isbn_in() takes a string and return an ISBN/ISBN13. Only if the string
- is really a ISBN/ISBN13, otherwise it fails.
-
-
-
-
- ismn_in() takes a string and return an ISMN/ISMN13. Only if the string
- is really a ISMN/ISMN13, otherwise it fails.
-
-
-
-
- issn_in() takes a string and return an ISSN/ISSN13. Only if the string
- is really a ISSN/ISSN13, otherwise it fails.
-
-
-
-
- upc_in() takes a string and return an UPC. Only if the string is
- really a UPC, otherwise it fails.
+ ISSN <=> ISSN13
- (on failure, the functions 'ereport' the error)
+ When casting from EAN13> to another type, there is a run-time
+ check that the value is within the domain of the other type, and an error
+ is thrown if not. The other casts are simply relabelings that will
+ always succeed.
- Testing functions
+ Functions and Operators
+
+
+ The isn> module provides the standard comparison operators,
+ plus btree and hash indexing support for all these datatypes. In
+ addition there are several specialized functions. In this table,
+ isn> means any one of the module's data types:
+
+
- Testing functions
-
+ isn> functions
+
- Function
- Description
+ Function
+ Returns
+ Description
+
- isn_weak(boolean)
- Sets the weak input mode.
+ isn_weak(boolean)
+ boolean
+ Sets the weak input mode (returns new setting)
- isn_weak()
- Gets the current status of the weak mode.
+ isn_weak()
+ boolean
+ Gets the current status of the weak mode
- make_valid()
- Validates an invalid number (deleting the invalid flag).
+ make_valid(isn)
+ isn
+ Validates an invalid number (clears the invalid flag)
- is_valid()
- Checks for the invalid flag prsence.
+ is_valid(isn)
+ boolean
+ Checks for the presence of the invalid flag
- Weak mode is used to be able to insert invalid data to
- a table. Invalid as in the check digit being wrong, not missing numbers.
+ Weak mode is used to be able to insert invalid data
+ into a table. Invalid means the check digit is wrong, not that there are
+ missing numbers.
+
Why would you want to use the weak mode? Well, it could be that
you have a huge collection of ISBN numbers, and that there are so many of
them that for weird reasons some have the wrong check digit (perhaps the
- numbers where scanned from a printed list and the OCR got the numbers wrong,
- perhaps the numbers were manually captured... who knows.) Anyway, the thing
- is you might want to clean the mess up, but you still want to be able to have
- all the numbers in your database and maybe use an external tool to access
- the invalid numbers in the database so you can verify the information and
- validate it more easily; as selecting all the invalid numbers in the table.
+ numbers were scanned from a printed list and the OCR got the numbers wrong,
+ perhaps the numbers were manually captured... who knows). Anyway, the point
+ is you might want to clean the mess up, but you still want to be able to
+ have all the numbers in your database and maybe use an external tool to
+ locate the invalid numbers in the database so you can verify the
+ information and validate it more easily; so for example you'd want to
+ select all the invalid numbers in the table.
+
When you insert invalid numbers in a table using the weak mode, the number
- will be inserted with the corrected check digit, but it will be flagged
- with an exclamation mark ('!') at the end (i.e. 0-11-000322-5!)
+ will be inserted with the corrected check digit, but it will be displayed
+ with an exclamation mark (!>) at the end, for example
+ 0-11-000322-5!>. This invalid marker can be checked with
+ the is_valid> function and cleared with the
+ make_valid> function.
+
- You can also force the insertion of invalid numbers even not in the weak mode,
- appending the '!' character at the end of the number.
+ You can also force the insertion of invalid numbers even when not in the
+ weak mode, by appending the !> character at the end of the
+ number.
+
+
+
+ Another special feature is that during input, you can write
+ ?> in place of the check digit, and the correct check digit
+ will be inserted automatically.
Examples
+
--Using the types directly:
SELECT isbn('978-0-393-04002-9');
@@ -431,18 +301,18 @@ SELECT isbn13('0901690546');
SELECT issn('1436-4522');
--Casting types:
--- note that you can only cast from ean13 to other type when the casted
--- number would be valid in the realm of the casted type;
--- thus, the following will NOT work: select isbn(ean13('0220356483481'));
--- but these will:
-SELECT upc(ean13('0220356483481'));
-SELECT ean13(upc('220356483481'));
+-- note that you can only cast from ean13 to another type when the
+-- number would be valid in the realm of the target type;
+-- thus, the following will NOT work: select isbn(ean13('0220356483481'));
+-- but these will:
+SELECT upc(ean13('0220356483481'));
+SELECT ean13(upc('220356483481'));
--Create a table with a single column to hold ISBN numbers:
-CREATE TABLE test ( id isbn );
+CREATE TABLE test (id isbn);
INSERT INTO test VALUES('9780393040029');
---Automatically calculating check digits (observe the '?'):
+--Automatically calculate check digits (observe the '?'):
INSERT INTO test VALUES('220500896?');
INSERT INTO test VALUES('978055215372?');
@@ -455,10 +325,10 @@ INSERT INTO test VALUES('978-0-11-000533-4');
INSERT INTO test VALUES('9780141219307');
INSERT INTO test VALUES('2-205-00876-X');
SELECT isn_weak(false);
-
+
SELECT id FROM test WHERE NOT is_valid(id);
-UPDATE test SET id=make_valid(id) WHERE id = '2-205-00876-X!';
-
+UPDATE test SET id = make_valid(id) WHERE id = '2-205-00876-X!';
+
SELECT * FROM test;
SELECT isbn13(id) FROM test;
@@ -467,8 +337,9 @@ SELECT isbn13(id) FROM test;
Bibliography
+
- The information to implement this module was collected through
+ The information to implement this module was collected from
several sites, including:
@@ -477,26 +348,33 @@ SELECT isbn13(id) FROM test;
http://www.ismn-international.org/
http://www.wikipedia.org/
+
- the prefixes used for hyphenation where also compiled from:
+ The prefixes used for hyphenation were also compiled from:
http://www.gs1.org/productssolutions/idkeys/support/prefix_list.html
http://www.isbn-international.org/en/identifiers.html
http://www.ismn-international.org/ranges.html
+
Care was taken during the creation of the algorithms and they
were meticulously verified against the suggested algorithms
in the official ISBN, ISMN, ISSN User Manuals.
-
+
Author
Germán Méndez Bravo (Kronuz), 2004 - 2006
-
-
+
+ This module was inspired by Garrett A. Wollman's
+ isbn_issn code.
+
+
+
+
diff --git a/doc/src/sgml/ltree.sgml b/doc/src/sgml/ltree.sgml
index f98555c31ac..22bd019c7db 100644
--- a/doc/src/sgml/ltree.sgml
+++ b/doc/src/sgml/ltree.sgml
@@ -1,438 +1,568 @@
+
ltree
-
+
ltree
- ltree is a PostgreSQL module that contains implementation
- of data types, indexed access methods and queries for data organized as a
- tree-like structures.
+ This module implements a data type ltree> for representing
+ labels of data stored in a hierarchical tree-like structure.
+ Extensive facilities for searching through label trees are provided.
Definitions
+
- A label of a node is a sequence of one or more words
- separated by blank character '_' and containing letters and digits ( for
- example, [a-zA-Z0-9] for C locale). The length of a label is limited by 256
- bytes.
+ A label is a sequence of alphanumeric characters
+ and underscores (for example, in C locale the characters
+ A-Za-z0-9_> are allowed). Labels must be less than 256 bytes
+ long.
+
- Example: 'Countries', 'Personal_Services'
+ Examples: 42>, Personal_Services>
+
- A label path of a node is a sequence of one or more
- dot-separated labels l1.l2...ln, represents path from root to the node. The
- length of a label path is limited by 65Kb, but size <= 2Kb is preferrable.
- We consider it's not a strict limitation (maximal size of label path for
- DMOZ catalogue - , is about 240
- bytes!)
+ A label path is a sequence of zero or more
+ labels separated by dots, for example L1.L2.L3>, representing
+ a path from the root of a hierarchical tree to a particular node. The
+ length of a label path must be less than 65Kb, but keeping it under 2Kb is
+ preferable. In practice this is not a major limitation; for example,
+ the longest label path in the DMOZ catalogue () is about 240 bytes.
+
- Example: 'Top.Countries.Europe.Russia'
+ Example: Top.Countries.Europe.Russia
+
- We introduce several datatypes:
+ The ltree> module provides several datatypes:
+
- ltree - is a datatype for label path.
-
-
-
-
- ltree[] - is a datatype for arrays of ltree.
+ ltree stores a label path.
+
- lquery
- - is a path expression that has regular expression in the label path and
- used for ltree matching. Star symbol (*) is used to specify any number of
- labels (levels) and could be used at the beginning and the end of lquery,
- for example, '*.Europe.*'.
+ lquery represents a regular-expression-like pattern
+ for matching ltree> values. A simple word matches that
+ label within a path. A star symbol (*>) matches zero
+ or more labels. For example:
+
+foo Match the exact label path foo>
+*.foo.* Match any label path containing the label foo>
+*.foo Match any label path whose last label is foo>
+
+
- The following quantifiers are recognized for '*' (like in Perl):
+ Star symbols can also be quantified to restrict how many labels
+ they can match:
+
+*{n>} Match exactly n> labels
+*{n>,} Match at least n> labels
+*{n>,m>} Match at least n> but not more than m> labels
+*{,m>} Match at most m> labels — same as *{0,m>}
+
-
-
- {n} Match exactly n levels
-
-
- {n,} Match at least n levels
-
-
- {n,m} Match at least n but not more than m levels
-
-
- {,m} Match at maximum m levels (eq. to {0,m})
-
-
+
- It is possible to use several modifiers at the end of a label:
+ There are several modifiers that can be put at the end of a non-star
+ label in lquery> to make it match more than just the exact match:
+
+@ Match case-insensitively, for example a@> matches A>
+* Match any label with this prefix, for example foo*> matches foobar>
+% Match initial underscore-separated words
+
+ The behavior of %> is a bit complicated. It tries to match
+ words rather than the entire label. For example
+ foo_bar%> matches foo_bar_baz> but not
+ foo_barbaz>. If combined with *>, prefix
+ matching applies to each word separately, for example
+ foo_bar%*> matches foo1_bar2_baz> but
+ not foo1_br2_baz>.
-
-
- @ Do case-insensitive label matching
-
-
- * Do prefix matching for a label
-
-
- % Don't account word separator '_' in label matching, that is
- 'Russian%' would match 'Russian_nations', but not 'Russian'
-
-
-
-
+
- lquery can contain logical '!' (NOT) at the beginning
- of the label and '|' (OR) to specify possible alternatives for label
- matching.
+ Also, you can write several possibly-modified labels separated with
+ |> (OR) to match any of those labels, and you can put
+ !> (NOT) at the start to match any label that doesn't
+ match any of the alternatives.
-
- Example of lquery:
-
-
+
+
+ Here's an annotated example of lquery:
+
Top.*{0,2}.sport*@.!football|tennis.Russ*|Spain
- a) b) c) d) e)
+ a. b. c. d. e.
-
- A label path should
+ This query will match any label path that:
- begin from a node with label 'Top'
+ begins with the label Top
- and following zero or 2 labels until
+ and next has zero to two labels before
- a node with label beginning from case-insensitive prefix 'sport'
+ a label beginning with the case-insensitive prefix sport
- following node with label not matched 'football' or 'tennis' and
+ then a label not matching football nor
+ tennis
- end on node with label beginning from 'Russ' or strictly matched
- 'Spain'.
+ and then ends with a label beginning with Russ or
+ exactly matching Spain.
-
- ltxtquery
- - is a datatype for label searching (like type 'query' for full text
- searching, see contrib/tsearch). It's possible to use modifiers @,%,* at
- the end of word. The meaning of modifiers are the same as for lquery.
+ ltxtquery represents a full-text-search-like
+ pattern for matching ltree> values. An
+ ltxtquery value contains words, possibly with the
+ modifiers @>, *>, %> at the end;
+ the modifiers have the same meanings as in lquery>.
+ Words can be combined with &> (AND),
+ |> (OR), !> (NOT), and parentheses.
+ The key difference from
+ lquery> is that ltxtquery matches words without
+ regard to their position in the label path.
+
- Example: 'Europe & Russia*@ & !Transportation'
-
-
- Search paths contain words 'Europe' and 'Russia*' (case-insensitive) and
- not 'Transportation'. Notice, the order of words as they appear in label
- path is not important !
+ Here's an example ltxtquery:
+
+ Europe & Russia*@ & !Transportation
+
+ This will match paths that contain the label Europe and
+ any label beginning with Russia (case-insensitive),
+ but not paths containing the label Transportation.
+ The location of these words within the path is not important.
+ Also, when %> is used, the word can be matched to any
+ underscore-separated word within a label, regardless of position.
+
+
+ Note: ltxtquery> allows whitespace between symbols, but
+ ltree> and lquery> do not.
+
- Operations
+ Operators and Functions
+
- The following operations are defined for type ltree:
+ Type ltree> has the usual comparison operators
+ =>, <>,
+ <>, >>, <=>, >=>.
+ Comparison sorts in the order of a tree traversal, with the children
+ of a node sorted by label text. In addition, there are the following
+ specialized operators:
+
+
+
+ ltree> Operators
+
+
+
+
+ Operator
+ Returns
+ Description
+
+
+
+
+
+ ltree> @>> ltree>
+ boolean
+ is left argument an ancestor of right (or equal)?
+
+
+
+ ltree> <@> ltree>
+ boolean
+ is left argument a descendant of right (or equal)?
+
+
+
+ ltree> ~> lquery>
+ boolean
+ does ltree> match lquery>?
+
+
+
+ lquery> ~> ltree>
+ boolean
+ does ltree> match lquery>?
+
+
+
+ ltree> ?> lquery[]>
+ boolean
+ does ltree> match any lquery> in array?
+
+
+
+ lquery[]> ?> ltree>
+ boolean
+ does ltree> match any lquery> in array?
+
+
+
+ ltree> @> ltxtquery>
+ boolean
+ does ltree> match ltxtquery>?
+
+
+
+ ltxtquery> @> ltree>
+ boolean
+ does ltree> match ltxtquery>?
+
+
+
+ ltree> ||> ltree>
+ ltree
+ concatenate ltree> paths
+
+
+
+ ltree> ||> text>
+ ltree
+ convert text to ltree> and concatenate
+
+
+
+ text> ||> ltree>
+ ltree
+ convert text to ltree> and concatenate
+
+
+
+ ltree[]> @>> ltree>
+ boolean
+ does array contain an ancestor of ltree>?
+
+
+
+ ltree> <@> ltree[]>
+ boolean
+ does array contain an ancestor of ltree>?
+
+
+
+ ltree[]> <@> ltree>
+ boolean
+ does array contain a descendant of ltree>?
+
+
+
+ ltree> @>> ltree[]>
+ boolean
+ does array contain a descendant of ltree>?
+
+
+
+ ltree[]> ~> lquery>
+ boolean
+ does array contain any path matching lquery>?
+
+
+
+ lquery> ~> ltree[]>
+ boolean
+ does array contain any path matching lquery>?
+
+
+
+ ltree[]> ?> lquery[]>
+ boolean
+ does ltree> array contain any path matching any lquery>?
+
+
+
+ lquery[]> ?> ltree[]>
+ boolean
+ does ltree> array contain any path matching any lquery>?
+
+
+
+ ltree[]> @> ltxtquery>
+ boolean
+ does array contain any path matching ltxtquery>?
+
+
+
+ ltxtquery> @> ltree[]>
+ boolean
+ does array contain any path matching ltxtquery>?
+
+
+
+ ltree[]> ?@>> ltree>
+ ltree
+ first array entry that is an ancestor of ltree>; NULL if none
+
+
+
+ ltree[]> ?<@> ltree>
+ ltree
+ first array entry that is a descendant of ltree>; NULL if none
+
+
+
+ ltree[]> ?~> lquery>
+ ltree
+ first array entry that matches lquery>; NULL if none
+
+
+
+ ltree[]> ?@> ltxtquery>
+ ltree
+ first array entry that matches ltxtquery>; NULL if none
+
+
+
+
+
+
+
+ The operators <@, @>,
+ @ and ~ have analogues
+ ^<@>, ^@>>, ^@>,
+ ^~, which are the same except they do not use
+ indexes. These are useful only for testing purposes.
+
+
+
+ The following functions are available:
+
+
+
+ ltree> Functions
+
+
+
+
+ Function
+ Return Type
+ Description
+ Example
+ Result
+
+
+
+
+
+ subltree(ltree, int start, int end)
+ ltree
+ subpath of ltree> from position start> to
+ position end>-1 (counting from 0)
+ subltree('Top.Child1.Child2',1,2)
+ Child1
+
+
+
+ subpath(ltree, int offset, int len)
+ ltree
+ subpath of ltree> starting at position
+ offset>, length len>.
+ If offset> is negative, subpath starts that far from the
+ end of the path. If len> is negative, leaves that many
+ labels off the end of the path.
+ subpath('Top.Child1.Child2',0,2)
+ Top.Child1
+
+
+
+ subpath(ltree, int offset)
+ ltree
+ subpath of ltree> starting at position
+ offset>, extending to end of path.
+ If offset> is negative, subpath starts that far from the
+ end of the path.
+ subpath('Top.Child1.Child2',1)
+ Child1.Child2
+
+
+
+ nlevel(ltree)
+ integer
+ number of labels in path
+ nlevel('Top.Child1.Child2')
+ 3
+
+
+
+ index(ltree a, ltree b)
+ integer
+ position of first occurrence of b> in
+ a>; -1 if not found
+ index('0.1.2.3.5.4.5.6.8.5.6.8','5.6')
+ 6
+
+
+
+ index(ltree a, ltree b, int offset)
+ integer
+ position of first occurrence of b> in
+ a>, searching starting at offset>;
+ negative offset> means start -offset>
+ labels from the end of the path
+ index('0.1.2.3.5.4.5.6.8.5.6.8','5.6',-4)
+ 9
+
+
+
+ text2ltree(text)
+ ltree
+ cast text> to ltree>
+
+
+
+
+
+ ltree2text(ltree)
+ text
+ cast ltree> to text>
+
+
+
+
+
+ lca(ltree, ltree, ...)
+ ltree
+ lowest common ancestor, i.e., longest common prefix of paths
+ (up to 8 arguments supported)
+ lca('1.2.2.3','1.2.3.4.5.6')
+ 1.2
+
+
+
+ lca(ltree[])
+ ltree
+ lowest common ancestor, i.e., longest common prefix of paths
+ lca(array['1.2.2.3'::ltree,'1.2.3'])
+ 1.2
+
+
+
+
+
+
+
+
+ Indexes
+
+ ltree> supports several types of indexes that can speed
+ up the indicated operators:
- <,>,<=,>=,=, <>
- - Have their usual meanings. Comparison is doing in the order of direct
- tree traversing, children of a node are sorted lexicographic.
+ B-tree index over ltree>:
+ <>, <=>, =>,
+ >=>, >
- ltree @> ltree
- - returns TRUE if left argument is an ancestor of right argument (or
- equal).
-
-
-
-
- ltree <@ ltree
- - returns TRUE if left argument is a descendant of right argument (or
- equal).
-
-
-
-
- ltree ~ lquery, lquery ~ ltree
- - return TRUE if node represented by ltree satisfies lquery.
-
-
-
-
- ltree ? lquery[], lquery ? ltree[]
- - return TRUE if node represented by ltree satisfies at least one lquery
- from array.
-
-
-
-
- ltree @ ltxtquery, ltxtquery @ ltree
- - return TRUE if node represented by ltree satisfies ltxtquery.
-
-
-
-
- ltree || ltree, ltree || text, text || ltree
- - return concatenated ltree.
-
-
-
-
-
- Operations for arrays of ltree (ltree[]):
-
-
-
-
- ltree[] @> ltree, ltree <@ ltree[]
- - returns TRUE if array ltree[] contains an ancestor of ltree.
-
-
-
-
- ltree @> ltree[], ltree[] <@ ltree
- - returns TRUE if array ltree[] contains a descendant of ltree.
-
-
-
-
- ltree[] ~ lquery, lquery ~ ltree[]
- - returns TRUE if array ltree[] contains label paths matched lquery.
-
-
-
-
- ltree[] ? lquery[], lquery[] ? ltree[]
- - returns TRUE if array ltree[] contains label paths matched atleaset one
- lquery from array.
-
-
-
-
- ltree[] @ ltxtquery, ltxtquery @ ltree[]
- - returns TRUE if array ltree[] contains label paths matched ltxtquery
- (full text search).
-
-
-
-
- ltree[] ?@> ltree, ltree ?<@ ltree[], ltree[] ?~ lquery, ltree[] ?@ ltxtquery
-
- - returns first element of array ltree[] satisfies corresponding condition
- and NULL in vice versa.
-
-
-
-
-
-
- Remark
-
-
- Operations <@, @>, @ and
- ~ have analogues - ^<@, ^@>, ^@, ^~, which don't use
- indices!
-
-
-
-
- Indices
-
- Various indices could be created to speed up execution of operations:
-
-
-
-
-
- B-tree index over ltree: <, <=, =, >=, >
-
-
-
-
- GiST index over ltree: <, <=, =, >=, >, @>, <@, @, ~, ?
+ GiST index over ltree>:
+ <>, <=>, =>,
+ >=>, >>,
+ @>>, <@>,
+ @>, ~>, ?
- Example:
+ Example of creating such an index:
CREATE INDEX path_gist_idx ON test USING GIST (path);
- GiST index over ltree[]:
- ltree[]<@ ltree, ltree @> ltree[], @, ~, ?.
+
+ GiST index over ltree[]>:
+ ltree[] <@ ltree>, ltree @> ltree[]>,
+ @>, ~>, ?
- Example:
+ Example of creating such an index:
- CREATE INDEX path_gist_idx ON test USING GIST (array_path);
+ CREATE INDEX path_gist_idx ON test USING GIST (array_path);
- Notices: This index is lossy.
+ Note: This index type is lossy.
-
- Functions
-
-
-
-
- ltree subltree(ltree, start, end)
- returns subpath of ltree from start (inclusive) until the end.
-
-
- # select subltree('Top.Child1.Child2',1,2);
- subltree
- --------
- Child1
-
-
-
-
- ltree subpath(ltree, OFFSET,LEN) and
- ltree subpath(ltree, OFFSET)
- returns subpath of ltree from OFFSET (inclusive) with length LEN.
- If OFFSET is negative returns subpath starts that far from the end
- of the path. If LENGTH is omitted, returns everything to the end
- of the path. If LENGTH is negative, leaves that many labels off
- the end of the path.
-
-
- # select subpath('Top.Child1.Child2',1,2);
- subpath
- -------
- Child1.Child2
-
- # select subpath('Top.Child1.Child2',-2,1);
- subpath
- ---------
- Child1
-
-
-
-
- int4 nlevel(ltree) - returns level of the node.
-
-
- # select nlevel('Top.Child1.Child2');
- nlevel
- --------
- 3
-
-
- Note, that arguments start, end, OFFSET, LEN have meaning of level of the
- node !
-
-
-
-
- int4 index(ltree,ltree) and
- int4 index(ltree,ltree,OFFSET)
- returns number of level of the first occurence of second argument in first
- one beginning from OFFSET. if OFFSET is negative, than search begins from |
- OFFSET| levels from the end of the path.
-
-
- SELECT index('0.1.2.3.5.4.5.6.8.5.6.8','5.6',3);
- index
- -------
- 6
- SELECT index('0.1.2.3.5.4.5.6.8.5.6.8','5.6',-4);
- index
- -------
- 9
-
-
-
-
- ltree text2ltree(text) and
- text ltree2text(text) cast functions for ltree and text.
-
-
-
-
- ltree lca(ltree,ltree,...) (up to 8 arguments) and
- ltree lca(ltree[]) Returns Lowest Common Ancestor (lca).
-
-
- # select lca('1.2.2.3','1.2.3.4.5.6');
- lca
- -----
- 1.2
- # select lca('{la.2.3,1.2.3.4.5.6}') is null;
- ?column?
- ----------
- f
-
-
-
-
-
-
- Installation
-
- cd contrib/ltree
- make
- make install
- make installcheck
-
-
-
Example
-
- createdb ltreetest
- psql ltreetest < /usr/local/pgsql/share/contrib/ltree.sql
- psql ltreetest < ltreetest.sql
-
-Now, we have a database ltreetest populated with a data describing hierarchy
-shown below:
+ This example uses the following data (also available in file
+ contrib/ltree/ltreetest.sql> in the source distribution):
+CREATE TABLE test (path ltree);
+INSERT INTO test VALUES ('Top');
+INSERT INTO test VALUES ('Top.Science');
+INSERT INTO test VALUES ('Top.Science.Astronomy');
+INSERT INTO test VALUES ('Top.Science.Astronomy.Astrophysics');
+INSERT INTO test VALUES ('Top.Science.Astronomy.Cosmology');
+INSERT INTO test VALUES ('Top.Hobbies');
+INSERT INTO test VALUES ('Top.Hobbies.Amateurs_Astronomy');
+INSERT INTO test VALUES ('Top.Collections');
+INSERT INTO test VALUES ('Top.Collections.Pictures');
+INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy');
+INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Stars');
+INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Galaxies');
+INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Astronauts');
+CREATE INDEX path_gist_idx ON test USING gist(path);
+CREATE INDEX path_idx ON test USING btree(path);
+
-
- TOP
- / | \
- Science Hobbies Collections
+
+ Now, we have a table test> populated with data describing
+ the hierarchy shown below:
+
+
+
+ Top
+ / | \
+ Science Hobbies Collections
/ | \
Astronomy Amateurs_Astronomy Pictures
/ \ |
@@ -440,13 +570,14 @@ shown below:
/ | \
Galaxies Stars Astronauts
+
- Inheritance:
+ We can do inheritance:
ltreetest=# select path from test where path <@ 'Top.Science';
- path
+ path
------------------------------------
Top.Science
Top.Science.Astronomy
@@ -454,12 +585,14 @@ ltreetest=# select path from test where path <@ 'Top.Science';
Top.Science.Astronomy.Cosmology
(4 rows)
+
- Matching:
+ Here are some examples of path matching:
+
ltreetest=# select path from test where path ~ '*.Astronomy.*';
- path
+ path
-----------------------------------------------
Top.Science.Astronomy
Top.Science.Astronomy.Astrophysics
@@ -469,20 +602,22 @@ ltreetest=# select path from test where path ~ '*.Astronomy.*';
Top.Collections.Pictures.Astronomy.Galaxies
Top.Collections.Pictures.Astronomy.Astronauts
(7 rows)
+
ltreetest=# select path from test where path ~ '*.!pictures@.*.Astronomy.*';
- path
+ path
------------------------------------
Top.Science.Astronomy
Top.Science.Astronomy.Astrophysics
Top.Science.Astronomy.Cosmology
(3 rows)
+
- Full text search:
+ Here are some examples of full text search:
ltreetest=# select path from test where path @ 'Astro*% & !pictures@';
- path
+ path
------------------------------------
Top.Science.Astronomy
Top.Science.Astronomy.Astrophysics
@@ -491,54 +626,38 @@ ltreetest=# select path from test where path @ 'Astro*% & !pictures@';
(4 rows)
ltreetest=# select path from test where path @ 'Astro* & !pictures@';
- path
+ path
------------------------------------
Top.Science.Astronomy
Top.Science.Astronomy.Astrophysics
Top.Science.Astronomy.Cosmology
(3 rows)
+
- Using Functions:
+ Path construction using functions:
ltreetest=# select subpath(path,0,2)||'Space'||subpath(path,2) from test where path <@ 'Top.Science.Astronomy';
- ?column?
+ ?column?
------------------------------------------
Top.Science.Space.Astronomy
Top.Science.Space.Astronomy.Astrophysics
Top.Science.Space.Astronomy.Cosmology
(3 rows)
-We could create SQL-function:
-CREATE FUNCTION ins_label(ltree, int4, text) RETURNS ltree
+
+
+
+ We could simplify this by creating a SQL function that inserts a label
+ at a specified position in a path:
+
+
+CREATE FUNCTION ins_label(ltree, int, text) RETURNS ltree
AS 'select subpath($1,0,$2) || $3 || subpath($1,$2);'
LANGUAGE SQL IMMUTABLE;
-
-
- and previous select could be rewritten as:
-
-
ltreetest=# select ins_label(path,2,'Space') from test where path <@ 'Top.Science.Astronomy';
- ins_label
-------------------------------------------
- Top.Science.Space.Astronomy
- Top.Science.Space.Astronomy.Astrophysics
- Top.Science.Space.Astronomy.Cosmology
-(3 rows)
-
-
-
- Or with another arguments:
-
-
-
-CREATE FUNCTION ins_label(ltree, ltree, text) RETURNS ltree
-AS 'select subpath($1,0,nlevel($2)) || $3 || subpath($1,nlevel($2));'
-LANGUAGE SQL IMMUTABLE;
-
-ltreetest=# select ins_label(path,'Top.Science'::ltree,'Space') from test where path <@ 'Top.Science.Astronomy';
- ins_label
+ ins_label
------------------------------------------
Top.Science.Space.Astronomy
Top.Science.Space.Astronomy.Astrophysics
@@ -547,224 +666,16 @@ ltreetest=# select ins_label(path,'Top.Science'::ltree,'Space') from test where
-
- Additional data
-
- To get more feeling from our ltree module you could download
- dmozltree-eng.sql.gz (about 3Mb tar.gz archive containing 300,274 nodes),
- available from
-
- dmozltree-eng.sql.gz, which is DMOZ catalogue, prepared for use with ltree.
- Setup your test database (dmoz), load ltree module and issue command:
-
-
- zcat dmozltree-eng.sql.gz| psql dmoz
-
-
- Data will be loaded into database dmoz and all indices will be created.
-
-
-
-
- Benchmarks
-
- All runs were performed on my IBM ThinkPad T21 (256 MB RAM, 750Mhz) using DMOZ
- data, containing 300,274 nodes (see above for download link). We used some
- basic queries typical for walking through catalog.
-
-
-
- Queries
-
-
-
- Q0: Count all rows (sort of base time for comparison)
-
-
- select count(*) from dmoz;
- count
- --------
- 300274
- (1 row)
-
-
-
-
- Q1: Get direct children (without inheritance)
-
-
- select path from dmoz where path ~ 'Top.Adult.Arts.Animation.*{1}';
- path
- -----------------------------------
- Top.Adult.Arts.Animation.Cartoons
- Top.Adult.Arts.Animation.Anime
- (2 rows)
-
-
-
-
- Q2: The same as Q1 but with counting of successors
-
-
- select path as parentpath , (select count(*)-1 from dmoz where path <@
- p.path) as count from dmoz p where path ~ 'Top.Adult.Arts.Animation.*{1}';
- parentpath | count
- -----------------------------------+-------
- Top.Adult.Arts.Animation.Cartoons | 2
- Top.Adult.Arts.Animation.Anime | 61
- (2 rows)
-
-
-
-
- Q3: Get all parents
-
-
- select path from dmoz where path @> 'Top.Adult.Arts.Animation' order by
- path asc;
- path
- --------------------------
- Top
- Top.Adult
- Top.Adult.Arts
- Top.Adult.Arts.Animation
- (4 rows)
-
-
-
-
- Q4: Get all parents with counting of children
-
-
- select path, (select count(*)-1 from dmoz where path <@ p.path) as count
- from dmoz p where path @> 'Top.Adult.Arts.Animation' order by path asc;
- path | count
- --------------------------+--------
- Top | 300273
- Top.Adult | 4913
- Top.Adult.Arts | 339
- Top.Adult.Arts.Animation | 65
- (4 rows)
-
-
-
-
- Q5: Get all children with levels
-
-
- select path, nlevel(path) - nlevel('Top.Adult.Arts.Animation') as level
- from dmoz where path ~ 'Top.Adult.Arts.Animation.*{1,2}' order by path asc;
- path | level
- ------------------------------------------------+-------
- Top.Adult.Arts.Animation.Anime | 1
- Top.Adult.Arts.Animation.Anime.Fan_Works | 2
- Top.Adult.Arts.Animation.Anime.Games | 2
- Top.Adult.Arts.Animation.Anime.Genres | 2
- Top.Adult.Arts.Animation.Anime.Image_Galleries | 2
- Top.Adult.Arts.Animation.Anime.Multimedia | 2
- Top.Adult.Arts.Animation.Anime.Resources | 2
- Top.Adult.Arts.Animation.Anime.Titles | 2
- Top.Adult.Arts.Animation.Cartoons | 1
- Top.Adult.Arts.Animation.Cartoons.AVS | 2
- Top.Adult.Arts.Animation.Cartoons.Members | 2
- (11 rows)
-
-
-
-
-
-
- Timings
-
-+---------------------------------------------+
-|Query|Rows|Time (ms) index|Time (ms) no index|
-|-----+----+---------------+------------------|
-| Q0| 1| NA| 1453.44|
-|-----+----+---------------+------------------|
-| Q1| 2| 0.49| 1001.54|
-|-----+----+---------------+------------------|
-| Q2| 2| 1.48| 3009.39|
-|-----+----+---------------+------------------|
-| Q3| 4| 0.55| 906.98|
-|-----+----+---------------+------------------|
-| Q4| 4| 24385.07| 4951.91|
-|-----+----+---------------+------------------|
-| Q5| 11| 0.85| 1003.23|
-+---------------------------------------------+
-
-
- Timings without indices were obtained using operations which doesn't use
- indices (see above)
-
-
-
-
- Remarks
-
- We didn't run full-scale tests, also we didn't present (yet) data for
- operations with arrays of ltree (ltree[]) and full text searching. We'll
- appreciate your input. So far, below some (rather obvious) results:
-
-
-
-
- Indices does help execution of queries
-
-
-
-
- Q4 performs bad because one needs to read almost all data from the HDD
-
-
-
-
-
-
- Some Backgrounds
-
- The approach we use for ltree is much like one we used in our other GiST based
- contrib modules (intarray, tsearch, tree, btree_gist, rtree_gist). Theoretical
- background is available in papers referenced from our GiST development page
- ().
-
-
- A hierarchical data structure (tree) is a set of nodes. Each node has a
- signature (LPS) of a fixed size, which is a hashed label path of that node.
- Traversing a tree we could *certainly* prune branches if
-
-
- LQS (bitwise AND) LPS != LQS
-
-
- where LQS is a signature of lquery or ltxtquery, obtained in the same way as
- LPS.
-
-
- ltree[]:
-
-
- For array of ltree LPS is a bitwise OR-ed signatures of *ALL* children
- reachable from that node. Signatures are stored in RD-tree, implemented using
- GiST, which provides indexed access.
-
-
- ltree:
-
-
- For ltree we store LPS in a B-tree, implemented using GiST. Each node entry is
- represented by (left_bound, signature, right_bound), so that we could speedup
- operations <, <=, =, >=, > using left_bound, right_bound and prune branches of
- a tree using signature.
-
-
Authors
+
- All work was done by Teodor Sigaev (teodor@stack.net) and
- Oleg Bartunov (oleg@sai.msu.su). See
+ All work was done by Teodor Sigaev (teodor@stack.net) and
+ Oleg Bartunov (oleg@sai.msu.su). See
for
- additional information. Authors would like to thank Eugeny Rodichev for
+ additional information. Authors would like to thank Eugeny Rodichev for
helpful discussions. Comments and bug reports are welcome.
+
diff --git a/doc/src/sgml/oid2name.sgml b/doc/src/sgml/oid2name.sgml
index 6fb4c181c44..eb91827c7bf 100644
--- a/doc/src/sgml/oid2name.sgml
+++ b/doc/src/sgml/oid2name.sgml
@@ -1,70 +1,157 @@
+
+
oid2name
-
+
oid2name
- This utility allows administrators to examine the file structure used by
- PostgreSQL. To make use of it, you need to be familiar with the file
- structure, which is described in .
+ oid2name> is a utility program that helps administrators to
+ examine the file structure used by PostgreSQL. To make use of it, you need
+ to be familiar with the database file structure, which is described in
+ .
-
+
+
+
+ The name oid2name> is historical, and is actually rather
+ misleading, since most of the time when you use it, you will really
+ be concerned with tables' filenode numbers (which are the file names
+ visible in the database directories). Be sure you understand the
+ difference between table OIDs and table filenodes!
+
+
+
Overview
+
- oid2name connects to the database and extracts OID,
- filenode, and table name information. You can also have it show database
- OIDs and tablespace OIDs.
+ oid2name connects to a target database and
+ extracts OID, filenode, and/or table name information. You can also have
+ it show database OIDs or tablespace OIDs. The program is controlled by
+ a large number of command-line switches, as shown in
+ .
-
- When displaying specific tables, you can select which tables to show by
- using -o, -f and -t. The first switch takes an OID, the second takes
- a filenode, and the third takes a tablename (actually, it's a LIKE
- pattern, so you can use things like "foo%"). Note that you can use as many
- of these switches as you like, and the listing will include all objects
- matched by any of the switches. Also note that these switches can only
- show objects in the database given in -d.
-
-
- If you don't give any of -o, -f or -t it will dump all the tables in the
- database given in -d. If you don't give -d, it will show a database
- listing. Alternatively you can give -s to get a tablespace listing.
-
-
- Additional switches
+
+
+ oid2name> switches
+
+
+ Switch
+ Description
+
+
+
- -i
- include indexes and sequences in the database listing.
+ -o oid>
+ show info for table with OID oid>
+
- -x
- display more information about each object shown: tablespace name,
- schema name, OID.
-
+ -f filenode>
+ show info for table with filenode filenode>
+
+
+ -t tablename_pattern>
+ show info for table(s) matching tablename_pattern>
+
+
+
+ -s
+ show tablespace OIDs
+
+
-S
- also show system objects (those in information_schema, pg_toast
- and pg_catalog schemas)
+ include system objects (those in
+ information_schema, pg_toast
+ and pg_catalog schemas)
+
+
+ -i
+ include indexes and sequences in the listing
+
+
+
+ -x
+ display more information about each object shown: tablespace name,
+ schema name, and OID
+
+
+
-q
- don't display headers(useful for scripting)
+ omit headers (useful for scripting)
+
+
+
+ -d database>
+ database to connect to
+
+
+
+ -H host>
+ database server's host
+
+
+
+ -p port>
+ database server's port
+
+
+
+ -U username>
+ username to connect as
+
+
+
+ -P password>
+ password (deprecated — putting this on the command line
+ is a security hazard)
+
+
+ To display specific tables, select which tables to show by
+ using -o>, -f> and/or -t>.
+ -o> takes an OID,
+ -f> takes a filenode,
+ and -t> takes a tablename (actually, it's a LIKE
+ pattern, so you can use things like foo%>).
+ You can use as many
+ of these switches as you like, and the listing will include all objects
+ matched by any of the switches. But note that these switches can only
+ show objects in the database given by -d>.
+
+
+
+ If you don't give any of -o>, -f> or -t>,
+ but do give -d>, it will list all tables in the database
+ named by -d>. In this mode, the -S> and
+ -i> switches control what gets listed.
+
+
+
+ If you don't give -d> either, it will show a listing of database
+ OIDs. Alternatively you can give -s> to get a tablespace
+ listing.
+
-
+
Examples
-
+
+$ # what's in this database server, anyway?
$ oid2name
All databases:
Oid Database Name Tablespace
@@ -83,7 +170,8 @@ All tablespaces:
155151 fastdisk
155152 bigdisk
-$ cd $PGDATA/17228
+$ # OK, let's look into database alvherre
+$ cd $PGDATA/base/17228
$ # get top 10 db objects in the default tablespace, ordered by size
$ ls -lS * | head -10
@@ -98,6 +186,7 @@ $ ls -lS * | head -10
-rw------- 1 alvherre alvherre 163840 sep 14 09:50 16699
-rw------- 1 alvherre alvherre 122880 sep 6 17:51 16751
+$ # I wonder what file 155173 is ...
$ oid2name -d alvherre -f 155173
From database "alvherre":
Filenode Table Name
@@ -112,7 +201,7 @@ From database "alvherre":
155173 accounts
1155291 accounts_pkey
-$ # you can also mix the options, and have more details
+$ # you can mix the options, and get more details with -x
$ oid2name -d alvherre -t accounts -f 1155291 -x
From database "alvherre":
Filenode Table Name Oid Schema Tablespace
@@ -157,7 +246,7 @@ $ ls -d 155151/*
155151/17228/ 155151/PG_VERSION
$ # Oh, what was database 17228 again?
-$ oid2name
+$ oid2name
All databases:
Oid Database Name Tablespace
----------------------------------
@@ -178,28 +267,25 @@ From database "alvherre":
Filenode Table Name
----------------------
155156 foo
-
-$ # end of sample session.
+
+
+
+ Limitations
- You can also get approximate size data for each object using psql. For
- example,
-
-
- SELECT relpages, relfilenode, relname FROM pg_class ORDER BY relpages DESC;
-
-
- Each page is typically 8k. Relpages is updated by VACUUM.
+ oid2name> requires a running database server with
+ non-corrupt system catalogs. It is therefore of only limited use
+ for recovering from catastrophic database corruption situations.
-
+
Author
+
- b. palmer, bpalmer@crimelabs.net
+ B. Palmer bpalmer@crimelabs.net
-
diff --git a/doc/src/sgml/pageinspect.sgml b/doc/src/sgml/pageinspect.sgml
index c7ebe1f730e..e398733d013 100644
--- a/doc/src/sgml/pageinspect.sgml
+++ b/doc/src/sgml/pageinspect.sgml
@@ -1,124 +1,170 @@
+
pageinspect
-
+
pageinspect
- The functions in this module allow you to inspect the contents of data pages
- at a low level, for debugging purposes.
+ The pageinspect> module provides functions that allow you to
+ inspect the contents of database pages at a low level, which is useful for
+ debugging purposes. All of these functions may be used only by superusers.
- Functions included
+ Functions
-
-
-
- get_raw_page reads one block of the named table and returns a copy as a
- bytea field. This allows a single time-consistent copy of the block to be
- made. Use of this functions is restricted to superusers.
-
-
+
+
+
+ get_raw_page(text, int) returns bytea
+
-
-
- page_header shows fields which are common to all PostgreSQL heap and index
- pages. Use of this function is restricted to superusers.
-
-
- A page image obtained with get_raw_page should be passed as argument:
-
-
-regression=# SELECT * FROM page_header(get_raw_page('pg_class',0));
- lsn | tli | flags | lower | upper | special | pagesize | version | prune_xid
- -----------+-----+-------+-------+-------+---------+----------+---------+-----------
- 0/24A1B50 | 1 | 1 | 232 | 368 | 8192 | 8192 | 4 | 0
-
-
- The returned columns correspond to the fields in the PageHeaderData struct.
- See src/include/storage/bufpage.h for more details.
-
-
+
+
+ get_raw_page reads the specified block of the named
+ table and returns a copy as a bytea> value. This allows a
+ single time-consistent copy of the block to be obtained.
+
+
+
-
-
- heap_page_items shows all line pointers on a heap page. For those line
- pointers that are in use, tuple headers are also shown. All tuples are
- shown, whether or not the tuples were visible to an MVCC snapshot at the
- time the raw page was copied. Use of this function is restricted to
- superusers.
-
-
- A heap page image obtained with get_raw_page should be passed as argument:
-
-
- test=# SELECT * FROM heap_page_items(get_raw_page('pg_class',0));
-
-
- See src/include/storage/itemid.h and src/include/access/htup.h for
- explanations of the fields returned.
-
-
+
+
+ page_header(bytea) returns record
+
-
-
- bt_metap() returns information about the btree index metapage:
-
-
- test=> SELECT * FROM bt_metap('pg_cast_oid_index');
- -[ RECORD 1 ]-----
- magic | 340322
- version | 2
- root | 1
- level | 0
- fastroot | 1
- fastlevel | 0
-
-
+
+
+ page_header shows fields that are common to all
+ PostgreSQL> heap and index pages.
+
-
-
- bt_page_stats() shows information about single btree pages:
-
-
- test=> SELECT * FROM bt_page_stats('pg_cast_oid_index', 1);
- -[ RECORD 1 ]-+-----
- blkno | 1
- type | l
- live_items | 256
- dead_items | 0
- avg_item_size | 12
- page_size | 8192
- free_size | 4056
- btpo_prev | 0
- btpo_next | 0
- btpo | 0
- btpo_flags | 3
-
-
+
+ A page image obtained with get_raw_page should be
+ passed as argument. For example:
+
+
+test=# SELECT * FROM page_header(get_raw_page('pg_class', 0));
+ lsn | tli | flags | lower | upper | special | pagesize | version | prune_xid
+-----------+-----+-------+-------+-------+---------+----------+---------+-----------
+ 0/24A1B50 | 1 | 1 | 232 | 368 | 8192 | 8192 | 4 | 0
+
-
-
- bt_page_items() returns information about specific items on btree pages:
+
+ The returned columns correspond to the fields in the
+ PageHeaderData> struct.
+ See src/include/storage/bufpage.h> for details.
-
- test=> SELECT * FROM bt_page_items('pg_cast_oid_index', 1);
- itemoffset | ctid | itemlen | nulls | vars | data
- ------------+---------+---------+-------+------+-------------
- 1 | (0,1) | 12 | f | f | 23 27 00 00
- 2 | (0,2) | 12 | f | f | 24 27 00 00
- 3 | (0,3) | 12 | f | f | 25 27 00 00
- 4 | (0,4) | 12 | f | f | 26 27 00 00
- 5 | (0,5) | 12 | f | f | 27 27 00 00
- 6 | (0,6) | 12 | f | f | 28 27 00 00
- 7 | (0,7) | 12 | f | f | 29 27 00 00
- 8 | (0,8) | 12 | f | f | 2a 27 00 00
-
-
-
+
+
+
+
+
+ heap_page_items(bytea) returns setof record
+
+
+
+
+ heap_page_items shows all line pointers on a heap
+ page. For those line pointers that are in use, tuple headers are also
+ shown. All tuples are shown, whether or not the tuples were visible to
+ an MVCC snapshot at the time the raw page was copied.
+
+
+ A heap page image obtained with get_raw_page should
+ be passed as argument. For example:
+
+
+test=# SELECT * FROM heap_page_items(get_raw_page('pg_class', 0));
+
+
+ See src/include/storage/itemid.h> and
+ src/include/access/htup.h> for explanations of the fields
+ returned.
+
+
+
+
+
+
+ bt_metap(text) returns record
+
+
+
+
+ bt_metap returns information about a btree
+ index's metapage. For example:
+
+
+test=# SELECT * FROM bt_metap('pg_cast_oid_index');
+-[ RECORD 1 ]-----
+magic | 340322
+version | 2
+root | 1
+level | 0
+fastroot | 1
+fastlevel | 0
+
+
+
+
+
+
+ bt_page_stats(text, int) returns record
+
+
+
+
+ bt_page_stats returns summary information about
+ single pages of btree indexes. For example:
+
+
+test=# SELECT * FROM bt_page_stats('pg_cast_oid_index', 1);
+-[ RECORD 1 ]-+-----
+blkno | 1
+type | l
+live_items | 256
+dead_items | 0
+avg_item_size | 12
+page_size | 8192
+free_size | 4056
+btpo_prev | 0
+btpo_next | 0
+btpo | 0
+btpo_flags | 3
+
+
+
+
+
+
+ bt_page_items(text, int) returns setof record
+
+
+
+
+ bt_page_items returns detailed information about
+ all of the items on a btree index page. For example:
+
+
+test=# SELECT * FROM bt_page_items('pg_cast_oid_index', 1);
+ itemoffset | ctid | itemlen | nulls | vars | data
+------------+---------+---------+-------+------+-------------
+ 1 | (0,1) | 12 | f | f | 23 27 00 00
+ 2 | (0,2) | 12 | f | f | 24 27 00 00
+ 3 | (0,3) | 12 | f | f | 25 27 00 00
+ 4 | (0,4) | 12 | f | f | 26 27 00 00
+ 5 | (0,5) | 12 | f | f | 27 27 00 00
+ 6 | (0,6) | 12 | f | f | 28 27 00 00
+ 7 | (0,7) | 12 | f | f | 29 27 00 00
+ 8 | (0,8) | 12 | f | f | 2a 27 00 00
+
+
+
+
-
+
diff --git a/doc/src/sgml/pgbench.sgml b/doc/src/sgml/pgbench.sgml
index 0e9dcfab442..be089f8836d 100644
--- a/doc/src/sgml/pgbench.sgml
+++ b/doc/src/sgml/pgbench.sgml
@@ -1,436 +1,544 @@
+
pgbench
-
+
pgbench
- pgbench is a simple program to run a benchmark test.
- pgbench is a client application of PostgreSQL and runs
- with PostgreSQL only. It performs lots of small and simple transactions
- including SELECT/UPDATE/INSERT operations then calculates number of
- transactions successfully completed within a second (transactions
- per second, tps). Targeting data includes a table with at least 100k
- tuples.
-
-
- Example outputs from pgbench look like:
-
-
-number of clients: 4
-number of transactions per client: 100
-number of processed transactions: 400/400
-tps = 19.875015(including connections establishing)
-tps = 20.098827(excluding connections establishing)
-
- Similar program called "JDBCBench" already exists, but it requires
- Java that may not be available on every platform. Moreover some
- people concerned about the overhead of Java that might lead
- inaccurate results. So I decided to write in pure C, and named
- it "pgbench."
+ pgbench is a simple program for running benchmark
+ tests on PostgreSQL>. It runs the same sequence of SQL
+ commands over and over, possibly in multiple concurrent database sessions,
+ and then calculates the average transaction rate (transactions per second).
+ By default, pgbench tests a scenario that is
+ loosely based on TPC-B, involving five SELECT>,
+ UPDATE>, and INSERT> commands per transaction.
+ However, it is easy to test other cases by writing your own transaction
+ script files.
- Features of pgbench:
+ Typical output from pgbench looks like:
+
+
+transaction type: TPC-B (sort of)
+scaling factor: 10
+number of clients: 10
+number of transactions per client: 1000
+number of transactions actually processed: 10000/10000
+tps = 85.184871 (including connections establishing)
+tps = 85.296346 (excluding connections establishing)
+
+
+ The first four lines just report some of the most important parameter
+ settings. The next line reports the number of transactions completed
+ and intended (the latter being just the product of number of clients
+ and number of transactions); these will be equal unless the run
+ failed before completion. The last two lines report the TPS rate,
+ figured with and without counting the time to start database sessions.
-
-
-
- pgbench is written in C using libpq only. So it is very portable
- and easy to install.
-
-
-
-
- pgbench can simulate concurrent connections using asynchronous
- capability of libpq. No threading is required.
-
-
-
Overview
-
-
- (optional)Initialize database by:
-
-pgbench -i <dbname>
-
-
- where <dbname> is the name of database. pgbench uses four tables
- accounts, branches, history and tellers. These tables will be
- destroyed. Be very careful if you have tables having same
- names. Default test data contains:
-
-
-table # of tuples
+
+
+ The default TPC-B-like transaction test requires specific tables to be
+ set up beforehand. pgbench> should be invoked with
+ the -i> (initialize) option to create and populate these
+ tables. (When you are testing a custom script, you don't need this
+ step, but will instead need to do whatever setup your test needs.)
+ Initialization looks like:
+
+
+pgbench -i other-options> dbname>
+
+
+ where dbname> is the name of the already-created
+ database to test in. (You may also need -h>,
+ -p>, and/or -U> options to specify how to
+ connect to the database server.)
+
+
+
+
+ pgbench -i> creates four tables accounts>,
+ branches>, history>, and
+ tellers>, destroying any existing tables of these names.
+ Be very careful to use another database if you have tables having these
+ names!
+
+
+
+
+ At the default scale factor> of 1, the tables initially
+ contain this many rows:
+
+
+table # of rows
-------------------------
branches 1
tellers 10
accounts 100000
history 0
-
-
- You can increase the number of tuples by using -s option. branches,
- tellers and accounts tables are created with a fillfactor which is
- set using -F option. See below.
-
-
-
- Run the benchmark test
-
-pgbench <dbname>
-
-
- The default configuration is:
-
-
- number of clients: 1
- number of transactions per client: 10
-
-
-
+
+
+ You can (and, for most purposes, probably should) increase the number
+ of rows by using the -s> (scale factor) option. The
+ -F> (fillfactor) option might also be used at this point.
+
-
- pgbench options
+
+ Once you have done the necessary setup, you can run your benchmark
+ with a command that doesn't include -i>, that is
+
+
+pgbench options> dbname>
+
+
+ In nearly all cases, you'll need some options to make a useful test.
+ The most important options are -c> (number of clients),
+ -t> (number of transactions), and -f> (specify
+ a custom script file). See below for a full list.
+
+
+
+ shows options that are used
+ during database initialization, while
+ shows options that are used
+ while running benchmarks, and
+ shows options that are useful
+ in both cases.
+
+
+
+ pgbench initialization options
- Parameter
+ Option
Description
+
- -h hostname
+ -i
-
- hostname where the backend is running. If this option
- is omitted, pgbench will connect to the localhost via
- Unix domain socket.
-
+ Required to invoke initialization mode.
- -p port
+ -s scale_factor>
-
- the port number that the backend is accepting. default is
- libpq's default, usually 5432.
-
+ Multiply the number of rows generated by the scale factor.
+ For example, -s 100> will imply 10,000,000 rows
+ in the accounts> table. Default is 1.
- -c number_of_clients
+ -F fillfactor>
-
- Number of clients simulated. default is 1.
-
+ Create the accounts>, tellers> and
+ branches> tables with the given fillfactor.
+ Default is 100.
+
+
+
+
+
+
+
+ pgbench benchmarking options
+
+
+
+ Option
+ Description
+
+
+
+
+
+ -c clients>
+
+ Number of clients simulated, that is, number of concurrent database
+ sessions. Default is 1.
- -t number_of_transactions
+ -t transactions>
-
- Number of transactions each client runs. default is 10.
-
-
-
-
- -s scaling_factor
-
-
- this should be used with -i (initialize) option.
- number of tuples generated will be multiple of the
- scaling factor. For example, -s 100 will imply 10M
- (10,000,000) tuples in the accounts table.
- default is 1.
-
-
- NOTE: scaling factor should be at least
- as large as the largest number of clients you intend
- to test; else you'll mostly be measuring update contention.
- Regular (not initializing) runs using one of the
- built-in tests will detect scale based on the number of
- branches in the database. For custom (-f) runs it can
- be manually specified with this parameter.
-
-
-
-
- -D varname=value
-
-
- Define a variable. It can be refered to by a script
- provided by using -f option. Multiple -D options are allowed.
-
-
-
-
- -U login
-
-
- Specify db user's login name if it is different from
- the Unix login name.
-
-
-
-
- -P password
-
-
- Specify the db password. CAUTION: using this option
- might be a security hole since ps command will
- show the password. Use this for TESTING PURPOSE ONLY.
-
-
-
-
- -n
-
-
- No vacuuming and cleaning the history table prior to the
- test is performed.
-
-
-
-
- -v
-
-
- Do vacuuming before testing. This will take some time.
- With neither -n nor -v, pgbench will vacuum tellers and
- branches tables only.
-
-
-
-
- -S
-
-
- Perform select only transactions instead of TPC-B.
-
+ Number of transactions each client runs. Default is 10.
-N
-
- Do not update "branches" and "tellers". This will
- avoid heavy update contention on branches and tellers,
- while it will not make pgbench supporting TPC-B like
- transactions.
-
+ Do not update tellers> and branches>.
+ This will avoid update contention on these tables, but
+ it makes the test case even less like TPC-B.
- -f filename
+ -S
-
- Read transaction script from file. Detailed
- explanation will appear later.
-
+ Perform select-only transactions instead of TPC-B-like test.
+
+
+
+ -f filename>
+
+ Read transaction script from filename>.
+ See below for details.
+ -N, -S, and -f
+ are mutually exclusive.
+
+
+
+ -n
+
+ No vacuuming is performed before running the test.
+ This option is necessary>
+ if you are running a custom test scenario that does not include
+ the standard tables accounts>,
+ branches>, history>, and
+ tellers>.
+
+
+
+ -v
+
+ Vacuum all four standard tables before running the test.
+ With neither -n> nor -v>, pgbench will vacuum
+ tellers> and branches> tables, and
+ will remove all entries in history>.
+
+
+
+ -D varname>=>value>
+
+ Define a variable for use by a custom script (see below).
+ Multiple -D> options are allowed.
-C
-
- Establish connection for each transaction, rather than
- doing it just once at beginning of pgbench in the normal
- mode. This is useful to measure the connection overhead.
-
+ Establish a new connection for each transaction, rather than
+ doing it just once per client thread.
+ This is useful to measure the connection overhead.
-l
-
- Write the time taken by each transaction to a logfile,
- with the name "pgbench_log.xxx", where xxx is the PID
- of the pgbench process. The format of the log is:
-
-
- client_id transaction_no time file_no time-epoch time-us
-
-
- where time is measured in microseconds, , the file_no is
- which test file was used (useful when multiple were
- specified with -f), and time-epoch/time-us are a
- UNIX epoch format timestamp followed by an offset
- in microseconds (suitable for creating a ISO 8601
- timestamp with a fraction of a second) of when
- the transaction completed.
-
-
- Here are example outputs:
-
-
- 0 199 2241 0 1175850568 995598
- 0 200 2465 0 1175850568 998079
- 0 201 2513 0 1175850569 608
- 0 202 2038 0 1175850569 2663
-
+ Write the time taken by each transaction to a logfile.
+ See below for details.
- -F fillfactor
+ -s scale_factor>
-
- Create tables(accounts, tellers and branches) with the given
- fillfactor. Default is 100. This should be used with -i
- (initialize) option.
-
+ Report the specified scale factor in pgbench>'s
+ output. With the built-in tests, this is not necessary; the
+ correct scale factor will be detected by counting the number of
+ rows in the branches> table. However, when testing
+ custom benchmarks (-f> option), the scale factor
+ will be reported as 1 unless this option is used.
-d
-
- debug option.
-
+ Print debugging output.
+
+
+ pgbench common options
+
+
+
+ Option
+ Description
+
+
+
+
+
+ -h hostname>
+ database server's host
+
+
+ -p port>
+ database server's port
+
+
+ -U login>
+ username to connect as
+
+
+ -P password>
+ password (deprecated — putting this on the command line
+ is a security hazard)
+
+
+
+
- What is the "transaction" actually performed in pgbench?
+ What is the transaction> actually performed in pgbench?
+
+
+ The default transaction script issues seven commands per transaction:
+
+
- begin;
-
- update accounts set abalance = abalance + :delta where aid = :aid;
-
- select abalance from accounts where aid = :aid;
-
- update tellers set tbalance = tbalance + :delta where tid = :tid;
-
- update branches set bbalance = bbalance + :delta where bid = :bid;
-
- insert into history(tid,bid,aid,delta) values(:tid,:bid,:aid,:delta);
-
- end;
+ BEGIN;
+ UPDATE accounts SET abalance = abalance + :delta WHERE aid = :aid;
+ SELECT abalance FROM accounts WHERE aid = :aid;
+ UPDATE tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
+ UPDATE branches SET bbalance = bbalance + :delta WHERE bid = :bid;
+ INSERT INTO history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
+ END;
+
- If you specify -N, (4) and (5) aren't included in the transaction.
+ If you specify -N>, steps 4 and 5 aren't included in the
+ transaction. If you specify -S>, only the SELECT> is
+ issued.
- Script file
+ Custom Scripts
+
- pgbench has support for reading a transaction script
- from a specified file (-f option). This file should
- include SQL commands in each line. SQL command consists of multiple lines
- are not supported. Empty lines and lines begging with "--" will be ignored.
-
-
- Multiple -f options are allowed. In this case each
- transaction is assigned randomly chosen script.
-
-
- SQL commands can include "meta command" which begins with "\" (back
- slash). A meta command takes some arguments separted by white
- spaces. Currently following meta command is supported:
+ pgbench has support for running custom
+ benchmark scenarios by replacing the default transaction script
+ (described above) with a transaction script read from a file
+ (-f option). In this case a transaction>
+ counts as one execution of a script file. You can even specify
+ multiple scripts (multiple -f options), in which
+ case a random one of the scripts is chosen each time a client session
+ starts a new transaction.
-
-
-
- \set name operand1 [ operator operand2 ]
- - Sets the calculated value using "operand1" "operator"
- "operand2" to variable "name". If "operator" and "operand2"
- are omitted, the value of operand1 is set to variable "name".
-
-
- Example:
-
-
+
+ The format of a script file is one SQL command per line; multi-line
+ SQL commands are not supported. Empty lines and lines beginning with
+ --> are ignored. Script file lines can also be
+ meta commands>, which are interpreted by pgbench>
+ itself, as described below.
+
+
+
+ There is a simple variable-substitution facility for script files.
+ Variables can be set by the command-line -D> option,
+ explained above, or by the meta commands explained below.
+ In addition to any variables preset by -D> command-line options,
+ the variable scale> is preset to the current scale factor.
+ Once set, a variable's
+ value can be inserted into a SQL command by writing
+ :>variablename>. When running more than
+ one client session, each session has its own set of variables.
+
+
+
+ Script file meta commands begin with a backslash (\>).
+ Arguments to a meta command are separated by white space.
+ These meta commands are supported:
+
+
+
+
+
+ \set varname> operand1> [ operator> operand2> ]
+
+
+
+
+ Sets variable varname> to a calculated integer value.
+ Each operand> is either an integer constant or a
+ :>variablename> reference to a variable
+ having an integer value. The operator> can be
+ +>, ->, *>, or />.
+
+
+
+ Example:
+
\set ntellers 10 * :scale
-
-
-
-
- \setrandom name min max
- - Assigns random integer to name between min and max
-
-
- Example:
-
-
-\setrandom aid 1 100000
-
-
-
-
- Variables can be referred to in SQL comands by adding ":" in front
- of the varible name.
-
-
- Example:
-
-
-SELECT abalance FROM accounts WHERE aid = :aid
-
-
- Variables can also be defined by using -D option.
-
-
-
-
- \sleep num [us|ms|s]> - Causes script execution to sleep for the
- specified duration of microseconds (us), milliseconds (ms) or the default
- seconds (s).
-
-
- Example:
-
-
-\setrandom millisec 1000 2500
-\sleep : millisec ms
-
-
-
-
+
+
+
+
+
+
+
+ \setrandom varname> min> max>
+
+
+
+
+ Sets variable varname> to a random integer value
+ between the limits min> and max>.
+ Each limit can be either an integer constant or a
+ :>variablename> reference to a variable
+ having an integer value.
+
+
+
+ Example:
+
+\setrandom aid 1 :naccounts
+
+
+
+
+
+
+
+ \sleep number> [ us | ms | s ]
+
+
+
+
+ Causes script execution to sleep for the specified duration in
+ microseconds (us>), milliseconds (ms>) or seconds
+ (s>). If the unit is omitted then seconds are the default.
+ number> can be either an integer constant or a
+ :>variablename> reference to a variable
+ having an integer value.
+
+
+
+ Example:
+
+\sleep 10 ms
+
+
+
+
+
-
- Examples
- Example, TPC-B like benchmark can be defined as follows(scaling
- factor = 1):
-
-
+ As an example, the full definition of the built-in TPC-B-like
+ transaction is:
+
+
\set nbranches :scale
\set ntellers 10 * :scale
\set naccounts 100000 * :scale
\setrandom aid 1 :naccounts
\setrandom bid 1 :nbranches
\setrandom tid 1 :ntellers
-\setrandom delta 1 10000
-BEGIN
-UPDATE accounts SET abalance = abalance + :delta WHERE aid = :aid
-SELECT abalance FROM accounts WHERE aid = :aid
-UPDATE tellers SET tbalance = tbalance + :delta WHERE tid = :tid
-UPDATE branches SET bbalance = bbalance + :delta WHERE bid = :bid
-INSERT INTO history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, 'now')
-END
-
-
- If you want to automatically set the scaling factor from the number of
- tuples in branches table, use -s option and shell command like this:
+\setrandom delta -5000 5000
+BEGIN;
+UPDATE accounts SET abalance = abalance + :delta WHERE aid = :aid;
+SELECT abalance FROM accounts WHERE aid = :aid;
+UPDATE tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
+UPDATE branches SET bbalance = bbalance + :delta WHERE bid = :bid;
+INSERT INTO history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
+END;
+
+
+ This script allows each iteration of the transaction to reference
+ different, randomly-chosen rows. (This example also shows why it's
+ important for each client session to have its own variables —
+ otherwise they'd not be independently touching different rows.)
-
-pgbench -s $(psql -At -c "SELECT count(*) FROM branches") -f tpc_b.sql
-
+
+
+
+
+ Per-transaction logging
+
- Notice that -f option does not execute vacuum and clearing history
- table before starting benchmark.
+ With the -l> option, pgbench> writes the time
+ taken by each transaction to a logfile. The logfile will be named
+ pgbench_log.nnn>, where
+ nnn> is the PID of the pgbench process.
+ The format of the log is:
+
+
+ client_id> transaction_no> time> file_no> time_epoch> time_us>
+
+
+ where time> is the elapsed transaction time in microseconds,
+ file_no> identifies which script file was used
+ (useful when multiple scripts were specified with -f>),
+ and time_epoch>/time_us> are a
+ UNIX epoch format timestamp and an offset
+ in microseconds (suitable for creating a ISO 8601
+ timestamp with fractional seconds) showing when
+ the transaction completed.
+
+
+
+ Here are example outputs:
+
+ 0 199 2241 0 1175850568 995598
+ 0 200 2465 0 1175850568 998079
+ 0 201 2513 0 1175850569 608
+ 0 202 2038 0 1175850569 2663
+
+
+
+
+
+ Good Practices
+
+
+ It is very easy to use pgbench> to produce completely
+ meaningless numbers. Here are some guidelines to help you get useful
+ results.
+
+
+
+ In the first place, never> believe any test that runs
+ for only a few seconds. Increase the -t> setting enough
+ to make the run last at least a few minutes, so as to average out noise.
+ In some cases you could need hours to get numbers that are reproducible.
+ It's a good idea to try the test run a few times, to find out if your
+ numbers are reproducible or not.
+
+
+
+ For the default TPC-B-like test scenario, the initialization scale factor
+ (-s>) should be at least as large as the largest number of
+ clients you intend to test (-c>); else you'll mostly be
+ measuring update contention. There are only -s> rows in
+ the branches> table, and every transaction wants to
+ update one of them, so -c> values in excess of -s>
+ will undoubtedly result in lots of transactions blocked waiting for
+ other transactions.
+
+
+
+ The default test scenario is also quite sensitive to how long it's been
+ since the tables were initialized: accumulation of dead rows and dead space
+ in the tables changes the results. To understand the results you must keep
+ track of the total number of updates and when vacuuming happens. If
+ autovacuum is enabled it can result in unpredictable changes in measured
+ performance.
+
+
+
+ A limitation of pgbench> is that it can itself become
+ the bottleneck when trying to test a large number of client sessions.
+ This can be alleviated by running pgbench> on a different
+ machine from the database server, although low network latency will be
+ essential. It might even be useful to run several pgbench>
+ instances concurrently, on several client machines, against the same
+ database server.
-
diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
index e4ecc8f78ce..f2507df124a 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -1,85 +1,138 @@
+
+
pg_buffercache
-
+
pg_buffercache
- The pg_buffercache module provides a means for examining
- what's happening to the buffercache at any given time without having to
- restart or rebuild the server with debugging code added. The intent is to
- do for the buffercache what pg_locks does for locks.
+ The pg_buffercache module provides a means for
+ examining what's happening in the shared buffer cache in real time.
+
- This module consists of a C function pg_buffercache_pages()
- that returns a set of records, plus a view pg_buffercache
- to wrapper the function.
+ The module provides a C function pg_buffercache_pages
+ that returns a set of records, plus a view
+ pg_buffercache that wraps the function for
+ convenient use.
+
- By default public access is REVOKED from both of these, just in case there
+ By default public access is revoked from both of these, just in case there
are security issues lurking.
- Notes
+ The pg_buffercache view
+
- The definition of the columns exposed in the view is:
+ The definitions of the columns exposed by the view are:
-
- Column | references | Description
- ----------------+----------------------+------------------------------------
- bufferid | | Id, 1..shared_buffers.
- relfilenode | pg_class.relfilenode | Refilenode of the relation.
- reltablespace | pg_tablespace.oid | Tablespace oid of the relation.
- reldatabase | pg_database.oid | Database for the relation.
- relblocknumber | | Offset of the page in the relation.
- isdirty | | Is the page dirty?
- usagecount | | Page LRU count
-
+
+
+ pg_buffercache> Columns
+
+
+
+
+ Name
+ Type
+ References
+ Description
+
+
+
+
+
+ bufferid
+ integer
+
+ ID, in the range 1..shared_buffers>
+
+
+
+ relfilenode
+ oid
+ pg_class.relfilenode
+ Relfilenode of the relation
+
+
+
+ reltablespace
+ oid
+ pg_tablespace.oid
+ Tablespace OID of the relation
+
+
+
+ reldatabase
+ oid
+ pg_database.oid
+ Database OID of the relation
+
+
+
+ relblocknumber
+ bigint
+
+ Page number within the relation
+
+
+
+ isdirty
+ boolean
+
+ Is the page dirty?
+
+
+
+ usagecount
+ smallint
+
+ Page LRU count
+
+
+
+
+
+
There is one row for each buffer in the shared cache. Unused buffers are
- shown with all fields null except bufferid.
+ shown with all fields null except bufferid>. Shared system
+ catalogs are shown as belonging to database zero.
+
- Because the cache is shared by all the databases, there are pages from
- relations not belonging to the current database.
+ Because the cache is shared by all the databases, there will normally be
+ pages from relations not belonging to the current database. This means
+ that there may not be matching join rows in pg_class> for
+ some rows, or that there could even be incorrect joins. If you are
+ trying to join against pg_class>, it's a good idea to
+ restrict the join to rows having reldatabase> equal to
+ the current database's OID or zero.
+
- When the pg_buffercache view is accessed, internal buffer manager locks are
- taken, and a copy of the buffer cache data is made for the view to display.
- This ensures that the view produces a consistent set of results, while not
- blocking normal buffer activity longer than necessary. Nonetheless there
+ When the pg_buffercache> view is accessed, internal buffer
+ manager locks are taken for long enough to copy all the buffer state
+ data that the view will display.
+ This ensures that the view produces a consistent set of results, while not
+ blocking normal buffer activity longer than necessary. Nonetheless there
could be some impact on database performance if this view is read often.
Sample output
+
- regression=# \d pg_buffercache;
- View "public.pg_buffercache"
- Column | Type | Modifiers
- ----------------+----------+-----------
- bufferid | integer |
- relfilenode | oid |
- reltablespace | oid |
- reldatabase | oid |
- relblocknumber | bigint |
- isdirty | boolean |
- usagecount | smallint |
-
- View definition:
- SELECT p.bufferid, p.relfilenode, p.reltablespace, p.reldatabase,
- p.relblocknumber, p.isdirty, p.usagecount
- FROM pg_buffercache_pages() p(bufferid integer, relfilenode oid,
- reltablespace oid, reldatabase oid, relblocknumber bigint,
- isdirty boolean, usagecount smallint);
-
regression=# SELECT c.relname, count(*) AS buffers
- FROM pg_class c INNER JOIN pg_buffercache b
- ON b.relfilenode = c.relfilenode INNER JOIN pg_database d
- ON (b.reldatabase = d.oid AND d.datname = current_database())
+ FROM pg_buffercache b INNER JOIN pg_class c
+ ON b.relfilenode = c.relfilenode AND
+ b.reldatabase IN (0, (SELECT oid FROM pg_database
+ WHERE datname = current_database()))
GROUP BY c.relname
ORDER BY 2 DESC LIMIT 10;
relname | buffers
@@ -95,26 +148,23 @@
pg_depend | 22
pg_depend_reference_index | 20
(10 rows)
-
- regression=#
Authors
-
-
-
- Mark Kirkwood markir@paradise.net.nz
-
-
-
- Design suggestions: Neil Conway neilc@samurai.com
-
-
- Debugging advice: Tom Lane tgl@sss.pgh.pa.us
-
-
+
+
+ Mark Kirkwood markir@paradise.net.nz
+
+
+
+ Design suggestions: Neil Conway neilc@samurai.com
+
+
+
+ Debugging advice: Tom Lane tgl@sss.pgh.pa.us
+
diff --git a/doc/src/sgml/pgfreespacemap.sgml b/doc/src/sgml/pgfreespacemap.sgml
index 66748c3ef4b..bc821ead6ba 100644
--- a/doc/src/sgml/pgfreespacemap.sgml
+++ b/doc/src/sgml/pgfreespacemap.sgml
@@ -1,184 +1,203 @@
+
+
pg_freespacemap
-
+
pg_freespacemap
- This module provides a means for examining the free space map (FSM). It
- consists of two C functions: pg_freespacemap_relations()
- and pg_freespacemap_pages() that return a set
- of records, plus two views pg_freespacemap_relations and
- pg_freespacemap_pages for more user-friendly access to
- the functions.
+ The pg_freespacemap> module provides a means for examining the
+ free space map (FSM). It provides two C functions:
+ pg_freespacemap_relations and
+ pg_freespacemap_pages that each return a set of
+ records, plus two views pg_freespacemap_relations
+ and pg_freespacemap_pages that wrap the functions
+ for convenient use.
+
- The module provides the ability to examine the contents of the free space
- map, without having to restart or rebuild the server with additional
- debugging code.
-
-
- By default public access is REVOKED from the functions and views, just in
- case there are security issues present in the code.
+ By default public access is revoked from the functions and views, just in
+ case there are security issues lurking.
- Notes
+ The pg_freespacemap> views
+
- The definitions for the columns exposed in the views are:
+ The definitions of the columns exposed by the views are:
- pg_freespacemap_relations
-
+ pg_freespacemap_relations> Columns
+
+
- Column
- references
+ Name
+ Type
+ References
Description
+
- reltablespace
- pg_tablespace.oid
- Tablespace oid of the relation.
+ reltablespace
+ oid
+ pg_tablespace.oid
+ Tablespace OID of the relation
- reldatabase
- pg_database.oid
- Database oid of the relation.
+ reldatabase
+ oid
+ pg_database.oid
+ Database OID of the relation
- relfilenode
- pg_class.relfilenode
- Relfilenode of the relation.
+ relfilenode
+ oid
+ pg_class.relfilenode
+ Relfilenode of the relation
- avgrequest
+ avgrequest
+ integer
Moving average of free space requests (NULL for indexes)
- interestingpages
+ interestingpages
+ integer
- Count of pages last reported as containing useful free space.
+ Count of pages last reported as containing useful free space
- storedpages
+ storedpages
+ integer
- Count of pages actually stored in free space map.
+ Count of pages actually stored in free space map
- nextpage
+ nextpage
+ integer
- Page index (from 0) to start next search at.
+ Page index (from 0) to start next search at
+
- pg_freespacemap_pages
-
+ pg_freespacemap_pages> Columns
+
+
- Column
- references
+ Name
+ Type
+ References
Description
+
- reltablespace
- pg_tablespace.oid
- Tablespace oid of the relation.
+ reltablespace
+ oid
+ pg_tablespace.oid
+ Tablespace OID of the relation
- reldatabase
- pg_database.oid
- Database oid of the relation.
+ reldatabase
+ oid
+ pg_database.oid
+ Database OID of the relation
- relfilenode
- pg_class.relfilenode
- Relfilenode of the relation.
+ relfilenode
+ oid
+ pg_class.relfilenode
+ Relfilenode of the relation
- relblocknumber
+ relblocknumber
+ bigint
- Page number in the relation.
+ Page number within the relation
- bytes
+ bytes
+ integer
- Free bytes in the page, or NULL for an index page (see below).
+ Free bytes in the page, or NULL for an index page (see below)
+
- For pg_freespacemap_relations, there is one row for each
- relation in the free space map. storedpages is the
- number of pages actually stored in the map, while
- interestingpages is the number of pages the last VACUUM
- thought had useful amounts of free space.
+ For pg_freespacemap_relations, there is one row
+ for each relation in the free space map.
+ storedpages is the number of pages actually
+ stored in the map, while interestingpages is the
+ number of pages the last VACUUM> thought had useful amounts of
+ free space.
+
- If storedpages is consistently less than interestingpages
- then it'd be a good idea to increase max_fsm_pages. Also,
- if the number of rows in pg_freespacemap_relations is
- close to max_fsm_relations, then you should consider
- increasing max_fsm_relations.
+ If storedpages is consistently less than
+ interestingpages> then it'd be a good idea to increase
+ max_fsm_pages. Also, if the number of rows in
+ pg_freespacemap_relations is close to
+ max_fsm_relations, then you should consider increasing
+ max_fsm_relations.
+
- For pg_freespacemap_pages, there is one row for each page
- in the free space map. The number of rows for a relation will match the
- storedpages column in
- pg_freespacemap_relations.
+ For pg_freespacemap_pages, there is one row for
+ each page in the free space map. The number of rows for a relation will
+ match the storedpages column in
+ pg_freespacemap_relations.
+
For indexes, what is tracked is entirely-unused pages, rather than free
space within pages. Therefore, the average request size and free bytes
within a page are not meaningful, and are shown as NULL.
+
- Because the map is shared by all the databases, it will include relations
- not belonging to the current database.
+ Because the map is shared by all the databases, there will normally be
+ entries for relations not belonging to the current database. This means
+ that there may not be matching join rows in pg_class> for
+ some rows, or that there could even be incorrect joins. If you are
+ trying to join against pg_class>, it's a good idea to
+ restrict the join to rows having reldatabase> equal to
+ the current database's OID or zero.
+
- When either of the views are accessed, internal free space map locks are
- taken, and a copy of the map data is made for them to display.
- This ensures that the views produce a consistent set of results, while not
- blocking normal activity longer than necessary. Nonetheless there
+ When either of the views is accessed, internal free space map locks are
+ taken for long enough to copy all the state data that the view will display.
+ This ensures that the views produce a consistent set of results, while not
+ blocking normal activity longer than necessary. Nonetheless there
could be some impact on database performance if they are read often.
- Sample output - pg_freespacemap_relations
-
-regression=# \d pg_freespacemap_relations
-View "public.pg_freespacemap_relations"
- Column | Type | Modifiers
-------------------+---------+-----------
- reltablespace | oid |
- reldatabase | oid |
- relfilenode | oid |
- avgrequest | integer |
- interestingpages | integer |
- storedpages | integer |
- nextpage | integer |
-View definition:
- SELECT p.reltablespace, p.reldatabase, p.relfilenode, p.avgrequest, p.interestingpages, p.storedpages, p.nextpage
- FROM pg_freespacemap_relations() p(reltablespace oid, reldatabase oid, relfilenode oid, avgrequest integer, interestingpages integer, storedpages integer, nextpage integer);
+ Sample output
+
regression=# SELECT c.relname, r.avgrequest, r.interestingpages, r.storedpages
FROM pg_freespacemap_relations r INNER JOIN pg_class c
- ON c.relfilenode = r.relfilenode INNER JOIN pg_database d
- ON r.reldatabase = d.oid AND (d.datname = current_database())
+ ON r.relfilenode = c.relfilenode AND
+ r.reldatabase IN (0, (SELECT oid FROM pg_database
+ WHERE datname = current_database()))
ORDER BY r.storedpages DESC LIMIT 10;
relname | avgrequest | interestingpages | storedpages
---------------------------------+------------+------------------+-------------
@@ -193,31 +212,14 @@ regression=# SELECT c.relname, r.avgrequest, r.interestingpages, r.storedpages
pg_class_relname_nsp_index | | 10 | 10
pg_proc | 302 | 8 | 8
(10 rows)
-
-
-
-
- Sample output - pg_freespacemap_pages
-
-regression=# \d pg_freespacemap_pages
- View "public.pg_freespacemap_pages"
- Column | Type | Modifiers
-----------------+---------+-----------
- reltablespace | oid |
- reldatabase | oid |
- relfilenode | oid |
- relblocknumber | bigint |
- bytes | integer |
-View definition:
- SELECT p.reltablespace, p.reldatabase, p.relfilenode, p.relblocknumber, p.bytes
- FROM pg_freespacemap_pages() p(reltablespace oid, reldatabase oid, relfilenode oid, relblocknumber bigint, bytes integer);
regression=# SELECT c.relname, p.relblocknumber, p.bytes
FROM pg_freespacemap_pages p INNER JOIN pg_class c
- ON c.relfilenode = p.relfilenode INNER JOIN pg_database d
- ON (p.reldatabase = d.oid AND d.datname = current_database())
+ ON p.relfilenode = c.relfilenode AND
+ p.reldatabase IN (0, (SELECT oid FROM pg_database
+ WHERE datname = current_database()))
ORDER BY c.relname LIMIT 10;
- relname | relblocknumber | bytes
+ relname | relblocknumber | bytes
--------------+----------------+-------
a_star | 0 | 8040
abstime_tbl | 0 | 7908
@@ -235,8 +237,10 @@ regression=# SELECT c.relname, p.relblocknumber, p.bytes
Author
+
Mark Kirkwood markir@paradise.net.nz
+
diff --git a/doc/src/sgml/pgrowlocks.sgml b/doc/src/sgml/pgrowlocks.sgml
index 140b33387f9..6fe56aa4df5 100644
--- a/doc/src/sgml/pgrowlocks.sgml
+++ b/doc/src/sgml/pgrowlocks.sgml
@@ -1,115 +1,122 @@
+
pgrowlocks
-
+
pgrowlocks
- The pgrowlocks module provides a function to show row
+ The pgrowlocks module provides a function to show row
locking information for a specified table.
Overview
-
-pgrowlocks(text) RETURNS pgrowlocks_type
-
+
+
+pgrowlocks(text) returns setof record
+
+
- The parameter is a name of table. And pgrowlocks_type is
- defined as:
+ The parameter is the name of a table. The result is a set of records,
+ with one row for each locked row within the table. The output columns
+ are:
-
-CREATE TYPE pgrowlocks_type AS (
- locked_row TID, -- row TID
- lock_type TEXT, -- lock type
- locker XID, -- locking XID
- multi bool, -- multi XID?
- xids xid[], -- multi XIDs
- pids INTEGER[] -- locker's process id
-);
-
- pgrowlocks_type
-
+ pgrowlocks> output columns
+
+
+
+
+ Name
+ Type
+ Description
+
+
+
- locked_row
- tuple ID(TID) of each locked rows
+ locked_row
+ tid
+ Tuple ID (TID) of locked row
- lock_type
- "Shared" for shared lock, "Exclusive" for exclusive lock
+ lock_type
+ text
+ Shared> for shared lock, or
+ Exclusive> for exclusive lock
- locker
- transaction ID of locker (Note 1)
+ locker
+ xid
+ Transaction ID of locker, or multixact ID if multi-transaction
- multi
- "t" if locker is a multi transaction, otherwise "f"
+ multi
+ boolean
+ True if locker is a multi-transaction
- xids
- XIDs of lockers (Note 2)
+ xids
+ xid[]
+ Transaction IDs of lockers (more than one if multi-transaction)
- pids
- process ids of locking backends
+ pids
+ integer[]
+ Process IDs of locking backends (more than one if multi-transaction)
+
-
- Note1: If the locker is multi transaction, it represents the multi ID.
-
-
- Note2: If the locker is multi, multiple data are shown.
-
- The calling sequence for pgrowlocks is as follows:
- pgrowlocks grabs AccessShareLock for the target table and
- reads each row one by one to get the row locking information. You should
- notice that:
+ pgrowlocks takes AccessShareLock> for the
+ target table and reads each row one by one to collect the row locking
+ information. This is not very speedy for a large table. Note that:
+
- if the table is exclusive locked by someone else,
- pgrowlocks will be blocked.
+ If the table as a whole is exclusive-locked by someone else,
+ pgrowlocks will be blocked.
- pgrowlocks may show incorrect information if there's a
- new lock or a lock is freeed while its execution.
+ pgrowlocks is not guaranteed to produce a
+ self-consistent snapshot. It is possible that a new row lock is taken,
+ or an old lock is freed, during its execution.
+
- pgrowlocks does not show the contents of locked rows. If
- you want to take a look at the row contents at the same time, you could do
- something like this:
-
+ pgrowlocks does not show the contents of locked
+ rows. If you want to take a look at the row contents at the same time, you
+ could do something like this:
+
-SELECT * FROM accounts AS a, pgrowlocks('accounts') AS p WHERE p.locked_ row = a.ctid;
+SELECT * FROM accounts AS a, pgrowlocks('accounts') AS p
+ WHERE p.locked_row = a.ctid;
+
+ Be aware however that (as of PostgreSQL> 8.3) such a
+ query will be very inefficient.
+
- Example
-
- pgrowlocks returns the following columns:
-
-
- Here is a sample execution of pgrowlocks:
-
+ Sample output
+
test=# SELECT * FROM pgrowlocks('t1');
- locked_row | lock_type | locker | multi | xids | pids
+ locked_row | lock_type | locker | multi | xids | pids
------------+-----------+--------+-------+-----------+---------------
(0,1) | Shared | 19 | t | {804,805} | {29066,29068}
(0,2) | Shared | 19 | t | {804,805} | {29066,29068}
@@ -117,7 +124,14 @@ test=# SELECT * FROM pgrowlocks('t1');
(0,4) | Exclusive | 804 | f | {804} | {29066}
(4 rows)
-
-
+
+ Author
+
+
+ Tatsuo Ishii
+
+
+
+
diff --git a/doc/src/sgml/pgstattuple.sgml b/doc/src/sgml/pgstattuple.sgml
index eaa3a547035..ad52dcd4420 100644
--- a/doc/src/sgml/pgstattuple.sgml
+++ b/doc/src/sgml/pgstattuple.sgml
@@ -1,29 +1,35 @@
+
pgstattuple
-
+
pgstattuple
- pgstattuple modules provides various functions to obtain
- tuple statistics.
+ The pgstattuple module provides various functions to
+ obtain tuple-level statistics.
Functions
-
-
-
- pgstattuple() returns the relation length, percentage
- of the "dead" tuples of a relation and other info. This may help users to
- determine whether vacuum is necessary or not. Here is an example session:
-
-
-test=> \x
-Expanded display is on.
+
+
+
+ pgstattuple(text) returns record>
+
+
+
+
+ pgstattuple returns a relation's physical length,
+ percentage of dead> tuples, and other info. This may help users
+ to determine whether vacuum is necessary or not. The argument is the
+ target relation's name (optionally schema-qualified).
+ For example:
+
+
test=> SELECT * FROM pgstattuple('pg_catalog.pg_proc');
-[ RECORD 1 ]------+-------
table_len | 458752
@@ -35,86 +41,111 @@ dead_tuple_len | 3157
dead_tuple_percent | 0.69
free_space | 8932
free_percent | 1.95
-
+
+
- Here are explanations for each column:
+ The output columns are:
-
+
- pgstattuple() column descriptions
-
+ pgstattuple output columns
+
Column
+ Type
Description
+
- table_len
- physical relation length in bytes
+ table_len
+ bigint
+ Physical relation length in bytes
- tuple_count
- number of live tuples
+ tuple_count
+ bigint
+ Number of live tuples
- tuple_len
- total tuples length in bytes
+ tuple_len
+ bigint
+ Total length of live tuples in bytes
- tuple_percent
- live tuples in %
+ tuple_percent
+ float8
+ Percentage of live tuples
- dead_tuple_len
- total dead tuples length in bytes
+ dead_tuple_count
+ bigint
+ Number of dead tuples
- dead_tuple_percent
- dead tuples in %
+ dead_tuple_len
+ bigint
+ Total length of dead tuples in bytes
- free_space
- free space in bytes
+ dead_tuple_percent
+ float8
+ Percentage of dead tuples
- free_percent
- free space in %
+ free_space
+ bigint
+ Total free space in bytes
+
+ free_percent
+ float8
+ Percentage of free space
+
+
-
-
-
- pgstattuple acquires only a read lock on the relation. So
- concurrent update may affect the result.
-
-
-
-
- pgstattuple judges a tuple is "dead" if HeapTupleSatisfiesNow()
- returns false.
-
-
-
-
-
-
- pg_relpages() returns the number of pages in the relation.
+ pgstattuple acquires only a read lock on the
+ relation. So the results do not reflect an instantaneous snapshot;
+ concurrent updates will affect them.
-
-
- pgstatindex() returns an array showing the information about an index:
+ pgstattuple judges a tuple is dead> if
+ HeapTupleSatisfiesNow> returns false.
-
-test=> \x
-Expanded display is on.
+
+
+
+
+
+ pgstattuple(oid) returns record>
+
+
+
+
+ This is the same as pgstattuple(text), except
+ that the target relation is specified by OID.
+
+
+
+
+
+
+ pgstatindex(text) returns record>
+
+
+
+
+ pgstatindex returns a record showing information
+ about a btree index. For example:
+
+
test=> SELECT * FROM pgstatindex('pg_cast_oid_index');
-[ RECORD 1 ]------+------
version | 2
@@ -128,31 +159,116 @@ deleted_pages | 0
avg_leaf_density | 50.27
leaf_fragmentation | 0
-
-
+
+
+ The output columns are:
+
+
+
+ pgstatindex output columns
+
+
+
+ Column
+ Type
+ Description
+
+
+
+
+
+ version
+ integer
+ Btree version number
+
+
+
+ tree_level
+ integer
+ Tree level of the root page
+
+
+
+ index_size
+ integer
+ Total number of pages in index
+
+
+
+ root_block_no
+ integer
+ Location of root block
+
+
+
+ internal_pages
+ integer
+ Number of internal> (upper-level) pages
+
+
+
+ leaf_pages
+ integer
+ Number of leaf pages
+
+
+
+ empty_pages
+ integer
+ Number of empty pages
+
+
+
+ deleted_pages
+ integer
+ Number of deleted pages
+
+
+
+ avg_leaf_density
+ float8
+ Average density of leaf pages
+
+
+
+ leaf_fragmentation
+ float8
+ Leaf page fragmentation
+
+
+
+
+
+
+
+ As with pgstattuple>, the results are accumulated
+ page-by-page, and should not be expected to represent an
+ instantaneous snapshot of the whole index.
+
+
+
+
+
+
+ pg_relpages(text) returns integer>
+
+
+
+
+ pg_relpages returns the number of pages in the
+ relation.
+
+
+
+
- Usage
-
- pgstattuple may be called as a relation function and is
- defined as follows:
-
-
- CREATE OR REPLACE FUNCTION pgstattuple(text) RETURNS pgstattuple_type
- AS 'MODULE_PATHNAME', 'pgstattuple'
- LANGUAGE C STRICT;
+ Author
- CREATE OR REPLACE FUNCTION pgstattuple(oid) RETURNS pgstattuple_type
- AS 'MODULE_PATHNAME', 'pgstattuplebyid'
- LANGUAGE C STRICT;
-
- The argument is the relation name (optionally it may be qualified)
- or the OID of the relation. Note that pgstattuple only returns
- one row.
+ Tatsuo Ishii
-
diff --git a/doc/src/sgml/pgtrgm.sgml b/doc/src/sgml/pgtrgm.sgml
index b7061481a91..5b630f203dd 100644
--- a/doc/src/sgml/pgtrgm.sgml
+++ b/doc/src/sgml/pgtrgm.sgml
@@ -1,90 +1,120 @@
+
+
pg_trgm
-
+
pg_trgm
- The pg_trgm module provides functions and index classes
- for determining the similarity of text based on trigram matching.
+ The pg_trgm module provides functions and operators
+ for determining the similarity of text based on trigram matching, as
+ well as index operator classes that support fast searching for similar
+ strings.
- Trigram (or Trigraph)
+ Trigram (or Trigraph) Concepts
+
- A trigram is a set of three consecutive characters taken
- from a string. A string is considered to have two spaces
- prefixed and one space suffixed when determining the set
- of trigrams that comprise the string.
-
-
- eg. The set of trigrams in the word "cat" is " c", " ca",
- "at " and "cat".
+ A trigram is a group of three consecutive characters taken
+ from a string. We can measure the similarity of two strings by
+ counting the number of trigrams they share. This simple idea
+ turns out to be very effective for measuring the similarity of
+ words in many natural languages.
+
+
+
+ A string is considered to have two spaces
+ prefixed and one space suffixed when determining the set
+ of trigrams contained in the string.
+ For example, the set of trigrams in the string
+ cat
is
+ c
,
+ ca
,
+ cat
, and
+ at
.
+
+
- Public Functions
-
- pg_trgm functions
-
+ Functions and Operators
+
+
+ pg_trgm functions
+
Function
+ Returns
Description
+
- real similarity(text, text)
+ similarity(text, text)
+ real
-
- Returns a number that indicates how closely matches the two
- arguments are. A zero result indicates that the two words
- are completely dissimilar, and a result of one indicates that
- the two words are identical.
-
+ Returns a number that indicates how similar the two arguments are.
+ The range of the result is zero (indicating that the two strings are
+ completely dissimilar) to one (indicating that the two strings are
+ identical).
- real show_limit()
+ show_trgm(text)
+ text[]
-
- Returns the current similarity threshold used by the '%'
- operator. This in effect sets the minimum similarity between
- two words in order that they be considered similar enough to
- be misspellings of each other, for example.
-
+ Returns an array of all the trigrams in the given string.
+ (In practice this is seldom useful except for debugging.)
- real set_limit(real)
+ show_limit()
+ real
-
- Sets the current similarity threshold that is used by the '%'
- operator, and is returned by the show_limit() function.
-
+ Returns the current similarity threshold used by the %>
+ operator. This sets the minimum similarity between
+ two words for them to be considered similar enough to
+ be misspellings of each other, for example.
- text[] show_trgm(text)
+ set_limit(real)
+ real
-
- Returns an array of all the trigrams of the supplied text
- parameter.
-
+ Sets the current similarity threshold that is used by the %>
+ operator. The threshold must be between 0 and 1 (default is 0.3).
+ Returns the same value passed in.
+
+
+
+
+
+ pg_trgm operators
+
+
- Operator: text % text (returns boolean)
+ Operator
+ Returns
+ Description
+
+
+
+
+
+ text> % text>
+ boolean
-
- The '%' operator returns TRUE if its two arguments have a similarity
- that is greater than the similarity threshold set by set_limit(). It
- will return FALSE if the similarity is less than the current
- threshold.
-
+ Returns true> if its arguments have a similarity that is
+ greater than the current similarity threshold set by
+ set_limit>.
@@ -93,97 +123,111 @@
- Public Index Operator Class
+ Index Support
+
- The pg_trgm module comes with the
- gist_trgm_ops index operator class that allows a
- developer to create an index over a text column for the purpose
- of very fast similarity searches.
+ The pg_trgm module provides GiST and GIN index
+ operator classes that allow you to create an index over a text column for
+ the purpose of very fast similarity searches. These index types support
+ the %> similarity operator (and no other operators, so you may
+ want a regular btree index too).
+
- To use this index, the '%' operator must be used and an appropriate
- similarity threshold for the application must be set. Example:
-
+ Example:
+
CREATE TABLE test_trgm (t text);
CREATE INDEX trgm_idx ON test_trgm USING gist (t gist_trgm_ops);
+or
+
+CREATE INDEX trgm_idx ON test_trgm USING gin (t gin_trgm_ops);
+
+
+
- At this point, you will have an index on the t text column that you
- can use for similarity searching. Example:
+ At this point, you will have an index on the t> column that
+ you can use for similarity searching. A typical query is
-SELECT
- t,
- similarity(t, 'word') AS sml
-FROM
- test_trgm
-WHERE
- t % 'word'
-ORDER BY
- sml DESC, t;
+SELECT t, similarity(t, 'word>') AS sml
+ FROM test_trgm
+ WHERE t % 'word>'
+ ORDER BY sml DESC, t;
This will return all values in the text column that are sufficiently
- similar to 'word', sorted from best match to worst. The index will
- be used to make this a fast operation over very large data sets.
+ similar to word>, sorted from best match to worst. The
+ index will be used to make this a fast operation even over very large data
+ sets.
+
+
+
+ The choice between GiST and GIN indexing depends on the relative
+ performance characteristics of GiST and GIN, which are discussed elsewhere.
+ As a rule of thumb, a GIN index is faster to search than a GiST index, but
+ slower to build or update; so GIN is better suited for static data and GiST
+ for often-updated data.
Text Search Integration
+
Trigram matching is a very useful tool when used in conjunction
- with a full text index.
+ with a full text index. In particular it can help to recognize
+ misspelled input words that will not be matched directly by the
+ full text search mechanism.
+
The first step is to generate an auxiliary table containing all
the unique words in the documents:
+
-CREATE TABLE words AS SELECT word FROM
- stat('SELECT to_tsvector(''simple'', bodytext) FROM documents');
+CREATE TABLE words AS SELECT word FROM
+ ts_stat('SELECT to_tsvector(''simple'', bodytext) FROM documents');
+
where documents> is a table that has a text field
- bodytext> that we wish to search. The use of the
- simple> configuration with the to_tsvector>
- function, instead of just using the already
- existing vector is to avoid creating a list of already stemmed
- words. This way, only the original, unstemmed words are added
- to the word list.
+ bodytext> that we wish to search. The reason for using
+ the simple> configuration with the to_tsvector>
+ function, instead of using a language-specific configuration,
+ is that we want a list of the original (unstemmed) words.
+
Next, create a trigram index on the word column:
+
-CREATE INDEX words_idx ON words USING gist(word gist_trgm_ops);
+CREATE INDEX words_idx ON words USING gin(word gin_trgm_ops);
+
- or
-
-
-CREATE INDEX words_idx ON words USING gin(word gist_trgm_ops);
-
-
- Now, a SELECT query similar to the example above can be
- used to suggest spellings for misspelled words in user search terms. A
- useful extra clause is to ensure that the similar words are also
- of similar length to the misspelled word.
-
-
-
-
- Since the words> table has been generated as a separate,
- static table, it will need to be periodically regenerated so that
- it remains up to date with the document collection.
-
-
+ Now, a SELECT query similar to the previous example can
+ be used to suggest spellings for misspelled words in user search terms.
+ A useful extra test is to require that the selected words are also of
+ similar length to the misspelled word.
+
+
+
+ Since the words> table has been generated as a separate,
+ static table, it will need to be periodically regenerated so that
+ it remains reasonably up-to-date with the document collection.
+ Keeping it exactly current is usually unnecessary.
+
+
References
+
GiST Development Site
@@ -196,6 +240,7 @@ CREATE INDEX words_idx ON words USING gin(word gist_trgm_ops);
Authors
+
Oleg Bartunov oleg@sai.msu.su, Moscow, Moscow University, Russia
@@ -203,7 +248,7 @@ CREATE INDEX words_idx ON words USING gin(word gist_trgm_ops);
Teodor Sigaev teodor@sigaev.ru, Moscow, Delta-Soft Ltd.,Russia
- Documentation: Christopher Kings-Lynne
+ Documentation: Christopher Kings-Lynne
This module is sponsored by Delta-Soft Ltd., Moscow, Russia.