Functions and Operators

Functions and Operators function operator PostgreSQL provides a large number of functions and operators for the built-in data types. Users can also define their own functions and operators, as described in . The psql commands \df and \do can be used to list all available functions and operators, respectively. If you are concerned about portability then note that most of the functions and operators described in this chapter, with the exception of the most trivial arithmetic and comparison operators and some explicitly marked functions, are not specified by the SQL standard. Some of this extended functionality is present in other SQL database management systems, and in many cases this functionality is compatible and consistent between the various implementations. This chapter is also not exhaustive; additional functions appear in relevant sections of the manual. Logical Operators operator logical Boolean operators operators, logical The usual logical operators are available: AND (operator) OR (operator) NOT (operator) conjunction disjunction negation AND OR NOT SQL uses a three-valued logic system with true, false, and null, which represents unknown. Observe the following truth tables: a b a AND b a OR b TRUE TRUE TRUE TRUE TRUE FALSE FALSE TRUE TRUE NULL NULL TRUE FALSE FALSE FALSE FALSE FALSE NULL FALSE NULL NULL NULL NULL NULL a NOT a TRUE FALSE FALSE TRUE NULL NULL The operators AND and OR are commutative, that is, you can switch the left and right operand without affecting the result. But see for more information about the order of evaluation of subexpressions. Comparison Functions and Operators comparison operators The usual comparison operators are available, as shown in . Comparison Operators Operator Description < less than > greater than <= less than or equal to >= greater than or equal to = equal <> or != not equal

The != operator is converted to <> in the parser stage. It is not possible to implement != and <> operators that do different things. Comparison operators are available for all relevant data types. All comparison operators are binary operators that return values of type boolean; expressions like 1 < 2 < 3 are not valid (because there is no < operator to compare a Boolean value with 3). There are also some comparison predicates, as shown in . These behave much like operators, but have special syntax mandated by the SQL standard. Comparison Predicates Predicate Description a BETWEEN x AND y between a NOT BETWEEN x AND y not between a BETWEEN SYMMETRIC x AND y between, after sorting the comparison values a NOT BETWEEN SYMMETRIC x AND y not between, after sorting the comparison values a IS DISTINCT FROM b not equal, treating null like an ordinary value a IS NOT DISTINCT FROM b equal, treating null like an ordinary value expression IS NULL is null expression IS NOT NULL is not null expression ISNULL is null (nonstandard syntax) expression NOTNULL is not null (nonstandard syntax) boolean_expression IS TRUE is true boolean_expression IS NOT TRUE is false or unknown boolean_expression IS FALSE is false boolean_expression IS NOT FALSE is true or unknown boolean_expression IS UNKNOWN is unknown boolean_expression IS NOT UNKNOWN is true or false

BETWEEN The BETWEEN predicate simplifies range tests: a BETWEEN x AND y is equivalent to a >= x AND a <= y Notice that BETWEEN treats the endpoint values as included in the range. NOT BETWEEN does the opposite comparison: a NOT BETWEEN x AND y is equivalent to a < x OR a > y BETWEEN SYMMETRIC BETWEEN SYMMETRIC is like BETWEEN except there is no requirement that the argument to the left of AND be less than or equal to the argument on the right. If it is not, those two arguments are automatically swapped, so that a nonempty range is always implied. IS DISTINCT FROM IS NOT DISTINCT FROM Ordinary comparison operators yield null (signifying unknown), not true or false, when either input is null. For example, 7 = NULL yields null, as does 7 <> NULL. When this behavior is not suitable, use the IS NOT DISTINCT FROM predicates: a IS DISTINCT FROM b a IS NOT DISTINCT FROM b For non-null inputs, IS DISTINCT FROM is the same as the <> operator. However, if both inputs are null it returns false, and if only one input is null it returns true. Similarly, IS NOT DISTINCT FROM is identical to = for non-null inputs, but it returns true when both inputs are null, and false when only one input is null. Thus, these predicates effectively act as though null were a normal data value, rather than unknown. IS NULL IS NOT NULL ISNULL NOTNULL To check whether a value is or is not null, use the predicates: expression IS NULL expression IS NOT NULL or the equivalent, but nonstandard, predicates: expression ISNULL expression NOTNULL null valuecomparing Do not write expression = NULL because NULL is not equal to NULL. (The null value represents an unknown value, and it is not known whether two unknown values are equal.) Some applications might expect that expression = NULL returns true if expression evaluates to the null value. It is highly recommended that these applications be modified to comply with the SQL standard. However, if that cannot be done the configuration variable is available. If it is enabled, PostgreSQL will convert x = NULL clauses to x IS NULL. If the expression is row-valued, then IS NULL is true when the row expression itself is null or when all the row's fields are null, while IS NOT NULL is true when the row expression itself is non-null and all the row's fields are non-null. Because of this behavior, IS NULL and IS NOT NULL do not always return inverse results for row-valued expressions; in particular, a row-valued expression that contains both null and non-null fields will return false for both tests. In some cases, it may be preferable to write row IS DISTINCT FROM NULL or row IS NOT DISTINCT FROM NULL, which will simply check whether the overall row value is null without any additional tests on the row fields. IS TRUE IS NOT TRUE IS FALSE IS NOT FALSE IS UNKNOWN IS NOT UNKNOWN Boolean values can also be tested using the predicates boolean_expression IS TRUE boolean_expression IS NOT TRUE boolean_expression IS FALSE boolean_expression IS NOT FALSE boolean_expression IS UNKNOWN boolean_expression IS NOT UNKNOWN These will always return true or false, never a null value, even when the operand is null. A null input is treated as the logical value unknown. Notice that IS UNKNOWN and IS NOT UNKNOWN are effectively the same as IS NULL and IS NOT NULL, respectively, except that the input expression must be of Boolean type. Some comparison-related functions are also available, as shown in . Comparison Functions Function Description Example Example Result num_nonnulls num_nonnulls(VARIADIC "any") returns the number of non-null arguments num_nonnulls(1, NULL, 2) 2 num_nulls num_nulls(VARIADIC "any") returns the number of null arguments num_nulls(1, NULL, 2) 1

Mathematical Functions and Operators Mathematical operators are provided for many PostgreSQL types. For types without standard mathematical conventions (e.g., date/time types) we describe the actual behavior in subsequent sections. shows the available mathematical operators. Mathematical Operators Operator Description Example Result + addition 2 + 3 5 - subtraction 2 - 3 -1 * multiplication 2 * 3 6 / division (integer division truncates the result) 4 / 2 2 % modulo (remainder) 5 % 4 1 ^ exponentiation (associates left to right) 2.0 ^ 3.0 8 |/ square root |/ 25.0 5 ||/ cube root ||/ 27.0 3 ! factorial 5 ! 120 !! factorial (prefix operator) !! 5 120 @ absolute value @ -5.0 5 & bitwise AND 91 & 15 11 | bitwise OR 32 | 3 35 # bitwise XOR 17 # 5 20 ~ bitwise NOT ~1 -2 << bitwise shift left 1 << 4 16 >> bitwise shift right 8 >> 2 2

The bitwise operators work only on integral data types, and are also available for the bit string types bit and bit varying, as shown in . shows the available mathematical functions. In the table, dp indicates double precision. Many of these functions are provided in multiple forms with different argument types. Except where noted, any given form of a function returns the same data type as its argument. The functions working with double precision data are mostly implemented on top of the host system's C library; accuracy and behavior in boundary cases can therefore vary depending on the host system. Mathematical Functions Function Return Type Description Example Result abs abs(x) (same as input) absolute value abs(-17.4) 17.4 cbrt cbrt(dp) dp cube root cbrt(27.0) 3 ceil ceil(dp or numeric) (same as input) nearest integer greater than or equal to argument ceil(-42.8) -42 ceiling ceiling(dp or numeric) (same as input) nearest integer greater than or equal to argument (same as ceil) ceiling(-95.3) -95 degrees degrees(dp) dp radians to degrees degrees(0.5) 28.6478897565412 div div(y numeric, x numeric) numeric integer quotient of y/x div(9,4) 2 exp exp(dp or numeric) (same as input) exponential exp(1.0) 2.71828182845905 floor floor(dp or numeric) (same as input) nearest integer less than or equal to argument floor(-42.8) -43 ln ln(dp or numeric) (same as input) natural logarithm ln(2.0) 0.693147180559945 log log(dp or numeric) (same as input) base 10 logarithm log(100.0) 2 log10 log10(dp or numeric) (same as input) base 10 logarithm log10(100.0) 2 log(b numeric, x numeric) numeric logarithm to base b log(2.0, 64.0) 6.0000000000 mod mod(y, x) (same as argument types) remainder of y/x mod(9,4) 1 pi pi() dp π constant pi() 3.14159265358979 power power(a dp, b dp) dp a raised to the power of b power(9.0, 3.0) 729 power(a numeric, b numeric) numeric a raised to the power of b power(9.0, 3.0) 729 radians radians(dp) dp degrees to radians radians(45.0) 0.785398163397448 round round(dp or numeric) (same as input) round to nearest integer round(42.4) 42 round(v numeric, s int) numeric round to s decimal places round(42.4382, 2) 42.44 scale scale(numeric) integer scale of the argument (the number of decimal digits in the fractional part) scale(8.41) 2 sign sign(dp or numeric) (same as input) sign of the argument (-1, 0, +1) sign(-8.4) -1 sqrt sqrt(dp or numeric) (same as input) square root sqrt(2.0) 1.4142135623731 trunc trunc(dp or numeric) (same as input) truncate toward zero trunc(42.8) 42 trunc(v numeric, s int) numeric truncate to s decimal places trunc(42.4382, 2) 42.43 width_bucket width_bucket(operand dp, b1 dp, b2 dp, count int) int return the bucket number to which operand would be assigned in a histogram having count equal-width buckets spanning the range b1 to b2; returns 0 or count+1 for an input outside the range width_bucket(5.35, 0.024, 10.06, 5) 3 width_bucket(operand numeric, b1 numeric, b2 numeric, count int) int return the bucket number to which operand would be assigned in a histogram having count equal-width buckets spanning the range b1 to b2; returns 0 or count+1 for an input outside the range width_bucket(5.35, 0.024, 10.06, 5) 3 width_bucket(operand anyelement, thresholds anyarray) int return the bucket number to which operand would be assigned given an array listing the lower bounds of the buckets; returns 0 for an input less than the first lower bound; the thresholds array must be sorted, smallest first, or unexpected results will be obtained width_bucket(now(), array['yesterday', 'today', 'tomorrow']::timestamptz[]) 2

shows functions for generating random numbers. Random Functions Function Return Type Description random random() dp random value in the range 0.0 <= x < 1.0 setseed setseed(dp) void set seed for subsequent random() calls (value between -1.0 and 1.0, inclusive)

The random() function uses a simple linear congruential algorithm. It is fast but not suitable for cryptographic applications; see the module for a more secure alternative. If setseed() is called, the results of subsequent random() calls in the current session are repeatable by re-issuing setseed() with the same argument. shows the available trigonometric functions. All these functions take arguments and return values of type double precision. Each of the trigonometric functions comes in two variants, one that measures angles in radians and one that measures angles in degrees. Trigonometric Functions Function (radians) Function (degrees) Description acos acos(x) acosd acosd(x) inverse cosine asin asin(x) asind asind(x) inverse sine atan atan(x) atand atand(x) inverse tangent atan2 atan2(y, x) atan2d atan2d(y, x) inverse tangent of y/x cos cos(x) cosd cosd(x) cosine cot cot(x) cotd cotd(x) cotangent sin sin(x) sind sind(x) sine tan tan(x) tand tand(x) tangent

Another way to work with angles measured in degrees is to use the unit transformation functions radians() and degrees() shown earlier. However, using the degree-based trigonometric functions is preferred, as that way avoids round-off error for special cases such as sind(30). shows the available hyperbolic functions. All these functions take arguments and return values of type double precision. Hyperbolic Functions Function Description Example Result sinh sinh(x) hyperbolic sine sinh(0) 0 cosh cosh(x) hyperbolic cosine cosh(0) 1 tanh tanh(x) hyperbolic tangent tanh(0) 0 asinh asinh(x) inverse hyperbolic sine asinh(0) 0 acosh acosh(x) inverse hyperbolic cosine acosh(1) 0 atanh atanh(x) inverse hyperbolic tangent atanh(0) 0

String Functions and Operators This section describes functions and operators for examining and manipulating string values. Strings in this context include values of the types character, character varying, and text. Unless otherwise noted, all of the functions listed below work on all of these types, but be wary of potential effects of automatic space-padding when using the character type. Some functions also exist natively for the bit-string types. SQL defines some string functions that use key words, rather than commas, to separate arguments. Details are in . PostgreSQL also provides versions of these functions that use the regular function invocation syntax (see ). Before PostgreSQL 8.3, these functions would silently accept values of several non-string data types as well, due to the presence of implicit coercions from those data types to text. Those coercions have been removed because they frequently caused surprising behaviors. However, the string concatenation operator (||) still accepts non-string input, so long as at least one input is of a string type, as shown in . For other cases, insert an explicit coercion to text if you need to duplicate the previous behavior. <acronym>SQL</acronym> String Functions and Operators Function Return Type Description Example Result string || string text String concatenation character string concatenation 'Post' || 'greSQL' PostgreSQL string || non-string or non-string || string text String concatenation with one non-string input 'Value: ' || 42 Value: 42 bit_length bit_length(string) int Number of bits in string bit_length('jose') 32 char_length char_length(string) or character_length(string) int Number of characters in string character string length length of a character string character string, length char_length('jose') 4 lower lower(string) text Convert string to lower case lower('TOM') tom octet_length octet_length(string) int Number of bytes in string octet_length('jose') 4 overlay overlay(string placing string from int for int) text Replace substring overlay('Txxxxas' placing 'hom' from 2 for 4) Thomas position position(substring in string) int Location of specified substring position('om' in 'Thomas') 3 substring substring(string from int for int) text Extract substring substring('Thomas' from 2 for 3) hom substring(string from pattern) text Extract substring matching POSIX regular expression. See for more information on pattern matching. substring('Thomas' from '...$') mas substring(string from pattern for escape) text Extract substring matching SQL regular expression. See for more information on pattern matching. substring('Thomas' from '%#"o_a#"_' for '#') oma trim trim(leading | trailing | both characters from string) text Remove the longest string containing only characters from characters (a space by default) from the start, end, or both ends (both is the default) of string trim(both 'xyz' from 'yxTomxx') Tom trim(leading | trailing | both from string , characters ) text Non-standard syntax for trim() trim(both from 'yxTomxx', 'xyz') Tom upper upper(string) text Convert string to upper case upper('tom') TOM

Additional string manipulation functions are available and are listed in . Some of them are used internally to implement the SQL-standard string functions listed in . Other String Functions Function Return Type Description Example Result ascii ascii(string) int ASCII code of the first character of the argument. For UTF8 returns the Unicode code point of the character. For other multibyte encodings, the argument must be an ASCII character. ascii('x') 120 btrim btrim(string text , characters text) text Remove the longest string consisting only of characters in characters (a space by default) from the start and end of string btrim('xyxtrimyyx', 'xyz') trim chr chr(int) text Character with the given code. For UTF8 the argument is treated as a Unicode code point. For other multibyte encodings the argument must designate an ASCII character. The NULL (0) character is not allowed because text data types cannot store such bytes. chr(65) A concat concat(str "any" [, str "any" [, ...] ]) text Concatenate the text representations of all the arguments. NULL arguments are ignored. concat('abcde', 2, NULL, 22) abcde222 concat_ws concat_ws(sep text, str "any" [, str "any" [, ...] ]) text Concatenate all but the first argument with separators. The first argument is used as the separator string. NULL arguments are ignored. concat_ws(',', 'abcde', 2, NULL, 22) abcde,2,22 convert convert(string bytea, src_encoding name, dest_encoding name) bytea Convert string to dest_encoding. The original encoding is specified by src_encoding. The string must be valid in this encoding. Conversions can be defined by CREATE CONVERSION. Also there are some predefined conversions. See for available conversions. convert('text_in_utf8', 'UTF8', 'LATIN1') text_in_utf8 represented in Latin-1 encoding (ISO 8859-1) convert_from convert_from(string bytea, src_encoding name) text Convert string to the database encoding. The original encoding is specified by src_encoding. The string must be valid in this encoding. convert_from('text_in_utf8', 'UTF8') text_in_utf8 represented in the current database encoding convert_to convert_to(string text, dest_encoding name) bytea Convert string to dest_encoding. convert_to('some text', 'UTF8') some text represented in the UTF8 encoding decode decode(string text, format text) bytea Decode binary data from textual representation in string. Options for format are same as in encode. decode('MTIzAAE=', 'base64') \x3132330001 encode encode(data bytea, format text) text Encode binary data into a textual representation. Supported formats are: base64, hex, escape. escape converts zero bytes and high-bit-set bytes to octal sequences (\nnn) and doubles backslashes. encode('123\000\001', 'base64') MTIzAAE= format format(formatstr text [, formatarg "any" [, ...] ]) text Format arguments according to a format string. This function is similar to the C function sprintf. See . format('Hello %s, %1$s', 'World') Hello World, World initcap initcap(string) text Convert the first letter of each word to upper case and the rest to lower case. Words are sequences of alphanumeric characters separated by non-alphanumeric characters. initcap('hi THOMAS') Hi Thomas left left(str text, n int) text Return first n characters in the string. When n is negative, return all but last |n| characters. left('abcde', 2) ab length length(string) int Number of characters in string length('jose') 4 length(string bytea, encoding name ) int Number of characters in string in the given encoding. The string must be valid in this encoding. length('jose', 'UTF8') 4 lpad lpad(string text, length int , fill text) text Fill up the string to length length by prepending the characters fill (a space by default). If the string is already longer than length then it is truncated (on the right). lpad('hi', 5, 'xy') xyxhi ltrim ltrim(string text , characters text) text Remove the longest string containing only characters from characters (a space by default) from the start of string ltrim('zzzytest', 'xyz') test md5 md5(string) text Calculates the MD5 hash of string, returning the result in hexadecimal md5('abc') 900150983cd24fb0 d6963f7d28e17f72 parse_ident parse_ident(qualified_identifier text [, strictmode boolean DEFAULT true ] ) text[] Split qualified_identifier into an array of identifiers, removing any quoting of individual identifiers. By default, extra characters after the last identifier are considered an error; but if the second parameter is false, then such extra characters are ignored. (This behavior is useful for parsing names for objects like functions.) Note that this function does not truncate over-length identifiers. If you want truncation you can cast the result to name[]. parse_ident('"SomeSchema".someTable') {SomeSchema,sometable} pg_client_encoding pg_client_encoding() name Current client encoding name pg_client_encoding() SQL_ASCII quote_ident quote_ident(string text) text Return the given string suitably quoted to be used as an identifier in an SQL statement string. Quotes are added only if necessary (i.e., if the string contains non-identifier characters or would be case-folded). Embedded quotes are properly doubled. See also . quote_ident('Foo bar') "Foo bar" quote_literal quote_literal(string text) text Return the given string suitably quoted to be used as a string literal in an SQL statement string. Embedded single-quotes and backslashes are properly doubled. Note that quote_literal returns null on null input; if the argument might be null, quote_nullable is often more suitable. See also . quote_literal(E'O\'Reilly') 'O''Reilly' quote_literal(value anyelement) text Coerce the given value to text and then quote it as a literal. Embedded single-quotes and backslashes are properly doubled. quote_literal(42.5) '42.5' quote_nullable quote_nullable(string text) text Return the given string suitably quoted to be used as a string literal in an SQL statement string; or, if the argument is null, return NULL. Embedded single-quotes and backslashes are properly doubled. See also . quote_nullable(NULL) NULL quote_nullable(value anyelement) text Coerce the given value to text and then quote it as a literal; or, if the argument is null, return NULL. Embedded single-quotes and backslashes are properly doubled. quote_nullable(42.5) '42.5' regexp_match regexp_match(string text, pattern text [, flags text]) text[] Return captured substring(s) resulting from the first match of a POSIX regular expression to the string. See for more information. regexp_match('foobarbequebaz', '(bar)(beque)') {bar,beque} regexp_matches regexp_matches(string text, pattern text [, flags text]) setof text[] Return captured substring(s) resulting from matching a POSIX regular expression to the string. See for more information. regexp_matches('foobarbequebaz', 'ba.', 'g') {bar}{baz} (2 rows) regexp_replace regexp_replace(string text, pattern text, replacement text [, flags text]) text Replace substring(s) matching a POSIX regular expression. See for more information. regexp_replace('Thomas', '.[mN]a.', 'M') ThM regexp_split_to_array regexp_split_to_array(string text, pattern text [, flags text ]) text[] Split string using a POSIX regular expression as the delimiter. See for more information. regexp_split_to_array('hello world', '\s+') {hello,world} regexp_split_to_table regexp_split_to_table(string text, pattern text [, flags text]) setof text Split string using a POSIX regular expression as the delimiter. See for more information. regexp_split_to_table('hello world', '\s+') helloworld (2 rows) repeat repeat(string text, number int) text Repeat string the specified number of times repeat('Pg', 4) PgPgPgPg replace replace(string text, from text, to text) text Replace all occurrences in string of substring from with substring to replace('abcdefabcdef', 'cd', 'XX') abXXefabXXef reverse reverse(str) text Return reversed string. reverse('abcde') edcba right right(str text, n int) text Return last n characters in the string. When n is negative, return all but first |n| characters. right('abcde', 2) de rpad rpad(string text, length int , fill text) text Fill up the string to length length by appending the characters fill (a space by default). If the string is already longer than length then it is truncated. rpad('hi', 5, 'xy') hixyx rtrim rtrim(string text , characters text) text Remove the longest string containing only characters from characters (a space by default) from the end of string rtrim('testxxzx', 'xyz') test split_part split_part(string text, delimiter text, field int) text Split string on delimiter and return the given field (counting from one) split_part('abc~@~def~@~ghi', '~@~', 2) def strpos strpos(string, substring) int Location of specified substring (same as position(substring in string), but note the reversed argument order) strpos('high', 'ig') 2 substr substr(string, from , count) text Extract substring (same as substring(string from from for count)) substr('alphabet', 3, 2) ph starts_with starts_with(string, prefix) bool Returns true if string starts with prefix. starts_with('alphabet', 'alph') t to_ascii to_ascii(string text , encoding text) text Convert string to ASCII from another encoding (only supports conversion from LATIN1, LATIN2, LATIN9, and WIN1250 encodings) to_ascii('Karel') Karel to_hex to_hex(number int or bigint) text Convert number to its equivalent hexadecimal representation to_hex(2147483647) 7fffffff translate translate(string text, from text, to text) text Any character in string that matches a character in the from set is replaced by the corresponding character in the to set. If from is longer than to, occurrences of the extra characters in from are removed. translate('12345', '143', 'ax') a2x5

The concat, concat_ws and format functions are variadic, so it is possible to pass the values to be concatenated or formatted as an array marked with the VARIADIC keyword (see ). The array's elements are treated as if they were separate ordinary arguments to the function. If the variadic array argument is NULL, concat and concat_ws return NULL, but format treats a NULL as a zero-element array. See also the aggregate function string_agg in . Built-in Conversions Conversion Name The conversion names follow a standard naming scheme: The official name of the source encoding with all non-alphanumeric characters replaced by underscores, followed by _to_, followed by the similarly processed destination encoding name. Therefore, the names might deviate from the customary encoding names. Source Encoding Destination Encoding ascii_to_mic SQL_ASCII MULE_INTERNAL ascii_to_utf8 SQL_ASCII UTF8 big5_to_euc_tw BIG5 EUC_TW big5_to_mic BIG5 MULE_INTERNAL big5_to_utf8 BIG5 UTF8 euc_cn_to_mic EUC_CN MULE_INTERNAL euc_cn_to_utf8 EUC_CN UTF8 euc_jp_to_mic EUC_JP MULE_INTERNAL euc_jp_to_sjis EUC_JP SJIS euc_jp_to_utf8 EUC_JP UTF8 euc_kr_to_mic EUC_KR MULE_INTERNAL euc_kr_to_utf8 EUC_KR UTF8 euc_tw_to_big5 EUC_TW BIG5 euc_tw_to_mic EUC_TW MULE_INTERNAL euc_tw_to_utf8 EUC_TW UTF8 gb18030_to_utf8 GB18030 UTF8 gbk_to_utf8 GBK UTF8 iso_8859_10_to_utf8 LATIN6 UTF8 iso_8859_13_to_utf8 LATIN7 UTF8 iso_8859_14_to_utf8 LATIN8 UTF8 iso_8859_15_to_utf8 LATIN9 UTF8 iso_8859_16_to_utf8 LATIN10 UTF8 iso_8859_1_to_mic LATIN1 MULE_INTERNAL iso_8859_1_to_utf8 LATIN1 UTF8 iso_8859_2_to_mic LATIN2 MULE_INTERNAL iso_8859_2_to_utf8 LATIN2 UTF8 iso_8859_2_to_windows_1250 LATIN2 WIN1250 iso_8859_3_to_mic LATIN3 MULE_INTERNAL iso_8859_3_to_utf8 LATIN3 UTF8 iso_8859_4_to_mic LATIN4 MULE_INTERNAL iso_8859_4_to_utf8 LATIN4 UTF8 iso_8859_5_to_koi8_r ISO_8859_5 KOI8R iso_8859_5_to_mic ISO_8859_5 MULE_INTERNAL iso_8859_5_to_utf8 ISO_8859_5 UTF8 iso_8859_5_to_windows_1251 ISO_8859_5 WIN1251 iso_8859_5_to_windows_866 ISO_8859_5 WIN866 iso_8859_6_to_utf8 ISO_8859_6 UTF8 iso_8859_7_to_utf8 ISO_8859_7 UTF8 iso_8859_8_to_utf8 ISO_8859_8 UTF8 iso_8859_9_to_utf8 LATIN5 UTF8 johab_to_utf8 JOHAB UTF8 koi8_r_to_iso_8859_5 KOI8R ISO_8859_5 koi8_r_to_mic KOI8R MULE_INTERNAL koi8_r_to_utf8 KOI8R UTF8 koi8_r_to_windows_1251 KOI8R WIN1251 koi8_r_to_windows_866 KOI8R WIN866 koi8_u_to_utf8 KOI8U UTF8 mic_to_ascii MULE_INTERNAL SQL_ASCII mic_to_big5 MULE_INTERNAL BIG5 mic_to_euc_cn MULE_INTERNAL EUC_CN mic_to_euc_jp MULE_INTERNAL EUC_JP mic_to_euc_kr MULE_INTERNAL EUC_KR mic_to_euc_tw MULE_INTERNAL EUC_TW mic_to_iso_8859_1 MULE_INTERNAL LATIN1 mic_to_iso_8859_2 MULE_INTERNAL LATIN2 mic_to_iso_8859_3 MULE_INTERNAL LATIN3 mic_to_iso_8859_4 MULE_INTERNAL LATIN4 mic_to_iso_8859_5 MULE_INTERNAL ISO_8859_5 mic_to_koi8_r MULE_INTERNAL KOI8R mic_to_sjis MULE_INTERNAL SJIS mic_to_windows_1250 MULE_INTERNAL WIN1250 mic_to_windows_1251 MULE_INTERNAL WIN1251 mic_to_windows_866 MULE_INTERNAL WIN866 sjis_to_euc_jp SJIS EUC_JP sjis_to_mic SJIS MULE_INTERNAL sjis_to_utf8 SJIS UTF8 windows_1258_to_utf8 WIN1258 UTF8 uhc_to_utf8 UHC UTF8 utf8_to_ascii UTF8 SQL_ASCII utf8_to_big5 UTF8 BIG5 utf8_to_euc_cn UTF8 EUC_CN utf8_to_euc_jp UTF8 EUC_JP utf8_to_euc_kr UTF8 EUC_KR utf8_to_euc_tw UTF8 EUC_TW utf8_to_gb18030 UTF8 GB18030 utf8_to_gbk UTF8 GBK utf8_to_iso_8859_1 UTF8 LATIN1 utf8_to_iso_8859_10 UTF8 LATIN6 utf8_to_iso_8859_13 UTF8 LATIN7 utf8_to_iso_8859_14 UTF8 LATIN8 utf8_to_iso_8859_15 UTF8 LATIN9 utf8_to_iso_8859_16 UTF8 LATIN10 utf8_to_iso_8859_2 UTF8 LATIN2 utf8_to_iso_8859_3 UTF8 LATIN3 utf8_to_iso_8859_4 UTF8 LATIN4 utf8_to_iso_8859_5 UTF8 ISO_8859_5 utf8_to_iso_8859_6 UTF8 ISO_8859_6 utf8_to_iso_8859_7 UTF8 ISO_8859_7 utf8_to_iso_8859_8 UTF8 ISO_8859_8 utf8_to_iso_8859_9 UTF8 LATIN5 utf8_to_johab UTF8 JOHAB utf8_to_koi8_r UTF8 KOI8R utf8_to_koi8_u UTF8 KOI8U utf8_to_sjis UTF8 SJIS utf8_to_windows_1258 UTF8 WIN1258 utf8_to_uhc UTF8 UHC utf8_to_windows_1250 UTF8 WIN1250 utf8_to_windows_1251 UTF8 WIN1251 utf8_to_windows_1252 UTF8 WIN1252 utf8_to_windows_1253 UTF8 WIN1253 utf8_to_windows_1254 UTF8 WIN1254 utf8_to_windows_1255 UTF8 WIN1255 utf8_to_windows_1256 UTF8 WIN1256 utf8_to_windows_1257 UTF8 WIN1257 utf8_to_windows_866 UTF8 WIN866 utf8_to_windows_874 UTF8 WIN874 windows_1250_to_iso_8859_2 WIN1250 LATIN2 windows_1250_to_mic WIN1250 MULE_INTERNAL windows_1250_to_utf8 WIN1250 UTF8 windows_1251_to_iso_8859_5 WIN1251 ISO_8859_5 windows_1251_to_koi8_r WIN1251 KOI8R windows_1251_to_mic WIN1251 MULE_INTERNAL windows_1251_to_utf8 WIN1251 UTF8 windows_1251_to_windows_866 WIN1251 WIN866 windows_1252_to_utf8 WIN1252 UTF8 windows_1256_to_utf8 WIN1256 UTF8 windows_866_to_iso_8859_5 WIN866 ISO_8859_5 windows_866_to_koi8_r WIN866 KOI8R windows_866_to_mic WIN866 MULE_INTERNAL windows_866_to_utf8 WIN866 UTF8 windows_866_to_windows_1251 WIN866 WIN windows_874_to_utf8 WIN874 UTF8 euc_jis_2004_to_utf8 EUC_JIS_2004 UTF8 utf8_to_euc_jis_2004 UTF8 EUC_JIS_2004 shift_jis_2004_to_utf8 SHIFT_JIS_2004 UTF8 utf8_to_shift_jis_2004 UTF8 SHIFT_JIS_2004 euc_jis_2004_to_shift_jis_2004 EUC_JIS_2004 SHIFT_JIS_2004 shift_jis_2004_to_euc_jis_2004 SHIFT_JIS_2004 EUC_JIS_2004

<function>format</function> format The function format produces output formatted according to a format string, in a style similar to the C function sprintf. format(formatstr text [, formatarg "any" [, ...] ]) formatstr is a format string that specifies how the result should be formatted. Text in the format string is copied directly to the result, except where format specifiers are used. Format specifiers act as placeholders in the string, defining how subsequent function arguments should be formatted and inserted into the result. Each formatarg argument is converted to text according to the usual output rules for its data type, and then formatted and inserted into the result string according to the format specifier(s). Format specifiers are introduced by a % character and have the form %[position][flags][width]type where the component fields are: position (optional) A string of the form n$ where n is the index of the argument to print. Index 1 means the first argument after formatstr. If the position is omitted, the default is to use the next argument in sequence. flags (optional) Additional options controlling how the format specifier's output is formatted. Currently the only supported flag is a minus sign (-) which will cause the format specifier's output to be left-justified. This has no effect unless the width field is also specified. width (optional) Specifies the minimum number of characters to use to display the format specifier's output. The output is padded on the left or right (depending on the - flag) with spaces as needed to fill the width. A too-small width does not cause truncation of the output, but is simply ignored. The width may be specified using any of the following: a positive integer; an asterisk (*) to use the next function argument as the width; or a string of the form *n$ to use the nth function argument as the width. If the width comes from a function argument, that argument is consumed before the argument that is used for the format specifier's value. If the width argument is negative, the result is left aligned (as if the - flag had been specified) within a field of length abs(width). type (required) The type of format conversion to use to produce the format specifier's output. The following types are supported: s formats the argument value as a simple string. A null value is treated as an empty string. I treats the argument value as an SQL identifier, double-quoting it if necessary. It is an error for the value to be null (equivalent to quote_ident). L quotes the argument value as an SQL literal. A null value is displayed as the string NULL, without quotes (equivalent to quote_nullable). In addition to the format specifiers described above, the special sequence %% may be used to output a literal % character. Here are some examples of the basic format conversions: SELECT format('Hello %s', 'World'); Result: Hello World SELECT format('Testing %s, %s, %s, %%', 'one', 'two', 'three'); Result: Testing one, two, three, % SELECT format('INSERT INTO %I VALUES(%L)', 'Foo bar', E'O\'Reilly'); Result: INSERT INTO "Foo bar" VALUES('O''Reilly') SELECT format('INSERT INTO %I VALUES(%L)', 'locations', 'C:\Program Files'); Result: INSERT INTO locations VALUES('C:\Program Files') Here are examples using width fields and the - flag: SELECT format('|%10s|', 'foo'); Result: | foo| SELECT format('|%-10s|', 'foo'); Result: |foo | SELECT format('|%*s|', 10, 'foo'); Result: | foo| SELECT format('|%*s|', -10, 'foo'); Result: |foo | SELECT format('|%-*s|', 10, 'foo'); Result: |foo | SELECT format('|%-*s|', -10, 'foo'); Result: |foo | These examples show use of position fields: SELECT format('Testing %3$s, %2$s, %1$s', 'one', 'two', 'three'); Result: Testing three, two, one SELECT format('|%*2$s|', 'foo', 10, 'bar'); Result: | bar| SELECT format('|%1$*2$s|', 'foo', 10, 'bar'); Result: | foo| Unlike the standard C function sprintf, PostgreSQL's format function allows format specifiers with and without position fields to be mixed in the same format string. A format specifier without a position field always uses the next argument after the last argument consumed. In addition, the format function does not require all function arguments to be used in the format string. For example: SELECT format('Testing %3$s, %2$s, %s', 'one', 'two', 'three'); Result: Testing three, two, three The %I and %L format specifiers are particularly useful for safely constructing dynamic SQL statements. See . Binary String Functions and Operators binary data functions This section describes functions and operators for examining and manipulating values of type bytea. SQL defines some string functions that use key words, rather than commas, to separate arguments. Details are in . PostgreSQL also provides versions of these functions that use the regular function invocation syntax (see ). The sample results shown on this page assume that the server parameter bytea_output is set to escape (the traditional PostgreSQL format). <acronym>SQL</acronym> Binary String Functions and Operators Function Return Type Description Example Result string || string bytea String concatenation binary string concatenation '\\Post'::bytea || '\047gres\000'::bytea \\Post'gres\000 octet_length octet_length(string) int Number of bytes in binary string octet_length('jo\000se'::bytea) 5 overlay overlay(string placing string from int for int) bytea Replace substring overlay('Th\000omas'::bytea placing '\002\003'::bytea from 2 for 3) T\\002\\003mas position position(substring in string) int Location of specified substring position('\000om'::bytea in 'Th\000omas'::bytea) 3 substring substring(string from int for int) bytea Extract substring substring('Th\000omas'::bytea from 2 for 3) h\000o trim trim(both bytes from string) bytea Remove the longest string containing only bytes appearing in bytes from the start and end of string trim('\000\001'::bytea from '\000Tom\001'::bytea) Tom

Additional binary string manipulation functions are available and are listed in . Some of them are used internally to implement the SQL-standard string functions listed in . Other Binary String Functions Function Return Type Description Example Result btrim btrim(string bytea, bytes bytea) bytea Remove the longest string containing only bytes appearing in bytes from the start and end of string btrim('\000trim\001'::bytea, '\000\001'::bytea) trim decode decode(string text, format text) bytea Decode binary data from textual representation in string. Options for format are same as in encode. decode('123\000456', 'escape') 123\000456 encode encode(data bytea, format text) text Encode binary data into a textual representation. Supported formats are: base64, hex, escape. escape converts zero bytes and high-bit-set bytes to octal sequences (\nnn) and doubles backslashes. encode('123\000456'::bytea, 'escape') 123\000456 get_bit get_bit(string, offset) int Extract bit from string get_bit('Th\000omas'::bytea, 45) 1 get_byte get_byte(string, offset) int Extract byte from string get_byte('Th\000omas'::bytea, 4) 109 length length(string) int Length of binary string binary string length length of a binary string binary strings, length length('jo\000se'::bytea) 5 md5 md5(string) text Calculates the MD5 hash of string, returning the result in hexadecimal md5('Th\000omas'::bytea) 8ab2d3c9689aaf18b4958c334c82d8b1 set_bit set_bit(string, offset, newvalue) bytea Set bit in string set_bit('Th\000omas'::bytea, 45, 0) Th\000omAs set_byte set_byte(string, offset, newvalue) bytea Set byte in string set_byte('Th\000omas'::bytea, 4, 64) Th\000o@as sha224 sha224(bytea) bytea SHA-224 hash sha224('abc') \x23097d223405d8228642a477bda255b32aadbce4bda0b3f7e36c9da7 sha256 sha256(bytea) bytea SHA-256 hash sha256('abc') \xba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad sha384 sha384(bytea) bytea SHA-384 hash sha384('abc') \xcb00753f45a35e8bb5a03d699ac65007272c32ab0eded1631a8b605a43ff5bed8086072ba1e7cc2358baeca134c825a7 sha512 sha512(bytea) bytea SHA-512 hash sha512('abc') \xddaf35a193617abacc417349ae20413112e6fa4e89a97ea20a9eeee64b55d39a2192992a274fc1a836ba3c23a3feebbd454d4423643ce80e2a9ac94fa54ca49f

get_byte and set_byte number the first byte of a binary string as byte 0. get_bit and set_bit number bits from the right within each byte; for example bit 0 is the least significant bit of the first byte, and bit 15 is the most significant bit of the second byte. Note that for historic reasons, the function md5 returns a hex-encoded value of type text whereas the SHA-2 functions return type bytea. Use the functions encode and decode to convert between the two, for example encode(sha256('abc'), 'hex') to get a hex-encoded text representation. See also the aggregate function string_agg in and the large object functions in . Bit String Functions and Operators bit strings functions This section describes functions and operators for examining and manipulating bit strings, that is values of the types bit and bit varying. Aside from the usual comparison operators, the operators shown in can be used. Bit string operands of &, |, and # must be of equal length. When bit shifting, the original length of the string is preserved, as shown in the examples. Bit String Operators Operator Description Example Result || concatenation B'10001' || B'011' 10001011 & bitwise AND B'10001' & B'01101' 00001 | bitwise OR B'10001' | B'01101' 11101 # bitwise XOR B'10001' # B'01101' 11100 ~ bitwise NOT ~ B'10001' 01110 << bitwise shift left B'10001' << 3 01000 >> bitwise shift right B'10001' >> 2 00100

The following SQL-standard functions work on bit strings as well as character strings: length, bit_length, octet_length, position, substring, overlay. The following functions work on bit strings as well as binary strings: get_bit, set_bit. When working with a bit string, these functions number the first (leftmost) bit of the string as bit 0. In addition, it is possible to cast integral values to and from type bit. Some examples: 44::bit(10) 0000101100 44::bit(3) 100 cast(-44 as bit(12)) 111111010100 '1110'::bit(4)::integer 14 Note that casting to just bit means casting to bit(1), and so will deliver only the least significant bit of the integer. Casting an integer to bit(n) copies the rightmost n bits. Casting an integer to a bit string width wider than the integer itself will sign-extend on the left. Pattern Matching pattern matching There are three separate approaches to pattern matching provided by PostgreSQL: the traditional SQL LIKE operator, the more recent SIMILAR TO operator (added in SQL:1999), and POSIX-style regular expressions. Aside from the basic does this string match this pattern? operators, functions are available to extract or replace matching substrings and to split a string at matching locations. If you have pattern matching needs that go beyond this, consider writing a user-defined function in Perl or Tcl. While most regular-expression searches can be executed very quickly, regular expressions can be contrived that take arbitrary amounts of time and memory to process. Be wary of accepting regular-expression search patterns from hostile sources. If you must do so, it is advisable to impose a statement timeout. Searches using SIMILAR TO patterns have the same security hazards, since SIMILAR TO provides many of the same capabilities as POSIX-style regular expressions. LIKE searches, being much simpler than the other two options, are safer to use with possibly-hostile pattern sources. The pattern matching operators of all three kinds do not support nondeterministic collations. If required, apply a different collation to the expression to work around this limitation. <function>LIKE</function> LIKE string LIKE pattern ESCAPE escape-character string NOT LIKE pattern ESCAPE escape-character The LIKE expression returns true if the string matches the supplied pattern. (As expected, the NOT LIKE expression returns false if LIKE returns true, and vice versa. An equivalent expression is NOT (string LIKE pattern).) If pattern does not contain percent signs or underscores, then the pattern only represents the string itself; in that case LIKE acts like the equals operator. An underscore (_) in pattern stands for (matches) any single character; a percent sign (%) matches any sequence of zero or more characters. Some examples: 'abc' LIKE 'abc' true 'abc' LIKE 'a%' true 'abc' LIKE '_b_' true 'abc' LIKE 'c' false LIKE pattern matching always covers the entire string. Therefore, if it's desired to match a sequence anywhere within a string, the pattern must start and end with a percent sign. To match a literal underscore or percent sign without matching other characters, the respective character in pattern must be preceded by the escape character. The default escape character is the backslash but a different one can be selected by using the ESCAPE clause. To match the escape character itself, write two escape characters. If you have turned off, any backslashes you write in literal string constants will need to be doubled. See for more information. It's also possible to select no escape character by writing ESCAPE ''. This effectively disables the escape mechanism, which makes it impossible to turn off the special meaning of underscore and percent signs in the pattern. The key word ILIKE can be used instead of LIKE to make the match case-insensitive according to the active locale. This is not in the SQL standard but is a PostgreSQL extension. The operator ~~ is equivalent to LIKE, and ~~* corresponds to ILIKE. There are also !~~ and !~~* operators that represent NOT LIKE and NOT ILIKE, respectively. All of these operators are PostgreSQL-specific. You may see these operator names in EXPLAIN output and similar places, since the parser actually translates LIKE et al. to these operators. The phrases LIKE, ILIKE, NOT LIKE, and NOT ILIKE are generally treated as operators in PostgreSQL syntax; for example they can be used in expression operator ANY (subquery) constructs, although an ESCAPE clause cannot be included there. In some obscure cases it may be necessary to use the underlying operator names instead. There is also the prefix operator ^@ and corresponding starts_with function which covers cases when only searching by beginning of the string is needed. <function>SIMILAR TO</function> Regular Expressions regular expression SIMILAR TO substring string SIMILAR TO pattern ESCAPE escape-character string NOT SIMILAR TO pattern ESCAPE escape-character The SIMILAR TO operator returns true or false depending on whether its pattern matches the given string. It is similar to LIKE, except that it interprets the pattern using the SQL standard's definition of a regular expression. SQL regular expressions are a curious cross between LIKE notation and common regular expression notation. Like LIKE, the SIMILAR TO operator succeeds only if its pattern matches the entire string; this is unlike common regular expression behavior where the pattern can match any part of the string. Also like LIKE, SIMILAR TO uses _ and % as wildcard characters denoting any single character and any string, respectively (these are comparable to . and .* in POSIX regular expressions). In addition to these facilities borrowed from LIKE, SIMILAR TO supports these pattern-matching metacharacters borrowed from POSIX regular expressions: | denotes alternation (either of two alternatives). * denotes repetition of the previous item zero or more times. + denotes repetition of the previous item one or more times. ? denotes repetition of the previous item zero or one time. {m} denotes repetition of the previous item exactly m times. {m,} denotes repetition of the previous item m or more times. {m,n} denotes repetition of the previous item at least m and not more than n times. Parentheses () can be used to group items into a single logical item. A bracket expression [...] specifies a character class, just as in POSIX regular expressions. Notice that the period (.) is not a metacharacter for SIMILAR TO. As with LIKE, a backslash disables the special meaning of any of these metacharacters; or a different escape character can be specified with ESCAPE. Some examples: 'abc' SIMILAR TO 'abc' true 'abc' SIMILAR TO 'a' false 'abc' SIMILAR TO '%(b|d)%' true 'abc' SIMILAR TO '(b|c)%' false The substring function with three parameters provides extraction of a substring that matches an SQL regular expression pattern. The function can be written according to SQL99 syntax: substring(string from pattern for escape-character) or as a plain three-argument function: substring(string, pattern, escape-character) As with SIMILAR TO, the specified pattern must match the entire data string, or else the function fails and returns null. To indicate the part of the pattern for which the matching data sub-string is of interest, the pattern should contain two occurrences of the escape character followed by a double quote ("). The text matching the portion of the pattern between these separators is returned when the match is successful. The escape-double-quote separators actually divide substring's pattern into three independent regular expressions; for example, a vertical bar (|) in any of the three sections affects only that section. Also, the first and third of these regular expressions are defined to match the smallest possible amount of text, not the largest, when there is any ambiguity about how much of the data string matches which pattern. (In POSIX parlance, the first and third regular expressions are forced to be non-greedy.) As an extension to the SQL standard, PostgreSQL allows there to be just one escape-double-quote separator, in which case the third regular expression is taken as empty; or no separators, in which case the first and third regular expressions are taken as empty. Some examples, with #" delimiting the return string: substring('foobar' from '%#"o_b#"%' for '#') oob substring('foobar' from '#"o_b#"%' for '#') NULL <acronym>POSIX</acronym> Regular Expressions regular expression pattern matching substring regexp_replace regexp_match regexp_matches regexp_split_to_table regexp_split_to_array lists the available operators for pattern matching using POSIX regular expressions. Regular Expression Match Operators Operator Description Example ~ Matches regular expression, case sensitive 'thomas' ~ '.*thomas.*' ~* Matches regular expression, case insensitive 'thomas' ~* '.*Thomas.*' !~ Does not match regular expression, case sensitive 'thomas' !~ '.*Thomas.*' !~* Does not match regular expression, case insensitive 'thomas' !~* '.*vadim.*'

POSIX regular expressions provide a more powerful means for pattern matching than the LIKE and SIMILAR TO operators. Many Unix tools such as egrep, sed, or awk use a pattern matching language that is similar to the one described here. A regular expression is a character sequence that is an abbreviated definition of a set of strings (a regular set). A string is said to match a regular expression if it is a member of the regular set described by the regular expression. As with LIKE, pattern characters match string characters exactly unless they are special characters in the regular expression language — but regular expressions use different special characters than LIKE does. Unlike LIKE patterns, a regular expression is allowed to match anywhere within a string, unless the regular expression is explicitly anchored to the beginning or end of the string. Some examples: 'abc' ~ 'abc' true 'abc' ~ '^a' true 'abc' ~ '(b|d)' true 'abc' ~ '^(b|c)' false The POSIX pattern language is described in much greater detail below. The substring function with two parameters, substring(string from pattern), provides extraction of a substring that matches a POSIX regular expression pattern. It returns null if there is no match, otherwise the portion of the text that matched the pattern. But if the pattern contains any parentheses, the portion of the text that matched the first parenthesized subexpression (the one whose left parenthesis comes first) is returned. You can put parentheses around the whole expression if you want to use parentheses within it without triggering this exception. If you need parentheses in the pattern before the subexpression you want to extract, see the non-capturing parentheses described below. Some examples: substring('foobar' from 'o.b') oob substring('foobar' from 'o(.)b') o The regexp_replace function provides substitution of new text for substrings that match POSIX regular expression patterns. It has the syntax regexp_replace(source, pattern, replacement , flags ). The source string is returned unchanged if there is no match to the pattern. If there is a match, the source string is returned with the replacement string substituted for the matching substring. The replacement string can contain \n, where n is 1 through 9, to indicate that the source substring matching the n'th parenthesized subexpression of the pattern should be inserted, and it can contain \& to indicate that the substring matching the entire pattern should be inserted. Write \\ if you need to put a literal backslash in the replacement text. The flags parameter is an optional text string containing zero or more single-letter flags that change the function's behavior. Flag i specifies case-insensitive matching, while flag g specifies replacement of each matching substring rather than only the first one. Supported flags (though not g) are described in . Some examples: regexp_replace('foobarbaz', 'b..', 'X') fooXbaz regexp_replace('foobarbaz', 'b..', 'X', 'g') fooXX regexp_replace('foobarbaz', 'b(..)', 'X\1Y', 'g') fooXarYXazY The regexp_match function returns a text array of captured substring(s) resulting from the first match of a POSIX regular expression pattern to a string. It has the syntax regexp_match(string, pattern , flags ). If there is no match, the result is NULL. If a match is found, and the pattern contains no parenthesized subexpressions, then the result is a single-element text array containing the substring matching the whole pattern. If a match is found, and the pattern contains parenthesized subexpressions, then the result is a text array whose n'th element is the substring matching the n'th parenthesized subexpression of the pattern (not counting non-capturing parentheses; see below for details). The flags parameter is an optional text string containing zero or more single-letter flags that change the function's behavior. Supported flags are described in . Some examples: SELECT regexp_match('foobarbequebaz', 'bar.*que'); regexp_match -------------- {barbeque} (1 row) SELECT regexp_match('foobarbequebaz', '(bar)(beque)'); regexp_match -------------- {bar,beque} (1 row) In the common case where you just want the whole matching substring or NULL for no match, write something like SELECT (regexp_match('foobarbequebaz', 'bar.*que'))[1]; regexp_match -------------- barbeque (1 row) The regexp_matches function returns a set of text arrays of captured substring(s) resulting from matching a POSIX regular expression pattern to a string. It has the same syntax as regexp_match. This function returns no rows if there is no match, one row if there is a match and the g flag is not given, or N rows if there are N matches and the g flag is given. Each returned row is a text array containing the whole matched substring or the substrings matching parenthesized subexpressions of the pattern, just as described above for regexp_match. regexp_matches accepts all the flags shown in , plus the g flag which commands it to return all matches, not just the first one. Some examples: SELECT regexp_matches('foo', 'not there'); regexp_matches ---------------- (0 rows) SELECT regexp_matches('foobarbequebazilbarfbonk', '(b[^b]+)(b[^b]+)', 'g'); regexp_matches ---------------- {bar,beque} {bazil,barf} (2 rows) In most cases regexp_matches() should be used with the g flag, since if you only want the first match, it's easier and more efficient to use regexp_match(). However, regexp_match() only exists in PostgreSQL version 10 and up. When working in older versions, a common trick is to place a regexp_matches() call in a sub-select, for example: SELECT col1, (SELECT regexp_matches(col2, '(bar)(beque)')) FROM tab; This produces a text array if there's a match, or NULL if not, the same as regexp_match() would do. Without the sub-select, this query would produce no output at all for table rows without a match, which is typically not the desired behavior. The regexp_split_to_table function splits a string using a POSIX regular expression pattern as a delimiter. It has the syntax regexp_split_to_table(string, pattern , flags ). If there is no match to the pattern, the function returns the string. If there is at least one match, for each match it returns the text from the end of the last match (or the beginning of the string) to the beginning of the match. When there are no more matches, it returns the text from the end of the last match to the end of the string. The flags parameter is an optional text string containing zero or more single-letter flags that change the function's behavior. regexp_split_to_table supports the flags described in . The regexp_split_to_array function behaves the same as regexp_split_to_table, except that regexp_split_to_array returns its result as an array of text. It has the syntax regexp_split_to_array(string, pattern , flags ). The parameters are the same as for regexp_split_to_table. Some examples: SELECT foo FROM regexp_split_to_table('the quick brown fox jumps over the lazy dog', '\s+') AS foo; foo ------- the quick brown fox jumps over the lazy dog (9 rows) SELECT regexp_split_to_array('the quick brown fox jumps over the lazy dog', '\s+'); regexp_split_to_array ----------------------------------------------- {the,quick,brown,fox,jumps,over,the,lazy,dog} (1 row) SELECT foo FROM regexp_split_to_table('the quick brown fox', '\s*') AS foo; foo ----- t h e q u i c k b r o w n f o x (16 rows) As the last example demonstrates, the regexp split functions ignore zero-length matches that occur at the start or end of the string or immediately after a previous match. This is contrary to the strict definition of regexp matching that is implemented by regexp_match and regexp_matches, but is usually the most convenient behavior in practice. Other software systems such as Perl use similar definitions. Regular Expression Details PostgreSQL's regular expressions are implemented using a software package written by Henry Spencer. Much of the description of regular expressions below is copied verbatim from his manual. Regular expressions (REs), as defined in POSIX 1003.2, come in two forms: extended REs or EREs (roughly those of egrep), and basic REs or BREs (roughly those of ed). PostgreSQL supports both forms, and also implements some extensions that are not in the POSIX standard, but have become widely used due to their availability in programming languages such as Perl and Tcl. REs using these non-POSIX extensions are called advanced REs or AREs in this documentation. AREs are almost an exact superset of EREs, but BREs have several notational incompatibilities (as well as being much more limited). We first describe the ARE and ERE forms, noting features that apply only to AREs, and then describe how BREs differ. PostgreSQL always initially presumes that a regular expression follows the ARE rules. However, the more limited ERE or BRE rules can be chosen by prepending an embedded option to the RE pattern, as described in . This can be useful for compatibility with applications that expect exactly the POSIX 1003.2 rules. A regular expression is defined as one or more branches, separated by |. It matches anything that matches one of the branches. A branch is zero or more quantified atoms or constraints, concatenated. It matches a match for the first, followed by a match for the second, etc; an empty branch matches the empty string. A quantified atom is an atom possibly followed by a single quantifier. Without a quantifier, it matches a match for the atom. With a quantifier, it can match some number of matches of the atom. An atom can be any of the possibilities shown in . The possible quantifiers and their meanings are shown in . A constraint matches an empty string, but matches only when specific conditions are met. A constraint can be used where an atom could be used, except it cannot be followed by a quantifier. The simple constraints are shown in ; some more constraints are described later. Regular Expression Atoms Atom Description (re) (where re is any regular expression) matches a match for re, with the match noted for possible reporting (?:re) as above, but the match is not noted for reporting (a non-capturing set of parentheses) (AREs only) . matches any single character [chars] a bracket expression, matching any one of the chars (see for more detail) \k (where k is a non-alphanumeric character) matches that character taken as an ordinary character, e.g., \\ matches a backslash character \c where c is alphanumeric (possibly followed by other characters) is an escape, see (AREs only; in EREs and BREs, this matches c) { when followed by a character other than a digit, matches the left-brace character {; when followed by a digit, it is the beginning of a bound (see below) x where x is a single character with no other significance, matches that character

An RE cannot end with a backslash (\). If you have turned off, any backslashes you write in literal string constants will need to be doubled. See for more information. Regular Expression Quantifiers Quantifier Matches * a sequence of 0 or more matches of the atom + a sequence of 1 or more matches of the atom ? a sequence of 0 or 1 matches of the atom {m} a sequence of exactly m matches of the atom {m,} a sequence of m or more matches of the atom {m,n} a sequence of m through n (inclusive) matches of the atom; m cannot exceed n *? non-greedy version of * +? non-greedy version of + ?? non-greedy version of ? {m}? non-greedy version of {m} {m,}? non-greedy version of {m,} {m,n}? non-greedy version of {m,n}

The forms using {...} are known as bounds. The numbers m and n within a bound are unsigned decimal integers with permissible values from 0 to 255 inclusive. Non-greedy quantifiers (available in AREs only) match the same possibilities as their corresponding normal (greedy) counterparts, but prefer the smallest number rather than the largest number of matches. See for more detail. A quantifier cannot immediately follow another quantifier, e.g., ** is invalid. A quantifier cannot begin an expression or subexpression or follow ^ or |. Regular Expression Constraints Constraint Description ^ matches at the beginning of the string $ matches at the end of the string (?=re) positive lookahead matches at any point where a substring matching re begins (AREs only) (?!re) negative lookahead matches at any point where no substring matching re begins (AREs only) (?<=re) positive lookbehind matches at any point where a substring matching re ends (AREs only) (?<!re) negative lookbehind matches at any point where no substring matching re ends (AREs only)

Lookahead and lookbehind constraints cannot contain back references (see ), and all parentheses within them are considered non-capturing. Bracket Expressions A bracket expression is a list of characters enclosed in []. It normally matches any single character from the list (but see below). If the list begins with ^, it matches any single character not from the rest of the list. If two characters in the list are separated by -, this is shorthand for the full range of characters between those two (inclusive) in the collating sequence, e.g., [0-9] in ASCII matches any decimal digit. It is illegal for two ranges to share an endpoint, e.g., a-c-e. Ranges are very collating-sequence-dependent, so portable programs should avoid relying on them. To include a literal ] in the list, make it the first character (after ^, if that is used). To include a literal -, make it the first or last character, or the second endpoint of a range. To use a literal - as the first endpoint of a range, enclose it in [. and .] to make it a collating element (see below). With the exception of these characters, some combinations using [ (see next paragraphs), and escapes (AREs only), all other special characters lose their special significance within a bracket expression. In particular, \ is not special when following ERE or BRE rules, though it is special (as introducing an escape) in AREs. Within a bracket expression, a collating element (a character, a multiple-character sequence that collates as if it were a single character, or a collating-sequence name for either) enclosed in [. and .] stands for the sequence of characters of that collating element. The sequence is treated as a single element of the bracket expression's list. This allows a bracket expression containing a multiple-character collating element to match more than one character, e.g., if the collating sequence includes a ch collating element, then the RE [[.ch.]]*c matches the first five characters of chchcc. PostgreSQL currently does not support multi-character collating elements. This information describes possible future behavior. Within a bracket expression, a collating element enclosed in [= and =] is an equivalence class, standing for the sequences of characters of all collating elements equivalent to that one, including itself. (If there are no other equivalent collating elements, the treatment is as if the enclosing delimiters were [. and .].) For example, if o and ^ are the members of an equivalence class, then [[=o=]], [[=^=]], and [o^] are all synonymous. An equivalence class cannot be an endpoint of a range. Within a bracket expression, the name of a character class enclosed in [: and :] stands for the list of all characters belonging to that class. A character class cannot be used as an endpoint of a range. The POSIX standard defines these character class names: alnum (letters and numeric digits), alpha (letters), blank (space and tab), cntrl (control characters), digit (numeric digits), graph (printable characters except space), lower (lower-case letters), print (printable characters including space), punct (punctuation), space (any white space), upper (upper-case letters), and xdigit (hexadecimal digits). The behavior of these standard character classes is generally consistent across platforms for characters in the 7-bit ASCII set. Whether a given non-ASCII character is considered to belong to one of these classes depends on the collation that is used for the regular-expression function or operator (see ), or by default on the database's LC_CTYPE locale setting (see ). The classification of non-ASCII characters can vary across platforms even in similarly-named locales. (But the C locale never considers any non-ASCII characters to belong to any of these classes.) In addition to these standard character classes, PostgreSQL defines the ascii character class, which contains exactly the 7-bit ASCII set. There are two special cases of bracket expressions: the bracket expressions [[:<:]] and [[:>:]] are constraints, matching empty strings at the beginning and end of a word respectively. A word is defined as a sequence of word characters that is neither preceded nor followed by word characters. A word character is an alnum character (as defined by the POSIX character class described above) or an underscore. This is an extension, compatible with but not specified by POSIX 1003.2, and should be used with caution in software intended to be portable to other systems. The constraint escapes described below are usually preferable; they are no more standard, but are easier to type. Regular Expression Escapes Escapes are special sequences beginning with \ followed by an alphanumeric character. Escapes come in several varieties: character entry, class shorthands, constraint escapes, and back references. A \ followed by an alphanumeric character but not constituting a valid escape is illegal in AREs. In EREs, there are no escapes: outside a bracket expression, a \ followed by an alphanumeric character merely stands for that character as an ordinary character, and inside a bracket expression, \ is an ordinary character. (The latter is the one actual incompatibility between EREs and AREs.) Character-entry escapes exist to make it easier to specify non-printing and other inconvenient characters in REs. They are shown in . Class-shorthand escapes provide shorthands for certain commonly-used character classes. They are shown in . A constraint escape is a constraint, matching the empty string if specific conditions are met, written as an escape. They are shown in . A back reference (\n) matches the same string matched by the previous parenthesized subexpression specified by the number n (see ). For example, ([bc])\1 matches bb or cc but not bc or cb. The subexpression must entirely precede the back reference in the RE. Subexpressions are numbered in the order of their leading parentheses. Non-capturing parentheses do not define subexpressions. Regular Expression Character-Entry Escapes Escape Description \a alert (bell) character, as in C \b backspace, as in C \B synonym for backslash (\) to help reduce the need for backslash doubling \cX (where X is any character) the character whose low-order 5 bits are the same as those of X, and whose other bits are all zero \e the character whose collating-sequence name is ESC, or failing that, the character with octal value 033 \f form feed, as in C \n newline, as in C \r carriage return, as in C \t horizontal tab, as in C \uwxyz (where wxyz is exactly four hexadecimal digits) the character whose hexadecimal value is 0xwxyz \Ustuvwxyz (where stuvwxyz is exactly eight hexadecimal digits) the character whose hexadecimal value is 0xstuvwxyz \v vertical tab, as in C \xhhh (where hhh is any sequence of hexadecimal digits) the character whose hexadecimal value is 0xhhh (a single character no matter how many hexadecimal digits are used) \0 the character whose value is 0 (the null byte) \xy (where xy is exactly two octal digits, and is not a back reference) the character whose octal value is 0xy \xyz (where xyz is exactly three octal digits, and is not a back reference) the character whose octal value is 0xyz

Hexadecimal digits are 0-9, a-f, and A-F. Octal digits are 0-7. Numeric character-entry escapes specifying values outside the ASCII range (0-127) have meanings dependent on the database encoding. When the encoding is UTF-8, escape values are equivalent to Unicode code points, for example \u1234 means the character U+1234. For other multibyte encodings, character-entry escapes usually just specify the concatenation of the byte values for the character. If the escape value does not correspond to any legal character in the database encoding, no error will be raised, but it will never match any data. The character-entry escapes are always taken as ordinary characters. For example, \135 is ] in ASCII, but \135 does not terminate a bracket expression. Regular Expression Class-Shorthand Escapes Escape Description \d [[:digit:]] \s [[:space:]] \w [[:alnum:]_] (note underscore is included) \D [^[:digit:]] \S [^[:space:]] \W [^[:alnum:]_] (note underscore is included)

Within bracket expressions, \d, \s, and \w lose their outer brackets, and \D, \S, and \W are illegal. (So, for example, [a-c\d] is equivalent to [a-c[:digit:]]. Also, [a-c\D], which is equivalent to [a-c^[:digit:]], is illegal.) Regular Expression Constraint Escapes Escape Description \A matches only at the beginning of the string (see for how this differs from ^) \m matches only at the beginning of a word \M matches only at the end of a word \y matches only at the beginning or end of a word \Y matches only at a point that is not the beginning or end of a word \Z matches only at the end of the string (see for how this differs from $)

A word is defined as in the specification of [[:<:]] and [[:>:]] above. Constraint escapes are illegal within bracket expressions. Regular Expression Back References Escape Description \m (where m is a nonzero digit) a back reference to the m'th subexpression \mnn (where m is a nonzero digit, and nn is some more digits, and the decimal value mnn is not greater than the number of closing capturing parentheses seen so far) a back reference to the mnn'th subexpression

There is an inherent ambiguity between octal character-entry escapes and back references, which is resolved by the following heuristics, as hinted at above. A leading zero always indicates an octal escape. A single non-zero digit, not followed by another digit, is always taken as a back reference. A multi-digit sequence not starting with a zero is taken as a back reference if it comes after a suitable subexpression (i.e., the number is in the legal range for a back reference), and otherwise is taken as octal. Regular Expression Metasyntax In addition to the main syntax described above, there are some special forms and miscellaneous syntactic facilities available. An RE can begin with one of two special director prefixes. If an RE begins with ***:, the rest of the RE is taken as an ARE. (This normally has no effect in PostgreSQL, since REs are assumed to be AREs; but it does have an effect if ERE or BRE mode had been specified by the flags parameter to a regex function.) If an RE begins with ***=, the rest of the RE is taken to be a literal string, with all characters considered ordinary characters. An ARE can begin with embedded options: a sequence (?xyz) (where xyz is one or more alphabetic characters) specifies options affecting the rest of the RE. These options override any previously determined options — in particular, they can override the case-sensitivity behavior implied by a regex operator, or the flags parameter to a regex function. The available option letters are shown in . Note that these same option letters are used in the flags parameters of regex functions. ARE Embedded-Option Letters Option Description b rest of RE is a BRE c case-sensitive matching (overrides operator type) e rest of RE is an ERE i case-insensitive matching (see ) (overrides operator type) m historical synonym for n n newline-sensitive matching (see ) p partial newline-sensitive matching (see ) q rest of RE is a literal (quoted) string, all ordinary characters s non-newline-sensitive matching (default) t tight syntax (default; see below) w inverse partial newline-sensitive (weird) matching (see ) x expanded syntax (see below)

Embedded options take effect at the ) terminating the sequence. They can appear only at the start of an ARE (after the ***: director if any). In addition to the usual (tight) RE syntax, in which all characters are significant, there is an expanded syntax, available by specifying the embedded x option. In the expanded syntax, white-space characters in the RE are ignored, as are all characters between a # and the following newline (or the end of the RE). This permits paragraphing and commenting a complex RE. There are three exceptions to that basic rule: a white-space character or # preceded by \ is retained white space or # within a bracket expression is retained white space and comments cannot appear within multi-character symbols, such as (?: For this purpose, white-space characters are blank, tab, newline, and any character that belongs to the space character class. Finally, in an ARE, outside bracket expressions, the sequence (?#ttt) (where ttt is any text not containing a )) is a comment, completely ignored. Again, this is not allowed between the characters of multi-character symbols, like (?:. Such comments are more a historical artifact than a useful facility, and their use is deprecated; use the expanded syntax instead. None of these metasyntax extensions is available if an initial ***= director has specified that the user's input be treated as a literal string rather than as an RE. Regular Expression Matching Rules In the event that an RE could match more than one substring of a given string, the RE matches the one starting earliest in the string. If the RE could match more than one substring starting at that point, either the longest possible match or the shortest possible match will be taken, depending on whether the RE is greedy or non-greedy. Whether an RE is greedy or not is determined by the following rules: Most atoms, and all constraints, have no greediness attribute (because they cannot match variable amounts of text anyway). Adding parentheses around an RE does not change its greediness. A quantified atom with a fixed-repetition quantifier ({m} or {m}?) has the same greediness (possibly none) as the atom itself. A quantified atom with other normal quantifiers (including {m,n} with m equal to n) is greedy (prefers longest match). A quantified atom with a non-greedy quantifier (including {m,n}? with m equal to n) is non-greedy (prefers shortest match). A branch — that is, an RE that has no top-level | operator — has the same greediness as the first quantified atom in it that has a greediness attribute. An RE consisting of two or more branches connected by the | operator is always greedy. The above rules associate greediness attributes not only with individual quantified atoms, but with branches and entire REs that contain quantified atoms. What that means is that the matching is done in such a way that the branch, or whole RE, matches the longest or shortest possible substring as a whole. Once the length of the entire match is determined, the part of it that matches any particular subexpression is determined on the basis of the greediness attribute of that subexpression, with subexpressions starting earlier in the RE taking priority over ones starting later. An example of what this means: SELECT SUBSTRING('XY1234Z', 'Y*([0-9]{1,3})'); Result: 123 SELECT SUBSTRING('XY1234Z', 'Y*?([0-9]{1,3})'); Result: 1 In the first case, the RE as a whole is greedy because Y* is greedy. It can match beginning at the Y, and it matches the longest possible string starting there, i.e., Y123. The output is the parenthesized part of that, or 123. In the second case, the RE as a whole is non-greedy because Y*? is non-greedy. It can match beginning at the Y, and it matches the shortest possible string starting there, i.e., Y1. The subexpression [0-9]{1,3} is greedy but it cannot change the decision as to the overall match length; so it is forced to match just 1. In short, when an RE contains both greedy and non-greedy subexpressions, the total match length is either as long as possible or as short as possible, according to the attribute assigned to the whole RE. The attributes assigned to the subexpressions only affect how much of that match they are allowed to eat relative to each other. The quantifiers {1,1} and {1,1}? can be used to force greediness or non-greediness, respectively, on a subexpression or a whole RE. This is useful when you need the whole RE to have a greediness attribute different from what's deduced from its elements. As an example, suppose that we are trying to separate a string containing some digits into the digits and the parts before and after them. We might try to do that like this: SELECT regexp_match('abc01234xyz', '(.*)(\d+)(.*)'); Result: {abc0123,4,xyz} That didn't work: the first .* is greedy so it eats as much as it can, leaving the \d+ to match at the last possible place, the last digit. We might try to fix that by making it non-greedy: SELECT regexp_match('abc01234xyz', '(.*?)(\d+)(.*)'); Result: {abc,0,""} That didn't work either, because now the RE as a whole is non-greedy and so it ends the overall match as soon as possible. We can get what we want by forcing the RE as a whole to be greedy: SELECT regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}'); Result: {abc,01234,xyz} Controlling the RE's overall greediness separately from its components' greediness allows great flexibility in handling variable-length patterns. When deciding what is a longer or shorter match, match lengths are measured in characters, not collating elements. An empty string is considered longer than no match at all. For example: bb* matches the three middle characters of abbbc; (week|wee)(night|knights) matches all ten characters of weeknights; when (.*).* is matched against abc the parenthesized subexpression matches all three characters; and when (a*)* is matched against bc both the whole RE and the parenthesized subexpression match an empty string. If case-independent matching is specified, the effect is much as if all case distinctions had vanished from the alphabet. When an alphabetic that exists in multiple cases appears as an ordinary character outside a bracket expression, it is effectively transformed into a bracket expression containing both cases, e.g., x becomes [xX]. When it appears inside a bracket expression, all case counterparts of it are added to the bracket expression, e.g., [x] becomes [xX] and [^x] becomes [^xX]. If newline-sensitive matching is specified, . and bracket expressions using ^ will never match the newline character (so that matches will never cross newlines unless the RE explicitly arranges it) and ^ and $ will match the empty string after and before a newline respectively, in addition to matching at beginning and end of string respectively. But the ARE escapes \A and \Z continue to match beginning or end of string only. If partial newline-sensitive matching is specified, this affects . and bracket expressions as with newline-sensitive matching, but not ^ and $. If inverse partial newline-sensitive matching is specified, this affects ^ and $ as with newline-sensitive matching, but not . and bracket expressions. This isn't very useful but is provided for symmetry. Limits and Compatibility No particular limit is imposed on the length of REs in this implementation. However, programs intended to be highly portable should not employ REs longer than 256 bytes, as a POSIX-compliant implementation can refuse to accept such REs. The only feature of AREs that is actually incompatible with POSIX EREs is that \ does not lose its special significance inside bracket expressions. All other ARE features use syntax which is illegal or has undefined or unspecified effects in POSIX EREs; the *** syntax of directors likewise is outside the POSIX syntax for both BREs and EREs. Many of the ARE extensions are borrowed from Perl, but some have been changed to clean them up, and a few Perl extensions are not present. Incompatibilities of note include \b, \B, the lack of special treatment for a trailing newline, the addition of complemented bracket expressions to the things affected by newline-sensitive matching, the restrictions on parentheses and back references in lookahead/lookbehind constraints, and the longest/shortest-match (rather than first-match) matching semantics. Two significant incompatibilities exist between AREs and the ERE syntax recognized by pre-7.4 releases of PostgreSQL: In AREs, \ followed by an alphanumeric character is either an escape or an error, while in previous releases, it was just another way of writing the alphanumeric. This should not be much of a problem because there was no reason to write such a sequence in earlier releases. In AREs, \ remains a special character within [], so a literal \ within a bracket expression must be written \\. Basic Regular Expressions BREs differ from EREs in several respects. In BREs, |, +, and ? are ordinary characters and there is no equivalent for their functionality. The delimiters for bounds are \{ and \}, with { and } by themselves ordinary characters. The parentheses for nested subexpressions are $ and $, with ( and ) by themselves ordinary characters. ^ is an ordinary character except at the beginning of the RE or the beginning of a parenthesized subexpression, $ is an ordinary character except at the end of the RE or the end of a parenthesized subexpression, and * is an ordinary character if it appears at the beginning of the RE or the beginning of a parenthesized subexpression (after a possible leading ^). Finally, single-digit back references are available, and \< and \> are synonyms for [[:<:]] and [[:>:]] respectively; no other escapes are available in BREs. Differences From XQuery (<literal>LIKE_REGEX</literal>) LIKE_REGEX XQuery regular expressions Since SQL:2008, the SQL standard includes a LIKE_REGEX operator that performs pattern matching according to the XQuery regular expression standard. PostgreSQL does not yet implement this operator, but you can get very similar behavior using the regexp_match() function, since XQuery regular expressions are quite close to the ARE syntax described above. Notable differences between the existing POSIX-based regular-expression feature and XQuery regular expressions include: XQuery character class subtraction is not supported. An example of this feature is using the following to match only English consonants: [a-z-[aeiou]]. XQuery character class shorthands \c, \C, \i, and \I are not supported. XQuery character class elements using \p{UnicodeProperty} or the inverse \P{UnicodeProperty} are not supported. POSIX interprets character classes such as \w (see ) according to the prevailing locale (which you can control by attaching a COLLATE clause to the operator or function). XQuery specifies these classes by reference to Unicode character properties, so equivalent behavior is obtained only with a locale that follows the Unicode rules. The SQL standard (not XQuery itself) attempts to cater for more variants of newline than POSIX does. The newline-sensitive matching options described above consider only ASCII NL (\n) to be a newline, but SQL would have us treat CR (\r), CRLF (\r\n) (a Windows-style newline), and some Unicode-only characters like LINE SEPARATOR (U+2028) as newlines as well. Notably, . and \s should count \r\n as one character not two according to SQL. Of the character-entry escapes described in , XQuery supports only \n, \r, and \t. XQuery does not support the [:name:] syntax for character classes within bracket expressions. XQuery does not have lookahead or lookbehind constraints, nor any of the constraint escapes described in . The metasyntax forms described in do not exist in XQuery. The regular expression flag letters defined by XQuery are related to but not the same as the option letters for POSIX (). While the i and q options behave the same, others do not: XQuery's s (allow dot to match newline) and m (allow ^ and $ to match at newlines) flags provide access to the same behaviors as POSIX's n, p and w flags, but they do not match the behavior of POSIX's s and m flags. Note in particular that dot-matches-newline is the default behavior in POSIX but not XQuery. XQuery's x (ignore whitespace in pattern) flag is noticeably different from POSIX's expanded-mode flag. POSIX's x flag also allows # to begin a comment in the pattern, and POSIX will not ignore a whitespace character after a backslash. Data Type Formatting Functions formatting The PostgreSQL formatting functions provide a powerful set of tools for converting various data types (date/time, integer, floating point, numeric) to formatted strings and for converting from formatted strings to specific data types. lists them. These functions all follow a common calling convention: the first argument is the value to be formatted and the second argument is a template that defines the output or input format. Formatting Functions Function Return Type Description Example to_char to_char(timestamp, text) text convert time stamp to string to_char(current_timestamp, 'HH12:MI:SS') to_char(interval, text) text convert interval to string to_char(interval '15h 2m 12s', 'HH24:MI:SS') to_char(int, text) text convert integer to string to_char(125, '999') to_char(double precision, text) text convert real/double precision to string to_char(125.8::real, '999D9') to_char(numeric, text) text convert numeric to string to_char(-125.8, '999D99S') to_date to_date(text, text) date convert string to date to_date('05 Dec 2000', 'DD Mon YYYY') to_number to_number(text, text) numeric convert string to numeric to_number('12,454.8-', '99G999D9S') to_timestamp to_timestamp(text, text) timestamp with time zone convert string to time stamp to_timestamp('05 Dec 2000', 'DD Mon YYYY')

There is also a single-argument to_timestamp function; see . to_timestamp and to_date exist to handle input formats that cannot be converted by simple casting. For most standard date/time formats, simply casting the source string to the required data type works, and is much easier. Similarly, to_number is unnecessary for standard numeric representations. In a to_char output template string, there are certain patterns that are recognized and replaced with appropriately-formatted data based on the given value. Any text that is not a template pattern is simply copied verbatim. Similarly, in an input template string (for the other functions), template patterns identify the values to be supplied by the input data string. If there are characters in the template string that are not template patterns, the corresponding characters in the input data string are simply skipped over (whether or not they are equal to the template string characters). shows the template patterns available for formatting date and time values. Template Patterns for Date/Time Formatting Pattern Description HH hour of day (01-12) HH12 hour of day (01-12) HH24 hour of day (00-23) MI minute (00-59) SS second (00-59) MS millisecond (000-999) US microsecond (000000-999999) SSSS seconds past midnight (0-86399) AM, am, PM or pm meridiem indicator (without periods) A.M., a.m., P.M. or p.m. meridiem indicator (with periods) Y,YYY year (4 or more digits) with comma YYYY year (4 or more digits) YYY last 3 digits of year YY last 2 digits of year Y last digit of year IYYY ISO 8601 week-numbering year (4 or more digits) IYY last 3 digits of ISO 8601 week-numbering year IY last 2 digits of ISO 8601 week-numbering year I last digit of ISO 8601 week-numbering year BC, bc, AD or ad era indicator (without periods) B.C., b.c., A.D. or a.d. era indicator (with periods) MONTH full upper case month name (blank-padded to 9 chars) Month full capitalized month name (blank-padded to 9 chars) month full lower case month name (blank-padded to 9 chars) MON abbreviated upper case month name (3 chars in English, localized lengths vary) Mon abbreviated capitalized month name (3 chars in English, localized lengths vary) mon abbreviated lower case month name (3 chars in English, localized lengths vary) MM month number (01-12) DAY full upper case day name (blank-padded to 9 chars) Day full capitalized day name (blank-padded to 9 chars) day full lower case day name (blank-padded to 9 chars) DY abbreviated upper case day name (3 chars in English, localized lengths vary) Dy abbreviated capitalized day name (3 chars in English, localized lengths vary) dy abbreviated lower case day name (3 chars in English, localized lengths vary) DDD day of year (001-366) IDDD day of ISO 8601 week-numbering year (001-371; day 1 of the year is Monday of the first ISO week) DD day of month (01-31) D day of the week, Sunday (1) to Saturday (7) ID ISO 8601 day of the week, Monday (1) to Sunday (7) W week of month (1-5) (the first week starts on the first day of the month) WW week number of year (1-53) (the first week starts on the first day of the year) IW week number of ISO 8601 week-numbering year (01-53; the first Thursday of the year is in week 1) CC century (2 digits) (the twenty-first century starts on 2001-01-01) J Julian Day (integer days since November 24, 4714 BC at midnight UTC) Q quarter RM month in upper case Roman numerals (I-XII; I=January) rm month in lower case Roman numerals (i-xii; i=January) TZ upper case time-zone abbreviation (only supported in to_char) tz lower case time-zone abbreviation (only supported in to_char) TZH time-zone hours TZM time-zone minutes OF time-zone offset from UTC (only supported in to_char)

Modifiers can be applied to any template pattern to alter its behavior. For example, FMMonth is the Month pattern with the FM modifier. shows the modifier patterns for date/time formatting. Template Pattern Modifiers for Date/Time Formatting Modifier Description Example FM prefix fill mode (suppress leading zeroes and padding blanks) FMMonth TH suffix upper case ordinal number suffix DDTH, e.g., 12TH th suffix lower case ordinal number suffix DDth, e.g., 12th FX prefix fixed format global option (see usage notes) FX Month DD Day TM prefix translation mode (print localized day and month names based on ) TMMonth SP suffix spell mode (not implemented) DDSP

Usage notes for date/time formatting: FM suppresses leading zeroes and trailing blanks that would otherwise be added to make the output of a pattern be fixed-width. In PostgreSQL, FM modifies only the next specification, while in Oracle FM affects all subsequent specifications, and repeated FM modifiers toggle fill mode on and off. TM does not include trailing blanks. to_timestamp and to_date ignore the TM modifier. to_timestamp and to_date skip multiple blank spaces at the beginning of the input string and around date and time values unless the FX option is used. For example, to_timestamp(' 2000 JUN', 'YYYY MON') and to_timestamp('2000 - JUN', 'YYYY-MON') work, but to_timestamp('2000 JUN', 'FXYYYY MON') returns an error because to_timestamp expects only a single space. FX must be specified as the first item in the template. A separator (a space or non-letter/non-digit character) in the template string of to_timestamp and to_date matches any single separator in the input string or is skipped, unless the FX option is used. For example, to_timestamp('2000JUN', 'YYYY///MON') and to_timestamp('2000/JUN', 'YYYY MON') work, but to_timestamp('2000//JUN', 'YYYY/MON') returns an error because the number of separators in the input string exceeds the number of separators in the template. If FX is specified, a separator in the template string matches exactly one character in the input string. But note that the input string character is not required to be the same as the separator from the template string. For example, to_timestamp('2000/JUN', 'FXYYYY MON') works, but to_timestamp('2000/JUN', 'FXYYYY MON') returns an error because the second space in the template string consumes the letter J from the input string. A TZH template pattern can match a signed number. Without the FX option, minus signs may be ambiguous, and could be interpreted as a separator. This ambiguity is resolved as follows: If the number of separators before TZH in the template string is less than the number of separators before the minus sign in the input string, the minus sign is interpreted as part of TZH. Otherwise, the minus sign is considered to be a separator between values. For example, to_timestamp('2000 -10', 'YYYY TZH') matches -10 to TZH, but to_timestamp('2000 -10', 'YYYY TZH') matches 10 to TZH. Ordinary text is allowed in to_char templates and will be output literally. You can put a substring in double quotes to force it to be interpreted as literal text even if it contains template patterns. For example, in '"Hello Year "YYYY', the YYYY will be replaced by the year data, but the single Y in Year will not be. In to_date, to_number, and to_timestamp, literal text and double-quoted strings result in skipping the number of characters contained in the string; for example "XX" skips two input characters (whether or not they are XX). Prior to PostgreSQL 12, it was possible to skip arbitrary text in the input string using non-letter or non-digit characters. For example, to_timestamp('2000y6m1d', 'yyyy-MM-DD') used to work. Now you can only use letter characters for this purpose. For example, to_timestamp('2000y6m1d', 'yyyytMMtDDt') and to_timestamp('2000y6m1d', 'yyyy"y"MM"m"DD"d"') skip y, m, and d. If you want to have a double quote in the output you must precede it with a backslash, for example '\"YYYY Month\"'. Backslashes are not otherwise special outside of double-quoted strings. Within a double-quoted string, a backslash causes the next character to be taken literally, whatever it is (but this has no special effect unless the next character is a double quote or another backslash). In to_timestamp and to_date, if the year format specification is less than four digits, e.g. YYY, and the supplied year is less than four digits, the year will be adjusted to be nearest to the year 2020, e.g. 95 becomes 1995. In to_timestamp and to_date, the YYYY conversion has a restriction when processing years with more than 4 digits. You must use some non-digit character or template after YYYY, otherwise the year is always interpreted as 4 digits. For example (with the year 20000): to_date('200001131', 'YYYYMMDD') will be interpreted as a 4-digit year; instead use a non-digit separator after the year, like to_date('20000-1131', 'YYYY-MMDD') or to_date('20000Nov31', 'YYYYMonDD'). In to_timestamp and to_date, the CC (century) field is accepted but ignored if there is a YYY, YYYY or Y,YYY field. If CC is used with YY or Y then the result is computed as that year in the specified century. If the century is specified but the year is not, the first year of the century is assumed. In to_timestamp and to_date, weekday names or numbers (DAY, D, and related field types) are accepted but are ignored for purposes of computing the result. The same is true for quarter (Q) fields. In to_timestamp and to_date, an ISO 8601 week-numbering date (as distinct from a Gregorian date) can be specified in one of two ways: Year, week number, and weekday: for example to_date('2006-42-4', 'IYYY-IW-ID') returns the date 2006-10-19. If you omit the weekday it is assumed to be 1 (Monday). Year and day of year: for example to_date('2006-291', 'IYYY-IDDD') also returns 2006-10-19. Attempting to enter a date using a mixture of ISO 8601 week-numbering fields and Gregorian date fields is nonsensical, and will cause an error. In the context of an ISO 8601 week-numbering year, the concept of a month or day of month has no meaning. In the context of a Gregorian year, the ISO week has no meaning. While to_date will reject a mixture of Gregorian and ISO week-numbering date fields, to_char will not, since output format specifications like YYYY-MM-DD (IYYY-IDDD) can be useful. But avoid writing something like IYYY-MM-DD; that would yield surprising results near the start of the year. (See for more information.) In to_timestamp, millisecond (MS) or microsecond (US) fields are used as the seconds digits after the decimal point. For example to_timestamp('12.3', 'SS.MS') is not 3 milliseconds, but 300, because the conversion treats it as 12 + 0.3 seconds. So, for the format SS.MS, the input values 12.3, 12.30, and 12.300 specify the same number of milliseconds. To get three milliseconds, one must write 12.003, which the conversion treats as 12 + 0.003 = 12.003 seconds. Here is a more complex example: to_timestamp('15:12:02.020.001230', 'HH24:MI:SS.MS.US') is 15 hours, 12 minutes, and 2 seconds + 20 milliseconds + 1230 microseconds = 2.021230 seconds. to_char(..., 'ID')'s day of the week numbering matches the extract(isodow from ...) function, but to_char(..., 'D')'s does not match extract(dow from ...)'s day numbering. to_char(interval) formats HH and HH12 as shown on a 12-hour clock, for example zero hours and 36 hours both output as 12, while HH24 outputs the full hour value, which can exceed 23 in an interval value. shows the template patterns available for formatting numeric values. Template Patterns for Numeric Formatting Pattern Description 9 digit position (can be dropped if insignificant) 0 digit position (will not be dropped, even if insignificant) . (period) decimal point , (comma) group (thousands) separator PR negative value in angle brackets S sign anchored to number (uses locale) L currency symbol (uses locale) D decimal point (uses locale) G group separator (uses locale) MI minus sign in specified position (if number < 0) PL plus sign in specified position (if number > 0) SG plus/minus sign in specified position RN Roman numeral (input between 1 and 3999) TH or th ordinal number suffix V shift specified number of digits (see notes) EEEE exponent for scientific notation

Usage notes for numeric formatting: 0 specifies a digit position that will always be printed, even if it contains a leading/trailing zero. 9 also specifies a digit position, but if it is a leading zero then it will be replaced by a space, while if it is a trailing zero and fill mode is specified then it will be deleted. (For to_number(), these two pattern characters are equivalent.) The pattern characters S, L, D, and G represent the sign, currency symbol, decimal point, and thousands separator characters defined by the current locale (see and ). The pattern characters period and comma represent those exact characters, with the meanings of decimal point and thousands separator, regardless of locale. If no explicit provision is made for a sign in to_char()'s pattern, one column will be reserved for the sign, and it will be anchored to (appear just left of) the number. If S appears just left of some 9's, it will likewise be anchored to the number. A sign formatted using SG, PL, or MI is not anchored to the number; for example, to_char(-12, 'MI9999') produces '- 12' but to_char(-12, 'S9999') produces ' -12'. (The Oracle implementation does not allow the use of MI before 9, but rather requires that 9 precede MI.) TH does not convert values less than zero and does not convert fractional numbers. PL, SG, and TH are PostgreSQL extensions. In to_number, if non-data template patterns such as L or TH are used, the corresponding number of input characters are skipped, whether or not they match the template pattern, unless they are data characters (that is, digits, sign, decimal point, or comma). For example, TH would skip two non-data characters. V with to_char multiplies the input values by 10^n, where n is the number of digits following V. V with to_number divides in a similar manner. to_char and to_number do not support the use of V combined with a decimal point (e.g., 99.9V99 is not allowed). EEEE (scientific notation) cannot be used in combination with any of the other formatting patterns or modifiers other than digit and decimal point patterns, and must be at the end of the format string (e.g., 9.99EEEE is a valid pattern). Certain modifiers can be applied to any template pattern to alter its behavior. For example, FM99.99 is the 99.99 pattern with the FM modifier. shows the modifier patterns for numeric formatting. Template Pattern Modifiers for Numeric Formatting Modifier Description Example FM prefix fill mode (suppress trailing zeroes and padding blanks) FM99.99 TH suffix upper case ordinal number suffix 999TH th suffix lower case ordinal number suffix 999th

shows some examples of the use of the to_char function. <function>to_char</function> Examples Expression Result to_char(current_timestamp, 'Day, DD HH12:MI:SS') 'Tuesday , 06 05:39:18' to_char(current_timestamp, 'FMDay, FMDD HH12:MI:SS') 'Tuesday, 6 05:39:18' to_char(-0.1, '99.99') ' -.10' to_char(-0.1, 'FM9.99') '-.1' to_char(-0.1, 'FM90.99') '-0.1' to_char(0.1, '0.9') ' 0.1' to_char(12, '9990999.9') ' 0012.0' to_char(12, 'FM9990999.9') '0012.' to_char(485, '999') ' 485' to_char(-485, '999') '-485' to_char(485, '9 9 9') ' 4 8 5' to_char(1485, '9,999') ' 1,485' to_char(1485, '9G999') ' 1 485' to_char(148.5, '999.999') ' 148.500' to_char(148.5, 'FM999.999') '148.5' to_char(148.5, 'FM999.990') '148.500' to_char(148.5, '999D999') ' 148,500' to_char(3148.5, '9G999D999') ' 3 148,500' to_char(-485, '999S') '485-' to_char(-485, '999MI') '485-' to_char(485, '999MI') '485 ' to_char(485, 'FM999MI') '485' to_char(485, 'PL999') '+485' to_char(485, 'SG999') '+485' to_char(-485, 'SG999') '-485' to_char(-485, '9SG99') '4-85' to_char(-485, '999PR') '<485>' to_char(485, 'L999') 'DM 485' to_char(485, 'RN') ' CDLXXXV' to_char(485, 'FMRN') 'CDLXXXV' to_char(5.2, 'FMRN') 'V' to_char(482, '999th') ' 482nd' to_char(485, '"Good number:"999') 'Good number: 485' to_char(485.8, '"Pre:"999" Post:" .999') 'Pre: 485 Post: .800' to_char(12, '99V999') ' 12000' to_char(12.4, '99V999') ' 12400' to_char(12.45, '99V9') ' 125' to_char(0.0004859, '9.99EEEE') ' 4.86e-04'

Date/Time Functions and Operators shows the available functions for date/time value processing, with details appearing in the following subsections. illustrates the behaviors of the basic arithmetic operators (+, *, etc.). For formatting functions, refer to . You should be familiar with the background information on date/time data types from . All the functions and operators described below that take time or timestamp inputs actually come in two variants: one that takes time with time zone or timestamp with time zone, and one that takes time without time zone or timestamp without time zone. For brevity, these variants are not shown separately. Also, the + and * operators come in commutative pairs (for example both date + integer and integer + date); we show only one of each such pair. Date/Time Operators Operator Example Result + date '2001-09-28' + integer '7' date '2001-10-05' + date '2001-09-28' + interval '1 hour' timestamp '2001-09-28 01:00:00' + date '2001-09-28' + time '03:00' timestamp '2001-09-28 03:00:00' + interval '1 day' + interval '1 hour' interval '1 day 01:00:00' + timestamp '2001-09-28 01:00' + interval '23 hours' timestamp '2001-09-29 00:00:00' + time '01:00' + interval '3 hours' time '04:00:00' - - interval '23 hours' interval '-23:00:00' - date '2001-10-01' - date '2001-09-28' integer '3' (days) - date '2001-10-01' - integer '7' date '2001-09-24' - date '2001-09-28' - interval '1 hour' timestamp '2001-09-27 23:00:00' - time '05:00' - time '03:00' interval '02:00:00' - time '05:00' - interval '2 hours' time '03:00:00' - timestamp '2001-09-28 23:00' - interval '23 hours' timestamp '2001-09-28 00:00:00' - interval '1 day' - interval '1 hour' interval '1 day -01:00:00' - timestamp '2001-09-29 03:00' - timestamp '2001-09-27 12:00' interval '1 day 15:00:00' * 900 * interval '1 second' interval '00:15:00' * 21 * interval '1 day' interval '21 days' * double precision '3.5' * interval '1 hour' interval '03:30:00' / interval '1 hour' / double precision '1.5' interval '00:40:00'

Date/Time Functions Function Return Type Description Example Result age age(timestamp, timestamp) interval Subtract arguments, producing a symbolic result that uses years and months, rather than just days age(timestamp '2001-04-10', timestamp '1957-06-13') 43 years 9 mons 27 days age(timestamp) interval Subtract from current_date (at midnight) age(timestamp '1957-06-13') 43 years 8 mons 3 days clock_timestamp clock_timestamp() timestamp with time zone Current date and time (changes during statement execution); see current_date current_date date Current date; see current_time current_time time with time zone Current time of day; see current_timestamp current_timestamp timestamp with time zone Current date and time (start of current transaction); see date_part date_part(text, timestamp) double precision Get subfield (equivalent to extract); see date_part('hour', timestamp '2001-02-16 20:38:40') 20 date_part(text, interval) double precision Get subfield (equivalent to extract); see date_part('month', interval '2 years 3 months') 3 date_trunc date_trunc(text, timestamp) timestamp Truncate to specified precision; see date_trunc('hour', timestamp '2001-02-16 20:38:40') 2001-02-16 20:00:00 date_trunc(text, timestamp with time zone, text) timestamp with time zone Truncate to specified precision in the specified time zone; see date_trunc('day', timestamptz '2001-02-16 20:38:40+00', 'Australia/Sydney') 2001-02-16 13:00:00+00 date_trunc(text, interval) interval Truncate to specified precision; see date_trunc('hour', interval '2 days 3 hours 40 minutes') 2 days 03:00:00 extract extract(field from timestamp) double precision Get subfield; see extract(hour from timestamp '2001-02-16 20:38:40') 20 extract(field from interval) double precision Get subfield; see extract(month from interval '2 years 3 months') 3 isfinite isfinite(date) boolean Test for finite date (not +/-infinity) isfinite(date '2001-02-16') true isfinite(timestamp) boolean Test for finite time stamp (not +/-infinity) isfinite(timestamp '2001-02-16 21:28:30') true isfinite(interval) boolean Test for finite interval isfinite(interval '4 hours') true justify_days justify_days(interval) interval Adjust interval so 30-day time periods are represented as months justify_days(interval '35 days') 1 mon 5 days justify_hours justify_hours(interval) interval Adjust interval so 24-hour time periods are represented as days justify_hours(interval '27 hours') 1 day 03:00:00 justify_interval justify_interval(interval) interval Adjust interval using justify_days and justify_hours, with additional sign adjustments justify_interval(interval '1 mon -1 hour') 29 days 23:00:00 localtime localtime time Current time of day; see localtimestamp localtimestamp timestamp Current date and time (start of current transaction); see make_date make_date(year int, month int, day int) date Create date from year, month and day fields make_date(2013, 7, 15) 2013-07-15 make_interval make_interval(years int DEFAULT 0, months int DEFAULT 0, weeks int DEFAULT 0, days int DEFAULT 0, hours int DEFAULT 0, mins int DEFAULT 0, secs double precision DEFAULT 0.0) interval Create interval from years, months, weeks, days, hours, minutes and seconds fields make_interval(days => 10) 10 days make_time make_time(hour int, min int, sec double precision) time Create time from hour, minute and seconds fields make_time(8, 15, 23.5) 08:15:23.5 make_timestamp make_timestamp(year int, month int, day int, hour int, min int, sec double precision) timestamp Create timestamp from year, month, day, hour, minute and seconds fields make_timestamp(2013, 7, 15, 8, 15, 23.5) 2013-07-15 08:15:23.5 make_timestamptz make_timestamptz(year int, month int, day int, hour int, min int, sec double precision, timezone text ) timestamp with time zone Create timestamp with time zone from year, month, day, hour, minute and seconds fields; if timezone is not specified, the current time zone is used make_timestamptz(2013, 7, 15, 8, 15, 23.5) 2013-07-15 08:15:23.5+01 now now() timestamp with time zone Current date and time (start of current transaction); see statement_timestamp statement_timestamp() timestamp with time zone Current date and time (start of current statement); see timeofday timeofday() text Current date and time (like clock_timestamp, but as a text string); see transaction_timestamp transaction_timestamp() timestamp with time zone Current date and time (start of current transaction); see to_timestamp to_timestamp(double precision) timestamp with time zone Convert Unix epoch (seconds since 1970-01-01 00:00:00+00) to timestamp to_timestamp(1284352323) 2010-09-13 04:32:03+00

OVERLAPS In addition to these functions, the SQL OVERLAPS operator is supported: (start1, end1) OVERLAPS (start2, end2) (start1, length1) OVERLAPS (start2, length2) This expression yields true when two time periods (defined by their endpoints) overlap, false when they do not overlap. The endpoints can be specified as pairs of dates, times, or time stamps; or as a date, time, or time stamp followed by an interval. When a pair of values is provided, either the start or the end can be written first; OVERLAPS automatically takes the earlier value of the pair as the start. Each time period is considered to represent the half-open interval start <= time < end, unless start and end are equal in which case it represents that single time instant. This means for instance that two time periods with only an endpoint in common do not overlap. SELECT (DATE '2001-02-16', DATE '2001-12-21') OVERLAPS (DATE '2001-10-30', DATE '2002-10-30'); Result: true SELECT (DATE '2001-02-16', INTERVAL '100 days') OVERLAPS (DATE '2001-10-30', DATE '2002-10-30'); Result: false SELECT (DATE '2001-10-29', DATE '2001-10-30') OVERLAPS (DATE '2001-10-30', DATE '2001-10-31'); Result: false SELECT (DATE '2001-10-30', DATE '2001-10-30') OVERLAPS (DATE '2001-10-30', DATE '2001-10-31'); Result: true When adding an interval value to (or subtracting an interval value from) a timestamp with time zone value, the days component advances or decrements the date of the timestamp with time zone by the indicated number of days. Across daylight saving time changes (when the session time zone is set to a time zone that recognizes DST), this means interval '1 day' does not necessarily equal interval '24 hours'. For example, with the session time zone set to CST7CDT, timestamp with time zone '2005-04-02 12:00-07' + interval '1 day' will produce timestamp with time zone '2005-04-03 12:00-06', while adding interval '24 hours' to the same initial timestamp with time zone produces timestamp with time zone '2005-04-03 13:00-06', as there is a change in daylight saving time at 2005-04-03 02:00 in time zone CST7CDT. Note there can be ambiguity in the months field returned by age because different months have different numbers of days. PostgreSQL's approach uses the month from the earlier of the two dates when calculating partial months. For example, age('2004-06-01', '2004-04-30') uses April to yield 1 mon 1 day, while using May would yield 1 mon 2 days because May has 31 days, while April has only 30. Subtraction of dates and timestamps can also be complex. One conceptually simple way to perform subtraction is to convert each value to a number of seconds using EXTRACT(EPOCH FROM ...), then subtract the results; this produces the number of seconds between the two values. This will adjust for the number of days in each month, timezone changes, and daylight saving time adjustments. Subtraction of date or timestamp values with the - operator returns the number of days (24-hours) and hours/minutes/seconds between the values, making the same adjustments. The age function returns years, months, days, and hours/minutes/seconds, performing field-by-field subtraction and then adjusting for negative field values. The following queries illustrate the differences in these approaches. The sample results were produced with timezone = 'US/Eastern'; there is a daylight saving time change between the two dates used: SELECT EXTRACT(EPOCH FROM timestamptz '2013-07-01 12:00:00') - EXTRACT(EPOCH FROM timestamptz '2013-03-01 12:00:00'); Result: 10537200 SELECT (EXTRACT(EPOCH FROM timestamptz '2013-07-01 12:00:00') - EXTRACT(EPOCH FROM timestamptz '2013-03-01 12:00:00')) / 60 / 60 / 24; Result: 121.958333333333 SELECT timestamptz '2013-07-01 12:00:00' - timestamptz '2013-03-01 12:00:00'; Result: 121 days 23:00:00 SELECT age(timestamptz '2013-07-01 12:00:00', timestamptz '2013-03-01 12:00:00'); Result: 4 mons <function>EXTRACT</function>, <function>date_part</function> date_part extract EXTRACT(field FROM source) The extract function retrieves subfields such as year or hour from date/time values. source must be a value expression of type timestamp, time, or interval. (Expressions of type date are cast to timestamp and can therefore be used as well.) field is an identifier or string that selects what field to extract from the source value. The extract function returns values of type double precision. The following are valid field names: century The century SELECT EXTRACT(CENTURY FROM TIMESTAMP '2000-12-16 12:21:13'); Result: 20 SELECT EXTRACT(CENTURY FROM TIMESTAMP '2001-02-16 20:38:40'); Result: 21 The first century starts at 0001-01-01 00:00:00 AD, although they did not know it at the time. This definition applies to all Gregorian calendar countries. There is no century number 0, you go from -1 century to 1 century. If you disagree with this, please write your complaint to: Pope, Cathedral Saint-Peter of Roma, Vatican. day For timestamp values, the day (of the month) field (1 - 31) ; for interval values, the number of days SELECT EXTRACT(DAY FROM TIMESTAMP '2001-02-16 20:38:40'); Result: 16 SELECT EXTRACT(DAY FROM INTERVAL '40 days 1 minute'); Result: 40 decade The year field divided by 10 SELECT EXTRACT(DECADE FROM TIMESTAMP '2001-02-16 20:38:40'); Result: 200 dow The day of the week as Sunday (0) to Saturday (6) SELECT EXTRACT(DOW FROM TIMESTAMP '2001-02-16 20:38:40'); Result: 5 Note that extract's day of the week numbering differs from that of the to_char(..., 'D') function. doy The day of the year (1 - 365/366) SELECT EXTRACT(DOY FROM TIMESTAMP '2001-02-16 20:38:40'); Result: 47 epoch For timestamp with time zone values, the number of seconds since 1970-01-01 00:00:00 UTC (can be negative); for date and timestamp values, the number of seconds since 1970-01-01 00:00:00 local time; for interval values, the total number of seconds in the interval SELECT EXTRACT(EPOCH FROM TIMESTAMP WITH TIME ZONE '2001-02-16 20:38:40.12-08'); Result: 982384720.12 SELECT EXTRACT(EPOCH FROM INTERVAL '5 days 3 hours'); Result: 442800 You can convert an epoch value back to a time stamp with to_timestamp: SELECT to_timestamp(982384720.12); Result: 2001-02-17 04:38:40.12+00 hour The hour field (0 - 23) SELECT EXTRACT(HOUR FROM TIMESTAMP '2001-02-16 20:38:40'); Result: 20 isodow The day of the week as Monday (1) to Sunday (7) SELECT EXTRACT(ISODOW FROM TIMESTAMP '2001-02-18 20:38:40'); Result: 7 This is identical to dow except for Sunday. This matches the ISO 8601 day of the week numbering. isoyear The ISO 8601 week-numbering year that the date falls in (not applicable to intervals) SELECT EXTRACT(ISOYEAR FROM DATE '2006-01-01'); Result: 2005 SELECT EXTRACT(ISOYEAR FROM DATE '2006-01-02'); Result: 2006 Each ISO 8601 week-numbering year begins with the Monday of the week containing the 4th of January, so in early January or late December the ISO year may be different from the Gregorian year. See the week field for more information. This field is not available in PostgreSQL releases prior to 8.3. microseconds The seconds field, including fractional parts, multiplied by 1 000 000; note that this includes full seconds SELECT EXTRACT(MICROSECONDS FROM TIME '17:12:28.5'); Result: 28500000 millennium The millennium SELECT EXTRACT(MILLENNIUM FROM TIMESTAMP '2001-02-16 20:38:40'); Result: 3 Years in the 1900s are in the second millennium. The third millennium started January 1, 2001. milliseconds The seconds field, including fractional parts, multiplied by 1000. Note that this includes full seconds. SELECT EXTRACT(MILLISECONDS FROM TIME '17:12:28.5'); Result: 28500 minute The minutes field (0 - 59) SELECT EXTRACT(MINUTE FROM TIMESTAMP '2001-02-16 20:38:40'); Result: 38 month For timestamp values, the number of the month within the year (1 - 12) ; for interval values, the number of months, modulo 12 (0 - 11) SELECT EXTRACT(MONTH FROM TIMESTAMP '2001-02-16 20:38:40'); Result: 2 SELECT EXTRACT(MONTH FROM INTERVAL '2 years 3 months'); Result: 3 SELECT EXTRACT(MONTH FROM INTERVAL '2 years 13 months'); Result: 1 quarter The quarter of the year (1 - 4) that the date is in SELECT EXTRACT(QUARTER FROM TIMESTAMP '2001-02-16 20:38:40'); Result: 1 second The seconds field, including fractional parts (0 - 5960 if leap seconds are implemented by the operating system) SELECT EXTRACT(SECOND FROM TIMESTAMP '2001-02-16 20:38:40'); Result: 40 SELECT EXTRACT(SECOND FROM TIME '17:12:28.5'); Result: 28.5 timezone The time zone offset from UTC, measured in seconds. Positive values correspond to time zones east of UTC, negative values to zones west of UTC. (Technically, PostgreSQL does not use UTC because leap seconds are not handled.) timezone_hour The hour component of the time zone offset timezone_minute The minute component of the time zone offset week The number of the ISO 8601 week-numbering week of the year. By definition, ISO weeks start on Mondays and the first week of a year contains January 4 of that year. In other words, the first Thursday of a year is in week 1 of that year. In the ISO week-numbering system, it is possible for early-January dates to be part of the 52nd or 53rd week of the previous year, and for late-December dates to be part of the first week of the next year. For example, 2005-01-01 is part of the 53rd week of year 2004, and 2006-01-01 is part of the 52nd week of year 2005, while 2012-12-31 is part of the first week of 2013. It's recommended to use the isoyear field together with week to get consistent results. SELECT EXTRACT(WEEK FROM TIMESTAMP '2001-02-16 20:38:40'); Result: 7 year The year field. Keep in mind there is no 0 AD, so subtracting BC years from AD years should be done with care. SELECT EXTRACT(YEAR FROM TIMESTAMP '2001-02-16 20:38:40'); Result: 2001 When the input value is +/-Infinity, extract returns +/-Infinity for monotonically-increasing fields (epoch, julian, year, isoyear, decade, century, and millennium). For other fields, NULL is returned. PostgreSQL versions before 9.6 returned zero for all cases of infinite input. The extract function is primarily intended for computational processing. For formatting date/time values for display, see . The date_part function is modeled on the traditional Ingres equivalent to the SQL-standard function extract: date_part('field', source) Note that here the field parameter needs to be a string value, not a name. The valid field names for date_part are the same as for extract. SELECT date_part('day', TIMESTAMP '2001-02-16 20:38:40'); Result: 16 SELECT date_part('hour', INTERVAL '4 hours 3 minutes'); Result: 4 <function>date_trunc</function> date_trunc The function date_trunc is conceptually similar to the trunc function for numbers. date_trunc(field, source [, time_zone ]) source is a value expression of type timestamp, timestamp with time zone, or interval. (Values of type date and time are cast automatically to timestamp or interval, respectively.) field selects to which precision to truncate the input value. The return value is likewise of type timestamp, timestamp with time zone, or interval, and it has all fields that are less significant than the selected one set to zero (or one, for day and month). Valid values for field are: microseconds milliseconds second minute hour day week month quarter year decade century millennium When the input value is of type timestamp with time zone, the truncation is performed with respect to a particular time zone; for example, truncation to day produces a value that is midnight in that zone. By default, truncation is done with respect to the current setting, but the optional time_zone argument can be provided to specify a different time zone. The time zone name can be specified in any of the ways described in . A time zone cannot be specified when processing timestamp without time zone or interval inputs. These are always taken at face value. Examples (assuming the local time zone is America/New_York): SELECT date_trunc('hour', TIMESTAMP '2001-02-16 20:38:40'); Result: 2001-02-16 20:00:00 SELECT date_trunc('year', TIMESTAMP '2001-02-16 20:38:40'); Result: 2001-01-01 00:00:00 SELECT date_trunc('day', TIMESTAMP WITH TIME ZONE '2001-02-16 20:38:40+00'); Result: 2001-02-16 00:00:00-05 SELECT date_trunc('day', TIMESTAMP WITH TIME ZONE '2001-02-16 20:38:40+00', 'Australia/Sydney'); Result: 2001-02-16 08:00:00-05 SELECT date_trunc('hour', INTERVAL '3 days 02:47:33'); Result: 3 days 02:00:00 <literal>AT TIME ZONE</literal> time zone conversion AT TIME ZONE The AT TIME ZONE converts time stamp without time zone to/from time stamp with time zone, and time values to different time zones. shows its variants. <literal>AT TIME ZONE</literal> Variants Expression Return Type Description timestamp without time zone AT TIME ZONE zone timestamp with time zone Treat given time stamp without time zone as located in the specified time zone timestamp with time zone AT TIME ZONE zone timestamp without time zone Convert given time stamp with time zone to the new time zone, with no time zone designation time with time zone AT TIME ZONE zone time with time zone Convert given time with time zone to the new time zone

In these expressions, the desired time zone zone can be specified either as a text string (e.g., 'America/Los_Angeles') or as an interval (e.g., INTERVAL '-08:00'). In the text case, a time zone name can be specified in any of the ways described in . Examples (assuming the local time zone is America/Los_Angeles): SELECT TIMESTAMP '2001-02-16 20:38:40' AT TIME ZONE 'America/Denver'; Result: 2001-02-16 19:38:40-08 SELECT TIMESTAMP WITH TIME ZONE '2001-02-16 20:38:40-05' AT TIME ZONE 'America/Denver'; Result: 2001-02-16 18:38:40 SELECT TIMESTAMP '2001-02-16 20:38:40-05' AT TIME ZONE 'Asia/Tokyo' AT TIME ZONE 'America/Chicago'; Result: 2001-02-16 05:38:40 The first example adds a time zone to a value that lacks it, and displays the value using the current TimeZone setting. The second example shifts the time stamp with time zone value to the specified time zone, and returns the value without a time zone. This allows storage and display of values different from the current TimeZone setting. The third example converts Tokyo time to Chicago time. Converting time values to other time zones uses the currently active time zone rules since no date is supplied. The function timezone(zone, timestamp) is equivalent to the SQL-conforming construct timestamp AT TIME ZONE zone. Current Date/Time date current time current PostgreSQL provides a number of functions that return values related to the current date and time. These SQL-standard functions all return values based on the start time of the current transaction: CURRENT_DATE CURRENT_TIME CURRENT_TIMESTAMP CURRENT_TIME(precision) CURRENT_TIMESTAMP(precision) LOCALTIME LOCALTIMESTAMP LOCALTIME(precision) LOCALTIMESTAMP(precision) CURRENT_TIME and CURRENT_TIMESTAMP deliver values with time zone; LOCALTIME and LOCALTIMESTAMP deliver values without time zone. CURRENT_TIME, CURRENT_TIMESTAMP, LOCALTIME, and LOCALTIMESTAMP can optionally take a precision parameter, which causes the result to be rounded to that many fractional digits in the seconds field. Without a precision parameter, the result is given to the full available precision. Some examples: SELECT CURRENT_TIME; Result: 14:39:53.662522-05 SELECT CURRENT_DATE; Result: 2001-12-23 SELECT CURRENT_TIMESTAMP; Result: 2001-12-23 14:39:53.662522-05 SELECT CURRENT_TIMESTAMP(2); Result: 2001-12-23 14:39:53.66-05 SELECT LOCALTIMESTAMP; Result: 2001-12-23 14:39:53.662522 Since these functions return the start time of the current transaction, their values do not change during the transaction. This is considered a feature: the intent is to allow a single transaction to have a consistent notion of the current time, so that multiple modifications within the same transaction bear the same time stamp. Other database systems might advance these values more frequently. PostgreSQL also provides functions that return the start time of the current statement, as well as the actual current time at the instant the function is called. The complete list of non-SQL-standard time functions is: transaction_timestamp() statement_timestamp() clock_timestamp() timeofday() now() transaction_timestamp() is equivalent to CURRENT_TIMESTAMP, but is named to clearly reflect what it returns. statement_timestamp() returns the start time of the current statement (more specifically, the time of receipt of the latest command message from the client). statement_timestamp() and transaction_timestamp() return the same value during the first command of a transaction, but might differ during subsequent commands. clock_timestamp() returns the actual current time, and therefore its value changes even within a single SQL command. timeofday() is a historical PostgreSQL function. Like clock_timestamp(), it returns the actual current time, but as a formatted text string rather than a timestamp with time zone value. now() is a traditional PostgreSQL equivalent to transaction_timestamp(). All the date/time data types also accept the special literal value now to specify the current date and time (again, interpreted as the transaction start time). Thus, the following three all return the same result: SELECT CURRENT_TIMESTAMP; SELECT now(); SELECT TIMESTAMP 'now'; -- incorrect for use with DEFAULT You do not want to use the third form when specifying a DEFAULT clause while creating a table. The system will convert now to a timestamp as soon as the constant is parsed, so that when the default value is needed, the time of the table creation would be used! The first two forms will not be evaluated until the default value is used, because they are function calls. Thus they will give the desired behavior of defaulting to the time of row insertion. Delaying Execution pg_sleep pg_sleep_for pg_sleep_until sleep delay The following functions are available to delay execution of the server process: pg_sleep(seconds) pg_sleep_for(interval) pg_sleep_until(timestamp with time zone) pg_sleep makes the current session's process sleep until seconds seconds have elapsed. seconds is a value of type double precision, so fractional-second delays can be specified. pg_sleep_for is a convenience function for larger sleep times specified as an interval. pg_sleep_until is a convenience function for when a specific wake-up time is desired. For example: SELECT pg_sleep(1.5); SELECT pg_sleep_for('5 minutes'); SELECT pg_sleep_until('tomorrow 03:00'); The effective resolution of the sleep interval is platform-specific; 0.01 seconds is a common value. The sleep delay will be at least as long as specified. It might be longer depending on factors such as server load. In particular, pg_sleep_until is not guaranteed to wake up exactly at the specified time, but it will not wake up any earlier. Make sure that your session does not hold more locks than necessary when calling pg_sleep or its variants. Otherwise other sessions might have to wait for your sleeping process, slowing down the entire system. Enum Support Functions For enum types (described in ), there are several functions that allow cleaner programming without hard-coding particular values of an enum type. These are listed in . The examples assume an enum type created as: CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple'); Enum Support Functions Function Description Example Example Result enum_first enum_first(anyenum) Returns the first value of the input enum type enum_first(null::rainbow) red enum_last enum_last(anyenum) Returns the last value of the input enum type enum_last(null::rainbow) purple enum_range enum_range(anyenum) Returns all values of the input enum type in an ordered array enum_range(null::rainbow) {red,orange,yellow,green,blue,purple} enum_range(anyenum, anyenum) Returns the range between the two given enum values, as an ordered array. The values must be from the same enum type. If the first parameter is null, the result will start with the first value of the enum type. If the second parameter is null, the result will end with the last value of the enum type. enum_range('orange'::rainbow, 'green'::rainbow) {orange,yellow,green} enum_range(NULL, 'green'::rainbow) {red,orange,yellow,green} enum_range('orange'::rainbow, NULL) {orange,yellow,green,blue,purple}

Notice that except for the two-argument form of enum_range, these functions disregard the specific value passed to them; they care only about its declared data type. Either null or a specific value of the type can be passed, with the same result. It is more common to apply these functions to a table column or function argument than to a hardwired type name as suggested by the examples. Geometric Functions and Operators The geometric types point, box, lseg, line, path, polygon, and circle have a large set of native support functions and operators, shown in , , and . Note that the same as operator, ~=, represents the usual notion of equality for the point, box, polygon, and circle types. Some of these types also have an = operator, but = compares for equal areas only. The other scalar comparison operators (<= and so on) likewise compare areas for these types. Geometric Operators Operator Description Example + Translation box '((0,0),(1,1))' + point '(2.0,0)' - Translation box '((0,0),(1,1))' - point '(2.0,0)' * Scaling/rotation box '((0,0),(1,1))' * point '(2.0,0)' / Scaling/rotation box '((0,0),(2,2))' / point '(2.0,0)' # Point or box of intersection box '((1,-1),(-1,1))' # box '((1,1),(-2,-2))' # Number of points in path or polygon # path '((1,0),(0,1),(-1,0))' @-@ Length or circumference @-@ path '((0,0),(1,0))' @@ Center @@ circle '((0,0),10)' ## Closest point to first operand on second operand point '(0,0)' ## lseg '((2,0),(0,2))' <-> Distance between circle '((0,0),1)' <-> circle '((5,0),1)' && Overlaps? (One point in common makes this true.) box '((0,0),(1,1))' && box '((0,0),(2,2))' << Is strictly left of? circle '((0,0),1)' << circle '((5,0),1)' >> Is strictly right of? circle '((5,0),1)' >> circle '((0,0),1)' &< Does not extend to the right of? box '((0,0),(1,1))' &< box '((0,0),(2,2))' &> Does not extend to the left of? box '((0,0),(3,3))' &> box '((0,0),(2,2))' <<| Is strictly below? box '((0,0),(3,3))' <<| box '((3,4),(5,5))' |>> Is strictly above? box '((3,4),(5,5))' |>> box '((0,0),(3,3))' &<| Does not extend above? box '((0,0),(1,1))' &<| box '((0,0),(2,2))' |&> Does not extend below? box '((0,0),(3,3))' |&> box '((0,0),(2,2))' <^ Is below (allows touching)? circle '((0,0),1)' <^ circle '((0,5),1)' >^ Is above (allows touching)? circle '((0,5),1)' >^ circle '((0,0),1)' ?# Intersects? lseg '((-1,0),(1,0))' ?# box '((-2,-2),(2,2))' ?- Is horizontal? ?- lseg '((-1,0),(1,0))' ?- Are horizontally aligned? point '(1,0)' ?- point '(0,0)' ?| Is vertical? ?| lseg '((-1,0),(1,0))' ?| Are vertically aligned? point '(0,1)' ?| point '(0,0)' ?-| Is perpendicular? lseg '((0,0),(0,1))' ?-| lseg '((0,0),(1,0))' ?|| Are parallel? lseg '((-1,0),(1,0))' ?|| lseg '((-1,2),(1,2))' @> Contains? circle '((0,0),2)' @> point '(1,1)' <@ Contained in or on? point '(1,1)' <@ circle '((0,0),2)' ~= Same as? polygon '((0,0),(1,1))' ~= polygon '((1,1),(0,0))'

Before PostgreSQL 8.2, the containment operators @> and <@ were respectively called ~ and @. These names are still available, but are deprecated and will eventually be removed. area center diameter height isclosed isopen length npoints pclose popen radius width Geometric Functions Function Return Type Description Example area(object) double precision area area(box '((0,0),(1,1))') center(object) point center center(box '((0,0),(1,2))') diameter(circle) double precision diameter of circle diameter(circle '((0,0),2.0)') height(box) double precision vertical size of box height(box '((0,0),(1,1))') isclosed(path) boolean a closed path? isclosed(path '((0,0),(1,1),(2,0))') isopen(path) boolean an open path? isopen(path '[(0,0),(1,1),(2,0)]') length(object) double precision length length(path '((-1,0),(1,0))') npoints(path) int number of points npoints(path '[(0,0),(1,1),(2,0)]') npoints(polygon) int number of points npoints(polygon '((1,1),(0,0))') pclose(path) path convert path to closed pclose(path '[(0,0),(1,1),(2,0)]') popen(path) path convert path to open popen(path '((0,0),(1,1),(2,0))') radius(circle) double precision radius of circle radius(circle '((0,0),2.0)') width(box) double precision horizontal size of box width(box '((0,0),(1,1))')

Geometric Type Conversion Functions Function Return Type Description Example box box(circle) box circle to box box(circle '((0,0),2.0)') box(point) box point to empty box box(point '(0,0)') box(point, point) box points to box box(point '(0,0)', point '(1,1)') box(polygon) box polygon to box box(polygon '((0,0),(1,1),(2,0))') bound_box(box, box) box boxes to bounding box bound_box(box '((0,0),(1,1))', box '((3,3),(4,4))') circle circle(box) circle box to circle circle(box '((0,0),(1,1))') circle(point, double precision) circle center and radius to circle circle(point '(0,0)', 2.0) circle(polygon) circle polygon to circle circle(polygon '((0,0),(1,1),(2,0))') line(point, point) line points to line line(point '(-1,0)', point '(1,0)') lseg lseg(box) lseg box diagonal to line segment lseg(box '((-1,0),(1,0))') lseg(point, point) lseg points to line segment lseg(point '(-1,0)', point '(1,0)') path path(polygon) path polygon to path path(polygon '((0,0),(1,1),(2,0))') point point(double precision, double precision) point construct point point(23.4, -44.5) point(box) point center of box point(box '((-1,0),(1,0))') point(circle) point center of circle point(circle '((0,0),2.0)') point(lseg) point center of line segment point(lseg '((-1,0),(1,0))') point(polygon) point center of polygon point(polygon '((0,0),(1,1),(2,0))') polygon polygon(box) polygon box to 4-point polygon polygon(box '((0,0),(1,1))') polygon(circle) polygon circle to 12-point polygon polygon(circle '((0,0),2.0)') polygon(npts, circle) polygon circle to npts-point polygon polygon(12, circle '((0,0),2.0)') polygon(path) polygon path to polygon polygon(path '((0,0),(1,1),(2,0))')

It is possible to access the two component numbers of a point as though the point were an array with indexes 0 and 1. For example, if t.p is a point column then SELECT p[0] FROM t retrieves the X coordinate and UPDATE t SET p[1] = ... changes the Y coordinate. In the same way, a value of type box or lseg can be treated as an array of two point values. The area function works for the types box, circle, and path. The area function only works on the path data type if the points in the path are non-intersecting. For example, the path '((0,0),(0,1),(2,1),(2,2),(1,2),(1,0),(0,0))'::PATH will not work; however, the following visually identical path '((0,0),(0,1),(1,1),(1,2),(2,2),(2,1),(1,1),(1,0),(0,0))'::PATH will work. If the concept of an intersecting versus non-intersecting path is confusing, draw both of the above paths side by side on a piece of graph paper. Network Address Functions and Operators shows the operators available for the cidr and inet types. The operators <<, <<=, >>, >>=, and && test for subnet inclusion. They consider only the network parts of the two addresses (ignoring any host part) and determine whether one network is identical to or a subnet of the other. <type>cidr</type> and <type>inet</type> Operators Operator Description Example < is less than inet '192.168.1.5' < inet '192.168.1.6' <= is less than or equal inet '192.168.1.5' <= inet '192.168.1.5' = equals inet '192.168.1.5' = inet '192.168.1.5' >= is greater or equal inet '192.168.1.5' >= inet '192.168.1.5' > is greater than inet '192.168.1.5' > inet '192.168.1.4' <> is not equal inet '192.168.1.5' <> inet '192.168.1.4' << is contained by inet '192.168.1.5' << inet '192.168.1/24' <<= is contained by or equals inet '192.168.1/24' <<= inet '192.168.1/24' >> contains inet '192.168.1/24' >> inet '192.168.1.5' >>= contains or equals inet '192.168.1/24' >>= inet '192.168.1/24' && contains or is contained by inet '192.168.1/24' && inet '192.168.1.80/28' ~ bitwise NOT ~ inet '192.168.1.6' & bitwise AND inet '192.168.1.6' & inet '0.0.0.255' | bitwise OR inet '192.168.1.6' | inet '0.0.0.255' + addition inet '192.168.1.6' + 25 - subtraction inet '192.168.1.43' - 36 - subtraction inet '192.168.1.43' - inet '192.168.1.19'

shows the functions available for use with the cidr and inet types. The abbrev, host, and text functions are primarily intended to offer alternative display formats. <type>cidr</type> and <type>inet</type> Functions Function Return Type Description Example Result abbrev abbrev(inet) text abbreviated display format as text abbrev(inet '10.1.0.0/16') 10.1.0.0/16 abbrev(cidr) text abbreviated display format as text abbrev(cidr '10.1.0.0/16') 10.1/16 broadcast broadcast(inet) inet broadcast address for network broadcast('192.168.1.5/24') 192.168.1.255/24 family family(inet) int extract family of address; 4 for IPv4, 6 for IPv6 family('::1') 6 host host(inet) text extract IP address as text host('192.168.1.5/24') 192.168.1.5 hostmask hostmask(inet) inet construct host mask for network hostmask('192.168.23.20/30') 0.0.0.3 masklen masklen(inet) int extract netmask length masklen('192.168.1.5/24') 24 netmask netmask(inet) inet construct netmask for network netmask('192.168.1.5/24') 255.255.255.0 network network(inet) cidr extract network part of address network('192.168.1.5/24') 192.168.1.0/24 set_masklen set_masklen(inet, int) inet set netmask length for inet value set_masklen('192.168.1.5/24', 16) 192.168.1.5/16 set_masklen(cidr, int) cidr set netmask length for cidr value set_masklen('192.168.1.0/24'::cidr, 16) 192.168.0.0/16 text text(inet) text extract IP address and netmask length as text text(inet '192.168.1.5') 192.168.1.5/32 inet_same_family inet_same_family(inet, inet) boolean are the addresses from the same family? inet_same_family('192.168.1.5/24', '::1') false inet_merge inet_merge(inet, inet) cidr the smallest network which includes both of the given networks inet_merge('192.168.1.5/24', '192.168.2.5/24') 192.168.0.0/22

Any cidr value can be cast to inet implicitly or explicitly; therefore, the functions shown above as operating on inet also work on cidr values. (Where there are separate functions for inet and cidr, it is because the behavior should be different for the two cases.) Also, it is permitted to cast an inet value to cidr. When this is done, any bits to the right of the netmask are silently zeroed to create a valid cidr value. In addition, you can cast a text value to inet or cidr using normal casting syntax: for example, inet(expression) or colname::cidr. shows the functions available for use with the macaddr type. The function trunc(macaddr) returns a MAC address with the last 3 bytes set to zero. This can be used to associate the remaining prefix with a manufacturer. <type>macaddr</type> Functions Function Return Type Description Example Result trunc trunc(macaddr) macaddr set last 3 bytes to zero trunc(macaddr '12:34:56:78:90:ab') 12:34:56:00:00:00

The macaddr type also supports the standard relational operators (>, <=, etc.) for lexicographical ordering, and the bitwise arithmetic operators (~, & and |) for NOT, AND and OR. shows the functions available for use with the macaddr8 type. The function trunc(macaddr8) returns a MAC address with the last 5 bytes set to zero. This can be used to associate the remaining prefix with a manufacturer. <type>macaddr8</type> Functions Function Return Type Description Example Result trunc trunc(macaddr8) macaddr8 set last 5 bytes to zero trunc(macaddr8 '12:34:56:78:90:ab:cd:ef') 12:34:56:00:00:00:00:00 macaddr8_set7bit macaddr8_set7bit(macaddr8) macaddr8 set 7th bit to one, also known as modified EUI-64, for inclusion in an IPv6 address macaddr8_set7bit(macaddr8 '00:34:56:ab:cd:ef') 02:34:56:ff:fe:ab:cd:ef

The macaddr8 type also supports the standard relational operators (>, <=, etc.) for ordering, and the bitwise arithmetic operators (~, & and |) for NOT, AND and OR. Text Search Functions and Operators full text search functions and operators text search functions and operators , and summarize the functions and operators that are provided for full text searching. See for a detailed explanation of PostgreSQL's text search facility. Text Search Operators Operator Return Type Description Example Result @@ boolean tsvector matches tsquery ? to_tsvector('fat cats ate rats') @@ to_tsquery('cat & rat') t @@@ boolean deprecated synonym for @@ to_tsvector('fat cats ate rats') @@@ to_tsquery('cat & rat') t || tsvector concatenate tsvectors 'a:1 b:2'::tsvector || 'c:1 d:2 b:3'::tsvector 'a':1 'b':2,5 'c':3 'd':4 && tsquery AND tsquerys together 'fat | rat'::tsquery && 'cat'::tsquery ( 'fat' | 'rat' ) & 'cat' || tsquery OR tsquerys together 'fat | rat'::tsquery || 'cat'::tsquery ( 'fat' | 'rat' ) | 'cat' !! tsquery negate a tsquery !! 'cat'::tsquery !'cat' <-> tsquery tsquery followed by tsquery to_tsquery('fat') <-> to_tsquery('rat') 'fat' <-> 'rat' @> boolean tsquery contains another ? 'cat'::tsquery @> 'cat & rat'::tsquery f <@ boolean tsquery is contained in ? 'cat'::tsquery <@ 'cat & rat'::tsquery t

The tsquery containment operators consider only the lexemes listed in the two queries, ignoring the combining operators. In addition to the operators shown in the table, the ordinary B-tree comparison operators (=, <, etc) are defined for types tsvector and tsquery. These are not very useful for text searching but allow, for example, unique indexes to be built on columns of these types. Text Search Functions Function Return Type Description Example Result array_to_tsvector array_to_tsvector(text[]) tsvector convert array of lexemes to tsvector array_to_tsvector('{fat,cat,rat}'::text[]) 'cat' 'fat' 'rat' get_current_ts_config get_current_ts_config() regconfig get default text search configuration get_current_ts_config() english length length(tsvector) integer number of lexemes in tsvector length('fat:2,4 cat:3 rat:5A'::tsvector) 3 numnode numnode(tsquery) integer number of lexemes plus operators in tsquery numnode('(fat & rat) | cat'::tsquery) 5 plainto_tsquery plainto_tsquery( config regconfig , query text) tsquery produce tsquery ignoring punctuation plainto_tsquery('english', 'The Fat Rats') 'fat' & 'rat' phraseto_tsquery phraseto_tsquery( config regconfig , query text) tsquery produce tsquery that searches for a phrase, ignoring punctuation phraseto_tsquery('english', 'The Fat Rats') 'fat' <-> 'rat' websearch_to_tsquery websearch_to_tsquery( config regconfig , query text) tsquery produce tsquery from a web search style query websearch_to_tsquery('english', '"fat rat" or rat') 'fat' <-> 'rat' | 'rat' querytree querytree(query tsquery) text get indexable part of a tsquery querytree('foo & ! bar'::tsquery) 'foo' setweight setweight(vector tsvector, weight "char") tsvector assign weight to each element of vector setweight('fat:2,4 cat:3 rat:5B'::tsvector, 'A') 'cat':3A 'fat':2A,4A 'rat':5A setweight setweight for specific lexeme(s) setweight(vector tsvector, weight "char", lexemes text[]) tsvector assign weight to elements of vector that are listed in lexemes setweight('fat:2,4 cat:3 rat:5B'::tsvector, 'A', '{cat,rat}') 'cat':3A 'fat':2,4 'rat':5A strip strip(tsvector) tsvector remove positions and weights from tsvector strip('fat:2,4 cat:3 rat:5A'::tsvector) 'cat' 'fat' 'rat' to_tsquery to_tsquery( config regconfig , query text) tsquery normalize words and convert to tsquery to_tsquery('english', 'The & Fat & Rats') 'fat' & 'rat' to_tsvector to_tsvector( config regconfig , document text) tsvector reduce document text to tsvector to_tsvector('english', 'The Fat Rats') 'fat':2 'rat':3 to_tsvector( config regconfig , document json(b)) tsvector reduce each string value in the document to a tsvector, and then concatenate those in document order to produce a single tsvector to_tsvector('english', '{"a": "The Fat Rats"}'::json) 'fat':2 'rat':3 json(b)_to_tsvector( config regconfig, document json(b), filter json(b)) tsvector reduce each value in the document, specified by filter to a tsvector, and then concatenate those in document order to produce a single tsvector. filter is a jsonb array, that enumerates what kind of elements need to be included into the resulting tsvector. Possible values for filter are "string" (to include all string values), "numeric" (to include all numeric values in the string format), "boolean" (to include all Boolean values in the string format "true"/"false"), "key" (to include all keys) or "all" (to include all above). These values can be combined together to include, e.g. all string and numeric values. json_to_tsvector('english', '{"a": "The Fat Rats", "b": 123}'::json, '["string", "numeric"]') '123':5 'fat':2 'rat':3 ts_delete ts_delete(vector tsvector, lexeme text) tsvector remove given lexeme from vector ts_delete('fat:2,4 cat:3 rat:5A'::tsvector, 'fat') 'cat':3 'rat':5A ts_delete(vector tsvector, lexemes text[]) tsvector remove any occurrence of lexemes in lexemes from vector ts_delete('fat:2,4 cat:3 rat:5A'::tsvector, ARRAY['fat','rat']) 'cat':3 ts_filter ts_filter(vector tsvector, weights "char"[]) tsvector select only elements with given weights from vector ts_filter('fat:2,4 cat:3b rat:5A'::tsvector, '{a,b}') 'cat':3B 'rat':5A ts_headline ts_headline( config regconfig, document text, query tsquery , options text ) text display a query match ts_headline('x y z', 'z'::tsquery) x y <b>z</b> ts_headline( config regconfig, document json(b), query tsquery , options text ) text display a query match ts_headline('{"a":"x y z"}'::json, 'z'::tsquery) {"a":"x y <b>z</b>"} ts_rank ts_rank( weights float4[], vector tsvector, query tsquery , normalization integer ) float4 rank document for query ts_rank(textsearch, query) 0.818 ts_rank_cd ts_rank_cd( weights float4[], vector tsvector, query tsquery , normalization integer ) float4 rank document for query using cover density ts_rank_cd('{0.1, 0.2, 0.4, 1.0}', textsearch, query) 2.01317 ts_rewrite ts_rewrite(query tsquery, target tsquery, substitute tsquery) tsquery replace target with substitute within query ts_rewrite('a & b'::tsquery, 'a'::tsquery, 'foo|bar'::tsquery) 'b' & ( 'foo' | 'bar' ) ts_rewrite(query tsquery, select text) tsquery replace using targets and substitutes from a SELECT command SELECT ts_rewrite('a & b'::tsquery, 'SELECT t,s FROM aliases') 'b' & ( 'foo' | 'bar' ) tsquery_phrase tsquery_phrase(query1 tsquery, query2 tsquery) tsquery make query that searches for query1 followed by query2 (same as <-> operator) tsquery_phrase(to_tsquery('fat'), to_tsquery('cat')) 'fat' <-> 'cat' tsquery_phrase(query1 tsquery, query2 tsquery, distance integer) tsquery make query that searches for query1 followed by query2 at distance distance tsquery_phrase(to_tsquery('fat'), to_tsquery('cat'), 10) 'fat' <10> 'cat' tsvector_to_array tsvector_to_array(tsvector) text[] convert tsvector to array of lexemes tsvector_to_array('fat:2,4 cat:3 rat:5A'::tsvector) {cat,fat,rat} tsvector_update_trigger tsvector_update_trigger() trigger trigger function for automatic tsvector column update CREATE TRIGGER ... tsvector_update_trigger(tsvcol, 'pg_catalog.swedish', title, body) tsvector_update_trigger_column tsvector_update_trigger_column() trigger trigger function for automatic tsvector column update CREATE TRIGGER ... tsvector_update_trigger_column(tsvcol, configcol, title, body) unnest for tsvector unnest(tsvector, OUT lexeme text, OUT positions smallint[], OUT weights text) setof record expand a tsvector to a set of rows unnest('fat:2,4 cat:3 rat:5A'::tsvector) (cat,{3},{D}) ...

All the text search functions that accept an optional regconfig argument will use the configuration specified by when that argument is omitted. The functions in are listed separately because they are not usually used in everyday text searching operations. They are helpful for development and debugging of new text search configurations. Text Search Debugging Functions Function Return Type Description Example Result ts_debug ts_debug( config regconfig, document text, OUT alias text, OUT description text, OUT token text, OUT dictionaries regdictionary[], OUT dictionary regdictionary, OUT lexemes text[]) setof record test a configuration ts_debug('english', 'The Brightest supernovaes') (asciiword,"Word, all ASCII",The,{english_stem},english_stem,{}) ... ts_lexize ts_lexize(dict regdictionary, token text) text[] test a dictionary ts_lexize('english_stem', 'stars') {star} ts_parse ts_parse(parser_name text, document text, OUT tokid integer, OUT token text) setof record test a parser ts_parse('default', 'foo - bar') (1,foo) ... ts_parse(parser_oid oid, document text, OUT tokid integer, OUT token text) setof record test a parser ts_parse(3722, 'foo - bar') (1,foo) ... ts_token_type ts_token_type(parser_name text, OUT tokid integer, OUT alias text, OUT description text) setof record get token types defined by parser ts_token_type('default') (1,asciiword,"Word, all ASCII") ... ts_token_type(parser_oid oid, OUT tokid integer, OUT alias text, OUT description text) setof record get token types defined by parser ts_token_type(3722) (1,asciiword,"Word, all ASCII") ... ts_stat ts_stat(sqlquery text, weights text, OUT word text, OUT ndoc integer, OUT nentry integer) setof record get statistics of a tsvector column ts_stat('SELECT vector from apod') (foo,10,15) ...

XML Functions XML Functions The functions and function-like expressions described in this section operate on values of type xml. See for information about the xml type. The function-like expressions xmlparse and xmlserialize for converting to and from type xml are documented there, not in this section. Use of most of these functions requires PostgreSQL to have been built with configure --with-libxml. Producing XML Content A set of functions and function-like expressions are available for producing XML content from SQL data. As such, they are particularly suitable for formatting query results into XML documents for processing in client applications. <literal>xmlcomment</literal> xmlcomment xmlcomment(text) The function xmlcomment creates an XML value containing an XML comment with the specified text as content. The text cannot contain -- or end with a - so that the resulting construct is a valid XML comment. If the argument is null, the result is null. Example: ]]> <literal>xmlconcat</literal> xmlconcat xmlconcat(xml, ...) The function xmlconcat concatenates a list of individual XML values to create a single value containing an XML content fragment. Null values are omitted; the result is only null if there are no nonnull arguments. Example: ', 'foo'); xmlconcat ---------------------- foo ]]> XML declarations, if present, are combined as follows. If all argument values have the same XML version declaration, that version is used in the result, else no version is used. If all argument values have the standalone declaration value yes, then that value is used in the result. If all argument values have a standalone declaration value and at least one is no, then that is used in the result. Else the result will have no standalone declaration. If the result is determined to require a standalone declaration but no version declaration, a version declaration with version 1.0 will be used because XML requires an XML declaration to contain a version declaration. Encoding declarations are ignored and removed in all cases. Example: ', ''); xmlconcat ----------------------------------- ]]> <literal>xmlelement</literal> xmlelement xmlelement(name name , xmlattributes(value AS attname , ... ) , content, ...) The xmlelement expression produces an XML element with the given name, attributes, and content. Examples: SELECT xmlelement(name foo, xmlattributes('xyz' as bar)); xmlelement ------------------ SELECT xmlelement(name foo, xmlattributes(current_date as bar), 'cont', 'ent'); xmlelement ------------------------------------- content ]]> Element and attribute names that are not valid XML names are escaped by replacing the offending characters by the sequence _xHHHH_, where HHHH is the character's Unicode codepoint in hexadecimal notation. For example: ]]> An explicit attribute name need not be specified if the attribute value is a column reference, in which case the column's name will be used as the attribute name by default. In other cases, the attribute must be given an explicit name. So this example is valid: CREATE TABLE test (a xml, b xml); SELECT xmlelement(name test, xmlattributes(a, b)) FROM test; But these are not: SELECT xmlelement(name test, xmlattributes('constant'), a, b) FROM test; SELECT xmlelement(name test, xmlattributes(func(a, b))) FROM test; Element content, if specified, will be formatted according to its data type. If the content is itself of type xml, complex XML documents can be constructed. For example: ]]> Content of other types will be formatted into valid XML character data. This means in particular that the characters <, >, and & will be converted to entities. Binary data (data type bytea) will be represented in base64 or hex encoding, depending on the setting of the configuration parameter . The particular behavior for individual data types is expected to evolve in order to align the PostgreSQL mappings with those specified in SQL:2006 and later, as discussed in . <literal>xmlforest</literal> xmlforest xmlforest(content AS name , ...) The xmlforest expression produces an XML forest (sequence) of elements using the given names and content. Examples: abc123 SELECT xmlforest(table_name, column_name) FROM information_schema.columns WHERE table_schema = 'pg_catalog'; xmlforest ------------------------------------------------------------------------------------------- pg_authidrolname pg_authidrolsuper ... ]]> As seen in the second example, the element name can be omitted if the content value is a column reference, in which case the column name is used by default. Otherwise, a name must be specified. Element names that are not valid XML names are escaped as shown for xmlelement above. Similarly, content data is escaped to make valid XML content, unless it is already of type xml. Note that XML forests are not valid XML documents if they consist of more than one element, so it might be useful to wrap xmlforest expressions in xmlelement. <literal>xmlpi</literal> xmlpi xmlpi(name target , content) The xmlpi expression creates an XML processing instruction. The content, if present, must not contain the character sequence ?>. Example: ]]> <literal>xmlroot</literal> xmlroot xmlroot(xml, version text | no value , standalone yes|no|no value) The xmlroot expression alters the properties of the root node of an XML value. If a version is specified, it replaces the value in the root node's version declaration; if a standalone setting is specified, it replaces the value in the root node's standalone declaration. abc'), version '1.0', standalone yes); xmlroot ---------------------------------------- abc ]]> <literal>xmlagg</literal> xmlagg xmlagg(xml) The function xmlagg is, unlike the other functions described here, an aggregate function. It concatenates the input values to the aggregate function call, much like xmlconcat does, except that concatenation occurs across rows rather than across expressions in a single row. See for additional information about aggregate functions. Example: abc'); INSERT INTO test VALUES (2, ''); SELECT xmlagg(x) FROM test; xmlagg ---------------------- abc ]]> To determine the order of the concatenation, an ORDER BY clause may be added to the aggregate call as described in . For example: abc ]]> The following non-standard approach used to be recommended in previous versions, and may still be useful in specific cases: abc ]]> XML Predicates The expressions described in this section check properties of xml values. <literal>IS DOCUMENT</literal> IS DOCUMENT xml IS DOCUMENT The expression IS DOCUMENT returns true if the argument XML value is a proper XML document, false if it is not (that is, it is a content fragment), or null if the argument is null. See about the difference between documents and content fragments. <literal>IS NOT DOCUMENT</literal> IS NOT DOCUMENT xml IS NOT DOCUMENT The expression IS NOT DOCUMENT returns false if the argument XML value is a proper XML document, true if it is not (that is, it is a content fragment), or null if the argument is null. <literal>XMLEXISTS</literal> XMLEXISTS XMLEXISTS(text PASSING BY { REF | VALUE } xml BY { REF | VALUE }) The function xmlexists evaluates an XPath 1.0 expression (the first argument), with the passed XML value as its context item. The function returns false if the result of that evaluation yields an empty node-set, true if it yields any other value. The function returns null if any argument is null. A nonnull value passed as the context item must be an XML document, not a content fragment or any non-XML value. Example: TorontoOttawa'); xmlexists ------------ t (1 row) ]]> The BY REF and BY VALUE clauses are accepted in PostgreSQL, but are ignored, as discussed in . In the SQL standard, the xmlexists function evaluates an expression in the XML Query language, but PostgreSQL allows only an XPath 1.0 expression, as discussed in . <literal>xml_is_well_formed</literal> xml_is_well_formed xml_is_well_formed_document xml_is_well_formed_content xml_is_well_formed(text) xml_is_well_formed_document(text) xml_is_well_formed_content(text) These functions check whether a text string is well-formed XML, returning a Boolean result. xml_is_well_formed_document checks for a well-formed document, while xml_is_well_formed_content checks for well-formed content. xml_is_well_formed does the former if the configuration parameter is set to DOCUMENT, or the latter if it is set to CONTENT. This means that xml_is_well_formed is useful for seeing whether a simple cast to type xml will succeed, whereas the other two functions are useful for seeing whether the corresponding variants of XMLPARSE will succeed. Examples: '); xml_is_well_formed -------------------- f (1 row) SELECT xml_is_well_formed(''); xml_is_well_formed -------------------- t (1 row) SET xmloption TO CONTENT; SELECT xml_is_well_formed('abc'); xml_is_well_formed -------------------- t (1 row) SELECT xml_is_well_formed_document('bar'); xml_is_well_formed_document ----------------------------- t (1 row) SELECT xml_is_well_formed_document('bar'); xml_is_well_formed_document ----------------------------- f (1 row) ]]> The last example shows that the checks include whether namespaces are correctly matched. Processing XML To process values of data type xml, PostgreSQL offers the functions xpath and xpath_exists, which evaluate XPath 1.0 expressions, and the XMLTABLE table function. <literal>xpath</literal> XPath xpath(xpath, xml , nsarray) The function xpath evaluates the XPath 1.0 expression xpath (a text value) against the XML value xml. It returns an array of XML values corresponding to the node-set produced by the XPath expression. If the XPath expression returns a scalar value rather than a node-set, a single-element array is returned. The second argument must be a well formed XML document. In particular, it must have a single root node element. The optional third argument of the function is an array of namespace mappings. This array should be a two-dimensional text array with the length of the second axis being equal to 2 (i.e., it should be an array of arrays, each of which consists of exactly 2 elements). The first element of each array entry is the namespace name (alias), the second the namespace URI. It is not required that aliases provided in this array be the same as those being used in the XML document itself (in other words, both in the XML document and in the xpath function context, aliases are local). Example: test', ARRAY[ARRAY['my', 'http://example.com']]); xpath -------- {test} (1 row) ]]> To deal with default (anonymous) namespaces, do something like this: test', ARRAY[ARRAY['mydefns', 'http://example.com']]); xpath -------- {test} (1 row) ]]> <literal>xpath_exists</literal> xpath_exists xpath_exists(xpath, xml , nsarray) The function xpath_exists is a specialized form of the xpath function. Instead of returning the individual XML values that satisfy the XPath 1.0 expression, this function returns a Boolean indicating whether the query was satisfied or not (specifically, whether it produced any value other than an empty node-set). This function is equivalent to the XMLEXISTS predicate, except that it also offers support for a namespace mapping argument. Example: test', ARRAY[ARRAY['my', 'http://example.com']]); xpath_exists -------------- t (1 row) ]]> <literal>xmltable</literal> xmltable table function XMLTABLE xmltable( XMLNAMESPACES(namespace uri AS namespace name, ...), row_expression PASSING BY { REF | VALUE } document_expression BY { REF | VALUE } COLUMNS name { type PATH column_expression DEFAULT default_expression NOT NULL | NULL | FOR ORDINALITY } , ... ) The xmltable function produces a table based on the given XML value, an XPath filter to extract rows, and a set of column definitions. The optional XMLNAMESPACES clause is a comma-separated list of namespaces. It specifies the XML namespaces used in the document and their aliases. A default namespace specification is not currently supported. The required row_expression argument is an XPath 1.0 expression that is evaluated, passing the document_expression as its context item, to obtain a set of XML nodes. These nodes are what xmltable transforms into output rows. No rows will be produced if the document_expression is null, nor if the row_expression produces an empty node-set or any value other than a node-set. document_expression provides the context item for the row_expression. It must be a well-formed XML document; fragments/forests are not accepted. The BY REF and BY VALUE clauses are accepted but ignored, as discussed in . In the SQL standard, the xmltable function evaluates expressions in the XML Query language, but PostgreSQL allows only XPath 1.0 expressions, as discussed in . The mandatory COLUMNS clause specifies the list of columns in the output table. Each entry describes a single column. See the syntax summary above for the format. The column name and type are required; the path, default and nullability clauses are optional. A column marked FOR ORDINALITY will be populated with row numbers, starting with 1, in the order of nodes retrieved from the row_expression's result node-set. At most one column may be marked FOR ORDINALITY. XPath 1.0 does not specify an order for nodes in a node-set, so code that relies on a particular order of the results will be implementation-dependent. Details can be found in . The column_expression for a column is an XPath 1.0 expression that is evaluated for each row, with the current node from the row_expression result as its context item, to find the value of the column. If no column_expression is given, then the column name is used as an implicit path. If a column's XPath expression returns a non-XML value (limited to string, boolean, or double in XPath 1.0) and the column has a PostgreSQL type other than xml, the column will be set as if by assigning the value's string representation to the PostgreSQL type. (If the value is a boolean, its string representation is taken to be 1 or 0 if the output column's type category is numeric, otherwise true or false.) If a column's XPath expression returns a non-empty set of XML nodes and the column's PostgreSQL type is xml, the column will be assigned the expression result exactly, if it is of document or content form. A result containing more than one element node at the top level, or non-whitespace text outside of an element, is an example of content form. An XPath result can be of neither form, for example if it returns an attribute node selected from the element that contains it. Such a result will be put into content form with each such disallowed node replaced by its string value, as defined for the XPath 1.0 string function. A non-XML result assigned to an xml output column produces content, a single text node with the string value of the result. An XML result assigned to a column of any other type may not have more than one node, or an error is raised. If there is exactly one node, the column will be set as if by assigning the node's string value (as defined for the XPath 1.0 string function) to the PostgreSQL type. The string value of an XML element is the concatenation, in document order, of all text nodes contained in that element and its descendants. The string value of an element with no descendant text nodes is an empty string (not NULL). Any xsi:nil attributes are ignored. Note that the whitespace-only text() node between two non-text elements is preserved, and that leading whitespace on a text() node is not flattened. The XPath 1.0 string function may be consulted for the rules defining the string value of other XML node types and non-XML values. The conversion rules presented here are not exactly those of the SQL standard, as discussed in . If the path expression returns an empty node-set (typically, when it does not match) for a given row, the column will be set to NULL, unless a default_expression is specified; then the value resulting from evaluating that expression is used. Columns may be marked NOT NULL. If the column_expression for a NOT NULL column does not match anything and there is no DEFAULT or the default_expression also evaluates to null, an error is reported. A default_expression, rather than being evaluated immediately when xmltable is called, is evaluated each time a default is needed for the column. If the expression qualifies as stable or immutable, the repeat evaluation may be skipped. This means that you can usefully use volatile functions like nextval in default_expression. Examples: AU Australia JP Japan Shinzo Abe 145935 SG Singapore 697 $$ AS data; SELECT xmltable.* FROM xmldata, XMLTABLE('//ROWS/ROW' PASSING data COLUMNS id int PATH '@id', ordinality FOR ORDINALITY, "COUNTRY_NAME" text, country_id text PATH 'COUNTRY_ID', size_sq_km float PATH 'SIZE[@unit = "sq_km"]', size_other text PATH 'concat(SIZE[@unit!="sq_km"], " ", SIZE[@unit!="sq_km"]/@unit)', premier_name text PATH 'PREMIER_NAME' DEFAULT 'not specified') ; id | ordinality | COUNTRY_NAME | country_id | size_sq_km | size_other | premier_name ----+------------+--------------+------------+------------+--------------+--------------- 1 | 1 | Australia | AU | | | not specified 5 | 2 | Japan | JP | | 145935 sq_mi | Shinzo Abe 6 | 3 | Singapore | SG | 697 | | not specified ]]> The following example shows concatenation of multiple text() nodes, usage of the column name as XPath filter, and the treatment of whitespace, XML comments and processing instructions: Hello2a2 bbbxxxCC $$ AS data; SELECT xmltable.* FROM xmlelements, XMLTABLE('/root' PASSING data COLUMNS element text); element ------------------------- Hello2a2 bbbxxxCC ]]> The following example illustrates how the XMLNAMESPACES clause can be used to specify a list of namespaces used in the XML document as well as in the XPath expressions: '::xml) ) SELECT xmltable.* FROM XMLTABLE(XMLNAMESPACES('http://example.com/myns' AS x, 'http://example.com/b' AS "B"), '/x:example/x:item' PASSING (SELECT data FROM xmldata) COLUMNS foo int PATH '@foo', bar int PATH '@B:bar'); foo | bar -----+----- 1 | 2 3 | 4 4 | 5 (3 rows) ]]> Mapping Tables to XML XML export The following functions map the contents of relational tables to XML values. They can be thought of as XML export functionality: table_to_xml(tbl regclass, nulls boolean, tableforest boolean, targetns text) query_to_xml(query text, nulls boolean, tableforest boolean, targetns text) cursor_to_xml(cursor refcursor, count int, nulls boolean, tableforest boolean, targetns text) The return type of each function is xml. table_to_xml maps the content of the named table, passed as parameter tbl. The regclass type accepts strings identifying tables using the usual notation, including optional schema qualifications and double quotes. query_to_xml executes the query whose text is passed as parameter query and maps the result set. cursor_to_xml fetches the indicated number of rows from the cursor specified by the parameter cursor. This variant is recommended if large tables have to be mapped, because the result value is built up in memory by each function. If tableforest is false, then the resulting XML document looks like this: data data ... ... ]]> If tableforest is true, the result is an XML content fragment that looks like this: data data ... ... ]]> If no table name is available, that is, when mapping a query or a cursor, the string table is used in the first format, row in the second format. The choice between these formats is up to the user. The first format is a proper XML document, which will be important in many applications. The second format tends to be more useful in the cursor_to_xml function if the result values are to be reassembled into one document later on. The functions for producing XML content discussed above, in particular xmlelement, can be used to alter the results to taste. The data values are mapped in the same way as described for the function xmlelement above. The parameter nulls determines whether null values should be included in the output. If true, null values in columns are represented as: ]]> where xsi is the XML namespace prefix for XML Schema Instance. An appropriate namespace declaration will be added to the result value. If false, columns containing null values are simply omitted from the output. The parameter targetns specifies the desired XML namespace of the result. If no particular namespace is wanted, an empty string should be passed. The following functions return XML Schema documents describing the mappings performed by the corresponding functions above: table_to_xmlschema(tbl regclass, nulls boolean, tableforest boolean, targetns text) query_to_xmlschema(query text, nulls boolean, tableforest boolean, targetns text) cursor_to_xmlschema(cursor refcursor, nulls boolean, tableforest boolean, targetns text) It is essential that the same parameters are passed in order to obtain matching XML data mappings and XML Schema documents. The following functions produce XML data mappings and the corresponding XML Schema in one document (or forest), linked together. They can be useful where self-contained and self-describing results are wanted: table_to_xml_and_xmlschema(tbl regclass, nulls boolean, tableforest boolean, targetns text) query_to_xml_and_xmlschema(query text, nulls boolean, tableforest boolean, targetns text) In addition, the following functions are available to produce analogous mappings of entire schemas or the entire current database: schema_to_xml(schema name, nulls boolean, tableforest boolean, targetns text) schema_to_xmlschema(schema name, nulls boolean, tableforest boolean, targetns text) schema_to_xml_and_xmlschema(schema name, nulls boolean, tableforest boolean, targetns text) database_to_xml(nulls boolean, tableforest boolean, targetns text) database_to_xmlschema(nulls boolean, tableforest boolean, targetns text) database_to_xml_and_xmlschema(nulls boolean, tableforest boolean, targetns text) Note that these potentially produce a lot of data, which needs to be built up in memory. When requesting content mappings of large schemas or databases, it might be worthwhile to consider mapping the tables separately instead, possibly even through a cursor. The result of a schema content mapping looks like this: table1-mapping table2-mapping ... ]]> where the format of a table mapping depends on the tableforest parameter as explained above. The result of a database content mapping looks like this: ... ... ... ]]> where the schema mapping is as above. As an example of using the output produced by these functions, shows an XSLT stylesheet that converts the output of table_to_xml_and_xmlschema to an HTML document containing a tabular rendition of the table data. In a similar manner, the results from these functions can be converted into other XML-based formats.

XSLT Stylesheet for Converting SQL/XML Output to HTML <xsl:value-of select="name(current())"/>

]]> JSON Functions and Operators JSON functions and operators This section describes: functions and operators for processing and creating JSON data the SQL/JSON path language To learn more about the SQL/JSON standard, see . For details on JSON types supported in PostgreSQL, see . Processing and Creating JSON Data shows the operators that are available for use with JSON data types (see ). <type>json</type> and <type>jsonb</type> Operators Operator Right Operand Type Return type Description Example Example Result -> int json or jsonb Get JSON array element (indexed from zero, negative integers count from the end) '[{"a":"foo"},{"b":"bar"},{"c":"baz"}]'::json->2 {"c":"baz"} -> text json or jsonb Get JSON object field by key '{"a": {"b":"foo"}}'::json->'a' {"b":"foo"} ->> int text Get JSON array element as text '[1,2,3]'::json->>2 3 ->> text text Get JSON object field as text '{"a":1,"b":2}'::json->>'b' 2 #> text[] json or jsonb Get JSON object at the specified path '{"a": {"b":{"c": "foo"}}}'::json#>'{a,b}' {"c": "foo"} #>> text[] text Get JSON object at the specified path as text '{"a":[1,2,3],"b":[4,5,6]}'::json#>>'{a,2}' 3

There are parallel variants of these operators for both the json and jsonb types. The field/element/path extraction operators return the same type as their left-hand input (either json or jsonb), except for those specified as returning text, which coerce the value to text. The field/element/path extraction operators return NULL, rather than failing, if the JSON input does not have the right structure to match the request; for example if no such element exists. The field/element/path extraction operators that accept integer JSON array subscripts all support negative subscripting from the end of arrays. The standard comparison operators shown in are available for jsonb, but not for json. They follow the ordering rules for B-tree operations outlined at . Some further operators also exist only for jsonb, as shown in . Many of these operators can be indexed by jsonb operator classes. For a full description of jsonb containment and existence semantics, see . describes how these operators can be used to effectively index jsonb data. Additional <type>jsonb</type> Operators Operator Right Operand Type Description Example @> jsonb Does the left JSON value contain the right JSON path/value entries at the top level? '{"a":1, "b":2}'::jsonb @> '{"b":2}'::jsonb <@ jsonb Are the left JSON path/value entries contained at the top level within the right JSON value? '{"b":2}'::jsonb <@ '{"a":1, "b":2}'::jsonb ? text Does the string exist as a top-level key within the JSON value? '{"a":1, "b":2}'::jsonb ? 'b' ?| text[] Do any of these array strings exist as top-level keys? '{"a":1, "b":2, "c":3}'::jsonb ?| array['b', 'c'] ?& text[] Do all of these array strings exist as top-level keys? '["a", "b"]'::jsonb ?& array['a', 'b'] || jsonb Concatenate two jsonb values into a new jsonb value '["a", "b"]'::jsonb || '["c", "d"]'::jsonb - text Delete key/value pair or string element from left operand. Key/value pairs are matched based on their key value. '{"a": "b"}'::jsonb - 'a' - text[] Delete multiple key/value pairs or string elements from left operand. Key/value pairs are matched based on their key value. '{"a": "b", "c": "d"}'::jsonb - '{a,c}'::text[] - integer Delete the array element with specified index (Negative integers count from the end). Throws an error if top level container is not an array. '["a", "b"]'::jsonb - 1 #- text[] Delete the field or element with specified path (for JSON arrays, negative integers count from the end) '["a", {"b":1}]'::jsonb #- '{1,b}' @? jsonpath Does JSON path return any item for the specified JSON value? '{"a":[1,2,3,4,5]}'::jsonb @? '$.a[*] ? (@ > 2)' @@ jsonpath Returns the result of JSON path predicate check for the specified JSON value. Only the first item of the result is taken into account. If the result is not Boolean, then null is returned. '{"a":[1,2,3,4,5]}'::jsonb @@ '$.a[*] > 2'

The || operator concatenates the elements at the top level of each of its operands. It does not operate recursively. For example, if both operands are objects with a common key field name, the value of the field in the result will just be the value from the right hand operand. The @? and @@ operators suppress the following errors: lacking object field or array element, unexpected JSON item type, and numeric errors. This behavior might be helpful while searching over JSON document collections of varying structure. shows the functions that are available for creating json and jsonb values. (There are no equivalent functions for jsonb, of the row_to_json and array_to_json functions. However, the to_jsonb function supplies much the same functionality as these functions would.) to_json array_to_json row_to_json json_build_array json_build_object json_object to_jsonb jsonb_build_array jsonb_build_object jsonb_object JSON Creation Functions Function Description Example Example Result to_json(anyelement) to_jsonb(anyelement) Returns the value as json or jsonb. Arrays and composites are converted (recursively) to arrays and objects; otherwise, if there is a cast from the type to json, the cast function will be used to perform the conversion; otherwise, a scalar value is produced. For any scalar type other than a number, a Boolean, or a null value, the text representation will be used, in such a fashion that it is a valid json or jsonb value. to_json('Fred said "Hi."'::text) "Fred said \"Hi.\"" array_to_json(anyarray [, pretty_bool]) Returns the array as a JSON array. A PostgreSQL multidimensional array becomes a JSON array of arrays. Line feeds will be added between dimension-1 elements if pretty_bool is true. array_to_json('{{1,5},{99,100}}'::int[]) [[1,5],[99,100]] row_to_json(record [, pretty_bool]) Returns the row as a JSON object. Line feeds will be added between level-1 elements if pretty_bool is true. row_to_json(row(1,'foo')) {"f1":1,"f2":"foo"} json_build_array(VARIADIC "any") jsonb_build_array(VARIADIC "any") Builds a possibly-heterogeneously-typed JSON array out of a variadic argument list. json_build_array(1,2,'3',4,5) [1, 2, "3", 4, 5] json_build_object(VARIADIC "any") jsonb_build_object(VARIADIC "any") Builds a JSON object out of a variadic argument list. By convention, the argument list consists of alternating keys and values. json_build_object('foo',1,'bar',2) {"foo": 1, "bar": 2} json_object(text[]) jsonb_object(text[]) Builds a JSON object out of a text array. The array must have either exactly one dimension with an even number of members, in which case they are taken as alternating key/value pairs, or two dimensions such that each inner array has exactly two elements, which are taken as a key/value pair. json_object('{a, 1, b, "def", c, 3.5}') json_object('{{a, 1},{b, "def"},{c, 3.5}}') {"a": "1", "b": "def", "c": "3.5"} json_object(keys text[], values text[]) jsonb_object(keys text[], values text[]) This form of json_object takes keys and values pairwise from two separate arrays. In all other respects it is identical to the one-argument form. json_object('{a, b}', '{1,2}') {"a": "1", "b": "2"}

array_to_json and row_to_json have the same behavior as to_json except for offering a pretty-printing option. The behavior described for to_json likewise applies to each individual value converted by the other JSON creation functions. The extension has a cast from hstore to json, so that hstore values converted via the JSON creation functions will be represented as JSON objects, not as primitive string values. shows the functions that are available for processing json and jsonb values. json_array_length jsonb_array_length json_each jsonb_each json_each_text jsonb_each_text json_extract_path jsonb_extract_path json_extract_path_text jsonb_extract_path_text json_object_keys jsonb_object_keys json_populate_record jsonb_populate_record json_populate_recordset jsonb_populate_recordset json_array_elements jsonb_array_elements json_array_elements_text jsonb_array_elements_text json_typeof jsonb_typeof json_to_record jsonb_to_record json_to_recordset jsonb_to_recordset json_strip_nulls jsonb_strip_nulls jsonb_set jsonb_insert jsonb_pretty jsonb_path_exists jsonb_path_match jsonb_path_query jsonb_path_query_array jsonb_path_query_first JSON Processing Functions Function Return Type Description Example Example Result json_array_length(json) jsonb_array_length(jsonb) int Returns the number of elements in the outermost JSON array. json_array_length('[1,2,3,{"f1":1,"f2":[5,6]},4]') 5 json_each(json) jsonb_each(jsonb) setof key text, value json setof key text, value jsonb Expands the outermost JSON object into a set of key/value pairs. select * from json_each('{"a":"foo", "b":"bar"}') key | value -----+------- a | "foo" b | "bar" json_each_text(json) jsonb_each_text(jsonb) setof key text, value text Expands the outermost JSON object into a set of key/value pairs. The returned values will be of type text. select * from json_each_text('{"a":"foo", "b":"bar"}') key | value -----+------- a | foo b | bar json_extract_path(from_json json, VARIADIC path_elems text[]) jsonb_extract_path(from_json jsonb, VARIADIC path_elems text[]) jsonjsonb Returns JSON value pointed to by path_elems (equivalent to #> operator). json_extract_path('{"f2":{"f3":1},"f4":{"f5":99,"f6":"foo"}}','f4') {"f5":99,"f6":"foo"} json_extract_path_text(from_json json, VARIADIC path_elems text[]) jsonb_extract_path_text(from_json jsonb, VARIADIC path_elems text[]) text Returns JSON value pointed to by path_elems as text (equivalent to #>> operator). json_extract_path_text('{"f2":{"f3":1},"f4":{"f5":99,"f6":"foo"}}','f4', 'f6') foo json_object_keys(json) jsonb_object_keys(jsonb) setof text Returns set of keys in the outermost JSON object. json_object_keys('{"f1":"abc","f2":{"f3":"a", "f4":"b"}}') json_object_keys ------------------ f1 f2 json_populate_record(base anyelement, from_json json) jsonb_populate_record(base anyelement, from_json jsonb) anyelement Expands the object in from_json to a row whose columns match the record type defined by base (see note below). select * from json_populate_record(null::myrowtype, '{"a": 1, "b": ["2", "a b"], "c": {"d": 4, "e": "a b c"}}') a | b | c ---+-----------+------------- 1 | {2,"a b"} | (4,"a b c") json_populate_recordset(base anyelement, from_json json) jsonb_populate_recordset(base anyelement, from_json jsonb) setof anyelement Expands the outermost array of objects in from_json to a set of rows whose columns match the record type defined by base (see note below). select * from json_populate_recordset(null::myrowtype, '[{"a":1,"b":2},{"a":3,"b":4}]') a | b ---+--- 1 | 2 3 | 4 json_array_elements(json) jsonb_array_elements(jsonb) setof json setof jsonb Expands a JSON array to a set of JSON values. select * from json_array_elements('[1,true, [2,false]]') value ----------- 1 true [2,false] json_array_elements_text(json) jsonb_array_elements_text(jsonb) setof text Expands a JSON array to a set of text values. select * from json_array_elements_text('["foo", "bar"]') value ----------- foo bar json_typeof(json) jsonb_typeof(jsonb) text Returns the type of the outermost JSON value as a text string. Possible types are object, array, string, number, boolean, and null. json_typeof('-123.4') number json_to_record(json) jsonb_to_record(jsonb) record Builds an arbitrary record from a JSON object (see note below). As with all functions returning record, the caller must explicitly define the structure of the record with an AS clause. select * from json_to_record('{"a":1,"b":[1,2,3],"c":[1,2,3],"e":"bar","r": {"a": 123, "b": "a b c"}}') as x(a int, b text, c int[], d text, r myrowtype) a | b | c | d | r ---+---------+---------+---+--------------- 1 | [1,2,3] | {1,2,3} | | (123,"a b c") json_to_recordset(json) jsonb_to_recordset(jsonb) setof record Builds an arbitrary set of records from a JSON array of objects (see note below). As with all functions returning record, the caller must explicitly define the structure of the record with an AS clause. select * from json_to_recordset('[{"a":1,"b":"foo"},{"a":"2","c":"bar"}]') as x(a int, b text); a | b ---+----- 1 | foo 2 | json_strip_nulls(from_json json) jsonb_strip_nulls(from_json jsonb) jsonjsonb Returns from_json with all object fields that have null values omitted. Other null values are untouched. json_strip_nulls('[{"f1":1,"f2":null},2,null,3]') [{"f1":1},2,null,3] jsonb_set(target jsonb, path text[], new_value jsonb , create_missing boolean) jsonb Returns target with the section designated by path replaced by new_value, or with new_value added if create_missing is true (default is true) and the item designated by path does not exist. As with the path oriented operators, negative integers that appear in path count from the end of JSON arrays. jsonb_set('[{"f1":1,"f2":null},2,null,3]', '{0,f1}','[2,3,4]', false) jsonb_set('[{"f1":1,"f2":null},2]', '{0,f3}','[2,3,4]') [{"f1":[2,3,4],"f2":null},2,null,3] [{"f1": 1, "f2": null, "f3": [2, 3, 4]}, 2] jsonb_insert(target jsonb, path text[], new_value jsonb , insert_after boolean) jsonb Returns target with new_value inserted. If target section designated by path is in a JSONB array, new_value will be inserted before target or after if insert_after is true (default is false). If target section designated by path is in JSONB object, new_value will be inserted only if target does not exist. As with the path oriented operators, negative integers that appear in path count from the end of JSON arrays. jsonb_insert('{"a": [0,1,2]}', '{a, 1}', '"new_value"') jsonb_insert('{"a": [0,1,2]}', '{a, 1}', '"new_value"', true) {"a": [0, "new_value", 1, 2]} {"a": [0, 1, "new_value", 2]} jsonb_pretty(from_json jsonb) text Returns from_json as indented JSON text. jsonb_pretty('[{"f1":1,"f2":null},2,null,3]') [ { "f1": 1, "f2": null }, 2, null, 3 ] jsonb_path_exists(target jsonb, path jsonpath , vars jsonb , silent bool) boolean Checks whether JSON path returns any item for the specified JSON value. jsonb_path_exists('{"a":[1,2,3,4,5]}', '$.a[*] ? (@ >= $min && @ <= $max)', '{"min":2,"max":4}') true jsonb_path_match(target jsonb, path jsonpath , vars jsonb , silent bool) boolean Returns the result of JSON path predicate check for the specified JSON value. Only the first item of the result is taken into account. If the result is not Boolean, then null is returned. jsonb_path_match('{"a":[1,2,3,4,5]}', 'exists($.a[*] ? (@ >= $min && @ <= $max))', '{"min":2,"max":4}') true jsonb_path_query(target jsonb, path jsonpath , vars jsonb , silent bool) setof jsonb Gets all JSON items returned by JSON path for the specified JSON value. select * from jsonb_path_query('{"a":[1,2,3,4,5]}', '$.a[*] ? (@ >= $min && @ <= $max)', '{"min":2,"max":4}'); jsonb_path_query ------------------ 2 3 4 jsonb_path_query_array(target jsonb, path jsonpath , vars jsonb , silent bool) jsonb Gets all JSON items returned by JSON path for the specified JSON value and wraps result into an array. jsonb_path_query_array('{"a":[1,2,3,4,5]}', '$.a[*] ? (@ >= $min && @ <= $max)', '{"min":2,"max":4}') [2, 3, 4] jsonb_path_query_first(target jsonb, path jsonpath , vars jsonb , silent bool) jsonb Gets the first JSON item returned by JSON path for the specified JSON value. Returns NULL on no results. jsonb_path_query_first('{"a":[1,2,3,4,5]}', '$.a[*] ? (@ >= $min && @ <= $max)', '{"min":2,"max":4}') 2

Many of these functions and operators will convert Unicode escapes in JSON strings to the appropriate single character. This is a non-issue if the input is type jsonb, because the conversion was already done; but for json input, this may result in throwing an error, as noted in . The functions json[b]_populate_record, json[b]_populate_recordset, json[b]_to_record and json[b]_to_recordset operate on a JSON object, or array of objects, and extract the values associated with keys whose names match column names of the output row type. Object fields that do not correspond to any output column name are ignored, and output columns that do not match any object field will be filled with nulls. To convert a JSON value to the SQL type of an output column, the following rules are applied in sequence: A JSON null value is converted to a SQL null in all cases. If the output column is of type json or jsonb, the JSON value is just reproduced exactly. If the output column is a composite (row) type, and the JSON value is a JSON object, the fields of the object are converted to columns of the output row type by recursive application of these rules. Likewise, if the output column is an array type and the JSON value is a JSON array, the elements of the JSON array are converted to elements of the output array by recursive application of these rules. Otherwise, if the JSON value is a string literal, the contents of the string are fed to the input conversion function for the column's data type. Otherwise, the ordinary text representation of the JSON value is fed to the input conversion function for the column's data type. While the examples for these functions use constants, the typical use would be to reference a table in the FROM clause and use one of its json or jsonb columns as an argument to the function. Extracted key values can then be referenced in other parts of the query, like WHERE clauses and target lists. Extracting multiple values in this way can improve performance over extracting them separately with per-key operators. All the items of the path parameter of jsonb_set as well as jsonb_insert except the last item must be present in the target. If create_missing is false, all items of the path parameter of jsonb_set must be present. If these conditions are not met the target is returned unchanged. If the last path item is an object key, it will be created if it is absent and given the new value. If the last path item is an array index, if it is positive the item to set is found by counting from the left, and if negative by counting from the right - -1 designates the rightmost element, and so on. If the item is out of the range -array_length .. array_length -1, and create_missing is true, the new value is added at the beginning of the array if the item is negative, and at the end of the array if it is positive. The json_typeof function's null return value should not be confused with a SQL NULL. While calling json_typeof('null'::json) will return null, calling json_typeof(NULL::json) will return a SQL NULL. If the argument to json_strip_nulls contains duplicate field names in any object, the result could be semantically somewhat different, depending on the order in which they occur. This is not an issue for jsonb_strip_nulls since jsonb values never have duplicate object field names. The jsonb_path_exists, jsonb_path_match, jsonb_path_query, jsonb_path_query_array, and jsonb_path_query_first functions have optional vars and silent arguments. If the vars argument is specified, it provides an object containing named variables to be substituted into a jsonpath expression. If the silent argument is specified and has the true value, these functions suppress the same errors as the @? and @@ operators. See also for the aggregate function json_agg which aggregates record values as JSON, and the aggregate function json_object_agg which aggregates pairs of values into a JSON object, and their jsonb equivalents, jsonb_agg and jsonb_object_agg. The SQL/JSON Path Language SQL/JSON path language SQL/JSON path expressions specify the items to be retrieved from the JSON data, similar to XPath expressions used for SQL access to XML. In PostgreSQL, path expressions are implemented as the jsonpath data type and can use any elements described in . JSON query functions and operators pass the provided path expression to the path engine for evaluation. If the expression matches the queried JSON data, the corresponding SQL/JSON item is returned. Path expressions are written in the SQL/JSON path language and can also include arithmetic expressions and functions. Query functions treat the provided expression as a text string, so it must be enclosed in single quotes. A path expression consists of a sequence of elements allowed by the jsonpath data type. The path expression is evaluated from left to right, but you can use parentheses to change the order of operations. If the evaluation is successful, a sequence of SQL/JSON items (SQL/JSON sequence) is produced, and the evaluation result is returned to the JSON query function that completes the specified computation. To refer to the JSON data to be queried (the context item), use the $ sign in the path expression. It can be followed by one or more accessor operators, which go down the JSON structure level by level to retrieve the content of context item. Each operator that follows deals with the result of the previous evaluation step. For example, suppose you have some JSON data from a GPS tracker that you would like to parse, such as: { "track": { "segments": [ { "location": [ 47.763, 13.4034 ], "start time": "2018-10-14 10:05:14", "HR": 73 }, { "location": [ 47.706, 13.2635 ], "start time": "2018-10-14 10:39:21", "HR": 135 } ] } } To retrieve the available track segments, you need to use the .key accessor operator for all the preceding JSON objects: '$.track.segments' If the item to retrieve is an element of an array, you have to unnest this array using the [*] operator. For example, the following path will return location coordinates for all the available track segments: '$.track.segments[*].location' To return the coordinates of the first segment only, you can specify the corresponding subscript in the [] accessor operator. Note that the SQL/JSON arrays are 0-relative: '$.track.segments[0].location' The result of each path evaluation step can be processed by one or more jsonpath operators and methods listed in . Each method name must be preceded by a dot. For example, you can get an array size: '$.track.segments.size()' For more examples of using jsonpath operators and methods within path expressions, see . When defining the path, you can also use one or more filter expressions that work similar to the WHERE clause in SQL. A filter expression begins with a question mark and provides a condition in parentheses: ? (condition) Filter expressions must be specified right after the path evaluation step to which they are applied. The result of this step is filtered to include only those items that satisfy the provided condition. SQL/JSON defines three-valued logic, so the condition can be true, false, or unknown. The unknown value plays the same role as SQL NULL and can be tested for with the is unknown predicate. Further path evaluation steps use only those items for which filter expressions return true. Functions and operators that can be used in filter expressions are listed in . The path evaluation result to be filtered is denoted by the @ variable. To refer to a JSON element stored at a lower nesting level, add one or more accessor operators after @. Suppose you would like to retrieve all heart rate values higher than 130. You can achieve this using the following expression: '$.track.segments[*].HR ? (@ > 130)' To get the start time of segments with such values instead, you have to filter out irrelevant segments before returning the start time, so the filter expression is applied to the previous step, and the path used in the condition is different: '$.track.segments[*] ? (@.HR > 130)."start time"' You can use several filter expressions on the same nesting level, if required. For example, the following expression selects all segments that contain locations with relevant coordinates and high heart rate values: '$.track.segments[*] ? (@.location[1] < 13.4) ? (@.HR > 130)."start time"' Using filter expressions at different nesting levels is also allowed. The following example first filters all segments by location, and then returns high heart rate values for these segments, if available: '$.track.segments[*] ? (@.location[1] < 13.4).HR ? (@ > 130)' You can also nest filter expressions within each other: '$.track ? (exists(@.segments[*] ? (@.HR > 130))).segments.size()' This expression returns the size of the track if it contains any segments with high heart rate values, or an empty sequence otherwise. PostgreSQL's implementation of SQL/JSON path language has the following deviations from the SQL/JSON standard: .datetime() item method is not implemented yet mainly because immutable jsonpath functions and operators cannot reference session timezone, which is used in some datetime operations. Datetime support will be added to jsonpath in future versions of PostgreSQL. A path expression can be a Boolean predicate, although the SQL/JSON standard allows predicates only in filters. This is necessary for implementation of the @@ operator. For example, the following jsonpath expression is valid in PostgreSQL: '$.track.segments[*].HR < 70' There are minor differences in the interpretation of regular expression patterns used in like_regex filters, as described in . Strict and Lax Modes When you query JSON data, the path expression may not match the actual JSON data structure. An attempt to access a non-existent member of an object or element of an array results in a structural error. SQL/JSON path expressions have two modes of handling structural errors: lax (default) — the path engine implicitly adapts the queried data to the specified path. Any remaining structural errors are suppressed and converted to empty SQL/JSON sequences. strict — if a structural error occurs, an error is raised. The lax mode facilitates matching of a JSON document structure and path expression if the JSON data does not conform to the expected schema. If an operand does not match the requirements of a particular operation, it can be automatically wrapped as an SQL/JSON array or unwrapped by converting its elements into an SQL/JSON sequence before performing this operation. Besides, comparison operators automatically unwrap their operands in the lax mode, so you can compare SQL/JSON arrays out-of-the-box. An array of size 1 is considered equal to its sole element. Automatic unwrapping is not performed only when: The path expression contains type() or size() methods that return the type and the number of elements in the array, respectively. The queried JSON data contain nested arrays. In this case, only the outermost array is unwrapped, while all the inner arrays remain unchanged. Thus, implicit unwrapping can only go one level down within each path evaluation step. For example, when querying the GPS data listed above, you can abstract from the fact that it stores an array of segments when using the lax mode: 'lax $.track.segments.location' In the strict mode, the specified path must exactly match the structure of the queried JSON document to return an SQL/JSON item, so using this path expression will cause an error. To get the same result as in the lax mode, you have to explicitly unwrap the segments array: 'strict $.track.segments[*].location' Regular Expressions LIKE_REGEX in SQL/JSON SQL/JSON path expressions allow matching text to a regular expression with the like_regex filter. For example, the following SQL/JSON path query would case-insensitively match all strings in an array that start with an English vowel: '$[*] ? (@ like_regex "^[aeiou]" flag "i")' The optional flag string may include one or more of the characters i for case-insensitive match, m to allow ^ and $ to match at newlines, s to allow . to match a newline, and q to quote the whole pattern (reducing the behavior to a simple substring match). The SQL/JSON standard borrows its definition for regular expressions from the LIKE_REGEX operator, which in turn uses the XQuery standard. PostgreSQL does not currently support the LIKE_REGEX operator. Therefore, the like_regex filter is implemented using the POSIX regular expression engine described in . This leads to various minor discrepancies from standard SQL/JSON behavior, which are cataloged in . Note, however, that the flag-letter incompatibilities described there do not apply to SQL/JSON, as it translates the XQuery flag letters to match what the POSIX engine expects. Keep in mind that the pattern argument of like_regex is a JSON path string literal, written according to the rules given in . This means in particular that any backslashes you want to use in the regular expression must be doubled. For example, to match strings that contain only digits: '$ ? (@ like_regex "^\\d+$")' SQL/JSON Path Operators and Methods shows the operators and methods available in jsonpath. shows the available filter expression elements. <type>jsonpath</type> Operators and Methods Operator/Method Description Example JSON Example Query Result + (unary) Plus operator that iterates over the SQL/JSON sequence {"x": [2.85, -14.7, -9.4]} + $.x.floor() 2, -15, -10 - (unary) Minus operator that iterates over the SQL/JSON sequence {"x": [2.85, -14.7, -9.4]} - $.x.floor() -2, 15, 10 + (binary) Addition [2] 2 + $[0] 4 - (binary) Subtraction [2] 4 - $[0] 2 * Multiplication [4] 2 * $[0] 8 / Division [8] $[0] / 2 4 % Modulus [32] $[0] % 10 2 type() Type of the SQL/JSON item [1, "2", {}] $[*].type() "number", "string", "object" size() Size of the SQL/JSON item {"m": [11, 15]} $.m.size() 2 double() Approximate floating-point number converted from an SQL/JSON number or a string {"len": "1.9"} $.len.double() * 2 3.8 ceiling() Nearest integer greater than or equal to the SQL/JSON number {"h": 1.3} $.h.ceiling() 2 floor() Nearest integer less than or equal to the SQL/JSON number {"h": 1.3} $.h.floor() 1 abs() Absolute value of the SQL/JSON number {"z": -0.3} $.z.abs() 0.3 keyvalue() Sequence of object's key-value pairs represented as array of items containing three fields ("key", "value", and "id"). "id" is a unique identifier of the object key-value pair belongs to. {"x": "20", "y": 32} $.keyvalue() {"key": "x", "value": "20", "id": 0}, {"key": "y", "value": 32, "id": 0}

<type>jsonpath</type> Filter Expression Elements Value/Predicate Description Example JSON Example Query Result == Equality operator [1, 2, 1, 3] $[*] ? (@ == 1) 1, 1 != Non-equality operator [1, 2, 1, 3] $[*] ? (@ != 1) 2, 3 <> Non-equality operator (same as !=) [1, 2, 1, 3] $[*] ? (@ <> 1) 2, 3 < Less-than operator [1, 2, 3] $[*] ? (@ < 2) 1 <= Less-than-or-equal-to operator [1, 2, 3] $[*] ? (@ <= 2) 1, 2 > Greater-than operator [1, 2, 3] $[*] ? (@ > 2) 3 >= Greater-than-or-equal-to operator [1, 2, 3] $[*] ? (@ >= 2) 2, 3 true Value used to perform comparison with JSON true literal [{"name": "John", "parent": false}, {"name": "Chris", "parent": true}] $[*] ? (@.parent == true) {"name": "Chris", "parent": true} false Value used to perform comparison with JSON false literal [{"name": "John", "parent": false}, {"name": "Chris", "parent": true}] $[*] ? (@.parent == false) {"name": "John", "parent": false} null Value used to perform comparison with JSON null value [{"name": "Mary", "job": null}, {"name": "Michael", "job": "driver"}] $[*] ? (@.job == null) .name "Mary" && Boolean AND [1, 3, 7] $[*] ? (@ > 1 && @ < 5) 3 || Boolean OR [1, 3, 7] $[*] ? (@ < 1 || @ > 5) 7 ! Boolean NOT [1, 3, 7] $[*] ? (!(@ < 5)) 7 like_regex Tests whether the first operand matches the regular expression given by the second operand, optionally with modifications described by a string of flag characters (see ) ["abc", "abd", "aBdC", "abdacb", "babc"] $[*] ? (@ like_regex "^ab.*c" flag "i") "abc", "aBdC", "abdacb" starts with Tests whether the second operand is an initial substring of the first operand ["John Smith", "Mary Stone", "Bob Johnson"] $[*] ? (@ starts with "John") "John Smith" exists Tests whether a path expression matches at least one SQL/JSON item {"x": [1, 2], "y": [2, 4]} strict $.* ? (exists (@ ? (@[*] > 2))) 2, 4 is unknown Tests whether a Boolean condition is unknown [-1, 2, 7, "infinity"] $[*] ? ((@ > 0) is unknown) "infinity"

Sequence Manipulation Functions sequence nextval currval lastval setval This section describes functions for operating on sequence objects, also called sequence generators or just sequences. Sequence objects are special single-row tables created with . Sequence objects are commonly used to generate unique identifiers for rows of a table. The sequence functions, listed in , provide simple, multiuser-safe methods for obtaining successive sequence values from sequence objects. Sequence Functions Function Return Type Description currval(regclass) bigint Return value most recently obtained with nextval for specified sequence lastval() bigint Return value most recently obtained with nextval for any sequence nextval(regclass) bigint Advance sequence and return new value setval(regclass, bigint) bigint Set sequence's current value setval(regclass, bigint, boolean) bigint Set sequence's current value and is_called flag

The sequence to be operated on by a sequence function is specified by a regclass argument, which is simply the OID of the sequence in the pg_class system catalog. You do not have to look up the OID by hand, however, since the regclass data type's input converter will do the work for you. Just write the sequence name enclosed in single quotes so that it looks like a literal constant. For compatibility with the handling of ordinary SQL names, the string will be converted to lower case unless it contains double quotes around the sequence name. Thus: nextval('foo') operates on sequence foo nextval('FOO') operates on sequence foo nextval('"Foo"') operates on sequence Foo The sequence name can be schema-qualified if necessary: nextval('myschema.foo') operates on myschema.foo nextval('"myschema".foo') same as above nextval('foo') searches search path for foo See for more information about regclass. Before PostgreSQL 8.1, the arguments of the sequence functions were of type text, not regclass, and the above-described conversion from a text string to an OID value would happen at run time during each call. For backward compatibility, this facility still exists, but internally it is now handled as an implicit coercion from text to regclass before the function is invoked. When you write the argument of a sequence function as an unadorned literal string, it becomes a constant of type regclass. Since this is really just an OID, it will track the originally identified sequence despite later renaming, schema reassignment, etc. This early binding behavior is usually desirable for sequence references in column defaults and views. But sometimes you might want late binding where the sequence reference is resolved at run time. To get late-binding behavior, force the constant to be stored as a text constant instead of regclass: nextval('foo'::text) foo is looked up at runtime Note that late binding was the only behavior supported in PostgreSQL releases before 8.1, so you might need to do this to preserve the semantics of old applications. Of course, the argument of a sequence function can be an expression as well as a constant. If it is a text expression then the implicit coercion will result in a run-time lookup. The available sequence functions are: nextval Advance the sequence object to its next value and return that value. This is done atomically: even if multiple sessions execute nextval concurrently, each will safely receive a distinct sequence value. If a sequence object has been created with default parameters, successive nextval calls will return successive values beginning with 1. Other behaviors can be obtained by using special parameters in the command; see its command reference page for more information. To avoid blocking concurrent transactions that obtain numbers from the same sequence, a nextval operation is never rolled back; that is, once a value has been fetched it is considered used and will not be returned again. This is true even if the surrounding transaction later aborts, or if the calling query ends up not using the value. For example an INSERT with an ON CONFLICT clause will compute the to-be-inserted tuple, including doing any required nextval calls, before detecting any conflict that would cause it to follow the ON CONFLICT rule instead. Such cases will leave unused holes in the sequence of assigned values. Thus, PostgreSQL sequence objects cannot be used to obtain gapless sequences. This function requires USAGE or UPDATE privilege on the sequence. currval Return the value most recently obtained by nextval for this sequence in the current session. (An error is reported if nextval has never been called for this sequence in this session.) Because this is returning a session-local value, it gives a predictable answer whether or not other sessions have executed nextval since the current session did. This function requires USAGE or SELECT privilege on the sequence. lastval Return the value most recently returned by nextval in the current session. This function is identical to currval, except that instead of taking the sequence name as an argument it refers to whichever sequence nextval was most recently applied to in the current session. It is an error to call lastval if nextval has not yet been called in the current session. This function requires USAGE or SELECT privilege on the last used sequence. setval Reset the sequence object's counter value. The two-parameter form sets the sequence's last_value field to the specified value and sets its is_called field to true, meaning that the next nextval will advance the sequence before returning a value. The value reported by currval is also set to the specified value. In the three-parameter form, is_called can be set to either true or false. true has the same effect as the two-parameter form. If it is set to false, the next nextval will return exactly the specified value, and sequence advancement commences with the following nextval. Furthermore, the value reported by currval is not changed in this case. For example, SELECT setval('foo', 42); Next nextval will return 43 SELECT setval('foo', 42, true); Same as above SELECT setval('foo', 42, false); Next nextval will return 42 The result returned by setval is just the value of its second argument. Because sequences are non-transactional, changes made by setval are not undone if the transaction rolls back. This function requires UPDATE privilege on the sequence. Conditional Expressions CASE conditional expression This section describes the SQL-compliant conditional expressions available in PostgreSQL. If your needs go beyond the capabilities of these conditional expressions, you might want to consider writing a server-side function in a more expressive programming language. Although COALESCE, GREATEST, and LEAST are syntactically similar to functions, they are not ordinary functions, and thus cannot be used with explicit VARIADIC array arguments. <literal>CASE</literal> The SQL CASE expression is a generic conditional expression, similar to if/else statements in other programming languages: CASE WHEN condition THEN result WHEN ... ELSE result END CASE clauses can be used wherever an expression is valid. Each condition is an expression that returns a boolean result. If the condition's result is true, the value of the CASE expression is the result that follows the condition, and the remainder of the CASE expression is not processed. If the condition's result is not true, any subsequent WHEN clauses are examined in the same manner. If no WHEN condition yields true, the value of the CASE expression is the result of the ELSE clause. If the ELSE clause is omitted and no condition is true, the result is null. An example: SELECT * FROM test; a --- 1 2 3 SELECT a, CASE WHEN a=1 THEN 'one' WHEN a=2 THEN 'two' ELSE 'other' END FROM test; a | case ---+------- 1 | one 2 | two 3 | other The data types of all the result expressions must be convertible to a single output type. See for more details. There is a simple form of CASE expression that is a variant of the general form above: CASE expression WHEN value THEN result WHEN ... ELSE result END The first expression is computed, then compared to each of the value expressions in the WHEN clauses until one is found that is equal to it. If no match is found, the result of the ELSE clause (or a null value) is returned. This is similar to the switch statement in C. The example above can be written using the simple CASE syntax: SELECT a, CASE a WHEN 1 THEN 'one' WHEN 2 THEN 'two' ELSE 'other' END FROM test; a | case ---+------- 1 | one 2 | two 3 | other A CASE expression does not evaluate any subexpressions that are not needed to determine the result. For example, this is a possible way of avoiding a division-by-zero failure: SELECT ... WHERE CASE WHEN x <> 0 THEN y/x > 1.5 ELSE false END; As described in , there are various situations in which subexpressions of an expression are evaluated at different times, so that the principle that CASE evaluates only necessary subexpressions is not ironclad. For example a constant 1/0 subexpression will usually result in a division-by-zero failure at planning time, even if it's within a CASE arm that would never be entered at run time. <literal>COALESCE</literal> COALESCE NVL IFNULL COALESCE(value , ...) The COALESCE function returns the first of its arguments that is not null. Null is returned only if all arguments are null. It is often used to substitute a default value for null values when data is retrieved for display, for example: SELECT COALESCE(description, short_description, '(none)') ... This returns description if it is not null, otherwise short_description if it is not null, otherwise (none). Like a CASE expression, COALESCE only evaluates the arguments that are needed to determine the result; that is, arguments to the right of the first non-null argument are not evaluated. This SQL-standard function provides capabilities similar to NVL and IFNULL, which are used in some other database systems. <literal>NULLIF</literal> NULLIF NULLIF(value1, value2) The NULLIF function returns a null value if value1 equals value2; otherwise it returns value1. This can be used to perform the inverse operation of the COALESCE example given above: SELECT NULLIF(value, '(none)') ... In this example, if value is (none), null is returned, otherwise the value of value is returned. <literal>GREATEST</literal> and <literal>LEAST</literal> GREATEST LEAST GREATEST(value , ...) LEAST(value , ...) The GREATEST and LEAST functions select the largest or smallest value from a list of any number of expressions. The expressions must all be convertible to a common data type, which will be the type of the result (see for details). NULL values in the list are ignored. The result will be NULL only if all the expressions evaluate to NULL. Note that GREATEST and LEAST are not in the SQL standard, but are a common extension. Some other databases make them return NULL if any argument is NULL, rather than only when all are NULL. Array Functions and Operators shows the operators available for array types. Array Operators Operator Description Example Result = equal ARRAY[1.1,2.1,3.1]::int[] = ARRAY[1,2,3] t <> not equal ARRAY[1,2,3] <> ARRAY[1,2,4] t < less than ARRAY[1,2,3] < ARRAY[1,2,4] t > greater than ARRAY[1,4,3] > ARRAY[1,2,4] t <= less than or equal ARRAY[1,2,3] <= ARRAY[1,2,3] t >= greater than or equal ARRAY[1,4,3] >= ARRAY[1,4,3] t @> contains ARRAY[1,4,3] @> ARRAY[3,1,3] t <@ is contained by ARRAY[2,2,7] <@ ARRAY[1,7,4,2,6] t && overlap (have elements in common) ARRAY[1,4,3] && ARRAY[2,1] t || array-to-array concatenation ARRAY[1,2,3] || ARRAY[4,5,6] {1,2,3,4,5,6} || array-to-array concatenation ARRAY[1,2,3] || ARRAY[[4,5,6],[7,8,9]] {{1,2,3},{4,5,6},{7,8,9}} || element-to-array concatenation 3 || ARRAY[4,5,6] {3,4,5,6} || array-to-element concatenation ARRAY[4,5,6] || 7 {4,5,6,7}

The array ordering operators (<, >=, etc) compare the array contents element-by-element, using the default B-tree comparison function for the element data type, and sort based on the first difference. In multidimensional arrays the elements are visited in row-major order (last subscript varies most rapidly). If the contents of two arrays are equal but the dimensionality is different, the first difference in the dimensionality information determines the sort order. (This is a change from versions of PostgreSQL prior to 8.2: older versions would claim that two arrays with the same contents were equal, even if the number of dimensions or subscript ranges were different.) The array containment operators (<@ and @>) consider one array to be contained in another one if each of its elements appears in the other one. Duplicates are not treated specially, thus ARRAY[1] and ARRAY[1,1] are each considered to contain the other. See for more details about array operator behavior. See for more details about which operators support indexed operations. shows the functions available for use with array types. See for more information and examples of the use of these functions. array_append array_cat array_ndims array_dims array_fill array_length array_lower array_position array_positions array_prepend array_remove array_replace array_to_string array_upper cardinality string_to_array unnest Array Functions Function Return Type Description Example Result array_append(anyarray, anyelement) anyarray append an element to the end of an array array_append(ARRAY[1,2], 3) {1,2,3} array_cat(anyarray, anyarray) anyarray concatenate two arrays array_cat(ARRAY[1,2,3], ARRAY[4,5]) {1,2,3,4,5} array_ndims(anyarray) int returns the number of dimensions of the array array_ndims(ARRAY[[1,2,3], [4,5,6]]) 2 array_dims(anyarray) text returns a text representation of array's dimensions array_dims(ARRAY[[1,2,3], [4,5,6]]) [1:2][1:3] array_fill(anyelement, int[] , int[]) anyarray returns an array initialized with supplied value and dimensions, optionally with lower bounds other than 1 array_fill(7, ARRAY[3], ARRAY[2]) [2:4]={7,7,7} array_length(anyarray, int) int returns the length of the requested array dimension array_length(array[1,2,3], 1) 3 array_lower(anyarray, int) int returns lower bound of the requested array dimension array_lower('[0:2]={1,2,3}'::int[], 1) 0 array_position(anyarray, anyelement , int) int returns the subscript of the first occurrence of the second argument in the array, starting at the element indicated by the third argument or at the first element (array must be one-dimensional) array_position(ARRAY['sun','mon','tue','wed','thu','fri','sat'], 'mon') 2 array_positions(anyarray, anyelement) int[] returns an array of subscripts of all occurrences of the second argument in the array given as first argument (array must be one-dimensional) array_positions(ARRAY['A','A','B','A'], 'A') {1,2,4} array_prepend(anyelement, anyarray) anyarray append an element to the beginning of an array array_prepend(1, ARRAY[2,3]) {1,2,3} array_remove(anyarray, anyelement) anyarray remove all elements equal to the given value from the array (array must be one-dimensional) array_remove(ARRAY[1,2,3,2], 2) {1,3} array_replace(anyarray, anyelement, anyelement) anyarray replace each array element equal to the given value with a new value array_replace(ARRAY[1,2,5,4], 5, 3) {1,2,3,4} array_to_string(anyarray, text , text) text concatenates array elements using supplied delimiter and optional null string array_to_string(ARRAY[1, 2, 3, NULL, 5], ',', '*') 1,2,3,*,5 array_upper(anyarray, int) int returns upper bound of the requested array dimension array_upper(ARRAY[1,8,3,7], 1) 4 cardinality(anyarray) int returns the total number of elements in the array, or 0 if the array is empty cardinality(ARRAY[[1,2],[3,4]]) 4 string_to_array(text, text , text) text[] splits string into array elements using supplied delimiter and optional null string string_to_array('xx~^~yy~^~zz', '~^~', 'yy') {xx,NULL,zz} unnest(anyarray) setof anyelement expand an array to a set of rows unnest(ARRAY[1,2]) 1 2(2 rows) unnest(anyarray, anyarray [, ...]) setof anyelement, anyelement [, ...] expand multiple arrays (possibly of different types) to a set of rows. This is only allowed in the FROM clause; see unnest(ARRAY[1,2],ARRAY['foo','bar','baz']) 1 foo 2 bar NULL baz(3 rows)

In array_position and array_positions, each array element is compared to the searched value using IS NOT DISTINCT FROM semantics. In array_position, NULL is returned if the value is not found. In array_positions, NULL is returned only if the array is NULL; if the value is not found in the array, an empty array is returned instead. In string_to_array, if the delimiter parameter is NULL, each character in the input string will become a separate element in the resulting array. If the delimiter is an empty string, then the entire input string is returned as a one-element array. Otherwise the input string is split at each occurrence of the delimiter string. In string_to_array, if the null-string parameter is omitted or NULL, none of the substrings of the input will be replaced by NULL. In array_to_string, if the null-string parameter is omitted or NULL, any null elements in the array are simply skipped and not represented in the output string. There are two differences in the behavior of string_to_array from pre-9.1 versions of PostgreSQL. First, it will return an empty (zero-element) array rather than NULL when the input string is of zero length. Second, if the delimiter string is NULL, the function splits the input into individual characters, rather than returning NULL as before. See also about the aggregate function array_agg for use with arrays. Range Functions and Operators See for an overview of range types. shows the operators available for range types. Range Operators Operator Description Example Result = equal int4range(1,5) = '[1,4]'::int4range t <> not equal numrange(1.1,2.2) <> numrange(1.1,2.3) t < less than int4range(1,10) < int4range(2,3) t > greater than int4range(1,10) > int4range(1,5) t <= less than or equal numrange(1.1,2.2) <= numrange(1.1,2.2) t >= greater than or equal numrange(1.1,2.2) >= numrange(1.1,2.0) t @> contains range int4range(2,4) @> int4range(2,3) t @> contains element '[2011-01-01,2011-03-01)'::tsrange @> '2011-01-10'::timestamp t <@ range is contained by int4range(2,4) <@ int4range(1,7) t <@ element is contained by 42 <@ int4range(1,7) f && overlap (have points in common) int8range(3,7) && int8range(4,12) t << strictly left of int8range(1,10) << int8range(100,110) t >> strictly right of int8range(50,60) >> int8range(20,30) t &< does not extend to the right of int8range(1,20) &< int8range(18,20) t &> does not extend to the left of int8range(7,20) &> int8range(5,10) t -|- is adjacent to numrange(1.1,2.2) -|- numrange(2.2,3.3) t + union numrange(5,15) + numrange(10,20) [5,20) * intersection int8range(5,15) * int8range(10,20) [10,15) - difference int8range(5,15) - int8range(10,20) [5,10)

The simple comparison operators <, >, <=, and >= compare the lower bounds first, and only if those are equal, compare the upper bounds. These comparisons are not usually very useful for ranges, but are provided to allow B-tree indexes to be constructed on ranges. The left-of/right-of/adjacent operators always return false when an empty range is involved; that is, an empty range is not considered to be either before or after any other range. The union and difference operators will fail if the resulting range would need to contain two disjoint sub-ranges, as such a range cannot be represented. shows the functions available for use with range types. lower upper isempty lower_inc upper_inc lower_inf upper_inf Range Functions Function Return Type Description Example Result lower(anyrange) range's element type lower bound of range lower(numrange(1.1,2.2)) 1.1 upper(anyrange) range's element type upper bound of range upper(numrange(1.1,2.2)) 2.2 isempty(anyrange) boolean is the range empty? isempty(numrange(1.1,2.2)) false lower_inc(anyrange) boolean is the lower bound inclusive? lower_inc(numrange(1.1,2.2)) true upper_inc(anyrange) boolean is the upper bound inclusive? upper_inc(numrange(1.1,2.2)) false lower_inf(anyrange) boolean is the lower bound infinite? lower_inf('(,)'::daterange) true upper_inf(anyrange) boolean is the upper bound infinite? upper_inf('(,)'::daterange) true range_merge(anyrange, anyrange) anyrange the smallest range which includes both of the given ranges range_merge('[1,2)'::int4range, '[3,4)'::int4range) [1,4)

The lower and upper functions return null if the range is empty or the requested bound is infinite. The lower_inc, upper_inc, lower_inf, and upper_inf functions all return false for an empty range. Aggregate Functions aggregate function built-in Aggregate functions compute a single result from a set of input values. The built-in general-purpose aggregate functions are listed in and statistical aggregates in . The built-in within-group ordered-set aggregate functions are listed in while the built-in within-group hypothetical-set ones are in . Grouping operations, which are closely related to aggregate functions, are listed in . The special syntax considerations for aggregate functions are explained in . Consult for additional introductory information. General-Purpose Aggregate Functions Function Argument Type(s) Return Type Partial Mode Description array_agg array_agg(expression) any non-array type array of the argument type No input values, including nulls, concatenated into an array array_agg(expression) any array type same as argument data type No input arrays concatenated into array of one higher dimension (inputs must all have same dimensionality, and cannot be empty or null) average avg avg(expression) smallint, int, bigint, real, double precision, numeric, or interval numeric for any integer-type argument, double precision for a floating-point argument, otherwise the same as the argument data type Yes the average (arithmetic mean) of all non-null input values bit_and bit_and(expression) smallint, int, bigint, or bit same as argument data type Yes the bitwise AND of all non-null input values, or null if none bit_or bit_or(expression) smallint, int, bigint, or bit same as argument data type Yes the bitwise OR of all non-null input values, or null if none bool_and bool_and(expression) bool bool Yes true if all input values are true, otherwise false bool_or bool_or(expression) bool bool Yes true if at least one input value is true, otherwise false count count(*) bigint Yes number of input rows count(expression) any bigint Yes number of input rows for which the value of expression is not null every every(expression) bool bool Yes equivalent to bool_and json_agg json_agg(expression) any json No aggregates values, including nulls, as a JSON array jsonb_agg jsonb_agg(expression) any jsonb No aggregates values, including nulls, as a JSON array json_object_agg json_object_agg(name, value) (any, any) json No aggregates name/value pairs as a JSON object; values can be null, but not names jsonb_object_agg jsonb_object_agg(name, value) (any, any) jsonb No aggregates name/value pairs as a JSON object; values can be null, but not names max max(expression) any numeric, string, date/time, network, or enum type, or arrays of these types same as argument type Yes maximum value of expression across all non-null input values min min(expression) any numeric, string, date/time, network, or enum type, or arrays of these types same as argument type Yes minimum value of expression across all non-null input values string_agg string_agg(expression, delimiter) (text, text) or (bytea, bytea) same as argument types No non-null input values concatenated into a string, separated by delimiter sum sum(expression) smallint, int, bigint, real, double precision, numeric, interval, or money bigint for smallint or int arguments, numeric for bigint arguments, otherwise the same as the argument data type Yes sum of expression across all non-null input values xmlagg xmlagg(expression) xml xml No concatenation of non-null XML values (see also )

It should be noted that except for count, these functions return a null value when no rows are selected. In particular, sum of no rows returns null, not zero as one might expect, and array_agg returns null rather than an empty array when there are no input rows. The coalesce function can be used to substitute zero or an empty array for null when necessary. Aggregate functions which support Partial Mode are eligible to participate in various optimizations, such as parallel aggregation. ANY SOME Boolean aggregates bool_and and bool_or correspond to standard SQL aggregates every and any or some. As for any and some, it seems that there is an ambiguity built into the standard syntax: SELECT b1 = ANY((SELECT b2 FROM t2 ...)) FROM t1 ...; Here ANY can be considered either as introducing a subquery, or as being an aggregate function, if the subquery returns one row with a Boolean value. Thus the standard name cannot be given to these aggregates. Users accustomed to working with other SQL database management systems might be disappointed by the performance of the count aggregate when it is applied to the entire table. A query like: SELECT count(*) FROM sometable; will require effort proportional to the size of the table: PostgreSQL will need to scan either the entire table or the entirety of an index which includes all rows in the table. The aggregate functions array_agg, json_agg, jsonb_agg, json_object_agg, jsonb_object_agg, string_agg, and xmlagg, as well as similar user-defined aggregate functions, produce meaningfully different result values depending on the order of the input values. This ordering is unspecified by default, but can be controlled by writing an ORDER BY clause within the aggregate call, as shown in . Alternatively, supplying the input values from a sorted subquery will usually work. For example: Beware that this approach can fail if the outer query level contains additional processing, such as a join, because that might cause the subquery's output to be reordered before the aggregate is computed. shows aggregate functions typically used in statistical analysis. (These are separated out merely to avoid cluttering the listing of more-commonly-used aggregates.) Where the description mentions N, it means the number of input rows for which all the input expressions are non-null. In all cases, null is returned if the computation is meaningless, for example when N is zero. statistics linear regression Aggregate Functions for Statistics Function Argument Type Return Type Partial Mode Description correlation corr corr(Y, X) double precision double precision Yes correlation coefficient covariance population covar_pop covar_pop(Y, X) double precision double precision Yes population covariance covariance sample covar_samp covar_samp(Y, X) double precision double precision Yes sample covariance regr_avgx regr_avgx(Y, X) double precision double precision Yes average of the independent variable (sum(X)/N) regr_avgy regr_avgy(Y, X) double precision double precision Yes average of the dependent variable (sum(Y)/N) regr_count regr_count(Y, X) double precision bigint Yes number of input rows in which both expressions are nonnull regression intercept regr_intercept regr_intercept(Y, X) double precision double precision Yes y-intercept of the least-squares-fit linear equation determined by the (X, Y) pairs regr_r2 regr_r2(Y, X) double precision double precision Yes square of the correlation coefficient regression slope regr_slope regr_slope(Y, X) double precision double precision Yes slope of the least-squares-fit linear equation determined by the (X, Y) pairs regr_sxx regr_sxx(Y, X) double precision double precision Yes sum(X^2) - sum(X)^2/N (sum of squares of the independent variable) regr_sxy regr_sxy(Y, X) double precision double precision Yes sum(X*Y) - sum(X) * sum(Y)/N (sum of products of independent times dependent variable) regr_syy regr_syy(Y, X) double precision double precision Yes sum(Y^2) - sum(Y)^2/N (sum of squares of the dependent variable) standard deviation stddev stddev(expression) smallint, int, bigint, real, double precision, or numeric double precision for floating-point arguments, otherwise numeric Yes historical alias for stddev_samp standard deviation population stddev_pop stddev_pop(expression) smallint, int, bigint, real, double precision, or numeric double precision for floating-point arguments, otherwise numeric Yes population standard deviation of the input values standard deviation sample stddev_samp stddev_samp(expression) smallint, int, bigint, real, double precision, or numeric double precision for floating-point arguments, otherwise numeric Yes sample standard deviation of the input values variance variance(expression) smallint, int, bigint, real, double precision, or numeric double precision for floating-point arguments, otherwise numeric Yes historical alias for var_samp variance population var_pop var_pop(expression) smallint, int, bigint, real, double precision, or numeric double precision for floating-point arguments, otherwise numeric Yes population variance of the input values (square of the population standard deviation) variance sample var_samp var_samp(expression) smallint, int, bigint, real, double precision, or numeric double precision for floating-point arguments, otherwise numeric Yes sample variance of the input values (square of the sample standard deviation)

shows some aggregate functions that use the ordered-set aggregate syntax. These functions are sometimes referred to as inverse distribution functions. ordered-set aggregate built-in inverse distribution Ordered-Set Aggregate Functions Function Direct Argument Type(s) Aggregated Argument Type(s) Return Type Partial Mode Description mode statistical mode() WITHIN GROUP (ORDER BY sort_expression) any sortable type same as sort expression No returns the most frequent input value (arbitrarily choosing the first one if there are multiple equally-frequent results) percentile continuous percentile_cont(fraction) WITHIN GROUP (ORDER BY sort_expression) double precision double precision or interval same as sort expression No continuous percentile: returns a value corresponding to the specified fraction in the ordering, interpolating between adjacent input items if needed percentile_cont(fractions) WITHIN GROUP (ORDER BY sort_expression) double precision[] double precision or interval array of sort expression's type No multiple continuous percentile: returns an array of results matching the shape of the fractions parameter, with each non-null element replaced by the value corresponding to that percentile percentile discrete percentile_disc(fraction) WITHIN GROUP (ORDER BY sort_expression) double precision any sortable type same as sort expression No discrete percentile: returns the first input value whose position in the ordering equals or exceeds the specified fraction percentile_disc(fractions) WITHIN GROUP (ORDER BY sort_expression) double precision[] any sortable type array of sort expression's type No multiple discrete percentile: returns an array of results matching the shape of the fractions parameter, with each non-null element replaced by the input value corresponding to that percentile

All the aggregates listed in ignore null values in their sorted input. For those that take a fraction parameter, the fraction value must be between 0 and 1; an error is thrown if not. However, a null fraction value simply produces a null result. hypothetical-set aggregate built-in Each of the aggregates listed in is associated with a window function of the same name defined in . In each case, the aggregate result is the value that the associated window function would have returned for the hypothetical row constructed from args, if such a row had been added to the sorted group of rows computed from the sorted_args. Hypothetical-Set Aggregate Functions Function Direct Argument Type(s) Aggregated Argument Type(s) Return Type Partial Mode Description rank hypothetical rank(args) WITHIN GROUP (ORDER BY sorted_args) VARIADIC "any" VARIADIC "any" bigint No rank of the hypothetical row, with gaps for duplicate rows dense_rank hypothetical dense_rank(args) WITHIN GROUP (ORDER BY sorted_args) VARIADIC "any" VARIADIC "any" bigint No rank of the hypothetical row, without gaps percent_rank hypothetical percent_rank(args) WITHIN GROUP (ORDER BY sorted_args) VARIADIC "any" VARIADIC "any" double precision No relative rank of the hypothetical row, ranging from 0 to 1 cume_dist hypothetical cume_dist(args) WITHIN GROUP (ORDER BY sorted_args) VARIADIC "any" VARIADIC "any" double precision No relative rank of the hypothetical row, ranging from 1/N to 1

For each of these hypothetical-set aggregates, the list of direct arguments given in args must match the number and types of the aggregated arguments given in sorted_args. Unlike most built-in aggregates, these aggregates are not strict, that is they do not drop input rows containing nulls. Null values sort according to the rule specified in the ORDER BY clause. Grouping Operations Function Return Type Description GROUPING GROUPING(args...) integer Integer bit mask indicating which arguments are not being included in the current grouping set

Grouping operations are used in conjunction with grouping sets (see ) to distinguish result rows. The arguments to the GROUPING operation are not actually evaluated, but they must match exactly expressions given in the GROUP BY clause of the associated query level. Bits are assigned with the rightmost argument being the least-significant bit; each bit is 0 if the corresponding expression is included in the grouping criteria of the grouping set generating the result row, and 1 if it is not. For example: => SELECT * FROM items_sold; make | model | sales -------+-------+------- Foo | GT | 10 Foo | Tour | 20 Bar | City | 15 Bar | Sport | 5 (4 rows) => SELECT make, model, GROUPING(make,model), sum(sales) FROM items_sold GROUP BY ROLLUP(make,model); make | model | grouping | sum -------+-------+----------+----- Foo | GT | 0 | 10 Foo | Tour | 0 | 20 Bar | City | 0 | 15 Bar | Sport | 0 | 5 Foo | | 1 | 30 Bar | | 1 | 20 | | 3 | 50 (7 rows) Window Functions window function built-in Window functions provide the ability to perform calculations across sets of rows that are related to the current query row. See for an introduction to this feature, and for syntax details. The built-in window functions are listed in . Note that these functions must be invoked using window function syntax, i.e., an OVER clause is required. In addition to these functions, any built-in or user-defined general-purpose or statistical aggregate (i.e., not ordered-set or hypothetical-set aggregates) can be used as a window function; see for a list of the built-in aggregates. Aggregate functions act as window functions only when an OVER clause follows the call; otherwise they act as non-window aggregates and return a single row for the entire set. General-Purpose Window Functions Function Return Type Description row_number row_number() bigint number of the current row within its partition, counting from 1 rank rank() bigint rank of the current row with gaps; same as row_number of its first peer dense_rank dense_rank() bigint rank of the current row without gaps; this function counts peer groups percent_rank percent_rank() double precision relative rank of the current row: (rank - 1) / (total partition rows - 1) cume_dist cume_dist() double precision cumulative distribution: (number of partition rows preceding or peer with current row) / total partition rows ntile ntile(num_buckets integer) integer integer ranging from 1 to the argument value, dividing the partition as equally as possible lag lag(value anyelement [, offset integer [, default anyelement ]]) same type as value returns value evaluated at the row that is offset rows before the current row within the partition; if there is no such row, instead return default (which must be of the same type as value). Both offset and default are evaluated with respect to the current row. If omitted, offset defaults to 1 and default to null lead lead(value anyelement [, offset integer [, default anyelement ]]) same type as value returns value evaluated at the row that is offset rows after the current row within the partition; if there is no such row, instead return default (which must be of the same type as value). Both offset and default are evaluated with respect to the current row. If omitted, offset defaults to 1 and default to null first_value first_value(value any) same type as value returns value evaluated at the row that is the first row of the window frame last_value last_value(value any) same type as value returns value evaluated at the row that is the last row of the window frame nth_value nth_value(value any, nth integer) same type as value returns value evaluated at the row that is the nth row of the window frame (counting from 1); null if no such row

All of the functions listed in depend on the sort ordering specified by the ORDER BY clause of the associated window definition. Rows that are not distinct when considering only the ORDER BY columns are said to be peers. The four ranking functions (including cume_dist) are defined so that they give the same answer for all peer rows. Note that first_value, last_value, and nth_value consider only the rows within the window frame, which by default contains the rows from the start of the partition through the last peer of the current row. This is likely to give unhelpful results for last_value and sometimes also nth_value. You can redefine the frame by adding a suitable frame specification (RANGE, ROWS or GROUPS) to the OVER clause. See for more information about frame specifications. When an aggregate function is used as a window function, it aggregates over the rows within the current row's window frame. An aggregate used with ORDER BY and the default window frame definition produces a running sum type of behavior, which may or may not be what's wanted. To obtain aggregation over the whole partition, omit ORDER BY or use ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING. Other frame specifications can be used to obtain other effects. The SQL standard defines a RESPECT NULLS or IGNORE NULLS option for lead, lag, first_value, last_value, and nth_value. This is not implemented in PostgreSQL: the behavior is always the same as the standard's default, namely RESPECT NULLS. Likewise, the standard's FROM FIRST or FROM LAST option for nth_value is not implemented: only the default FROM FIRST behavior is supported. (You can achieve the result of FROM LAST by reversing the ORDER BY ordering.) cume_dist computes the fraction of partition rows that are less than or equal to the current row and its peers, while percent_rank computes the fraction of partition rows that are less than the current row, assuming the current row does not exist in the partition. Subquery Expressions EXISTS IN NOT IN ANY ALL SOME subquery This section describes the SQL-compliant subquery expressions available in PostgreSQL. All of the expression forms documented in this section return Boolean (true/false) results. <literal>EXISTS</literal> EXISTS (subquery) The argument of EXISTS is an arbitrary SELECT statement, or subquery. The subquery is evaluated to determine whether it returns any rows. If it returns at least one row, the result of EXISTS is true; if the subquery returns no rows, the result of EXISTS is false. The subquery can refer to variables from the surrounding query, which will act as constants during any one evaluation of the subquery. The subquery will generally only be executed long enough to determine whether at least one row is returned, not all the way to completion. It is unwise to write a subquery that has side effects (such as calling sequence functions); whether the side effects occur might be unpredictable. Since the result depends only on whether any rows are returned, and not on the contents of those rows, the output list of the subquery is normally unimportant. A common coding convention is to write all EXISTS tests in the form EXISTS(SELECT 1 WHERE ...). There are exceptions to this rule however, such as subqueries that use INTERSECT. This simple example is like an inner join on col2, but it produces at most one output row for each tab1 row, even if there are several matching tab2 rows: SELECT col1 FROM tab1 WHERE EXISTS (SELECT 1 FROM tab2 WHERE col2 = tab1.col2); <literal>IN</literal> expression IN (subquery) The right-hand side is a parenthesized subquery, which must return exactly one column. The left-hand expression is evaluated and compared to each row of the subquery result. The result of IN is true if any equal subquery row is found. The result is false if no equal row is found (including the case where the subquery returns no rows). Note that if the left-hand expression yields null, or if there are no equal right-hand values and at least one right-hand row yields null, the result of the IN construct will be null, not false. This is in accordance with SQL's normal rules for Boolean combinations of null values. As with EXISTS, it's unwise to assume that the subquery will be evaluated completely. row_constructor IN (subquery) The left-hand side of this form of IN is a row constructor, as described in . The right-hand side is a parenthesized subquery, which must return exactly as many columns as there are expressions in the left-hand row. The left-hand expressions are evaluated and compared row-wise to each row of the subquery result. The result of IN is true if any equal subquery row is found. The result is false if no equal row is found (including the case where the subquery returns no rows). As usual, null values in the rows are combined per the normal rules of SQL Boolean expressions. Two rows are considered equal if all their corresponding members are non-null and equal; the rows are unequal if any corresponding members are non-null and unequal; otherwise the result of that row comparison is unknown (null). If all the per-row results are either unequal or null, with at least one null, then the result of IN is null. <literal>NOT IN</literal> expression NOT IN (subquery) The right-hand side is a parenthesized subquery, which must return exactly one column. The left-hand expression is evaluated and compared to each row of the subquery result. The result of NOT IN is true if only unequal subquery rows are found (including the case where the subquery returns no rows). The result is false if any equal row is found. Note that if the left-hand expression yields null, or if there are no equal right-hand values and at least one right-hand row yields null, the result of the NOT IN construct will be null, not true. This is in accordance with SQL's normal rules for Boolean combinations of null values. As with EXISTS, it's unwise to assume that the subquery will be evaluated completely. row_constructor NOT IN (subquery) The left-hand side of this form of NOT IN is a row constructor, as described in . The right-hand side is a parenthesized subquery, which must return exactly as many columns as there are expressions in the left-hand row. The left-hand expressions are evaluated and compared row-wise to each row of the subquery result. The result of NOT IN is true if only unequal subquery rows are found (including the case where the subquery returns no rows). The result is false if any equal row is found. As usual, null values in the rows are combined per the normal rules of SQL Boolean expressions. Two rows are considered equal if all their corresponding members are non-null and equal; the rows are unequal if any corresponding members are non-null and unequal; otherwise the result of that row comparison is unknown (null). If all the per-row results are either unequal or null, with at least one null, then the result of NOT IN is null. <literal>ANY</literal>/<literal>SOME</literal> expression operator ANY (subquery) expression operator SOME (subquery) The right-hand side is a parenthesized subquery, which must return exactly one column. The left-hand expression is evaluated and compared to each row of the subquery result using the given operator, which must yield a Boolean result. The result of ANY is true if any true result is obtained. The result is false if no true result is found (including the case where the subquery returns no rows). SOME is a synonym for ANY. IN is equivalent to = ANY. Note that if there are no successes and at least one right-hand row yields null for the operator's result, the result of the ANY construct will be null, not false. This is in accordance with SQL's normal rules for Boolean combinations of null values. As with EXISTS, it's unwise to assume that the subquery will be evaluated completely. row_constructor operator ANY (subquery) row_constructor operator SOME (subquery) The left-hand side of this form of ANY is a row constructor, as described in . The right-hand side is a parenthesized subquery, which must return exactly as many columns as there are expressions in the left-hand row. The left-hand expressions are evaluated and compared row-wise to each row of the subquery result, using the given operator. The result of ANY is true if the comparison returns true for any subquery row. The result is false if the comparison returns false for every subquery row (including the case where the subquery returns no rows). The result is NULL if no comparison with a subquery row returns true, and at least one comparison returns NULL. See for details about the meaning of a row constructor comparison. <literal>ALL</literal> expression operator ALL (subquery) The right-hand side is a parenthesized subquery, which must return exactly one column. The left-hand expression is evaluated and compared to each row of the subquery result using the given operator, which must yield a Boolean result. The result of ALL is true if all rows yield true (including the case where the subquery returns no rows). The result is false if any false result is found. The result is NULL if no comparison with a subquery row returns false, and at least one comparison returns NULL. NOT IN is equivalent to <> ALL. As with EXISTS, it's unwise to assume that the subquery will be evaluated completely. row_constructor operator ALL (subquery) The left-hand side of this form of ALL is a row constructor, as described in . The right-hand side is a parenthesized subquery, which must return exactly as many columns as there are expressions in the left-hand row. The left-hand expressions are evaluated and compared row-wise to each row of the subquery result, using the given operator. The result of ALL is true if the comparison returns true for all subquery rows (including the case where the subquery returns no rows). The result is false if the comparison returns false for any subquery row. The result is NULL if no comparison with a subquery row returns false, and at least one comparison returns NULL. See for details about the meaning of a row constructor comparison. Single-Row Comparison comparison subquery result row row_constructor operator (subquery) The left-hand side is a row constructor, as described in . The right-hand side is a parenthesized subquery, which must return exactly as many columns as there are expressions in the left-hand row. Furthermore, the subquery cannot return more than one row. (If it returns zero rows, the result is taken to be null.) The left-hand side is evaluated and compared row-wise to the single subquery result row. See for details about the meaning of a row constructor comparison. Row and Array Comparisons IN NOT IN ANY ALL SOME composite type comparison row-wise comparison comparison composite type comparison row constructor IS DISTINCT FROM IS NOT DISTINCT FROM This section describes several specialized constructs for making multiple comparisons between groups of values. These forms are syntactically related to the subquery forms of the previous section, but do not involve subqueries. The forms involving array subexpressions are PostgreSQL extensions; the rest are SQL-compliant. All of the expression forms documented in this section return Boolean (true/false) results. <literal>IN</literal> expression IN (value , ...) The right-hand side is a parenthesized list of scalar expressions. The result is true if the left-hand expression's result is equal to any of the right-hand expressions. This is a shorthand notation for expression = value1 OR expression = value2 OR ... Note that if the left-hand expression yields null, or if there are no equal right-hand values and at least one right-hand expression yields null, the result of the IN construct will be null, not false. This is in accordance with SQL's normal rules for Boolean combinations of null values. <literal>NOT IN</literal> expression NOT IN (value , ...) The right-hand side is a parenthesized list of scalar expressions. The result is true if the left-hand expression's result is unequal to all of the right-hand expressions. This is a shorthand notation for expression <> value1 AND expression <> value2 AND ... Note that if the left-hand expression yields null, or if there are no equal right-hand values and at least one right-hand expression yields null, the result of the NOT IN construct will be null, not true as one might naively expect. This is in accordance with SQL's normal rules for Boolean combinations of null values. x NOT IN y is equivalent to NOT (x IN y) in all cases. However, null values are much more likely to trip up the novice when working with NOT IN than when working with IN. It is best to express your condition positively if possible. <literal>ANY</literal>/<literal>SOME</literal> (array) expression operator ANY (array expression) expression operator SOME (array expression) The right-hand side is a parenthesized expression, which must yield an array value. The left-hand expression is evaluated and compared to each element of the array using the given operator, which must yield a Boolean result. The result of ANY is true if any true result is obtained. The result is false if no true result is found (including the case where the array has zero elements). If the array expression yields a null array, the result of ANY will be null. If the left-hand expression yields null, the result of ANY is ordinarily null (though a non-strict comparison operator could possibly yield a different result). Also, if the right-hand array contains any null elements and no true comparison result is obtained, the result of ANY will be null, not false (again, assuming a strict comparison operator). This is in accordance with SQL's normal rules for Boolean combinations of null values. SOME is a synonym for ANY. <literal>ALL</literal> (array) expression operator ALL (array expression) The right-hand side is a parenthesized expression, which must yield an array value. The left-hand expression is evaluated and compared to each element of the array using the given operator, which must yield a Boolean result. The result of ALL is true if all comparisons yield true (including the case where the array has zero elements). The result is false if any false result is found. If the array expression yields a null array, the result of ALL will be null. If the left-hand expression yields null, the result of ALL is ordinarily null (though a non-strict comparison operator could possibly yield a different result). Also, if the right-hand array contains any null elements and no false comparison result is obtained, the result of ALL will be null, not true (again, assuming a strict comparison operator). This is in accordance with SQL's normal rules for Boolean combinations of null values. Row Constructor Comparison row_constructor operator row_constructor Each side is a row constructor, as described in . The two row values must have the same number of fields. Each side is evaluated and they are compared row-wise. Row constructor comparisons are allowed when the operator is =, <>, <, <=, > or >=. Every row element must be of a type which has a default B-tree operator class or the attempted comparison may generate an error. Errors related to the number or types of elements might not occur if the comparison is resolved using earlier columns. The = and <> cases work slightly differently from the others. Two rows are considered equal if all their corresponding members are non-null and equal; the rows are unequal if any corresponding members are non-null and unequal; otherwise the result of the row comparison is unknown (null). For the <, <=, > and >= cases, the row elements are compared left-to-right, stopping as soon as an unequal or null pair of elements is found. If either of this pair of elements is null, the result of the row comparison is unknown (null); otherwise comparison of this pair of elements determines the result. For example, ROW(1,2,NULL) < ROW(1,3,0) yields true, not null, because the third pair of elements are not considered. Prior to PostgreSQL 8.2, the <, <=, > and >= cases were not handled per SQL specification. A comparison like ROW(a,b) < ROW(c,d) was implemented as a < c AND b < d whereas the correct behavior is equivalent to a < c OR (a = c AND b < d). row_constructor IS DISTINCT FROM row_constructor This construct is similar to a <> row comparison, but it does not yield null for null inputs. Instead, any null value is considered unequal to (distinct from) any non-null value, and any two nulls are considered equal (not distinct). Thus the result will either be true or false, never null. row_constructor IS NOT DISTINCT FROM row_constructor This construct is similar to a = row comparison, but it does not yield null for null inputs. Instead, any null value is considered unequal to (distinct from) any non-null value, and any two nulls are considered equal (not distinct). Thus the result will always be either true or false, never null. Composite Type Comparison record operator record The SQL specification requires row-wise comparison to return NULL if the result depends on comparing two NULL values or a NULL and a non-NULL. PostgreSQL does this only when comparing the results of two row constructors (as in ) or comparing a row constructor to the output of a subquery (as in ). In other contexts where two composite-type values are compared, two NULL field values are considered equal, and a NULL is considered larger than a non-NULL. This is necessary in order to have consistent sorting and indexing behavior for composite types. Each side is evaluated and they are compared row-wise. Composite type comparisons are allowed when the operator is =, <>, <, <=, > or >=, or has semantics similar to one of these. (To be specific, an operator can be a row comparison operator if it is a member of a B-tree operator class, or is the negator of the = member of a B-tree operator class.) The default behavior of the above operators is the same as for IS [ NOT ] DISTINCT FROM for row constructors (see ). To support matching of rows which include elements without a default B-tree operator class, the following operators are defined for composite type comparison: *=, *<>, *<, *<=, *>, and *>=. These operators compare the internal binary representation of the two rows. Two rows might have a different binary representation even though comparisons of the two rows with the equality operator is true. The ordering of rows under these comparison operators is deterministic but not otherwise meaningful. These operators are used internally for materialized views and might be useful for other specialized purposes such as replication but are not intended to be generally useful for writing queries. Set Returning Functions set returning functions functions generate_series This section describes functions that possibly return more than one row. The most widely used functions in this class are series generating functions, as detailed in and . Other, more specialized set-returning functions are described elsewhere in this manual. See for ways to combine multiple set-returning functions. Series Generating Functions Function Argument Type Return Type Description generate_series(start, stop) int, bigint or numeric setof int, setof bigint, or setof numeric (same as argument type) Generate a series of values, from start to stop with a step size of one generate_series(start, stop, step) int, bigint or numeric setof int, setof bigint or setof numeric (same as argument type) Generate a series of values, from start to stop with a step size of step generate_series(start, stop, step interval) timestamp or timestamp with time zone setof timestamp or setof timestamp with time zone (same as argument type) Generate a series of values, from start to stop with a step size of step

When step is positive, zero rows are returned if start is greater than stop. Conversely, when step is negative, zero rows are returned if start is less than stop. Zero rows are also returned for NULL inputs. It is an error for step to be zero. Some examples follow: SELECT * FROM generate_series(2,4); generate_series ----------------- 2 3 4 (3 rows) SELECT * FROM generate_series(5,1,-2); generate_series ----------------- 5 3 1 (3 rows) SELECT * FROM generate_series(4,3); generate_series ----------------- (0 rows) SELECT generate_series(1.1, 4, 1.3); generate_series ----------------- 1.1 2.4 3.7 (3 rows) -- this example relies on the date-plus-integer operator SELECT current_date + s.a AS dates FROM generate_series(0,14,7) AS s(a); dates ------------ 2004-02-05 2004-02-12 2004-02-19 (3 rows) SELECT * FROM generate_series('2008-03-01 00:00'::timestamp, '2008-03-04 12:00', '10 hours'); generate_series --------------------- 2008-03-01 00:00:00 2008-03-01 10:00:00 2008-03-01 20:00:00 2008-03-02 06:00:00 2008-03-02 16:00:00 2008-03-03 02:00:00 2008-03-03 12:00:00 2008-03-03 22:00:00 2008-03-04 08:00:00 (9 rows) Subscript Generating Functions Function Return Type Description generate_subscripts(array anyarray, dim int) setof int Generate a series comprising the given array's subscripts. generate_subscripts(array anyarray, dim int, reverse boolean) setof int Generate a series comprising the given array's subscripts. When reverse is true, the series is returned in reverse order.

generate_subscripts generate_subscripts is a convenience function that generates the set of valid subscripts for the specified dimension of the given array. Zero rows are returned for arrays that do not have the requested dimension, or for NULL arrays (but valid subscripts are returned for NULL array elements). Some examples follow: -- basic usage SELECT generate_subscripts('{NULL,1,NULL,2}'::int[], 1) AS s; s --- 1 2 3 4 (4 rows) -- presenting an array, the subscript and the subscripted -- value requires a subquery SELECT * FROM arrays; a -------------------- {-1,-2} {100,200,300} (2 rows) SELECT a AS array, s AS subscript, a[s] AS value FROM (SELECT generate_subscripts(a, 1) AS s, a FROM arrays) foo; array | subscript | value ---------------+-----------+------- {-1,-2} | 1 | -1 {-1,-2} | 2 | -2 {100,200,300} | 1 | 100 {100,200,300} | 2 | 200 {100,200,300} | 3 | 300 (5 rows) -- unnest a 2D array CREATE OR REPLACE FUNCTION unnest2(anyarray) RETURNS SETOF anyelement AS $$ select $1[i][j] from generate_subscripts($1,1) g1(i), generate_subscripts($1,2) g2(j); $$ LANGUAGE sql IMMUTABLE; CREATE FUNCTION SELECT * FROM unnest2(ARRAY[[1,2],[3,4]]); unnest2 --------- 1 2 3 4 (4 rows) ordinality When a function in the FROM clause is suffixed by WITH ORDINALITY, a bigint column is appended to the output which starts from 1 and increments by 1 for each row of the function's output. This is most useful in the case of set returning functions such as unnest(). -- set returning function WITH ORDINALITY SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n); ls | n -----------------+---- pg_serial | 1 pg_twophase | 2 postmaster.opts | 3 pg_notify | 4 postgresql.conf | 5 pg_tblspc | 6 logfile | 7 base | 8 postmaster.pid | 9 pg_ident.conf | 10 global | 11 pg_xact | 12 pg_snapshots | 13 pg_multixact | 14 PG_VERSION | 15 pg_wal | 16 pg_hba.conf | 17 pg_stat_tmp | 18 pg_subtrans | 19 (19 rows) System Information Functions and Operators shows several functions that extract session and system information. In addition to the functions listed in this section, there are a number of functions related to the statistics system that also provide system information. See for more information. Session Information Functions Name Return Type Description current_catalog name name of current database (called catalog in the SQL standard) current_database() name name of current database current_query() text text of the currently executing query, as submitted by the client (might contain more than one statement) current_role name equivalent to current_user current_schema[()] name name of current schema current_schemas(boolean) name[] names of schemas in search path, optionally including implicit schemas current_user name user name of current execution context inet_client_addr() inet address of the remote connection inet_client_port() int port of the remote connection inet_server_addr() inet address of the local connection inet_server_port() int port of the local connection pg_backend_pid() int Process ID of the server process attached to the current session pg_blocking_pids(int) int[] Process ID(s) that are blocking specified server process ID from acquiring a lock pg_conf_load_time() timestamp with time zone configuration load time pg_current_logfile(text) text Primary log file name, or log in the requested format, currently in use by the logging collector pg_my_temp_schema() oid OID of session's temporary schema, or 0 if none pg_is_other_temp_schema(oid) boolean is schema another session's temporary schema? pg_jit_available() boolean is JIT compilation available in this session (see )? Returns false if is set to false. pg_listening_channels() setof text channel names that the session is currently listening on pg_notification_queue_usage() double fraction of the asynchronous notification queue currently occupied (0-1) pg_postmaster_start_time() timestamp with time zone server start time pg_safe_snapshot_blocking_pids(int) int[] Process ID(s) that are blocking specified server process ID from acquiring a safe snapshot pg_trigger_depth() int current nesting level of PostgreSQL triggers (0 if not called, directly or indirectly, from inside a trigger) session_user name session user name user name equivalent to current_user version() text PostgreSQL version information. See also for a machine-readable version.

current_catalog, current_role, current_schema, current_user, session_user, and user have special syntactic status in SQL: they must be called without trailing parentheses. (In PostgreSQL, parentheses can optionally be used with current_schema, but not with the others.) current_catalog current_database current_query current_role current_schema current_schemas current_user pg_backend_pid schema current search path current session_user user current user The session_user is normally the user who initiated the current database connection; but superusers can change this setting with . The current_user is the user identifier that is applicable for permission checking. Normally it is equal to the session user, but it can be changed with . It also changes during the execution of functions with the attribute SECURITY DEFINER. In Unix parlance, the session user is the real user and the current user is the effective user. current_role and user are synonyms for current_user. (The SQL standard draws a distinction between current_role and current_user, but PostgreSQL does not, since it unifies users and roles into a single kind of entity.) current_schema returns the name of the schema that is first in the search path (or a null value if the search path is empty). This is the schema that will be used for any tables or other named objects that are created without specifying a target schema. current_schemas(boolean) returns an array of the names of all schemas presently in the search path. The Boolean option determines whether or not implicitly included system schemas such as pg_catalog are included in the returned search path. The search path can be altered at run time. The command is: SET search_path TO schema , schema, ... inet_client_addr inet_client_port inet_server_addr inet_server_port inet_client_addr returns the IP address of the current client, and inet_client_port returns the port number. inet_server_addr returns the IP address on which the server accepted the current connection, and inet_server_port returns the port number. All these functions return NULL if the current connection is via a Unix-domain socket. pg_blocking_pids pg_blocking_pids returns an array of the process IDs of the sessions that are blocking the server process with the specified process ID, or an empty array if there is no such server process or it is not blocked. One server process blocks another if it either holds a lock that conflicts with the blocked process's lock request (hard block), or is waiting for a lock that would conflict with the blocked process's lock request and is ahead of it in the wait queue (soft block). When using parallel queries the result always lists client-visible process IDs (that is, pg_backend_pid results) even if the actual lock is held or awaited by a child worker process. As a result of that, there may be duplicated PIDs in the result. Also note that when a prepared transaction holds a conflicting lock, it will be represented by a zero process ID in the result of this function. Frequent calls to this function could have some impact on database performance, because it needs exclusive access to the lock manager's shared state for a short time. pg_conf_load_time pg_conf_load_time returns the timestamp with time zone when the server configuration files were last loaded. (If the current session was alive at the time, this will be the time when the session itself re-read the configuration files, so the reading will vary a little in different sessions. Otherwise it is the time when the postmaster process re-read the configuration files.) pg_current_logfile Logging pg_current_logfile function current_logfiles and the pg_current_logfile function Logging current_logfiles file and the pg_current_logfile function pg_current_logfile returns, as text, the path of the log file(s) currently in use by the logging collector. The path includes the directory and the log file name. Log collection must be enabled or the return value is NULL. When multiple log files exist, each in a different format, pg_current_logfile called without arguments returns the path of the file having the first format found in the ordered list: stderr, csvlog. NULL is returned when no log file has any of these formats. To request a specific file format supply, as text, either csvlog or stderr as the value of the optional parameter. The return value is NULL when the log format requested is not a configured . The pg_current_logfile reflects the contents of the current_logfiles file. pg_my_temp_schema pg_is_other_temp_schema pg_my_temp_schema returns the OID of the current session's temporary schema, or zero if it has none (because it has not created any temporary tables). pg_is_other_temp_schema returns true if the given OID is the OID of another session's temporary schema. (This can be useful, for example, to exclude other sessions' temporary tables from a catalog display.) pg_listening_channels pg_notification_queue_usage pg_listening_channels returns a set of names of asynchronous notification channels that the current session is listening to. pg_notification_queue_usage returns the fraction of the total available space for notifications currently occupied by notifications that are waiting to be processed, as a double in the range 0-1. See and for more information. pg_postmaster_start_time pg_postmaster_start_time returns the timestamp with time zone when the server started. pg_safe_snapshot_blocking_pids pg_safe_snapshot_blocking_pids returns an array of the process IDs of the sessions that are blocking the server process with the specified process ID from acquiring a safe snapshot, or an empty array if there is no such server process or it is not blocked. A session running a SERIALIZABLE transaction blocks a SERIALIZABLE READ ONLY DEFERRABLE transaction from acquiring a snapshot until the latter determines that it is safe to avoid taking any predicate locks. See for more information about serializable and deferrable transactions. Frequent calls to this function could have some impact on database performance, because it needs access to the predicate lock manager's shared state for a short time. version version returns a string describing the PostgreSQL server's version. You can also get this information from or for a machine-readable version, . Software developers should use server_version_num (available since 8.2) or instead of parsing the text version. privilege querying lists functions that allow the user to query object access privileges programmatically. See for more information about privileges. Access Privilege Inquiry Functions Name Return Type Description has_any_column_privilege(user, table, privilege) boolean does user have privilege for any column of table has_any_column_privilege(table, privilege) boolean does current user have privilege for any column of table has_column_privilege(user, table, column, privilege) boolean does user have privilege for column has_column_privilege(table, column, privilege) boolean does current user have privilege for column has_database_privilege(user, database, privilege) boolean does user have privilege for database has_database_privilege(database, privilege) boolean does current user have privilege for database has_foreign_data_wrapper_privilege(user, fdw, privilege) boolean does user have privilege for foreign-data wrapper has_foreign_data_wrapper_privilege(fdw, privilege) boolean does current user have privilege for foreign-data wrapper has_function_privilege(user, function, privilege) boolean does user have privilege for function has_function_privilege(function, privilege) boolean does current user have privilege for function has_language_privilege(user, language, privilege) boolean does user have privilege for language has_language_privilege(language, privilege) boolean does current user have privilege for language has_schema_privilege(user, schema, privilege) boolean does user have privilege for schema has_schema_privilege(schema, privilege) boolean does current user have privilege for schema has_sequence_privilege(user, sequence, privilege) boolean does user have privilege for sequence has_sequence_privilege(sequence, privilege) boolean does current user have privilege for sequence has_server_privilege(user, server, privilege) boolean does user have privilege for foreign server has_server_privilege(server, privilege) boolean does current user have privilege for foreign server has_table_privilege(user, table, privilege) boolean does user have privilege for table has_table_privilege(table, privilege) boolean does current user have privilege for table has_tablespace_privilege(user, tablespace, privilege) boolean does user have privilege for tablespace has_tablespace_privilege(tablespace, privilege) boolean does current user have privilege for tablespace has_type_privilege(user, type, privilege) boolean does user have privilege for type has_type_privilege(type, privilege) boolean does current user have privilege for type pg_has_role(user, role, privilege) boolean does user have privilege for role pg_has_role(role, privilege) boolean does current user have privilege for role row_security_active(table) boolean does current user have row level security active for table

has_any_column_privilege has_column_privilege has_database_privilege has_function_privilege has_foreign_data_wrapper_privilege has_language_privilege has_schema_privilege has_server_privilege has_sequence_privilege has_table_privilege has_tablespace_privilege has_type_privilege pg_has_role row_security_active has_table_privilege checks whether a user can access a table in a particular way. The user can be specified by name, by OID (pg_authid.oid), public to indicate the PUBLIC pseudo-role, or if the argument is omitted current_user is assumed. The table can be specified by name or by OID. (Thus, there are actually six variants of has_table_privilege, which can be distinguished by the number and types of their arguments.) When specifying by name, the name can be schema-qualified if necessary. The desired access privilege type is specified by a text string, which must evaluate to one of the values SELECT, INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, or TRIGGER. Optionally, WITH GRANT OPTION can be added to a privilege type to test whether the privilege is held with grant option. Also, multiple privilege types can be listed separated by commas, in which case the result will be true if any of the listed privileges is held. (Case of the privilege string is not significant, and extra whitespace is allowed between but not within privilege names.) Some examples: SELECT has_table_privilege('myschema.mytable', 'select'); SELECT has_table_privilege('joe', 'mytable', 'INSERT, SELECT WITH GRANT OPTION'); has_sequence_privilege checks whether a user can access a sequence in a particular way. The possibilities for its arguments are analogous to has_table_privilege. The desired access privilege type must evaluate to one of USAGE, SELECT, or UPDATE. has_any_column_privilege checks whether a user can access any column of a table in a particular way. Its argument possibilities are analogous to has_table_privilege, except that the desired access privilege type must evaluate to some combination of SELECT, INSERT, UPDATE, or REFERENCES. Note that having any of these privileges at the table level implicitly grants it for each column of the table, so has_any_column_privilege will always return true if has_table_privilege does for the same arguments. But has_any_column_privilege also succeeds if there is a column-level grant of the privilege for at least one column. has_column_privilege checks whether a user can access a column in a particular way. Its argument possibilities are analogous to has_table_privilege, with the addition that the column can be specified either by name or attribute number. The desired access privilege type must evaluate to some combination of SELECT, INSERT, UPDATE, or REFERENCES. Note that having any of these privileges at the table level implicitly grants it for each column of the table. has_database_privilege checks whether a user can access a database in a particular way. Its argument possibilities are analogous to has_table_privilege. The desired access privilege type must evaluate to some combination of CREATE, CONNECT, TEMPORARY, or TEMP (which is equivalent to TEMPORARY). has_function_privilege checks whether a user can access a function in a particular way. Its argument possibilities are analogous to has_table_privilege. When specifying a function by a text string rather than by OID, the allowed input is the same as for the regprocedure data type (see ). The desired access privilege type must evaluate to EXECUTE. An example is: SELECT has_function_privilege('joeuser', 'myfunc(int, text)', 'execute'); has_foreign_data_wrapper_privilege checks whether a user can access a foreign-data wrapper in a particular way. Its argument possibilities are analogous to has_table_privilege. The desired access privilege type must evaluate to USAGE. has_language_privilege checks whether a user can access a procedural language in a particular way. Its argument possibilities are analogous to has_table_privilege. The desired access privilege type must evaluate to USAGE. has_schema_privilege checks whether a user can access a schema in a particular way. Its argument possibilities are analogous to has_table_privilege. The desired access privilege type must evaluate to some combination of CREATE or USAGE. has_server_privilege checks whether a user can access a foreign server in a particular way. Its argument possibilities are analogous to has_table_privilege. The desired access privilege type must evaluate to USAGE. has_tablespace_privilege checks whether a user can access a tablespace in a particular way. Its argument possibilities are analogous to has_table_privilege. The desired access privilege type must evaluate to CREATE. has_type_privilege checks whether a user can access a type in a particular way. Its argument possibilities are analogous to has_table_privilege. When specifying a type by a text string rather than by OID, the allowed input is the same as for the regtype data type (see ). The desired access privilege type must evaluate to USAGE. pg_has_role checks whether a user can access a role in a particular way. Its argument possibilities are analogous to has_table_privilege, except that public is not allowed as a user name. The desired access privilege type must evaluate to some combination of MEMBER or USAGE. MEMBER denotes direct or indirect membership in the role (that is, the right to do SET ROLE), while USAGE denotes whether the privileges of the role are immediately available without doing SET ROLE. row_security_active checks whether row level security is active for the specified table in the context of the current_user and environment. The table can be specified by name or by OID. shows the operators available for the aclitem type, which is the catalog representation of access privileges. See for information about how to read access privilege values. acldefault aclitemeq aclcontains aclexplode makeaclitem <type>aclitem</type> Operators Operator Description Example Result = equal 'calvin=r*w/hobbes'::aclitem = 'calvin=r*w*/hobbes'::aclitem f @> contains element '{calvin=r*w/hobbes,hobbes=r*w*/postgres}'::aclitem[] @> 'calvin=r*w/hobbes'::aclitem t ~ contains element '{calvin=r*w/hobbes,hobbes=r*w*/postgres}'::aclitem[] ~ 'calvin=r*w/hobbes'::aclitem t

shows some additional functions to manage the aclitem type. <type>aclitem</type> Functions Name Return Type Description acldefault(type, ownerId) aclitem[] get the default access privileges for an object belonging to ownerId aclexplode(aclitem[]) setof record get aclitem array as tuples makeaclitem(grantee, grantor, privilege, grantable) aclitem build an aclitem from input

acldefault returns the built-in default access privileges for an object of type type belonging to role ownerId. These represent the access privileges that will be assumed when an object's ACL entry is null. (The default access privileges are described in .) The type parameter is a CHAR: write 'c' for COLUMN, 'r' for TABLE and table-like objects, 's' for SEQUENCE, 'd' for DATABASE, 'f' for FUNCTION or PROCEDURE, 'l' for LANGUAGE, 'L' for LARGE OBJECT, 'n' for SCHEMA, 't' for TABLESPACE, 'F' for FOREIGN DATA WRAPPER, 'S' for FOREIGN SERVER, or 'T' for TYPE or DOMAIN. aclexplode returns an aclitem array as a set of rows. Output columns are grantor oid, grantee oid (0 for PUBLIC), granted privilege as text (SELECT, ...) and whether the privilege is grantable as boolean. makeaclitem performs the inverse operation. shows functions that determine whether a certain object is visible in the current schema search path. For example, a table is said to be visible if its containing schema is in the search path and no table of the same name appears earlier in the search path. This is equivalent to the statement that the table can be referenced by name without explicit schema qualification. To list the names of all visible tables: SELECT relname FROM pg_class WHERE pg_table_is_visible(oid); search path object visibility Schema Visibility Inquiry Functions Name Return Type Description pg_collation_is_visible(collation_oid) boolean is collation visible in search path pg_conversion_is_visible(conversion_oid) boolean is conversion visible in search path pg_function_is_visible(function_oid) boolean is function visible in search path pg_opclass_is_visible(opclass_oid) boolean is operator class visible in search path pg_operator_is_visible(operator_oid) boolean is operator visible in search path pg_opfamily_is_visible(opclass_oid) boolean is operator family visible in search path pg_statistics_obj_is_visible(stat_oid) boolean is statistics object visible in search path pg_table_is_visible(table_oid) boolean is table visible in search path pg_ts_config_is_visible(config_oid) boolean is text search configuration visible in search path pg_ts_dict_is_visible(dict_oid) boolean is text search dictionary visible in search path pg_ts_parser_is_visible(parser_oid) boolean is text search parser visible in search path pg_ts_template_is_visible(template_oid) boolean is text search template visible in search path pg_type_is_visible(type_oid) boolean is type (or domain) visible in search path

pg_collation_is_visible pg_conversion_is_visible pg_function_is_visible pg_opclass_is_visible pg_operator_is_visible pg_opfamily_is_visible pg_statistics_obj_is_visible pg_table_is_visible pg_ts_config_is_visible pg_ts_dict_is_visible pg_ts_parser_is_visible pg_ts_template_is_visible pg_type_is_visible Each function performs the visibility check for one type of database object. Note that pg_table_is_visible can also be used with views, materialized views, indexes, sequences and foreign tables; pg_function_is_visible can also be used with procedures and aggregates; pg_type_is_visible can also be used with domains. For functions and operators, an object in the search path is visible if there is no object of the same name and argument data type(s) earlier in the path. For operator classes, both name and associated index access method are considered. All these functions require object OIDs to identify the object to be checked. If you want to test an object by name, it is convenient to use the OID alias types (regclass, regtype, regprocedure, regoperator, regconfig, or regdictionary), for example: SELECT pg_type_is_visible('myschema.widget'::regtype); Note that it would not make much sense to test a non-schema-qualified type name in this way — if the name can be recognized at all, it must be visible. format_type pg_get_constraintdef pg_get_expr pg_get_functiondef pg_get_function_arguments pg_get_function_identity_arguments pg_get_function_result pg_get_indexdef pg_get_keywords pg_get_ruledef pg_get_serial_sequence pg_get_statisticsobjdef pg_get_triggerdef pg_get_userbyid pg_get_viewdef pg_index_column_has_property pg_index_has_property pg_indexam_has_property pg_options_to_table pg_tablespace_databases pg_tablespace_location pg_typeof collation for to_regclass to_regproc to_regprocedure to_regoper to_regoperator to_regtype to_regnamespace to_regrole lists functions that extract information from the system catalogs. System Catalog Information Functions Name Return Type Description format_type(type_oid, typemod) text get SQL name of a data type pg_get_constraintdef(constraint_oid) text get definition of a constraint pg_get_constraintdef(constraint_oid, pretty_bool) text get definition of a constraint pg_get_expr(pg_node_tree, relation_oid) text decompile internal form of an expression, assuming that any Vars in it refer to the relation indicated by the second parameter pg_get_expr(pg_node_tree, relation_oid, pretty_bool) text decompile internal form of an expression, assuming that any Vars in it refer to the relation indicated by the second parameter pg_get_functiondef(func_oid) text get definition of a function or procedure pg_get_function_arguments(func_oid) text get argument list of function's or procedure's definition (with default values) pg_get_function_identity_arguments(func_oid) text get argument list to identify a function or procedure (without default values) pg_get_function_result(func_oid) text get RETURNS clause for function (returns null for a procedure) pg_get_indexdef(index_oid) text get CREATE INDEX command for index pg_get_indexdef(index_oid, column_no, pretty_bool) text get CREATE INDEX command for index, or definition of just one index column when column_no is not zero pg_get_keywords() setof record get list of SQL keywords and their categories pg_get_ruledef(rule_oid) text get CREATE RULE command for rule pg_get_ruledef(rule_oid, pretty_bool) text get CREATE RULE command for rule pg_get_serial_sequence(table_name, column_name) text get name of the sequence that a serial or identity column uses pg_get_statisticsobjdef(statobj_oid) text get CREATE STATISTICS command for extended statistics object pg_get_triggerdef(trigger_oid) text get CREATE [ CONSTRAINT ] TRIGGER command for trigger pg_get_triggerdef(trigger_oid, pretty_bool) text get CREATE [ CONSTRAINT ] TRIGGER command for trigger pg_get_userbyid(role_oid) name get role name with given OID pg_get_viewdef(view_name) text get underlying SELECT command for view or materialized view (deprecated) pg_get_viewdef(view_name, pretty_bool) text get underlying SELECT command for view or materialized view (deprecated) pg_get_viewdef(view_oid) text get underlying SELECT command for view or materialized view pg_get_viewdef(view_oid, pretty_bool) text get underlying SELECT command for view or materialized view pg_get_viewdef(view_oid, wrap_column_int) text get underlying SELECT command for view or materialized view; lines with fields are wrapped to specified number of columns, pretty-printing is implied pg_index_column_has_property(index_oid, column_no, prop_name) boolean test whether an index column has a specified property pg_index_has_property(index_oid, prop_name) boolean test whether an index has a specified property pg_indexam_has_property(am_oid, prop_name) boolean test whether an index access method has a specified property pg_options_to_table(reloptions) setof record get the set of storage option name/value pairs pg_tablespace_databases(tablespace_oid) setof oid get the set of database OIDs that have objects in the tablespace pg_tablespace_location(tablespace_oid) text get the path in the file system that this tablespace is located in pg_typeof(any) regtype get the data type of any value collation for (any) text get the collation of the argument to_regclass(rel_name) regclass get the OID of the named relation to_regproc(func_name) regproc get the OID of the named function to_regprocedure(func_name) regprocedure get the OID of the named function to_regoper(operator_name) regoper get the OID of the named operator to_regoperator(operator_name) regoperator get the OID of the named operator to_regtype(type_name) regtype get the OID of the named type to_regnamespace(schema_name) regnamespace get the OID of the named schema to_regrole(role_name) regrole get the OID of the named role

format_type returns the SQL name of a data type that is identified by its type OID and possibly a type modifier. Pass NULL for the type modifier if no specific modifier is known. pg_get_keywords returns a set of records describing the SQL keywords recognized by the server. The word column contains the keyword. The catcode column contains a category code: U for unreserved, C for column name, T for type or function name, or R for reserved. The catdesc column contains a possibly-localized string describing the category. pg_get_constraintdef, pg_get_indexdef, pg_get_ruledef, pg_get_statisticsobjdef, and pg_get_triggerdef, respectively reconstruct the creating command for a constraint, index, rule, extended statistics object, or trigger. (Note that this is a decompiled reconstruction, not the original text of the command.) pg_get_expr decompiles the internal form of an individual expression, such as the default value for a column. It can be useful when examining the contents of system catalogs. If the expression might contain Vars, specify the OID of the relation they refer to as the second parameter; if no Vars are expected, zero is sufficient. pg_get_viewdef reconstructs the SELECT query that defines a view. Most of these functions come in two variants, one of which can optionally pretty-print the result. The pretty-printed format is more readable, but the default format is more likely to be interpreted the same way by future versions of PostgreSQL; avoid using pretty-printed output for dump purposes. Passing false for the pretty-print parameter yields the same result as the variant that does not have the parameter at all. pg_get_functiondef returns a complete CREATE OR REPLACE FUNCTION statement for a function. pg_get_function_arguments returns the argument list of a function, in the form it would need to appear in within CREATE FUNCTION. pg_get_function_result similarly returns the appropriate RETURNS clause for the function. pg_get_function_identity_arguments returns the argument list necessary to identify a function, in the form it would need to appear in within ALTER FUNCTION, for instance. This form omits default values. pg_get_serial_sequence returns the name of the sequence associated with a column, or NULL if no sequence is associated with the column. If the column is an identity column, the associated sequence is the sequence internally created for the identity column. For columns created using one of the serial types (serial, smallserial, bigserial), it is the sequence created for that serial column definition. In the latter case, this association can be modified or removed with ALTER SEQUENCE OWNED BY. (The function probably should have been called pg_get_owned_sequence; its current name reflects the fact that it has typically been used with serial or bigserial columns.) The first input parameter is a table name with optional schema, and the second parameter is a column name. Because the first parameter is potentially a schema and table, it is not treated as a double-quoted identifier, meaning it is lower cased by default, while the second parameter, being just a column name, is treated as double-quoted and has its case preserved. The function returns a value suitably formatted for passing to sequence functions (see ). A typical use is in reading the current value of a sequence for an identity or serial column, for example: SELECT currval(pg_get_serial_sequence('sometable', 'id')); pg_get_userbyid extracts a role's name given its OID. pg_index_column_has_property, pg_index_has_property, and pg_indexam_has_property return whether the specified index column, index, or index access method possesses the named property. NULL is returned if the property name is not known or does not apply to the particular object, or if the OID or column number does not identify a valid object. Refer to for column properties, for index properties, and for access method properties. (Note that extension access methods can define additional property names for their indexes.) Index Column Properties NameDescription asc Does the column sort in ascending order on a forward scan? desc Does the column sort in descending order on a forward scan? nulls_first Does the column sort with nulls first on a forward scan? nulls_last Does the column sort with nulls last on a forward scan? orderable Does the column possess any defined sort ordering? distance_orderable Can the column be scanned in order by a distance operator, for example ORDER BY col <-> constant ? returnable Can the column value be returned by an index-only scan? search_array Does the column natively support col = ANY(array) searches? search_nulls Does the column support IS NULL and IS NOT NULL searches?

Index Properties NameDescription clusterable Can the index be used in a CLUSTER command? index_scan Does the index support plain (non-bitmap) scans? bitmap_scan Does the index support bitmap scans? backward_scan Can the scan direction be changed in mid-scan (to support FETCH BACKWARD on a cursor without needing materialization)?

Index Access Method Properties NameDescription can_order Does the access method support ASC, DESC and related keywords in CREATE INDEX? can_unique Does the access method support unique indexes? can_multi_col Does the access method support indexes with multiple columns? can_exclude Does the access method support exclusion constraints? can_include Does the access method support the INCLUDE clause of CREATE INDEX?

pg_options_to_table returns the set of storage option name/value pairs (option_name/option_value) when passed pg_class.reloptions or pg_attribute.attoptions. pg_tablespace_databases allows a tablespace to be examined. It returns the set of OIDs of databases that have objects stored in the tablespace. If this function returns any rows, the tablespace is not empty and cannot be dropped. To display the specific objects populating the tablespace, you will need to connect to the databases identified by pg_tablespace_databases and query their pg_class catalogs. pg_typeof returns the OID of the data type of the value that is passed to it. This can be helpful for troubleshooting or dynamically constructing SQL queries. The function is declared as returning regtype, which is an OID alias type (see ); this means that it is the same as an OID for comparison purposes but displays as a type name. For example: SELECT pg_typeof(33); pg_typeof ----------- integer (1 row) SELECT typlen FROM pg_type WHERE oid = pg_typeof(33); typlen -------- 4 (1 row) The expression collation for returns the collation of the value that is passed to it. Example: SELECT collation for (description) FROM pg_description LIMIT 1; pg_collation_for ------------------ "default" (1 row) SELECT collation for ('foo' COLLATE "de_DE"); pg_collation_for ------------------ "de_DE" (1 row) The value might be quoted and schema-qualified. If no collation is derived for the argument expression, then a null value is returned. If the argument is not of a collatable data type, then an error is raised. The to_regclass, to_regproc, to_regprocedure, to_regoper, to_regoperator, to_regtype, to_regnamespace, and to_regrole functions translate relation, function, operator, type, schema, and role names (given as text) to objects of type regclass, regproc, regprocedure, regoper, regoperator, regtype, regnamespace, and regrole respectively. These functions differ from a cast from text in that they don't accept a numeric OID, and that they return null rather than throwing an error if the name is not found (or, for to_regproc and to_regoper, if the given name matches multiple objects). pg_describe_object pg_identify_object pg_identify_object_as_address pg_get_object_address lists functions related to database object identification and addressing. Object Information and Addressing Functions Name Return Type Description pg_describe_object(classid oid, objid oid, objsubid integer) text get description of a database object pg_identify_object(classid oid, objid oid, objsubid integer) type text, schema text, name text, identity text get identity of a database object pg_identify_object_as_address(classid oid, objid oid, objsubid integer) type text, object_names text[], object_args text[] get external representation of a database object's address pg_get_object_address(type text, object_names text[], object_args text[]) classid oid, objid oid, objsubid integer get address of a database object from its external representation

pg_describe_object returns a textual description of a database object specified by catalog OID, object OID, and sub-object ID (such as a column number within a table; the sub-object ID is zero when referring to a whole object). This description is intended to be human-readable, and might be translated, depending on server configuration. This is useful to determine the identity of an object as stored in the pg_depend catalog. pg_identify_object returns a row containing enough information to uniquely identify the database object specified by catalog OID, object OID and sub-object ID. This information is intended to be machine-readable, and is never translated. type identifies the type of database object; schema is the schema name that the object belongs in, or NULL for object types that do not belong to schemas; name is the name of the object, quoted if necessary, if the name (along with schema name, if pertinent) is sufficient to uniquely identify the object, otherwise NULL; identity is the complete object identity, with the precise format depending on object type, and each name within the format being schema-qualified and quoted as necessary. pg_identify_object_as_address returns a row containing enough information to uniquely identify the database object specified by catalog OID, object OID and sub-object ID. The returned information is independent of the current server, that is, it could be used to identify an identically named object in another server. type identifies the type of database object; object_names and object_args are text arrays that together form a reference to the object. These three values can be passed to pg_get_object_address to obtain the internal address of the object. This function is the inverse of pg_get_object_address. pg_get_object_address returns a row containing enough information to uniquely identify the database object specified by its type and object name and argument arrays. The returned values are the ones that would be used in system catalogs such as pg_depend and can be passed to other system functions such as pg_identify_object or pg_describe_object. classid is the OID of the system catalog containing the object; objid is the OID of the object itself, and objsubid is the sub-object ID, or zero if none. This function is the inverse of pg_identify_object_as_address. col_description obj_description shobj_description comment about database objects The functions shown in extract comments previously stored with the command. A null value is returned if no comment could be found for the specified parameters. Comment Information Functions Name Return Type Description col_description(table_oid, column_number) text get comment for a table column obj_description(object_oid, catalog_name) text get comment for a database object obj_description(object_oid) text get comment for a database object (deprecated) shobj_description(object_oid, catalog_name) text get comment for a shared database object

col_description returns the comment for a table column, which is specified by the OID of its table and its column number. (obj_description cannot be used for table columns since columns do not have OIDs of their own.) The two-parameter form of obj_description returns the comment for a database object specified by its OID and the name of the containing system catalog. For example, obj_description(123456,'pg_class') would retrieve the comment for the table with OID 123456. The one-parameter form of obj_description requires only the object OID. It is deprecated since there is no guarantee that OIDs are unique across different system catalogs; therefore, the wrong comment might be returned. shobj_description is used just like obj_description except it is used for retrieving comments on shared objects. Some system catalogs are global to all databases within each cluster, and the descriptions for objects in them are stored globally as well. txid_current txid_current_if_assigned txid_current_snapshot txid_snapshot_xip txid_snapshot_xmax txid_snapshot_xmin txid_visible_in_snapshot txid_status The functions shown in provide server transaction information in an exportable form. The main use of these functions is to determine which transactions were committed between two snapshots. Transaction IDs and Snapshots Name Return Type Description txid_current() bigint get current transaction ID, assigning a new one if the current transaction does not have one txid_current_if_assigned() bigint same as txid_current() but returns null instead of assigning a new transaction ID if none is already assigned txid_current_snapshot() txid_snapshot get current snapshot txid_snapshot_xip(txid_snapshot) setof bigint get in-progress transaction IDs in snapshot txid_snapshot_xmax(txid_snapshot) bigint get xmax of snapshot txid_snapshot_xmin(txid_snapshot) bigint get xmin of snapshot txid_visible_in_snapshot(bigint, txid_snapshot) boolean is transaction ID visible in snapshot? (do not use with subtransaction ids) txid_status(bigint) text report the status of the given transaction: committed, aborted, in progress, or null if the transaction ID is too old

The internal transaction ID type (xid) is 32 bits wide and wraps around every 4 billion transactions. However, these functions export a 64-bit format that is extended with an epoch counter so it will not wrap around during the life of an installation. The data type used by these functions, txid_snapshot, stores information about transaction ID visibility at a particular moment in time. Its components are described in . Snapshot Components Name Description xmin Earliest transaction ID (txid) that is still active. All earlier transactions will either be committed and visible, or rolled back and dead. xmax First as-yet-unassigned txid. All txids greater than or equal to this are not yet started as of the time of the snapshot, and thus invisible. xip_list Active txids at the time of the snapshot. The list includes only those active txids between xmin and xmax; there might be active txids higher than xmax. A txid that is xmin <= txid < xmax and not in this list was already completed at the time of the snapshot, and thus either visible or dead according to its commit status. The list does not include txids of subtransactions.

txid_snapshot's textual representation is xmin:xmax:xip_list. For example 10:20:10,14,15 means xmin=10, xmax=20, xip_list=10, 14, 15. txid_status(bigint) reports the commit status of a recent transaction. Applications may use it to determine whether a transaction committed or aborted when the application and database server become disconnected while a COMMIT is in progress. The status of a transaction will be reported as either in progress, committed, or aborted, provided that the transaction is recent enough that the system retains the commit status of that transaction. If is old enough that no references to that transaction survive in the system and the commit status information has been discarded, this function will return NULL. Note that prepared transactions are reported as in progress; applications must check pg_prepared_xacts if they need to determine whether the txid is a prepared transaction. The functions shown in provide information about transactions that have been already committed. These functions mainly provide information about when the transactions were committed. They only provide useful data when configuration option is enabled and only for transactions that were committed after it was enabled. Committed Transaction Information Name Return Type Description pg_xact_commit_timestamp pg_xact_commit_timestamp(xid) timestamp with time zone get commit timestamp of a transaction pg_last_committed_xact pg_last_committed_xact() xid xid, timestamp timestamp with time zone get transaction ID and commit timestamp of latest committed transaction

The functions shown in print information initialized during initdb, such as the catalog version. They also show information about write-ahead logging and checkpoint processing. This information is cluster-wide, and not specific to any one database. They provide most of the same information, from the same source, as , although in a form better suited to SQL functions. Control Data Functions Name Return Type Description pg_control_checkpoint pg_control_checkpoint() record Returns information about current checkpoint state. pg_control_system pg_control_system() record Returns information about current control file state. pg_control_init pg_control_init() record Returns information about cluster initialization state. pg_control_recovery pg_control_recovery() record Returns information about recovery state.

pg_control_checkpoint returns a record, shown in <function>pg_control_checkpoint</function> Columns Column Name Data Type checkpoint_lsn pg_lsn redo_lsn pg_lsn redo_wal_file text timeline_id integer prev_timeline_id integer full_page_writes boolean next_xid text next_oid oid next_multixact_id xid next_multi_offset xid oldest_xid xid oldest_xid_dbid oid oldest_active_xid xid oldest_multi_xid xid oldest_multi_dbid oid oldest_commit_ts_xid xid newest_commit_ts_xid xid checkpoint_time timestamp with time zone

pg_control_system returns a record, shown in <function>pg_control_system</function> Columns Column Name Data Type pg_control_version integer catalog_version_no integer system_identifier bigint pg_control_last_modified timestamp with time zone

pg_control_init returns a record, shown in <function>pg_control_init</function> Columns Column Name Data Type max_data_alignment integer database_block_size integer blocks_per_segment integer wal_block_size integer bytes_per_wal_segment integer max_identifier_length integer max_index_columns integer max_toast_chunk_size integer large_object_chunk_size integer float4_pass_by_value boolean float8_pass_by_value boolean data_page_checksum_version integer

pg_control_recovery returns a record, shown in <function>pg_control_recovery</function> Columns Column Name Data Type min_recovery_end_lsn pg_lsn min_recovery_end_timeline integer backup_start_lsn pg_lsn backup_end_lsn pg_lsn end_of_backup_record_required boolean

System Administration Functions The functions described in this section are used to control and monitor a PostgreSQL installation. Configuration Settings Functions shows the functions available to query and alter run-time configuration parameters. Configuration Settings Functions Name Return Type Description current_setting current_setting(setting_name [, missing_ok ]) text get current value of setting set_config set_config(setting_name, new_value, is_local) text set parameter and return new value

SET SHOW configuration of the server functions The function current_setting yields the current value of the setting setting_name. It corresponds to the SQL command SHOW. An example: SELECT current_setting('datestyle'); current_setting ----------------- ISO, MDY (1 row) If there is no setting named setting_name, current_setting throws an error unless missing_ok is supplied and is true. set_config sets the parameter setting_name to new_value. If is_local is true, the new value will only apply to the current transaction. If you want the new value to apply for the current session, use false instead. The function corresponds to the SQL command SET. An example: SELECT set_config('log_statement_stats', 'off', false); set_config ------------ off (1 row) Server Signaling Functions pg_cancel_backend pg_reload_conf pg_rotate_logfile pg_terminate_backend signal backend processes The functions shown in send control signals to other server processes. Use of these functions is restricted to superusers by default but access may be granted to others using GRANT, with noted exceptions. Server Signaling Functions Name Return Type Description pg_cancel_backend(pid int) boolean Cancel a backend's current query. This is also allowed if the calling role is a member of the role whose backend is being canceled or the calling role has been granted pg_signal_backend, however only superusers can cancel superuser backends. pg_reload_conf() boolean Cause server processes to reload their configuration files pg_rotate_logfile() boolean Rotate server's log file pg_terminate_backend(pid int) boolean Terminate a backend. This is also allowed if the calling role is a member of the role whose backend is being terminated or the calling role has been granted pg_signal_backend, however only superusers can terminate superuser backends.

Each of these functions returns true if successful and false otherwise. pg_cancel_backend and pg_terminate_backend send signals (SIGINT or SIGTERM respectively) to backend processes identified by process ID. The process ID of an active backend can be found from the pid column of the pg_stat_activity view, or by listing the postgres processes on the server (using ps on Unix or the Task Manager on Windows). The role of an active backend can be found from the usename column of the pg_stat_activity view. pg_reload_conf sends a SIGHUP signal to the server, causing configuration files to be reloaded by all server processes. pg_rotate_logfile signals the log-file manager to switch to a new output file immediately. This works only when the built-in log collector is running, since otherwise there is no log-file manager subprocess. Backup Control Functions backup pg_create_restore_point pg_current_wal_flush_lsn pg_current_wal_insert_lsn pg_current_wal_lsn pg_start_backup pg_stop_backup pg_is_in_backup pg_backup_start_time pg_switch_wal pg_walfile_name pg_walfile_name_offset pg_wal_lsn_diff The functions shown in assist in making on-line backups. These functions cannot be executed during recovery (except non-exclusive pg_start_backup, non-exclusive pg_stop_backup, pg_is_in_backup, pg_backup_start_time and pg_wal_lsn_diff). Backup Control Functions Name Return Type Description pg_create_restore_point(name text) pg_lsn Create a named point for performing restore (restricted to superusers by default, but other users can be granted EXECUTE to run the function) pg_current_wal_flush_lsn() pg_lsn Get current write-ahead log flush location pg_current_wal_insert_lsn() pg_lsn Get current write-ahead log insert location pg_current_wal_lsn() pg_lsn Get current write-ahead log write location pg_start_backup(label text , fast boolean , exclusive boolean ) pg_lsn Prepare for performing on-line backup (restricted to superusers by default, but other users can be granted EXECUTE to run the function) pg_stop_backup() pg_lsn Finish performing exclusive on-line backup (restricted to superusers by default, but other users can be granted EXECUTE to run the function) pg_stop_backup(exclusive boolean , wait_for_archive boolean ) setof record Finish performing exclusive or non-exclusive on-line backup (restricted to superusers by default, but other users can be granted EXECUTE to run the function) pg_is_in_backup() bool True if an on-line exclusive backup is still in progress. pg_backup_start_time() timestamp with time zone Get start time of an on-line exclusive backup in progress. pg_switch_wal() pg_lsn Force switch to a new write-ahead log file (restricted to superusers by default, but other users can be granted EXECUTE to run the function) pg_walfile_name(lsn pg_lsn) text Convert write-ahead log location to file name pg_walfile_name_offset(lsn pg_lsn) text, integer Convert write-ahead log location to file name and decimal byte offset within file pg_wal_lsn_diff(lsn pg_lsn, lsn pg_lsn) numeric Calculate the difference between two write-ahead log locations

pg_start_backup accepts an arbitrary user-defined label for the backup. (Typically this would be the name under which the backup dump file will be stored.) When used in exclusive mode, the function writes a backup label file (backup_label) and, if there are any links in the pg_tblspc/ directory, a tablespace map file (tablespace_map) into the database cluster's data directory, performs a checkpoint, and then returns the backup's starting write-ahead log location as text. The user can ignore this result value, but it is provided in case it is useful. When used in non-exclusive mode, the contents of these files are instead returned by the pg_stop_backup function, and should be written to the backup by the caller. postgres=# select pg_start_backup('label_goes_here'); pg_start_backup ----------------- 0/D4445B8 (1 row) There is an optional second parameter of type boolean. If true, it specifies executing pg_start_backup as quickly as possible. This forces an immediate checkpoint which will cause a spike in I/O operations, slowing any concurrently executing queries. In an exclusive backup, pg_stop_backup removes the label file and, if it exists, the tablespace_map file created by pg_start_backup. In a non-exclusive backup, the contents of the backup_label and tablespace_map are returned in the result of the function, and should be written to files in the backup (and not in the data directory). There is an optional second parameter of type boolean. If false, the pg_stop_backup will return immediately after the backup is completed without waiting for WAL to be archived. This behavior is only useful for backup software which independently monitors WAL archiving. Otherwise, WAL required to make the backup consistent might be missing and make the backup useless. When this parameter is set to true, pg_stop_backup will wait for WAL to be archived when archiving is enabled; on the standby, this means that it will wait only when archive_mode = always. If write activity on the primary is low, it may be useful to run pg_switch_wal on the primary in order to trigger an immediate segment switch. When executed on a primary, the function also creates a backup history file in the write-ahead log archive area. The history file includes the label given to pg_start_backup, the starting and ending write-ahead log locations for the backup, and the starting and ending times of the backup. The return value is the backup's ending write-ahead log location (which again can be ignored). After recording the ending location, the current write-ahead log insertion point is automatically advanced to the next write-ahead log file, so that the ending write-ahead log file can be archived immediately to complete the backup. pg_switch_wal moves to the next write-ahead log file, allowing the current file to be archived (assuming you are using continuous archiving). The return value is the ending write-ahead log location + 1 within the just-completed write-ahead log file. If there has been no write-ahead log activity since the last write-ahead log switch, pg_switch_wal does nothing and returns the start location of the write-ahead log file currently in use. pg_create_restore_point creates a named write-ahead log record that can be used as recovery target, and returns the corresponding write-ahead log location. The given name can then be used with to specify the point up to which recovery will proceed. Avoid creating multiple restore points with the same name, since recovery will stop at the first one whose name matches the recovery target. pg_current_wal_lsn displays the current write-ahead log write location in the same format used by the above functions. Similarly, pg_current_wal_insert_lsn displays the current write-ahead log insertion location and pg_current_wal_flush_lsn displays the current write-ahead log flush location. The insertion location is the logical end of the write-ahead log at any instant, while the write location is the end of what has actually been written out from the server's internal buffers and flush location is the location guaranteed to be written to durable storage. The write location is the end of what can be examined from outside the server, and is usually what you want if you are interested in archiving partially-complete write-ahead log files. The insertion and flush locations are made available primarily for server debugging purposes. These are both read-only operations and do not require superuser permissions. You can use pg_walfile_name_offset to extract the corresponding write-ahead log file name and byte offset from the results of any of the above functions. For example: postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup()); file_name | file_offset --------------------------+------------- 00000001000000000000000D | 4039624 (1 row) Similarly, pg_walfile_name extracts just the write-ahead log file name. When the given write-ahead log location is exactly at a write-ahead log file boundary, both these functions return the name of the preceding write-ahead log file. This is usually the desired behavior for managing write-ahead log archiving behavior, since the preceding file is the last one that currently needs to be archived. pg_wal_lsn_diff calculates the difference in bytes between two write-ahead log locations. It can be used with pg_stat_replication or some functions shown in to get the replication lag. For details about proper usage of these functions, see . Recovery Control Functions pg_is_in_recovery pg_last_wal_receive_lsn pg_last_wal_replay_lsn pg_last_xact_replay_timestamp The functions shown in provide information about the current status of the standby. These functions may be executed both during recovery and in normal running. Recovery Information Functions Name Return Type Description pg_is_in_recovery() bool True if recovery is still in progress. pg_last_wal_receive_lsn() pg_lsn Get last write-ahead log location received and synced to disk by streaming replication. While streaming replication is in progress this will increase monotonically. If recovery has completed this will remain static at the value of the last WAL record received and synced to disk during recovery. If streaming replication is disabled, or if it has not yet started, the function returns NULL. pg_last_wal_replay_lsn() pg_lsn Get last write-ahead log location replayed during recovery. If recovery is still in progress this will increase monotonically. If recovery has completed then this value will remain static at the value of the last WAL record applied during that recovery. When the server has been started normally without recovery the function returns NULL. pg_last_xact_replay_timestamp() timestamp with time zone Get time stamp of last transaction replayed during recovery. This is the time at which the commit or abort WAL record for that transaction was generated on the primary. If no transactions have been replayed during recovery, this function returns NULL. Otherwise, if recovery is still in progress this will increase monotonically. If recovery has completed then this value will remain static at the value of the last transaction applied during that recovery. When the server has been started normally without recovery the function returns NULL.

pg_is_wal_replay_paused pg_promote pg_wal_replay_pause pg_wal_replay_resume The functions shown in control the progress of recovery. These functions may be executed only during recovery. Recovery Control Functions Name Return Type Description pg_is_wal_replay_paused() bool True if recovery is paused. pg_promote(wait boolean DEFAULT true, wait_seconds integer DEFAULT 60) boolean Promotes a physical standby server. With wait set to true (the default), the function waits until promotion is completed or wait_seconds seconds have passed, and returns true if promotion is successful and false otherwise. If wait is set to false, the function returns true immediately after sending SIGUSR1 to the postmaster to trigger the promotion. This function is restricted to superusers by default, but other users can be granted EXECUTE to run the function. pg_wal_replay_pause() void Pauses recovery immediately (restricted to superusers by default, but other users can be granted EXECUTE to run the function). pg_wal_replay_resume() void Restarts recovery if it was paused (restricted to superusers by default, but other users can be granted EXECUTE to run the function).

While recovery is paused no further database changes are applied. If in hot standby, all new queries will see the same consistent snapshot of the database, and no further query conflicts will be generated until recovery is resumed. If streaming replication is disabled, the paused state may continue indefinitely without problem. While streaming replication is in progress WAL records will continue to be received, which will eventually fill available disk space, depending upon the duration of the pause, the rate of WAL generation and available disk space. Snapshot Synchronization Functions pg_export_snapshot PostgreSQL allows database sessions to synchronize their snapshots. A snapshot determines which data is visible to the transaction that is using the snapshot. Synchronized snapshots are necessary when two or more sessions need to see identical content in the database. If two sessions just start their transactions independently, there is always a possibility that some third transaction commits between the executions of the two START TRANSACTION commands, so that one session sees the effects of that transaction and the other does not. To solve this problem, PostgreSQL allows a transaction to export the snapshot it is using. As long as the exporting transaction remains open, other transactions can import its snapshot, and thereby be guaranteed that they see exactly the same view of the database that the first transaction sees. But note that any database changes made by any one of these transactions remain invisible to the other transactions, as is usual for changes made by uncommitted transactions. So the transactions are synchronized with respect to pre-existing data, but act normally for changes they make themselves. Snapshots are exported with the pg_export_snapshot function, shown in , and imported with the command. Snapshot Synchronization Functions Name Return Type Description pg_export_snapshot() text Save the current snapshot and return its identifier

The function pg_export_snapshot saves the current snapshot and returns a text string identifying the snapshot. This string must be passed (outside the database) to clients that want to import the snapshot. The snapshot is available for import only until the end of the transaction that exported it. A transaction can export more than one snapshot, if needed. Note that doing so is only useful in READ COMMITTED transactions, since in REPEATABLE READ and higher isolation levels, transactions use the same snapshot throughout their lifetime. Once a transaction has exported any snapshots, it cannot be prepared with . See for details of how to use an exported snapshot. Replication Functions The functions shown in are for controlling and interacting with replication features. See , , and for information about the underlying features. Use of functions for replication origin is restricted to superusers. Use of functions for replication slot is restricted to superusers and users having REPLICATION privilege. Many of these functions have equivalent commands in the replication protocol; see . The functions described in , , and are also relevant for replication. Replication <acronym>SQL</acronym> Functions Function Return Type Description pg_create_physical_replication_slot pg_create_physical_replication_slot(slot_name name , immediately_reserve boolean, temporary boolean) (slot_name name, lsn pg_lsn) Creates a new physical replication slot named slot_name. The optional second parameter, when true, specifies that the LSN for this replication slot be reserved immediately; otherwise the LSN is reserved on first connection from a streaming replication client. Streaming changes from a physical slot is only possible with the streaming-replication protocol — see . The optional third parameter, temporary, when set to true, specifies that the slot should not be permanently stored to disk and is only meant for use by current session. Temporary slots are also released upon any error. This function corresponds to the replication protocol command CREATE_REPLICATION_SLOT ... PHYSICAL. pg_drop_replication_slot pg_drop_replication_slot(slot_name name) void Drops the physical or logical replication slot named slot_name. Same as replication protocol command DROP_REPLICATION_SLOT. For logical slots, this must be called when connected to the same database the slot was created on. pg_create_logical_replication_slot pg_create_logical_replication_slot(slot_name name, plugin name , temporary boolean) (slot_name name, lsn pg_lsn) Creates a new logical (decoding) replication slot named slot_name using the output plugin plugin. The optional third parameter, temporary, when set to true, specifies that the slot should not be permanently stored to disk and is only meant for use by current session. Temporary slots are also released upon any error. A call to this function has the same effect as the replication protocol command CREATE_REPLICATION_SLOT ... LOGICAL. pg_copy_physical_replication_slot pg_copy_physical_replication_slot(src_slot_name name, dst_slot_name name , temporary boolean) (slot_name name, lsn pg_lsn) Copies an existing physical replication slot named src_slot_name to a physical replication slot named dst_slot_name. The copied physical slot starts to reserve WAL from the same LSN as the source slot. temporary is optional. If temporary is omitted, the same value as the source slot is used. pg_copy_logical_replication_slot pg_copy_logical_replication_slot(src_slot_name name, dst_slot_name name , temporary boolean , plugin name) (slot_name name, lsn pg_lsn) Copies an existing logical replication slot named src_slot_name to a logical replication slot named dst_slot_name while changing the output plugin and persistence. The copied logical slot starts from the same LSN as the source logical slot. Both temporary and plugin are optional. If temporary or plugin are omitted, the same values as the source logical slot are used. pg_logical_slot_get_changes pg_logical_slot_get_changes(slot_name name, upto_lsn pg_lsn, upto_nchanges int, VARIADIC options text[]) (lsn pg_lsn, xid xid, data text) Returns changes in the slot slot_name, starting from the point at which since changes have been consumed last. If upto_lsn and upto_nchanges are NULL, logical decoding will continue until end of WAL. If upto_lsn is non-NULL, decoding will include only those transactions which commit prior to the specified LSN. If upto_nchanges is non-NULL, decoding will stop when the number of rows produced by decoding exceeds the specified value. Note, however, that the actual number of rows returned may be larger, since this limit is only checked after adding the rows produced when decoding each new transaction commit. pg_logical_slot_peek_changes pg_logical_slot_peek_changes(slot_name name, upto_lsn pg_lsn, upto_nchanges int, VARIADIC options text[]) (lsn pg_lsn, xid xid, data text) Behaves just like the pg_logical_slot_get_changes() function, except that changes are not consumed; that is, they will be returned again on future calls. pg_logical_slot_get_binary_changes pg_logical_slot_get_binary_changes(slot_name name, upto_lsn pg_lsn, upto_nchanges int, VARIADIC options text[]) (lsn pg_lsn, xid xid, data bytea) Behaves just like the pg_logical_slot_get_changes() function, except that changes are returned as bytea. pg_logical_slot_peek_binary_changes pg_logical_slot_peek_binary_changes(slot_name name, upto_lsn pg_lsn, upto_nchanges int, VARIADIC options text[]) (lsn pg_lsn, xid xid, data bytea) Behaves just like the pg_logical_slot_get_changes() function, except that changes are returned as bytea and that changes are not consumed; that is, they will be returned again on future calls. pg_replication_slot_advance pg_replication_slot_advance(slot_name name, upto_lsn pg_lsn) (slot_name name, end_lsn pg_lsn) bool Advances the current confirmed position of a replication slot named slot_name. The slot will not be moved backwards, and it will not be moved beyond the current insert location. Returns the name of the slot and the real position to which it was advanced to. The information of the updated slot is written out at the follow-up checkpoint if any advancing is done. In the event of a crash, the slot may return to an earlier position. pg_replication_origin_create pg_replication_origin_create(node_name text) oid Create a replication origin with the given external name, and return the internal id assigned to it. pg_replication_origin_drop pg_replication_origin_drop(node_name text) void Delete a previously created replication origin, including any associated replay progress. pg_replication_origin_oid pg_replication_origin_oid(node_name text) oid Lookup a replication origin by name and return the internal id. If no corresponding replication origin is found an error is thrown. pg_replication_origin_session_setup pg_replication_origin_session_setup(node_name text) void Mark the current session as replaying from the given origin, allowing replay progress to be tracked. Use pg_replication_origin_session_reset to revert. Can only be used if no previous origin is configured. pg_replication_origin_session_reset pg_replication_origin_session_reset() void Cancel the effects of pg_replication_origin_session_setup(). pg_replication_origin_session_is_setup pg_replication_origin_session_is_setup() bool Has a replication origin been configured in the current session? pg_replication_origin_session_progress pg_replication_origin_session_progress(flush bool) pg_lsn Return the replay location for the replication origin configured in the current session. The parameter flush determines whether the corresponding local transaction will be guaranteed to have been flushed to disk or not. pg_replication_origin_xact_setup pg_replication_origin_xact_setup(origin_lsn pg_lsn, origin_timestamp timestamptz) void Mark the current transaction as replaying a transaction that has committed at the given LSN and timestamp. Can only be called when a replication origin has previously been configured using pg_replication_origin_session_setup(). pg_replication_origin_xact_reset pg_replication_origin_xact_reset() void Cancel the effects of pg_replication_origin_xact_setup(). pg_replication_origin_advance pg_replication_origin_advance(node_name text, lsn pg_lsn) void Set replication progress for the given node to the given location. This primarily is useful for setting up the initial location or a new location after configuration changes and similar. Be aware that careless use of this function can lead to inconsistently replicated data. pg_replication_origin_progress pg_replication_origin_progress(node_name text, flush bool) pg_lsn Return the replay location for the given replication origin. The parameter flush determines whether the corresponding local transaction will be guaranteed to have been flushed to disk or not. pg_logical_emit_message pg_logical_emit_message(transactional bool, prefix text, content text) pg_lsn Emit text logical decoding message. This can be used to pass generic messages to logical decoding plugins through WAL. The parameter transactional specifies if the message should be part of current transaction or if it should be written immediately and decoded as soon as the logical decoding reads the record. The prefix is textual prefix used by the logical decoding plugins to easily recognize interesting messages for them. The content is the text of the message. pg_logical_emit_message(transactional bool, prefix text, content bytea) pg_lsn Emit binary logical decoding message. This can be used to pass generic messages to logical decoding plugins through WAL. The parameter transactional specifies if the message should be part of current transaction or if it should be written immediately and decoded as soon as the logical decoding reads the record. The prefix is textual prefix used by the logical decoding plugins to easily recognize interesting messages for them. The content is the binary content of the message.

Database Object Management Functions The functions shown in calculate the disk space usage of database objects. pg_column_size pg_database_size pg_indexes_size pg_relation_size pg_size_bytes pg_size_pretty pg_table_size pg_tablespace_size pg_total_relation_size Database Object Size Functions Name Return Type Description pg_column_size(any) int Number of bytes used to store a particular value (possibly compressed) pg_database_size(oid) bigint Disk space used by the database with the specified OID pg_database_size(name) bigint Disk space used by the database with the specified name pg_indexes_size(regclass) bigint Total disk space used by indexes attached to the specified table pg_relation_size(relation regclass, fork text) bigint Disk space used by the specified fork ('main', 'fsm', 'vm', or 'init') of the specified table or index pg_relation_size(relation regclass) bigint Shorthand for pg_relation_size(..., 'main') pg_size_bytes(text) bigint Converts a size in human-readable format with size units into bytes pg_size_pretty(bigint) text Converts a size in bytes expressed as a 64-bit integer into a human-readable format with size units pg_size_pretty(numeric) text Converts a size in bytes expressed as a numeric value into a human-readable format with size units pg_table_size(regclass) bigint Disk space used by the specified table, excluding indexes (but including TOAST, free space map, and visibility map) pg_tablespace_size(oid) bigint Disk space used by the tablespace with the specified OID pg_tablespace_size(name) bigint Disk space used by the tablespace with the specified name pg_total_relation_size(regclass) bigint Total disk space used by the specified table, including all indexes and TOAST data

pg_column_size shows the space used to store any individual data value. pg_total_relation_size accepts the OID or name of a table or toast table, and returns the total on-disk space used for that table, including all associated indexes. This function is equivalent to pg_table_size + pg_indexes_size. pg_table_size accepts the OID or name of a table and returns the disk space needed for that table, exclusive of indexes. (TOAST space, free space map, and visibility map are included.) pg_indexes_size accepts the OID or name of a table and returns the total disk space used by all the indexes attached to that table. pg_database_size and pg_tablespace_size accept the OID or name of a database or tablespace, and return the total disk space used therein. To use pg_database_size, you must have CONNECT permission on the specified database (which is granted by default), or be a member of the pg_read_all_stats role. To use pg_tablespace_size, you must have CREATE permission on the specified tablespace, or be a member of the pg_read_all_stats role unless it is the default tablespace for the current database. pg_relation_size accepts the OID or name of a table, index or toast table, and returns the on-disk size in bytes of one fork of that relation. (Note that for most purposes it is more convenient to use the higher-level functions pg_total_relation_size or pg_table_size, which sum the sizes of all forks.) With one argument, it returns the size of the main data fork of the relation. The second argument can be provided to specify which fork to examine: 'main' returns the size of the main data fork of the relation. 'fsm' returns the size of the Free Space Map (see ) associated with the relation. 'vm' returns the size of the Visibility Map (see ) associated with the relation. 'init' returns the size of the initialization fork, if any, associated with the relation. pg_size_pretty can be used to format the result of one of the other functions in a human-readable way, using bytes, kB, MB, GB or TB as appropriate. pg_size_bytes can be used to get the size in bytes from a string in human-readable format. The input may have units of bytes, kB, MB, GB or TB, and is parsed case-insensitively. If no units are specified, bytes are assumed. The units kB, MB, GB and TB used by the functions pg_size_pretty and pg_size_bytes are defined using powers of 2 rather than powers of 10, so 1kB is 1024 bytes, 1MB is 10242 = 1048576 bytes, and so on. The functions above that operate on tables or indexes accept a regclass argument, which is simply the OID of the table or index in the pg_class system catalog. You do not have to look up the OID by hand, however, since the regclass data type's input converter will do the work for you. Just write the table name enclosed in single quotes so that it looks like a literal constant. For compatibility with the handling of ordinary SQL names, the string will be converted to lower case unless it contains double quotes around the table name. If an OID that does not represent an existing object is passed as argument to one of the above functions, NULL is returned. The functions shown in assist in identifying the specific disk files associated with database objects. pg_relation_filenode pg_relation_filepath pg_filenode_relation Database Object Location Functions Name Return Type Description pg_relation_filenode(relation regclass) oid Filenode number of the specified relation pg_relation_filepath(relation regclass) text File path name of the specified relation pg_filenode_relation(tablespace oid, filenode oid) regclass Find the relation associated with a given tablespace and filenode

pg_relation_filenode accepts the OID or name of a table, index, sequence, or toast table, and returns the filenode number currently assigned to it. The filenode is the base component of the file name(s) used for the relation (see for more information). For most tables the result is the same as pg_class.relfilenode, but for certain system catalogs relfilenode is zero and this function must be used to get the correct value. The function returns NULL if passed a relation that does not have storage, such as a view. pg_relation_filepath is similar to pg_relation_filenode, but it returns the entire file path name (relative to the database cluster's data directory PGDATA) of the relation. pg_filenode_relation is the reverse of pg_relation_filenode. Given a tablespace OID and a filenode, it returns the associated relation's OID. For a table in the database's default tablespace, the tablespace can be specified as 0. lists functions used to manage collations. Collation Management Functions Name Return Type Description pg_collation_actual_version pg_collation_actual_version(oid) text Return actual version of collation from operating system pg_import_system_collations pg_import_system_collations(schema regnamespace) integer Import operating system collations

pg_collation_actual_version returns the actual version of the collation object as it is currently installed in the operating system. If this is different from the value in pg_collation.collversion, then objects depending on the collation might need to be rebuilt. See also . pg_import_system_collations adds collations to the system catalog pg_collation based on all the locales it finds in the operating system. This is what initdb uses; see for more details. If additional locales are installed into the operating system later on, this function can be run again to add collations for the new locales. Locales that match existing entries in pg_collation will be skipped. (But collation objects based on locales that are no longer present in the operating system are not removed by this function.) The schema parameter would typically be pg_catalog, but that is not a requirement; the collations could be installed into some other schema as well. The function returns the number of new collation objects it created. Partitioning Information Functions Name Return Type Description pg_partition_tree pg_partition_tree(regclass) setof record List information about tables or indexes in a partition tree for a given partitioned table or partitioned index, with one row for each partition. Information provided includes the name of the partition, the name of its immediate parent, a boolean value telling if the partition is a leaf, and an integer telling its level in the hierarchy. The value of level begins at 0 for the input table or index in its role as the root of the partition tree, 1 for its partitions, 2 for their partitions, and so on. pg_partition_ancestors pg_partition_ancestors(regclass) setof regclass List the ancestor relations of the given partition, including the partition itself. pg_partition_root pg_partition_root(regclass) regclass Return the top-most parent of a partition tree to which the given relation belongs.

To check the total size of the data contained in measurement table described in , one could use the following query: =# SELECT pg_size_pretty(sum(pg_relation_size(relid))) AS total_size FROM pg_partition_tree('measurement'); total_size ------------ 24 kB (1 row) Index Maintenance Functions brin_summarize_new_values gin_clean_pending_list brin_summarize_range brin_desummarize_range shows the functions available for index maintenance tasks. These functions cannot be executed during recovery. Use of these functions is restricted to superusers and the owner of the given index. Index Maintenance Functions Name Return Type Description brin_summarize_new_values(index regclass) integer summarize page ranges not already summarized brin_summarize_range(index regclass, blockNumber bigint) integer summarize the page range covering the given block, if not already summarized brin_desummarize_range(index regclass, blockNumber bigint) integer de-summarize the page range covering the given block, if summarized gin_clean_pending_list(index regclass) bigint move GIN pending list entries into main index structure

brin_summarize_new_values accepts the OID or name of a BRIN index and inspects the index to find page ranges in the base table that are not currently summarized by the index; for any such range it creates a new summary index tuple by scanning the table pages. It returns the number of new page range summaries that were inserted into the index. brin_summarize_range does the same, except it only summarizes the range that covers the given block number. gin_clean_pending_list accepts the OID or name of a GIN index and cleans up the pending list of the specified index by moving entries in it to the main GIN data structure in bulk. It returns the number of pages removed from the pending list. Note that if the argument is a GIN index built with the fastupdate option disabled, no cleanup happens and the return value is 0, because the index doesn't have a pending list. Please see and for details of the pending list and fastupdate option. Generic File Access Functions The functions shown in provide native access to files on the machine hosting the server. Only files within the database cluster directory and the log_directory can be accessed unless the user is granted the role pg_read_server_files. Use a relative path for files in the cluster directory, and a path matching the log_directory configuration setting for log files. Note that granting users the EXECUTE privilege on pg_read_file(), or related functions, allows them the ability to read any file on the server which the database can read and that those reads bypass all in-database privilege checks. This means that, among other things, a user with this access is able to read the contents of the pg_authid table where authentication information is contained, as well as read any file in the database. Therefore, granting access to these functions should be carefully considered. Generic File Access Functions Name Return Type Description pg_ls_dir(dirname text [, missing_ok boolean, include_dot_dirs boolean]) setof text List the contents of a directory. Restricted to superusers by default, but other users can be granted EXECUTE to run the function. pg_ls_logdir() setof record List the name, size, and last modification time of files in the log directory. Access is granted to members of the pg_monitor role and may be granted to other non-superuser roles. pg_ls_waldir() setof record List the name, size, and last modification time of files in the WAL directory. Access is granted to members of the pg_monitor role and may be granted to other non-superuser roles. pg_ls_archive_statusdir() setof record List the name, size, and last modification time of files in the WAL archive status directory. Access is granted to members of the pg_monitor role and may be granted to other non-superuser roles. pg_ls_tmpdir(tablespace oid) setof record List the name, size, and last modification time of files in the temporary directory for tablespace. If tablespace is not provided, the pg_default tablespace is used. Access is granted to members of the pg_monitor role and may be granted to other non-superuser roles. pg_read_file(filename text [, offset bigint, length bigint [, missing_ok boolean] ]) text Return the contents of a text file. Restricted to superusers by default, but other users can be granted EXECUTE to run the function. pg_read_binary_file(filename text [, offset bigint, length bigint [, missing_ok boolean] ]) bytea Return the contents of a file. Restricted to superusers by default, but other users can be granted EXECUTE to run the function. pg_stat_file(filename text[, missing_ok boolean]) record Return information about a file. Restricted to superusers by default, but other users can be granted EXECUTE to run the function.

Some of these functions take an optional missing_ok parameter, which specifies the behavior when the file or directory does not exist. If true, the function returns NULL (except pg_ls_dir, which returns an empty result set). If false, an error is raised. The default is false. pg_ls_dir pg_ls_dir returns the names of all files (and directories and other special files) in the specified directory. The include_dot_dirs indicates whether . and .. are included in the result set. The default is to exclude them (false), but including them can be useful when missing_ok is true, to distinguish an empty directory from an non-existent directory. pg_ls_logdir pg_ls_logdir returns the name, size, and last modified time (mtime) of each file in the log directory. By default, only superusers and members of the pg_monitor role can use this function. Access may be granted to others using GRANT. Filenames beginning with a dot, directories, and other special files are not shown. pg_ls_waldir pg_ls_waldir returns the name, size, and last modified time (mtime) of each file in the write ahead log (WAL) directory. By default only superusers and members of the pg_monitor role can use this function. Access may be granted to others using GRANT. Filenames beginning with a dot, directories, and other special files are not shown. pg_ls_archive_statusdir pg_ls_archive_statusdir returns the name, size, and last modified time (mtime) of each file in the WAL archive status directory pg_wal/archive_status. By default only superusers and members of the pg_monitor role can use this function. Access may be granted to others using GRANT. Filenames beginning with a dot, directories, and other special files are not shown. pg_ls_tmpdir pg_ls_tmpdir returns the name, size, and last modified time (mtime) of each file in the temporary file directory for the specified tablespace. If tablespace is not provided, the pg_default tablespace is used. By default only superusers and members of the pg_monitor role can use this function. Access may be granted to others using GRANT. Filenames beginning with a dot, directories, and other special files are not shown. pg_read_file pg_read_file returns part of a text file, starting at the given offset, returning at most length bytes (less if the end of file is reached first). If offset is negative, it is relative to the end of the file. If offset and length are omitted, the entire file is returned. The bytes read from the file are interpreted as a string in the server encoding; an error is thrown if they are not valid in that encoding. pg_read_binary_file pg_read_binary_file is similar to pg_read_file, except that the result is a bytea value; accordingly, no encoding checks are performed. In combination with the convert_from function, this function can be used to read a file in a specified encoding: SELECT convert_from(pg_read_binary_file('file_in_utf8.txt'), 'UTF8'); pg_stat_file pg_stat_file returns a record containing the file size, last accessed time stamp, last modified time stamp, last file status change time stamp (Unix platforms only), file creation time stamp (Windows only), and a boolean indicating if it is a directory. Typical usages include: SELECT * FROM pg_stat_file('filename'); SELECT (pg_stat_file('filename')).modification; Advisory Lock Functions The functions shown in manage advisory locks. For details about proper use of these functions, see . Advisory Lock Functions Name Return Type Description pg_advisory_lock(key bigint) void Obtain exclusive session level advisory lock pg_advisory_lock(key1 int, key2 int) void Obtain exclusive session level advisory lock pg_advisory_lock_shared(key bigint) void Obtain shared session level advisory lock pg_advisory_lock_shared(key1 int, key2 int) void Obtain shared session level advisory lock pg_advisory_unlock(key bigint) boolean Release an exclusive session level advisory lock pg_advisory_unlock(key1 int, key2 int) boolean Release an exclusive session level advisory lock pg_advisory_unlock_all() void Release all session level advisory locks held by the current session pg_advisory_unlock_shared(key bigint) boolean Release a shared session level advisory lock pg_advisory_unlock_shared(key1 int, key2 int) boolean Release a shared session level advisory lock pg_advisory_xact_lock(key bigint) void Obtain exclusive transaction level advisory lock pg_advisory_xact_lock(key1 int, key2 int) void Obtain exclusive transaction level advisory lock pg_advisory_xact_lock_shared(key bigint) void Obtain shared transaction level advisory lock pg_advisory_xact_lock_shared(key1 int, key2 int) void Obtain shared transaction level advisory lock pg_try_advisory_lock(key bigint) boolean Obtain exclusive session level advisory lock if available pg_try_advisory_lock(key1 int, key2 int) boolean Obtain exclusive session level advisory lock if available pg_try_advisory_lock_shared(key bigint) boolean Obtain shared session level advisory lock if available pg_try_advisory_lock_shared(key1 int, key2 int) boolean Obtain shared session level advisory lock if available pg_try_advisory_xact_lock(key bigint) boolean Obtain exclusive transaction level advisory lock if available pg_try_advisory_xact_lock(key1 int, key2 int) boolean Obtain exclusive transaction level advisory lock if available pg_try_advisory_xact_lock_shared(key bigint) boolean Obtain shared transaction level advisory lock if available pg_try_advisory_xact_lock_shared(key1 int, key2 int) boolean Obtain shared transaction level advisory lock if available

pg_advisory_lock pg_advisory_lock locks an application-defined resource, which can be identified either by a single 64-bit key value or two 32-bit key values (note that these two key spaces do not overlap). If another session already holds a lock on the same resource identifier, this function will wait until the resource becomes available. The lock is exclusive. Multiple lock requests stack, so that if the same resource is locked three times it must then be unlocked three times to be released for other sessions' use. pg_advisory_lock_shared pg_advisory_lock_shared works the same as pg_advisory_lock, except the lock can be shared with other sessions requesting shared locks. Only would-be exclusive lockers are locked out. pg_try_advisory_lock pg_try_advisory_lock is similar to pg_advisory_lock, except the function will not wait for the lock to become available. It will either obtain the lock immediately and return true, or return false if the lock cannot be acquired immediately. pg_try_advisory_lock_shared pg_try_advisory_lock_shared works the same as pg_try_advisory_lock, except it attempts to acquire a shared rather than an exclusive lock. pg_advisory_unlock pg_advisory_unlock will release a previously-acquired exclusive session level advisory lock. It returns true if the lock is successfully released. If the lock was not held, it will return false, and in addition, an SQL warning will be reported by the server. pg_advisory_unlock_shared pg_advisory_unlock_shared works the same as pg_advisory_unlock, except it releases a shared session level advisory lock. pg_advisory_unlock_all pg_advisory_unlock_all will release all session level advisory locks held by the current session. (This function is implicitly invoked at session end, even if the client disconnects ungracefully.) pg_advisory_xact_lock pg_advisory_xact_lock works the same as pg_advisory_lock, except the lock is automatically released at the end of the current transaction and cannot be released explicitly. pg_advisory_xact_lock_shared pg_advisory_xact_lock_shared works the same as pg_advisory_lock_shared, except the lock is automatically released at the end of the current transaction and cannot be released explicitly. pg_try_advisory_xact_lock pg_try_advisory_xact_lock works the same as pg_try_advisory_lock, except the lock, if acquired, is automatically released at the end of the current transaction and cannot be released explicitly. pg_try_advisory_xact_lock_shared pg_try_advisory_xact_lock_shared works the same as pg_try_advisory_lock_shared, except the lock, if acquired, is automatically released at the end of the current transaction and cannot be released explicitly. Trigger Functions suppress_redundant_updates_trigger Currently PostgreSQL provides one built in trigger function, suppress_redundant_updates_trigger, which will prevent any update that does not actually change the data in the row from taking place, in contrast to the normal behavior which always performs the update regardless of whether or not the data has changed. (This normal behavior makes updates run faster, since no checking is required, and is also useful in certain cases.) Ideally, you should normally avoid running updates that don't actually change the data in the record. Redundant updates can cost considerable unnecessary time, especially if there are lots of indexes to alter, and space in dead rows that will eventually have to be vacuumed. However, detecting such situations in client code is not always easy, or even possible, and writing expressions to detect them can be error-prone. An alternative is to use suppress_redundant_updates_trigger, which will skip updates that don't change the data. You should use this with care, however. The trigger takes a small but non-trivial time for each record, so if most of the records affected by an update are actually changed, use of this trigger will actually make the update run slower. The suppress_redundant_updates_trigger function can be added to a table like this: CREATE TRIGGER z_min_update BEFORE UPDATE ON tablename FOR EACH ROW EXECUTE FUNCTION suppress_redundant_updates_trigger(); In most cases, you would want to fire this trigger last for each row. Bearing in mind that triggers fire in name order, you would then choose a trigger name that comes after the name of any other trigger you might have on the table. For more information about creating triggers, see . Event Trigger Functions PostgreSQL provides these helper functions to retrieve information from event triggers. For more information about event triggers, see . Capturing Changes at Command End pg_event_trigger_ddl_commands pg_event_trigger_ddl_commands returns a list of DDL commands executed by each user action, when invoked in a function attached to a ddl_command_end event trigger. If called in any other context, an error is raised. pg_event_trigger_ddl_commands returns one row for each base command executed; some commands that are a single SQL sentence may return more than one row. This function returns the following columns: Name Type Description classid oid OID of catalog the object belongs in objid oid OID of the object itself objsubid integer Sub-object ID (e.g. attribute number for a column) command_tag text Command tag object_type text Type of the object schema_name text Name of the schema the object belongs in, if any; otherwise NULL. No quoting is applied. object_identity text Text rendering of the object identity, schema-qualified. Each identifier included in the identity is quoted if necessary. in_extension bool True if the command is part of an extension script command pg_ddl_command A complete representation of the command, in internal format. This cannot be output directly, but it can be passed to other functions to obtain different pieces of information about the command. Processing Objects Dropped by a DDL Command pg_event_trigger_dropped_objects pg_event_trigger_dropped_objects returns a list of all objects dropped by the command in whose sql_drop event it is called. If called in any other context, pg_event_trigger_dropped_objects raises an error. pg_event_trigger_dropped_objects returns the following columns: Name Type Description classid oid OID of catalog the object belonged in objid oid OID of the object itself objsubid integer Sub-object ID (e.g. attribute number for a column) original bool True if this was one of the root object(s) of the deletion normal bool True if there was a normal dependency relationship in the dependency graph leading to this object is_temporary bool True if this was a temporary object object_type text Type of the object schema_name text Name of the schema the object belonged in, if any; otherwise NULL. No quoting is applied. object_name text Name of the object, if the combination of schema and name can be used as a unique identifier for the object; otherwise NULL. No quoting is applied, and name is never schema-qualified. object_identity text Text rendering of the object identity, schema-qualified. Each identifier included in the identity is quoted if necessary. address_names text[] An array that, together with object_type and address_args, can be used by the pg_get_object_address() function to recreate the object address in a remote server containing an identically named object of the same kind address_args text[] Complement for address_names The pg_event_trigger_dropped_objects function can be used in an event trigger like this: CREATE FUNCTION test_event_trigger_for_drops() RETURNS event_trigger LANGUAGE plpgsql AS $$ DECLARE obj record; BEGIN FOR obj IN SELECT * FROM pg_event_trigger_dropped_objects() LOOP RAISE NOTICE '% dropped object: % %.% %', tg_tag, obj.object_type, obj.schema_name, obj.object_name, obj.object_identity; END LOOP; END $$; CREATE EVENT TRIGGER test_event_trigger_for_drops ON sql_drop EXECUTE FUNCTION test_event_trigger_for_drops(); Handling a Table Rewrite Event The functions shown in provide information about a table for which a table_rewrite event has just been called. If called in any other context, an error is raised. Table Rewrite Information Name Return Type Description pg_event_trigger_table_rewrite_oid pg_event_trigger_table_rewrite_oid() Oid The OID of the table about to be rewritten. pg_event_trigger_table_rewrite_reason pg_event_trigger_table_rewrite_reason() int The reason code(s) explaining the reason for rewriting. The exact meaning of the codes is release dependent.

The pg_event_trigger_table_rewrite_oid function can be used in an event trigger like this: CREATE FUNCTION test_event_trigger_table_rewrite_oid() RETURNS event_trigger LANGUAGE plpgsql AS $$ BEGIN RAISE NOTICE 'rewriting table % for reason %', pg_event_trigger_table_rewrite_oid()::regclass, pg_event_trigger_table_rewrite_reason(); END; $$; CREATE EVENT TRIGGER test_table_rewrite_oid ON table_rewrite EXECUTE FUNCTION test_event_trigger_table_rewrite_oid(); Statistics Information Functions function statistics PostgreSQL provides a function to inspect complex statistics defined using the CREATE STATISTICS command. Inspecting MCV Lists pg_mcv_list_items pg_mcv_list pg_mcv_list_items returns a list of all items stored in a multi-column MCV list, and returns the following columns: Name Type Description index int index of the item in the MCV list values text[] values stored in the MCV item nulls boolean[] flags identifying NULL values frequency double precision frequency of this MCV item base_frequency double precision base frequency of this MCV item The pg_mcv_list_items function can be used like this: SELECT m.* FROM pg_statistic_ext join pg_statistic_ext_data on (oid = stxoid), pg_mcv_list_items(stxdmcv) m WHERE stxname = 'stts'; Values of the pg_mcv_list can be obtained only from the pg_statistic_ext_data.stxdmcv column.