diff --git a/doc/FAQ b/doc/FAQ deleted file mode 100644 index cd57893da40..00000000000 --- a/doc/FAQ +++ /dev/null @@ -1,873 +0,0 @@ - - Frequently Asked Questions (FAQ) for PostgreSQL - - Last updated: Tue Sep 23 16:19:49 EDT 2008 - - Current maintainer: Bruce Momjian (bruce@momjian.us) - - The most recent version of this document can be viewed at - http://www.postgresql.org/files/documentation/faqs/FAQ.html. - - Platform-specific questions are answered at - http://www.postgresql.org/docs/faq/. - _________________________________________________________________ - - General Questions - - 1.1) What is PostgreSQL? How is it pronounced? What is Postgres? - 1.2) Who controls PostgreSQL? - 1.3) What is the copyright of PostgreSQL? - 1.4) What platforms does PostgreSQL support? - 1.5) Where can I get PostgreSQL? - 1.6) What is the most recent release? - 1.7) Where can I get support? - 1.8) How do I submit a bug report? - 1.9) How do I find out about known bugs or missing features? - 1.10) What documentation is available? - 1.11) How can I learn SQL? - 1.12) How do I submit a patch or join the development team? - 1.13) How does PostgreSQL compare to other DBMSs? Can PostgreSQL be - embedded? - 1.14) Will PostgreSQL handle recent daylight saving time changes in - various countries? - 1.15) How do I unsubscribe from the PostgreSQL email lists? How do I - avoid receiving duplicate emails? - - User Client Questions - - 2.1) What interfaces are available for PostgreSQL? - 2.2) What tools are available for using PostgreSQL with Web pages? - 2.3) Does PostgreSQL have a graphical user interface? - - Administrative Questions - - 3.1) How do I install PostgreSQL somewhere other than - /usr/local/pgsql? - 3.2) How do I control connections from other hosts? - 3.3) How do I tune the database engine for better performance? - 3.4) What debugging features are available? - 3.5) Why do I get "Sorry, too many clients" when trying to connect? - 3.6 What is the upgrade process for PostgreSQL? - 3.7) What computer hardware should I use? - - Operational Questions - - 4.1) How do I SELECT only the first few rows of a query? A random row? - 4.2) How do I find out what tables, indexes, databases, and users are - defined? How do I see the queries used by psql to display them? - 4.3) How do you change a column's data type? - 4.4) What is the maximum size for a row, a table, and a database? - 4.5) How much database disk space is required to store data from a - typical text file? - 4.6) Why are my queries slow? Why don't they use my indexes? - 4.7) How do I see how the query optimizer is evaluating my query? - 4.8) How do I perform regular expression searches and case-insensitive - regular expression searches? How do I use an index for - case-insensitive searches? - 4.9) In a query, how do I detect if a field is NULL? How do I - concatenate possible NULLs? How can I sort on whether a field is NULL - or not? - 4.10) What is the difference between the various character types? - 4.11.1) How do I create a serial/auto-incrementing field? - 4.11.2) How do I get the value of a SERIAL insert? - 4.11.3) Doesn't currval() lead to a race condition with other users? - 4.11.4) Why aren't my sequence numbers reused on transaction abort? - Why are there gaps in the numbering of my sequence/SERIAL column? - 4.12) What is an OID? What is a CTID? - 4.13) Why do I get the error "ERROR: Memory exhausted in - AllocSetAlloc()"? - 4.14) How do I tell what PostgreSQL version I am running? - 4.15) How do I create a column that will default to the current time? - 4.16) How do I perform an outer join? - 4.17) How do I perform queries using multiple databases? - 4.18) How do I return multiple rows or columns from a function? - 4.19) Why do I get "relation with OID ##### does not exist" errors - when accessing temporary tables in PL/PgSQL functions? - 4.20) What replication solutions are available? - 4.21) Why are my table and column names not recognized in my query? - Why is capitalization not preserved? - _________________________________________________________________ - - General Questions - - 1.1) What is PostgreSQL? How is it pronounced? What is Postgres? - - PostgreSQL is pronounced Post-Gres-Q-L. (For those curious about how - to say "PostgreSQL", an audio file is available.) - - PostgreSQL is an object-relational database system that has the - features of traditional commercial database systems with enhancements - to be found in next-generation DBMS systems. PostgreSQL is free and - the complete source code is available. - - PostgreSQL development is performed by a team of mostly volunteer - developers spread throughout the world and communicating via the - Internet. It is a community project and is not controlled by any - company. To get involved, see the developer's FAQ at - http://www.postgresql.org/docs/faqs.FAQ_DEV.html - - Postgres is a widely-used nickname for PostgreSQL. It was the original - name of the project at Berkeley and is strongly preferred over other - nicknames. If you find 'PostgreSQL' hard to pronounce, call it - 'Postgres' instead. - - 1.2) Who controls PostgreSQL? - - If you are looking for a PostgreSQL gatekeeper, central committee, or - controlling company, give up --- there isn't one. We do have a core - committee and CVS committers, but these groups are more for - administrative purposes than control. The project is directed by the - community of developers and users, which anyone can join. All you need - to do is subscribe to the mailing lists and participate in the - discussions. (See the Developer's FAQ for information on how to get - involved in PostgreSQL development.) - - 1.3) What is the copyright of PostgreSQL? - - PostgreSQL is distributed under the classic BSD license. Basically, it - allows users to do anything they want with the code, including - reselling binaries without the source code. The only restriction is - that you not hold us legally liable for problems with the software. - There is also the requirement that this copyright appear in all copies - of the software. Here is the actual BSD license we use: - - PostgreSQL Data Base Management System - - Portions Copyright (c) 1996-2009, PostgreSQL Global Development Group - Portions Copyright (c) 1994-1996 Regents of the University of - California - - Permission to use, copy, modify, and distribute this software and its - documentation for any purpose, without fee, and without a written - agreement is hereby granted, provided that the above copyright notice - and this paragraph and the following two paragraphs appear in all - copies. - - IN NO EVENT SHALL THE UNIVERSITY OF CALIFORNIA BE LIABLE TO ANY PARTY - FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, - INCLUDING LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND - ITS DOCUMENTATION, EVEN IF THE UNIVERSITY OF CALIFORNIA HAS BEEN - ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. - - THE UNIVERSITY OF CALIFORNIA SPECIFICALLY DISCLAIMS ANY WARRANTIES, - INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF - MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE SOFTWARE - PROVIDED HEREUNDER IS ON AN "AS IS" BASIS, AND THE UNIVERSITY OF - CALIFORNIA HAS NO OBLIGATIONS TO PROVIDE MAINTENANCE, SUPPORT, - UPDATES, ENHANCEMENTS, OR MODIFICATIONS. - - 1.4) What platforms does PostgreSQL support? - - In general, any modern Unix-compatible platform should be able to run - PostgreSQL. The platforms that had received explicit testing at the - time of release are listed in the installation instructions. - - PostgreSQL also runs natively on Microsoft Windows NT-based operating - systems like Win2000 SP4, WinXP, and Win2003. A prepackaged installer - is available at http://www.postgresql.org/download/windows. - MSDOS-based versions of Windows (Win95, Win98, WinMe) can run - PostgreSQL using Cygwin. - - There is also a Novell Netware 6 port at - http://developer.novell.com/wiki/index.php/Postgresql, and an OS/2 - (eComStation) version at - http://hobbes.nmsu.edu/cgi-bin/h-search?sh=1&button=Search&key=postgre - SQL&stype=all&sort=type&dir=%2F. - - 1.5) Where can I get PostgreSQL? - - Via web browser, use http://www.postgresql.org/ftp/, and via ftp, use - ftp://ftp.postgresql.org/pub/. - - 1.6) What is the most recent release? - - The latest release of PostgreSQL is version 8.3.3. - - We plan to have a major release every year, with minor releases every - few months. - - 1.7) Where can I get support? - - The PostgreSQL community provides assistance to many of its users via - email. The main web site to subscribe to the email lists is - http://www.postgresql.org/community/lists/. The general or bugs lists - are a good place to start. - - The major IRC channel is #postgresql on Freenode (irc.freenode.net). - To connect you can use the Unix program irc -c '#postgresql' "$USER" - irc.freenode.net or use any other IRC clients. A Spanish one also - exists on the same network, (#postgresql-es), a French one, - (#postgresqlfr), and a Brazilian one, (#postgresql-br). There is also - a PostgreSQL channel on EFNet. - - A list of commercial support companies is available at - http://www.postgresql.org/support/professional_support. - - 1.8) How do I submit a bug report? - - Visit the PostgreSQL bug form at - http://www.postgresql.org/support/submitbug. Also check out our ftp - site ftp://ftp.postgresql.org/pub/ to see if there is a more recent - PostgreSQL version. - - Bugs submitted using the bug form or posted to any PostgreSQL mailing - list typically generates one of the following replies: - * It is not a bug, and why - * It is a known bug and is already on the TODO list - * The bug has been fixed in the current release - * The bug has been fixed but is not packaged yet in an official - release - * A request is made for more detailed information: - + Operating system - + PostgreSQL version - + Reproducible test case - + Debugging information - + Debugger backtrace output - * The bug is new. The following might happen: - + A patch is created and will be included in the next major or - minor release - + The bug cannot be fixed immediately and is added to the TODO - list - - 1.9) How do I find out about known bugs or missing features? - - PostgreSQL supports an extended subset of SQL:2003. See our TODO list - for known bugs, missing features, and future plans. - - A feature request usually results in one of the following replies: - * The feature is already on the TODO list - * The feature is not desired because: - + It duplicates existing functionality that already follows the - SQL standard - + The feature would increase code complexity but add little - benefit - + The feature would be insecure or unreliable - * The new feature is added to the TODO list - - PostgreSQL does not use a bug tracking system because we find it more - efficient to respond directly to email and keep the TODO list - up-to-date. In practice, bugs don't last very long in the software, - and bugs that affect a large number of users are fixed rapidly. The - only place to find all changes, improvements, and fixes in a - PostgreSQL release is to read the CVS log messages. Even the release - notes do not list every change made to the software. - - 1.10) What documentation is available? - - PostgreSQL includes extensive documentation, including a large manual, - manual pages, and some test examples. See the /doc directory. You can - also browse the manuals online at http://www.postgresql.org/docs. - - There are two PostgreSQL books available online at - http://www.postgresql.org/docs/books/awbook.html and - http://www.commandprompt.com/ppbook/. There are a number of PostgreSQL - books available for purchase. One of the most popular ones is by Korry - Douglas. A list of book reviews can be found at - http://www.postgresql.org/docs/books/. There is also a collection of - PostgreSQL technical articles at - http://wiki.postgresql.org/wiki/Community_Generated_Articles%2C_Guides - %2C_and_Documentation. - - The command line client program psql has some \d commands to show - information about types, operators, functions, aggregates, etc. - use - \? to display the available commands. - - Our web site contains even more documentation. - - 1.11) How can I learn SQL? - - First, consider the PostgreSQL-specific books mentioned above. Many of - our users also like The Practical SQL Handbook, Bowman, Judith S., et - al., Addison-Wesley. Others like The Complete Reference SQL, Groff et - al., McGraw-Hill. - - There are also many nice tutorials available online: - * http://www.intermedia.net/support/sql/sqltut.shtm - * http://sqlcourse.com - * http://www.w3schools.com/sql/default.asp - * http://mysite.verizon.net/Graeme_Birchall/id1.html - * http://sqlzoo.net - - 1.12) How do I submit a patch or join the development team? - - See the Developer's FAQ. - - 1.13) How does PostgreSQL compare to other DBMSs? Can PostgreSQL be embedded? - - There are several ways of measuring software: features, performance, - reliability, support, and price. - - Features - PostgreSQL has most features present in large commercial DBMSs, - like transactions, subselects, triggers, views, foreign key - referential integrity, and sophisticated locking. We have some - features they do not have, like user-defined types, - inheritance, rules, and multi-version concurrency control to - reduce lock contention. - - Performance - PostgreSQL's performance is comparable to other commercial and - open source databases. It is faster for some things, slower for - others. Our performance is usually +/-10% compared to other - databases. - - Reliability - We realize that a DBMS must be reliable, or it is worthless. We - strive to release well-tested, stable code that has a minimum - of bugs. Each release has at least one month of beta testing, - and our release history shows that we can provide stable, solid - releases that are ready for production use. We believe we - compare favorably to other database software in this area. - - Support - Our mailing lists provide contact with a large group of - developers and users to help resolve any problems encountered. - While we cannot guarantee a fix, commercial DBMSs do not always - supply a fix either. Direct access to developers, the user - community, manuals, and the source code often make PostgreSQL - support superior to other DBMSs. There is commercial - per-incident support available for those who need it. (See FAQ - section 1.7.) - - Price - We are free for all use, both commercial and non-commercial. - You can add our code to your product with no limitations, - except those outlined in our BSD-style license stated above. - - PostgreSQL is designed as a client/server architecture, which requires - separate processes for each client and server, and various helper - processes. Many embedded architectures can support such requirements. - However, if your embedded architecture requires the database server to - run inside the application process, you cannot use Postgres and should - select a lighter-weight database solution. - - 1.14) Will PostgreSQL handle recent daylight saving time changes in various - countries? - - USA daylight saving time changes are included in PostgreSQL release - 8.0.[4+], and all later major releases, e.g. 8.1. Canada and Western - Australia changes are included in 8.0.[10+], 8.1.[6+], and all later - major releases. PostgreSQL releases prior to 8.0 use the operating - system's timezone database for daylight saving information. - - 1.15) How do I unsubscribe from the PostgreSQL email lists? How do I avoid - receiving duplicate emails? - - The PostgreSQL Majordomo page allows subscribing or unsubscribing from - any of the PostgreSQL email lists. (You might need to have your - Majordomo password emailed to you to log in.) - - All PostgreSQL email lists are configured so a group reply goes to the - email list and the original email author. This is done so users - receive the quickest possible email replies. If you would prefer not - to receive duplicate email from the list in cases where you already - receive an email directly, check eliminatecc from the Majordomo Change - Settings page. You can also prevent yourself from receiving copies of - emails you post to the lists by unchecking selfcopy. - _________________________________________________________________ - - User Client Questions - - 2.1) What interfaces are available for PostgreSQL? - - The PostgreSQL install includes only the C and embedded C interfaces. - All other interfaces are independent projects that are downloaded - separately; being separate allows them to have their own release - schedule and development teams. - - Some programming languages like PHP include an interface to - PostgreSQL. Interfaces for languages like Perl, TCL, Python, and many - others are available at http://pgfoundry.org. - - 2.2) What tools are available for using PostgreSQL with Web pages? - - A nice introduction to Database-backed Web pages can be seen at: - http://www.webreview.com - - For Web integration, PHP (http://www.php.net) is an excellent - interface. - - For complex cases, many use the Perl and DBD::Pg with CGI.pm or - mod_perl. - - 2.3) Does PostgreSQL have a graphical user interface? - - There are a large number of GUI Tools that are available for - PostgreSQL from both commercial and open source developers. A detailed - list can be found in the Community Guide to PostgreSQL GUI Tools. - _________________________________________________________________ - - Administrative Questions - - 3.1) How do I install PostgreSQL somewhere other than /usr/local/pgsql? - - Specify the --prefix option when running configure. - - 3.2) How do I control connections from other hosts? - - By default, PostgreSQL only allows connections from the local machine - using Unix domain sockets or TCP/IP connections. Other machines will - not be able to connect unless you modify listen_addresses in the - postgresql.conf file, enable host-based authentication by modifying - the $PGDATA/pg_hba.conf file, and restart the database server. - - 3.3) How do I tune the database engine for better performance? - - There are three major areas for potential performance improvement: - - Query Changes - This involves modifying queries to obtain better performance: - - + Creation of indexes, including expression and partial indexes - + Use of COPY instead of multiple INSERTs - + Grouping of multiple statements into a single transaction to - reduce commit overhead - + Use of CLUSTER when retrieving many rows from an index - + Use of LIMIT for returning a subset of a query's output - + Use of Prepared queries - + Use of ANALYZE to maintain accurate optimizer statistics - + Regular use of VACUUM or pg_autovacuum - + Dropping of indexes during large data changes - - Server Configuration - A number of postgresql.conf settings affect performance. For - more details, see Administration Guide/Server Run-time - Environment/Run-time Configuration for a full listing, and for - commentary see - http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_co - nf_e.html and - http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html. - - Hardware Selection - The effect of hardware on performance is detailed in - http://www.powerpostgresql.com/PerfList/ and - http://momjian.us/main/writings/pgsql/hw_performance/index.html - . - - 3.4) What debugging features are available? - - There are many log_* server configuration variables at - http://www.postgresql.org/docs/current/interactive/runtime-config-logg - ing.html that enable printing of query and process statistics which - can be very useful for debugging and performance measurements. - - 3.5) Why do I get "Sorry, too many clients" when trying to connect? - - You have reached the default limit of 100 database sessions. You need - to increase the server's limit on how many concurrent backend - processes it can start by changing the max_connections value in - postgresql.conf and restarting the server. - - 3.6) What is the upgrade process for PostgreSQL? - - See http://www.postgresql.org/support/versioning for a general - discussion about upgrading, and - http://www.postgresql.org/docs/current/static/install-upgrading.html - for specific instructions. - - 3.7) What computer hardware should I use? - - Because PC hardware is mostly compatible, people tend to believe that - all PC hardware is of equal quality. It is not. ECC RAM, SCSI, and - quality motherboards are more reliable and have better performance - than less expensive hardware. PostgreSQL will run on almost any - hardware, but if reliability and performance are important it is wise - to research your hardware options thoroughly. A disk controller with a - battery-backed cache is also useful. Our email lists can be used to - discuss hardware options and tradeoffs. - _________________________________________________________________ - - Operational Questions - - 4.1) How do I SELECT only the first few rows of a query? A random row? - - To retrieve only a few rows, if you know at the number of rows needed - at the time of the SELECT use LIMIT . If an index matches the ORDER BY - it is possible the entire query does not have to be executed. If you - don't know the number of rows at SELECT time, use a cursor and FETCH. - - To SELECT a random row, use: - SELECT col - FROM tab - ORDER BY random() - LIMIT 1; - - 4.2) How do I find out what tables, indexes, databases, and users are - defined? How do I see the queries used by psql to display them? - - Use the \dt command to see tables in psql. For a complete list of - commands inside psql you can use \?. Alternatively you can read the - source code for psql in file pgsql/src/bin/psql/describe.c, it - contains SQL commands that generate the output for psql's backslash - commands. You can also start psql with the -E option so it will print - out the queries it uses to execute the commands you give. PostgreSQL - also provides an SQL compliant INFORMATION SCHEMA interface you can - query to get information about the database. - - There are also system tables beginning with pg_ that describe these - too. - - Use psql -l will list all databases. - - Also try the file pgsql/src/tutorial/syscat.source. It illustrates - many of the SELECTs needed to get information from the database system - tables. - - 4.3) How do you change a column's data type? - - Changing the data type of a column can be done easily in 8.0 and later - with ALTER TABLE ALTER COLUMN TYPE. - - In earlier releases, do this: - BEGIN; - ALTER TABLE tab ADD COLUMN new_col new_data_type; - UPDATE tab SET new_col = CAST(old_col AS new_data_type); - ALTER TABLE tab DROP COLUMN old_col; - COMMIT; - - You might then want to do VACUUM FULL tab to reclaim the disk space - used by the expired rows. - - 4.4) What is the maximum size for a row, a table, and a database? - - These are the limits: - - Maximum size for a database? unlimited (32 TB databases exist) - Maximum size for a table? 32 TB - Maximum size for a row? 400 GB - Maximum size for a field? 1 GB - Maximum number of rows in a table? unlimited - Maximum number of columns in a table? 250-1600 depending on column - types - Maximum number of indexes on a table? unlimited - - Of course, these are not actually unlimited, but limited to available - disk space and memory/swap space. Performance may suffer when these - values get unusually large. - - The maximum table size of 32 TB does not require large file support - from the operating system. Large tables are stored as multiple 1 GB - files so file system size limits are not important. - - The maximum table size, row size, and maximum number of columns can be - quadrupled by increasing the default block size to 32k. The maximum - table size can also be increased using table partitioning. - - One limitation is that indexes can not be created on columns longer - than about 2,000 characters. Fortunately, such indexes are rarely - needed. Uniqueness is best guaranteed by a function index of an MD5 - hash of the long column, and full text indexing allows for searching - of words within the column. - - 4.5) How much database disk space is required to store data from a typical - text file? - - A PostgreSQL database may require up to five times the disk space to - store data from a text file. - - As an example, consider a file of 100,000 lines with an integer and - text description on each line. Suppose the text string avergages - twenty bytes in length. The flat file would be 2.8 MB. The size of the - PostgreSQL database file containing this data can be estimated as 5.2 - MB: - 24 bytes: each row header (approximate) - 24 bytes: one int field and one text field - + 4 bytes: pointer on page to tuple - ---------------------------------------- - 52 bytes per row - - The data page size in PostgreSQL is 8192 bytes (8 KB), so: - - 8192 bytes per page - ------------------- = 158 rows per database page (rounded down) - 52 bytes per row - - 100000 data rows - -------------------- = 633 database pages (rounded up) - 158 rows per page - -633 database pages * 8192 bytes per page = 5,185,536 bytes (5.2 MB) - - Indexes do not require as much overhead, but do contain the data that - is being indexed, so they can be large also. - - NULLs are stored as bitmaps, so they use very little space. - - 4.6) Why are my queries slow? Why don't they use my indexes? - - Indexes are not used by every query. Indexes are used only if the - table is larger than a minimum size, and the query selects only a - small percentage of the rows in the table. This is because the random - disk access caused by an index scan can be slower than a straight read - through the table, or sequential scan. - - To determine if an index should be used, PostgreSQL must have - statistics about the table. These statistics are collected using - VACUUM ANALYZE, or simply ANALYZE. Using statistics, the optimizer - knows how many rows are in the table, and can better determine if - indexes should be used. Statistics are also valuable in determining - optimal join order and join methods. Statistics collection should be - performed periodically as the contents of the table change. - - Indexes are normally not used for ORDER BY or to perform joins. A - sequential scan followed by an explicit sort is usually faster than an - index scan of a large table. However, LIMIT combined with ORDER BY - often will use an index because only a small portion of the table is - returned. - - If you believe the optimizer is incorrect in choosing a sequential - scan, use SET enable_seqscan TO 'off' and run query again to see if an - index scan is indeed faster. - - When using wild-card operators such as LIKE or ~, indexes can only be - used in certain circumstances: - * The beginning of the search string must be anchored to the start - of the string, i.e. - + LIKE patterns must not start with %. - + ~ (regular expression) patterns must start with ^. - * The search string can not start with a character class, e.g. - [a-e]. - * Case-insensitive searches such as ILIKE and ~* do not utilize - indexes. Instead, use expression indexes, which are described in - section 4.8. - * The default C locale must be used during initdb because it is not - possible to know the next-greatest character in a non-C locale. - You can create a special text_pattern_ops index for such cases - that work only for LIKE indexing. It is also possible to use full - text indexing for word searches. - - 4.7) How do I see how the query optimizer is evaluating my query? - - See the EXPLAIN manual page. - - 4.8) How do I perform regular expression searches and case-insensitive - regular expression searches? How do I use an index for case-insensitive - searches? - - The ~ operator does regular expression matching, and ~* does - case-insensitive regular expression matching. The case-insensitive - variant of LIKE is called ILIKE. - - Case-insensitive equality comparisons are normally expressed as: - SELECT * - FROM tab - WHERE lower(col) = 'abc'; - - This will not use an standard index. However, if you create an - expression index, it will be used: - CREATE INDEX tabindex ON tab (lower(col)); - - If the above index is created as UNIQUE, though the column can store - upper and lowercase characters, it can not have identical values that - differ only in case. To force a particular case to be stored in the - column, use a CHECK constraint or a trigger. - - 4.9) In a query, how do I detect if a field is NULL? How do I concatenate - possible NULLs? How can I sort on whether a field is NULL or not? - - You test the column with IS NULL and IS NOT NULL, like this: - SELECT * - FROM tab - WHERE col IS NULL; - - To concatentate with possible NULLs, use COALESCE(), like this: - SELECT COALESCE(col1, '') || COALESCE(col2, '') - FROM tab - - To sort by the NULL status, use the IS NULL and IS NOT NULL modifiers - in your ORDER BY clause. Things that are true will sort higher than - things that are false, so the following will put NULL entries at the - top of the resulting list: - SELECT * - FROM tab - ORDER BY (col IS NOT NULL) - - 4.10) What is the difference between the various character types? - - Type Internal Name Notes - VARCHAR(n) varchar size specifies maximum length, no padding - CHAR(n) bpchar blank padded to the specified fixed length - TEXT text no specific upper limit on length - BYTEA bytea variable-length byte array (null-byte safe) - "char" char one character - - You will see the internal name when examining system catalogs and in - some error messages. - - The first four types above are "varlena" types (i.e., the first four - bytes on disk are the length, followed by the data). Thus the actual - space used is slightly greater than the declared size. However, long - values are also subject to compression, so the space on disk might - also be less than expected. - VARCHAR(n) is best when storing variable-length strings and it limits - how long a string can be. TEXT is for strings of unlimited length, - with a maximum of one gigabyte. - - CHAR(n) is for storing strings that are all the same length. CHAR(n) - pads with blanks to the specified length, while VARCHAR(n) only stores - the characters supplied. BYTEA is for storing binary data, - particularly values that include NULL bytes. All the types described - here have similar performance characteristics. - - 4.11.1) How do I create a serial/auto-incrementing field? - - PostgreSQL supports a SERIAL data type. It auto-creates a sequence. - For example, this: - CREATE TABLE person ( - id SERIAL, - name TEXT - ); - - is automatically translated into this: - CREATE SEQUENCE person_id_seq; - CREATE TABLE person ( - id INT4 NOT NULL DEFAULT nextval('person_id_seq'), - name TEXT - ); - - Automatically created sequence are named
Maximum size for a database? | unlimited (32 TB databases -exist) |
Maximum size for a table? | 32 TB |
Maximum size for a row? | 400 GB |
Maximum size for a field? | 1 GB |
Maximum number of rows in a table? | unlimited |
Maximum number of columns in a table? | 250-1600 depending -on column types |
Maximum number of indexes on a -table? | unlimited |
Of course, these are not actually unlimited, but limited to - available disk space and memory/swap space. Performance may suffer - when these values get unusually large.
- -The maximum table size of 32 TB does not require large file - support from the operating system. Large tables are stored as - multiple 1 GB files so file system size limits are not - important.
- -The maximum table size, row size, and maximum number of columns - can be quadrupled by increasing the default block size to 32k. The - maximum table size can also be increased using table partitioning.
- -One limitation is that indexes can not be created on columns - longer than about 2,000 characters. Fortunately, such indexes are - rarely needed. Uniqueness is best guaranteed by a function index - of an MD5 hash of the long column, and full text indexing - allows for searching of words within the column.
- -A PostgreSQL database may require up to five times the disk - space to store data from a text file.
- -As an example, consider a file of 100,000 lines with an integer - and text description on each line. Suppose the text string - avergages twenty bytes in length. The flat file would be 2.8 MB. - The size of the PostgreSQL database file containing this data can - be estimated as 5.2 MB:
-- 24 bytes: each row header (approximate) - 24 bytes: one int field and one text field - + 4 bytes: pointer on page to tuple - ---------------------------------------- - 52 bytes per row - - The data page size in PostgreSQL is 8192 bytes (8 KB), so: - - 8192 bytes per page - ------------------- = 158 rows per database page (rounded down) - 52 bytes per row - - 100000 data rows - -------------------- = 633 database pages (rounded up) - 158 rows per page - -633 database pages * 8192 bytes per page = 5,185,536 bytes (5.2 MB) -- -
Indexes do not require as much overhead, but do contain the data - that is being indexed, so they can be large also.
- -NULLs are stored as bitmaps, so they - use very little space.
- -Indexes are not used by every query. Indexes are used only if the - table is larger than a minimum size, and the query selects only a - small percentage of the rows in the table. This is because the random - disk access caused by an index scan can be slower than a straight read - through the table, or sequential scan.
- -To determine if an index should be used, PostgreSQL must have - statistics about the table. These statistics are collected using - VACUUM ANALYZE, or simply ANALYZE. - Using statistics, the optimizer knows how many rows are in the - table, and can better determine if indexes should be used. - Statistics are also valuable in determining optimal join order and - join methods. Statistics collection should be performed - periodically as the contents of the table change.
- -Indexes are normally not used for ORDER BY or to - perform joins. A sequential scan followed by an explicit sort is - usually faster than an index scan of a large table. - However, LIMIT combined with ORDER BY - often will use an index because only a small portion of the table - is returned.
- -If you believe the optimizer is incorrect in choosing a
- sequential scan, use SET enable_seqscan TO 'off'
and
- run query again to see if an index scan is indeed faster.
When using wild-card operators such as LIKE or - ~, indexes can only be used in certain circumstances:
-text_pattern_ops
index for such cases that work only
- for LIKE indexing. It is also possible to use
- full text indexing for word searches.
- See the EXPLAIN manual page.
- -The ~ operator does regular expression matching, and - ~* does case-insensitive regular expression matching. The - case-insensitive variant of LIKE is called - ILIKE.
- -Case-insensitive equality comparisons are normally expressed - as:
-- SELECT * - FROM tab - WHERE lower(col) = 'abc'; -- This will not use an standard index. However, if you create an - expression index, it will be used: -
- CREATE INDEX tabindex ON tab (lower(col)); --
If the above index is created as UNIQUE, though - the column can store upper and lowercase characters, it can not have - identical values that differ only in case. To force a particular - case to be stored in the column, use a CHECK - constraint or a trigger.
- -You test the column with IS NULL and IS - NOT NULL, like this:
- -- SELECT * - FROM tab - WHERE col IS NULL; -- -
To concatentate with possible NULLs, use COALESCE(), - like this:
-- SELECT COALESCE(col1, '') || COALESCE(col2, '') - FROM tab -- -
To sort by the NULL status, use the IS NULL - and IS NOT NULL modifiers in your ORDER BY clause. - Things that are true will sort higher than things that are false, - so the following will put NULL entries at the top of the resulting list:
- -- SELECT * - FROM tab - ORDER BY (col IS NOT NULL) -- -
-- --
-- Type Internal Name Notes - VARCHAR(n) varchar size specifies maximum -length, no padding - CHAR(n) bpchar blank padded to the specified -fixed length - TEXT text no specific upper limit on -length - BYTEA bytea variable-length byte array -(null-byte safe) - "char" char one character
You will see the internal name when examining system catalogs - and in some error messages.
- -The first four types above are "varlena" types (i.e., the first - four bytes on disk are the length, followed by the data). Thus the - actual space used is slightly greater than the declared size. - However, long values are also subject to compression, so the space - on disk might also be less than expected.
- - VARCHAR(n) is best when storing variable-length - strings and it limits how long a string can be. TEXT - is for strings of unlimited length, with a maximum of one gigabyte. -CHAR(n) is for storing strings that are all the - same length. CHAR(n) pads with blanks to the specified - length, while VARCHAR(n) only stores the characters - supplied. BYTEA is for storing binary data, - particularly values that include NULL bytes. All the - types described here have similar performance characteristics.
- -PostgreSQL supports a SERIAL data type. It - auto-creates a sequence. For example, this:
-- CREATE TABLE person ( - id SERIAL, - name TEXT - ); -- - is automatically translated into this: - -
- CREATE SEQUENCE person_id_seq; - CREATE TABLE person ( - id INT4 NOT NULL DEFAULT nextval('person_id_seq'), - name TEXT - ); -- -
Automatically created sequence are named - <table>_<serialcolumn>_seq, where - table and serialcolumn are the names of the table and - SERIAL column, respectively. See the - create_sequence manual page for more information about - sequences.
- -The simplest way is to retrieve the assigned SERIAL - value with RETURNING. Using the example table in 4.11.1, it would look like this:
- -- INSERT INTO person (name) VALUES ('Blaise Pascal') RETURNING id; -- - You can also call nextval() and use that value in the - INSERT, or call currval() after the - INSERT. - -
No. currval() returns the current value assigned by your - session, not by all sessions.
- -To improve concurrency, sequence values are given out to running - transactions as needed and are not locked until the transaction - completes. This causes gaps in numbering from aborted - transactions.
- -If a table is created WITH OIDS, each row - gets a unique a OID. OIDs are - automatically assigned unique 4-byte integers that are unique - across the entire installation. However, they overflow at 4 - billion, and then the OIDs start being duplicated. - PostgreSQL uses OIDs to link its internal system - tables together.
- -To uniquely number rows in user tables, it is best to use - SERIAL rather than OIDs because - SERIAL sequences are unique only within a single - table. and are therefore less likely to overflow. - SERIAL8 is available for storing eight-byte sequence - values.
- -CTIDs are used to identify specific physical rows - with block and offset values. CTIDs change after rows - are modified or reloaded. They are used by index entries to point - to physical rows.
- -You probably have run out of virtual memory on your system, - or your kernel has a low limit for certain resources. Try this - before starting the server:
-- ulimit -d 262144 - limit datasize 256m -- - Depending on your shell, only one of these may succeed, but it will - set your process data segment limit much higher and perhaps allow - the query to complete. This command applies to the current process, - and all subprocesses created after the command is run. If you are - having a problem with the SQL client because the - backend is returning too much data, try it before starting the - client. - -
From psql, type SELECT version();
Use CURRENT_TIMESTAMP:
-- CREATE TABLE test (x int, modtime TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); -- -
PostgreSQL supports outer joins using the SQL standard syntax. - Here are two examples:
-- SELECT * - FROM t1 LEFT OUTER JOIN t2 ON (t1.col = t2.col); -- or -
- SELECT * - FROM t1 LEFT OUTER JOIN t2 USING (col); -- -
These identical queries join t1.col to t2.col, and also return - any unjoined rows in t1 (those with no match in t2). A - RIGHT join would add unjoined rows of t2. A - FULL join would return the matched rows plus all - unjoined rows from t1 and t2. The word OUTER is - optional and is assumed in LEFT, - RIGHT, and FULL joins. Ordinary joins - are called INNER joins.
- -There is no way to query a database other than the current one. - Because PostgreSQL loads database-specific system catalogs, it is - uncertain how a cross-database query should even behave.
- -contrib/dblink allows cross-database queries using - function calls. Of course, a client can also make simultaneous - connections to different databases and merge the results on the - client side.
- -It is easy using set-returning functions, - - http://wiki.postgresql.org/wiki/Return_more_than_one_row_of_data_from_PL/pgSQL_functions.
- -In PostgreSQL versions < 8.3, PL/PgSQL caches function scripts, and - an unfortunate side effect is that if a PL/PgSQL function accesses a - temporary table, and that table is later dropped and recreated, and - the function called again, the function will fail because the cached - function contents still point to the old temporary table. The solution - is to use EXECUTE for temporary table access in - PL/PgSQL. This will cause the query to be reparsed every time.
- -This problem does not occur in PostgreSQL 8.3 and later.
- -Though "replication" is a single term, there are several technologies - for doing replication, with advantages and disadvantages for each.
- -Master/slave replication allows a single master to receive read/write - queries, while slaves can only accept read/SELECT - queries. The most popular freely available master-slave PostgreSQL - replication solution is - Slony-I.
- -Multi-master replication allows read/write queries to be sent to - multiple replicated computers. This capability also has a severe impact - on performance due to the need to synchronize changes between servers. - PGCluster is the - most popular such solution freely available for PostgreSQL.
- -There are also commercial and hardware-based replication solutions - available supporting a variety of replication models.
- -The most common cause of unrecognized names is the use of - double-quotes around table or column names during table creation. - When double-quotes are used, table and column names (called - identifiers) are stored - case-sensitive, meaning you must use double-quotes when - referencing the names in a query. Some interfaces, like pgAdmin, - automatically double-quote identifiers during table creation. - So, for identifiers to be recognized, you must either: -
The developer FAQ can be found on the PostgreSQL wiki:
- - - -