mirror of
https://github.com/postgres/postgres.git
synced 2025-04-22 23:02:54 +03:00
474 lines
16 KiB
Plaintext
474 lines
16 KiB
Plaintext
<!-- $PostgreSQL: pgsql/doc/src/sgml/advanced.sgml,v 1.54 2007/02/01 00:28:16 momjian Exp $ -->
|
|
|
|
<chapter id="tutorial-advanced">
|
|
<title>Advanced Features</title>
|
|
|
|
<sect1 id="tutorial-advanced-intro">
|
|
<title>Introduction</title>
|
|
|
|
<para>
|
|
In the previous chapter we have covered the basics of using
|
|
<acronym>SQL</acronym> to store and access your data in
|
|
<productname>PostgreSQL</productname>. We will now discuss some
|
|
more advanced features of <acronym>SQL</acronym> that simplify
|
|
management and prevent loss or corruption of your data. Finally,
|
|
we will look at some <productname>PostgreSQL</productname>
|
|
extensions.
|
|
</para>
|
|
|
|
<para>
|
|
This chapter will on occasion refer to examples found in <xref
|
|
linkend="tutorial-sql"> to change or improve them, so it will be
|
|
of advantage if you have read that chapter. Some examples from
|
|
this chapter can also be found in
|
|
<filename>advanced.sql</filename> in the tutorial directory. This
|
|
file also contains some example data to load, which is not
|
|
repeated here. (Refer to <xref linkend="tutorial-sql-intro"> for
|
|
how to use the file.)
|
|
</para>
|
|
</sect1>
|
|
|
|
|
|
<sect1 id="tutorial-views">
|
|
<title>Views</title>
|
|
|
|
<indexterm zone="tutorial-views">
|
|
<primary>view</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
Refer back to the queries in <xref linkend="tutorial-join">.
|
|
Suppose the combined listing of weather records and city location
|
|
is of particular interest to your application, but you do not want
|
|
to type the query each time you need it. You can create a
|
|
<firstterm>view</firstterm> over the query, which gives a name to
|
|
the query that you can refer to like an ordinary table:
|
|
|
|
<programlisting>
|
|
CREATE VIEW myview AS
|
|
SELECT city, temp_lo, temp_hi, prcp, date, location
|
|
FROM weather, cities
|
|
WHERE city = name;
|
|
|
|
SELECT * FROM myview;
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
Making liberal use of views is a key aspect of good SQL database
|
|
design. Views allow you to encapsulate the details of the
|
|
structure of your tables, which might change as your application
|
|
evolves, behind consistent interfaces.
|
|
</para>
|
|
|
|
<para>
|
|
Views can be used in almost any place a real table can be used.
|
|
Building views upon other views is not uncommon.
|
|
</para>
|
|
</sect1>
|
|
|
|
|
|
<sect1 id="tutorial-fk">
|
|
<title>Foreign Keys</title>
|
|
|
|
<indexterm zone="tutorial-fk">
|
|
<primary>foreign key</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="tutorial-fk">
|
|
<primary>referential integrity</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
Recall the <classname>weather</classname> and
|
|
<classname>cities</classname> tables from <xref
|
|
linkend="tutorial-sql">. Consider the following problem: You
|
|
want to make sure that no one can insert rows in the
|
|
<classname>weather</classname> table that do not have a matching
|
|
entry in the <classname>cities</classname> table. This is called
|
|
maintaining the <firstterm>referential integrity</firstterm> of
|
|
your data. In simplistic database systems this would be
|
|
implemented (if at all) by first looking at the
|
|
<classname>cities</classname> table to check if a matching record
|
|
exists, and then inserting or rejecting the new
|
|
<classname>weather</classname> records. This approach has a
|
|
number of problems and is very inconvenient, so
|
|
<productname>PostgreSQL</productname> can do this for you.
|
|
</para>
|
|
|
|
<para>
|
|
The new declaration of the tables would look like this:
|
|
|
|
<programlisting>
|
|
CREATE TABLE cities (
|
|
city varchar(80) primary key,
|
|
location point
|
|
);
|
|
|
|
CREATE TABLE weather (
|
|
city varchar(80) references cities(city),
|
|
temp_lo int,
|
|
temp_hi int,
|
|
prcp real,
|
|
date date
|
|
);
|
|
</programlisting>
|
|
|
|
Now try inserting an invalid record:
|
|
|
|
<programlisting>
|
|
INSERT INTO weather VALUES ('Berkeley', 45, 53, 0.0, '1994-11-28');
|
|
</programlisting>
|
|
|
|
<screen>
|
|
ERROR: insert or update on table "weather" violates foreign key constraint "weather_city_fkey"
|
|
DETAIL: Key (city)=(Berkeley) is not present in table "cities".
|
|
</screen>
|
|
</para>
|
|
|
|
<para>
|
|
The behavior of foreign keys can be finely tuned to your
|
|
application. We will not go beyond this simple example in this
|
|
tutorial, but just refer you to <xref linkend="ddl">
|
|
for more information. Making correct use of
|
|
foreign keys will definitely improve the quality of your database
|
|
applications, so you are strongly encouraged to learn about them.
|
|
</para>
|
|
</sect1>
|
|
|
|
|
|
<sect1 id="tutorial-transactions">
|
|
<title>Transactions</title>
|
|
|
|
<indexterm zone="tutorial-transactions">
|
|
<primary>transaction</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
<firstterm>Transactions</> are a fundamental concept of all database
|
|
systems. The essential point of a transaction is that it bundles
|
|
multiple steps into a single, all-or-nothing operation. The intermediate
|
|
states between the steps are not visible to other concurrent transactions,
|
|
and if some failure occurs that prevents the transaction from completing,
|
|
then none of the steps affect the database at all.
|
|
</para>
|
|
|
|
<para>
|
|
For example, consider a bank database that contains balances for various
|
|
customer accounts, as well as total deposit balances for branches.
|
|
Suppose that we want to record a payment of $100.00 from Alice's account
|
|
to Bob's account. Simplifying outrageously, the SQL commands for this
|
|
might look like:
|
|
|
|
<programlisting>
|
|
UPDATE accounts SET balance = balance - 100.00
|
|
WHERE name = 'Alice';
|
|
UPDATE branches SET balance = balance - 100.00
|
|
WHERE name = (SELECT branch_name FROM accounts WHERE name = 'Alice');
|
|
UPDATE accounts SET balance = balance + 100.00
|
|
WHERE name = 'Bob';
|
|
UPDATE branches SET balance = balance + 100.00
|
|
WHERE name = (SELECT branch_name FROM accounts WHERE name = 'Bob');
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
The details of these commands are not important here; the important
|
|
point is that there are several separate updates involved to accomplish
|
|
this rather simple operation. Our bank's officers will want to be
|
|
assured that either all these updates happen, or none of them happen.
|
|
It would certainly not do for a system failure to result in Bob
|
|
receiving $100.00 that was not debited from Alice. Nor would Alice long
|
|
remain a happy customer if she was debited without Bob being credited.
|
|
We need a guarantee that if something goes wrong partway through the
|
|
operation, none of the steps executed so far will take effect. Grouping
|
|
the updates into a <firstterm>transaction</> gives us this guarantee.
|
|
A transaction is said to be <firstterm>atomic</>: from the point of
|
|
view of other transactions, it either happens completely or not at all.
|
|
</para>
|
|
|
|
<para>
|
|
We also want a
|
|
guarantee that once a transaction is completed and acknowledged by
|
|
the database system, it has indeed been permanently recorded
|
|
and won't be lost even if a crash ensues shortly thereafter.
|
|
For example, if we are recording a cash withdrawal by Bob,
|
|
we do not want any chance that the debit to his account will
|
|
disappear in a crash just after he walks out the bank door.
|
|
A transactional database guarantees that all the updates made by
|
|
a transaction are logged in permanent storage (i.e., on disk) before
|
|
the transaction is reported complete.
|
|
</para>
|
|
|
|
<para>
|
|
Another important property of transactional databases is closely
|
|
related to the notion of atomic updates: when multiple transactions
|
|
are running concurrently, each one should not be able to see the
|
|
incomplete changes made by others. For example, if one transaction
|
|
is busy totalling all the branch balances, it would not do for it
|
|
to include the debit from Alice's branch but not the credit to
|
|
Bob's branch, nor vice versa. So transactions must be all-or-nothing
|
|
not only in terms of their permanent effect on the database, but
|
|
also in terms of their visibility as they happen. The updates made
|
|
so far by an open transaction are invisible to other transactions
|
|
until the transaction completes, whereupon all the updates become
|
|
visible simultaneously.
|
|
</para>
|
|
|
|
<para>
|
|
In <productname>PostgreSQL</>, a transaction is set up by surrounding
|
|
the SQL commands of the transaction with
|
|
<command>BEGIN</> and <command>COMMIT</> commands. So our banking
|
|
transaction would actually look like:
|
|
|
|
<programlisting>
|
|
BEGIN;
|
|
UPDATE accounts SET balance = balance - 100.00
|
|
WHERE name = 'Alice';
|
|
-- etc etc
|
|
COMMIT;
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
If, partway through the transaction, we decide we do not want to
|
|
commit (perhaps we just noticed that Alice's balance went negative),
|
|
we can issue the command <command>ROLLBACK</> instead of
|
|
<command>COMMIT</>, and all our updates so far will be canceled.
|
|
</para>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</> actually treats every SQL statement as being
|
|
executed within a transaction. If you do not issue a <command>BEGIN</>
|
|
command,
|
|
then each individual statement has an implicit <command>BEGIN</> and
|
|
(if successful) <command>COMMIT</> wrapped around it. A group of
|
|
statements surrounded by <command>BEGIN</> and <command>COMMIT</>
|
|
is sometimes called a <firstterm>transaction block</>.
|
|
</para>
|
|
|
|
<note>
|
|
<para>
|
|
Some client libraries issue <command>BEGIN</> and <command>COMMIT</>
|
|
commands automatically, so that you might get the effect of transaction
|
|
blocks without asking. Check the documentation for the interface
|
|
you are using.
|
|
</para>
|
|
</note>
|
|
|
|
<para>
|
|
It's possible to control the statements in a transaction in a more
|
|
granular fashion through the use of <firstterm>savepoints</>. Savepoints
|
|
allow you to selectively discard parts of the transaction, while
|
|
committing the rest. After defining a savepoint with
|
|
<command>SAVEPOINT</>, you can if needed roll back to the savepoint
|
|
with <command>ROLLBACK TO</>. All the transaction's database changes
|
|
between defining the savepoint and rolling back to it are discarded, but
|
|
changes earlier than the savepoint are kept.
|
|
</para>
|
|
|
|
<para>
|
|
After rolling back to a savepoint, it continues to be defined, so you can
|
|
roll back to it several times. Conversely, if you are sure you won't need
|
|
to roll back to a particular savepoint again, it can be released, so the
|
|
system can free some resources. Keep in mind that either releasing or
|
|
rolling back to a savepoint
|
|
will automatically release all savepoints that were defined after it.
|
|
</para>
|
|
|
|
<para>
|
|
All this is happening within the transaction block, so none of it
|
|
is visible to other database sessions. When and if you commit the
|
|
transaction block, the committed actions become visible as a unit
|
|
to other sessions, while the rolled-back actions never become visible
|
|
at all.
|
|
</para>
|
|
|
|
<para>
|
|
Remembering the bank database, suppose we debit $100.00 from Alice's
|
|
account, and credit Bob's account, only to find later that we should
|
|
have credited Wally's account. We could do it using savepoints like
|
|
this:
|
|
|
|
<programlisting>
|
|
BEGIN;
|
|
UPDATE accounts SET balance = balance - 100.00
|
|
WHERE name = 'Alice';
|
|
SAVEPOINT my_savepoint;
|
|
UPDATE accounts SET balance = balance + 100.00
|
|
WHERE name = 'Bob';
|
|
-- oops ... forget that and use Wally's account
|
|
ROLLBACK TO my_savepoint;
|
|
UPDATE accounts SET balance = balance + 100.00
|
|
WHERE name = 'Wally';
|
|
COMMIT;
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
This example is, of course, oversimplified, but there's a lot of control
|
|
to be had over a transaction block through the use of savepoints.
|
|
Moreover, <command>ROLLBACK TO</> is the only way to regain control of a
|
|
transaction block that was put in aborted state by the
|
|
system due to an error, short of rolling it back completely and starting
|
|
again.
|
|
</para>
|
|
|
|
</sect1>
|
|
|
|
|
|
<sect1 id="tutorial-inheritance">
|
|
<title>Inheritance</title>
|
|
|
|
<indexterm zone="tutorial-inheritance">
|
|
<primary>inheritance</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
Inheritance is a concept from object-oriented databases. It opens
|
|
up interesting new possibilities of database design.
|
|
</para>
|
|
|
|
<para>
|
|
Let's create two tables: A table <classname>cities</classname>
|
|
and a table <classname>capitals</classname>. Naturally, capitals
|
|
are also cities, so you want some way to show the capitals
|
|
implicitly when you list all cities. If you're really clever you
|
|
might invent some scheme like this:
|
|
|
|
<programlisting>
|
|
CREATE TABLE capitals (
|
|
name text,
|
|
population real,
|
|
altitude int, -- (in ft)
|
|
state char(2)
|
|
);
|
|
|
|
CREATE TABLE non_capitals (
|
|
name text,
|
|
population real,
|
|
altitude int -- (in ft)
|
|
);
|
|
|
|
CREATE VIEW cities AS
|
|
SELECT name, population, altitude FROM capitals
|
|
UNION
|
|
SELECT name, population, altitude FROM non_capitals;
|
|
</programlisting>
|
|
|
|
This works OK as far as querying goes, but it gets ugly when you
|
|
need to update several rows, for one thing.
|
|
</para>
|
|
|
|
<para>
|
|
A better solution is this:
|
|
|
|
<programlisting>
|
|
CREATE TABLE cities (
|
|
name text,
|
|
population real,
|
|
altitude int -- (in ft)
|
|
);
|
|
|
|
CREATE TABLE capitals (
|
|
state char(2)
|
|
) INHERITS (cities);
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
In this case, a row of <classname>capitals</classname>
|
|
<firstterm>inherits</firstterm> all columns (<structfield>name</>,
|
|
<structfield>population</>, and <structfield>altitude</>) from its
|
|
<firstterm>parent</firstterm>, <classname>cities</classname>. The
|
|
type of the column <structfield>name</structfield> is
|
|
<type>text</type>, a native <productname>PostgreSQL</productname>
|
|
type for variable length character strings. State capitals have
|
|
an extra column, <structfield>state</>, that shows their state. In
|
|
<productname>PostgreSQL</productname>, a table can inherit from
|
|
zero or more other tables.
|
|
</para>
|
|
|
|
<para>
|
|
For example, the following query finds the names of all cities,
|
|
including state capitals, that are located at an altitude
|
|
over 500 feet:
|
|
|
|
<programlisting>
|
|
SELECT name, altitude
|
|
FROM cities
|
|
WHERE altitude > 500;
|
|
</programlisting>
|
|
|
|
which returns:
|
|
|
|
<screen>
|
|
name | altitude
|
|
-----------+----------
|
|
Las Vegas | 2174
|
|
Mariposa | 1953
|
|
Madison | 845
|
|
(3 rows)
|
|
</screen>
|
|
</para>
|
|
|
|
<para>
|
|
On the other hand, the following query finds
|
|
all the cities that are not state capitals and
|
|
are situated at an altitude of 500 feet or higher:
|
|
|
|
<programlisting>
|
|
SELECT name, altitude
|
|
FROM ONLY cities
|
|
WHERE altitude > 500;
|
|
</programlisting>
|
|
|
|
<screen>
|
|
name | altitude
|
|
-----------+----------
|
|
Las Vegas | 2174
|
|
Mariposa | 1953
|
|
(2 rows)
|
|
</screen>
|
|
</para>
|
|
|
|
<para>
|
|
Here the <literal>ONLY</literal> before <literal>cities</literal>
|
|
indicates that the query should be run over only the
|
|
<classname>cities</classname> table, and not tables below
|
|
<classname>cities</classname> in the inheritance hierarchy. Many
|
|
of the commands that we have already discussed —
|
|
<command>SELECT</command>, <command>UPDATE</command>, and
|
|
<command>DELETE</command> — support this <literal>ONLY</literal>
|
|
notation.
|
|
</para>
|
|
|
|
<note>
|
|
<para>
|
|
Although inheritance is frequently useful, it has not been integrated
|
|
with unique constraints or foreign keys, which limits its usefulness.
|
|
See <xref linkend="ddl-inherit"> for more detail.
|
|
</para>
|
|
</note>
|
|
</sect1>
|
|
|
|
|
|
<sect1 id="tutorial-conclusion">
|
|
<title>Conclusion</title>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</productname> has many features not
|
|
touched upon in this tutorial introduction, which has been
|
|
oriented toward newer users of <acronym>SQL</acronym>. These
|
|
features are discussed in more detail in the remainder of this
|
|
book.
|
|
</para>
|
|
|
|
<para>
|
|
If you feel you need more introductory material, please visit the PostgreSQL
|
|
<ulink url="http://www.postgresql.org">web site</ulink>
|
|
for links to more resources.
|
|
</para>
|
|
</sect1>
|
|
</chapter>
|