mirror of
https://github.com/postgres/postgres.git
synced 2025-05-12 16:21:30 +03:00
2149 lines
71 KiB
Plaintext
2149 lines
71 KiB
Plaintext
<!-- $PostgreSQL: pgsql/doc/src/sgml/sql.sgml,v 1.46 2007/02/16 03:50:29 momjian Exp $ -->
|
|
|
|
<chapter id="sql-intro">
|
|
<title>SQL</title>
|
|
|
|
<abstract>
|
|
<para>
|
|
This chapter introduces the mathematical concepts behind
|
|
relational databases. It is not required reading, so if you bog
|
|
down or want to get straight to some simple examples feel free to
|
|
jump ahead to the next chapter and come back when you have more
|
|
time and patience. This stuff is supposed to be fun!
|
|
</para>
|
|
|
|
<para>
|
|
This material originally appeared as a part of
|
|
Stefan Simkovics' Master's Thesis
|
|
(<xref linkend="SIM98" endterm="SIM98">).
|
|
</para>
|
|
</abstract>
|
|
|
|
<para>
|
|
<acronym>SQL</acronym> has become the most popular relational query
|
|
language.
|
|
The name <quote><acronym>SQL</acronym></quote> is an abbreviation for
|
|
<firstterm>Structured Query Language</firstterm>.
|
|
In 1974 Donald Chamberlin and others defined the
|
|
language SEQUEL (<firstterm>Structured English Query
|
|
Language</firstterm>) at IBM
|
|
Research. This language was first implemented in an IBM
|
|
prototype called SEQUEL-XRM in 1974-75. In 1976-77 a revised version
|
|
of SEQUEL called SEQUEL/2 was defined and the name was changed to
|
|
<acronym>SQL</acronym>
|
|
subsequently.
|
|
</para>
|
|
|
|
<para>
|
|
A new prototype called System R was developed by IBM in 1977. System R
|
|
implemented a large subset of SEQUEL/2 (now <acronym>SQL</acronym>)
|
|
and a number of
|
|
changes were made to <acronym>SQL</acronym> during the project.
|
|
System R was installed in
|
|
a number of user sites, both internal IBM sites and also some selected
|
|
customer sites. Thanks to the success and acceptance of System R at
|
|
those user sites IBM started to develop commercial products that
|
|
implemented the <acronym>SQL</acronym> language based on the System
|
|
R technology.
|
|
</para>
|
|
|
|
<para>
|
|
Over the next years IBM and also a number of other vendors announced
|
|
<acronym>SQL</acronym> products such as
|
|
<productname>SQL/DS</productname> (IBM),
|
|
<productname>DB2</productname> (IBM),
|
|
<productname>ORACLE</productname> (Oracle Corp.),
|
|
<productname>DG/SQL</productname> (Data General Corp.),
|
|
and <productname>SYBASE</productname> (Sybase Inc.).
|
|
</para>
|
|
|
|
<para>
|
|
<acronym>SQL</acronym> is also an official standard now. In 1982
|
|
the American National
|
|
Standards Institute (<acronym>ANSI</acronym>) chartered its
|
|
Database Committee X3H2 to
|
|
develop a proposal for a standard relational language. This proposal
|
|
was ratified in 1986 and consisted essentially of the IBM dialect of
|
|
<acronym>SQL</acronym>. In 1987 this <acronym>ANSI</acronym>
|
|
standard was also accepted as an international
|
|
standard by the International Organization for Standardization
|
|
(<acronym>ISO</acronym>).
|
|
This original standard version of <acronym>SQL</acronym> is often
|
|
referred to,
|
|
informally, as <quote><abbrev>SQL/86</abbrev></quote>. In 1989 the original
|
|
standard was extended
|
|
and this new standard is often, again informally, referred to as
|
|
<quote><abbrev>SQL/89</abbrev></quote>. Also in 1989, a related standard called
|
|
<firstterm>Database Language Embedded <acronym>SQL</acronym></firstterm>
|
|
(<acronym>ESQL</acronym>) was developed.
|
|
</para>
|
|
|
|
<para>
|
|
The <acronym>ISO</acronym> and <acronym>ANSI</acronym> committees
|
|
have been working for many years on the
|
|
definition of a greatly expanded version of the original standard,
|
|
referred to informally as <firstterm><acronym>SQL2</acronym></firstterm>
|
|
or <firstterm><acronym>SQL/92</acronym></firstterm>. This version became a
|
|
ratified standard - <quote>International Standard ISO/IEC 9075:1992,
|
|
Database Language <acronym>SQL</acronym></quote> - in late 1992.
|
|
<acronym>SQL/92</acronym> is the version
|
|
normally meant when people refer to <quote>the <acronym>SQL</acronym>
|
|
standard</quote>. A detailed
|
|
description of <acronym>SQL/92</acronym> is given in
|
|
<xref linkend="DATE97" endterm="DATE97">. At the time of
|
|
writing this document a new standard informally referred to
|
|
as <firstterm><acronym>SQL3</acronym></firstterm>
|
|
is under development. It is planned to make <acronym>SQL</acronym>
|
|
a Turing-complete
|
|
language, i.e. all computable queries (e.g. recursive queries) will be
|
|
possible. This has now been completed as SQL:2003.
|
|
</para>
|
|
|
|
<sect1 id="rel-model">
|
|
<title>The Relational Data Model</title>
|
|
|
|
<para>
|
|
As mentioned before, <acronym>SQL</acronym> is a relational
|
|
language. That means it is
|
|
based on the <firstterm>relational data model</firstterm>
|
|
first published by E.F. Codd in
|
|
1970. We will give a formal description of the relational model
|
|
later (in
|
|
<xref linkend="formal-notion" endterm="formal-notion">)
|
|
but first we want to have a look at it from a more intuitive
|
|
point of view.
|
|
</para>
|
|
|
|
<para>
|
|
A <firstterm>relational database</firstterm> is a database that is
|
|
perceived by its
|
|
users as a <firstterm>collection of tables</firstterm> (and
|
|
nothing else but tables).
|
|
A table consists of rows and columns where each row represents a
|
|
record and each column represents an attribute of the records
|
|
contained in the table.
|
|
<xref linkend="supplier-fig" endterm="supplier-fig">
|
|
shows an example of a database consisting of three tables:
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
SUPPLIER is a table storing the number
|
|
(SNO), the name (SNAME) and the city (CITY) of a supplier.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
PART is a table storing the number (PNO) the name (PNAME) and
|
|
the price (PRICE) of a part.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
SELLS stores information about which part (PNO) is sold by which
|
|
supplier (SNO).
|
|
It serves in a sense to connect the other two tables together.
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
|
|
<example>
|
|
<title id="supplier-fig">The Suppliers and Parts Database</title>
|
|
<programlisting>
|
|
SUPPLIER: SELLS:
|
|
SNO | SNAME | CITY SNO | PNO
|
|
----+---------+-------- -----+-----
|
|
1 | Smith | London 1 | 1
|
|
2 | Jones | Paris 1 | 2
|
|
3 | Adams | Vienna 2 | 4
|
|
4 | Blake | Rome 3 | 1
|
|
3 | 3
|
|
4 | 2
|
|
PART: 4 | 3
|
|
PNO | PNAME | PRICE 4 | 4
|
|
----+---------+---------
|
|
1 | Screw | 10
|
|
2 | Nut | 8
|
|
3 | Bolt | 15
|
|
4 | Cam | 25
|
|
</programlisting>
|
|
</example>
|
|
</para>
|
|
|
|
<para>
|
|
The tables PART and SUPPLIER can be regarded as
|
|
<firstterm>entities</firstterm> and
|
|
SELLS can be regarded as a <firstterm>relationship</firstterm>
|
|
between a particular
|
|
part and a particular supplier.
|
|
</para>
|
|
|
|
<para>
|
|
As we will see later, <acronym>SQL</acronym> operates on tables
|
|
like the ones just
|
|
defined but before that we will study the theory of the relational
|
|
model.
|
|
</para>
|
|
</sect1>
|
|
|
|
<sect1 id="relmodel-formal">
|
|
<title id="formal-notion">Relational Data Model Formalities</title>
|
|
|
|
<para>
|
|
The mathematical concept underlying the relational model is the
|
|
set-theoretic <firstterm>relation</firstterm> which is a subset of
|
|
the Cartesian
|
|
product of a list of domains. This set-theoretic relation gives
|
|
the model its name (do not confuse it with the relationship from the
|
|
<firstterm>Entity-Relationship model</firstterm>).
|
|
Formally a domain is simply a set of
|
|
values. For example the set of integers is a domain. Also the set of
|
|
character strings of length 20 and the real numbers are examples of
|
|
domains.
|
|
</para>
|
|
|
|
<para>
|
|
<!--
|
|
\begin{definition}
|
|
The <firstterm>Cartesian product</firstterm> of domains $D_{1},
|
|
D_{2},\ldots, D_{k}$ written
|
|
\mbox{$D_{1} \times D_{2} \times \ldots \times D_{k}$} is the set of
|
|
all $k$-tuples $(v_{1},v_{2},\ldots,v_{k})$ such that \mbox{$v_{1} \in
|
|
D_{1}, v_{2} \in D_{2}, \ldots, v_{k} \in D_{k}$}.
|
|
\end{definition}
|
|
-->
|
|
The <firstterm>Cartesian product</firstterm> of domains
|
|
<parameter>D<subscript>1</subscript></parameter>,
|
|
<parameter>D<subscript>2</subscript></parameter>,
|
|
...
|
|
<parameter>D<subscript>k</subscript></parameter>,
|
|
written
|
|
<parameter>D<subscript>1</subscript></parameter> ×
|
|
<parameter>D<subscript>2</subscript></parameter> ×
|
|
... ×
|
|
<parameter>D<subscript>k</subscript></parameter>
|
|
is the set of all k-tuples
|
|
<parameter>v<subscript>1</subscript></parameter>,
|
|
<parameter>v<subscript>2</subscript></parameter>,
|
|
...
|
|
<parameter>v<subscript>k</subscript></parameter>,
|
|
such that
|
|
<parameter>v<subscript>1</subscript></parameter> ∈
|
|
<parameter>D<subscript>1</subscript></parameter>,
|
|
<parameter>v<subscript>2</subscript></parameter> ∈
|
|
<parameter>D<subscript>2</subscript></parameter>,
|
|
...
|
|
<parameter>v<subscript>k</subscript></parameter> ∈
|
|
<parameter>D<subscript>k</subscript></parameter>.
|
|
</para>
|
|
|
|
<para>
|
|
For example, when we have
|
|
<!--
|
|
$k=2$, $D_{1}=\{0,1\}$ and
|
|
$D_{2}=\{a,b,c\}$, then $D_{1} \times D_{2}$ is
|
|
$\{(0,a),(0,b),(0,c),(1,a),(1,b),(1,c)\}$.
|
|
-->
|
|
<parameter>k</parameter>=2,
|
|
<parameter>D<subscript>1</subscript></parameter>=<literal>{0,1}</literal> and
|
|
<parameter>D<subscript>2</subscript></parameter>=<literal>{a,b,c}</literal> then
|
|
<parameter>D<subscript>1</subscript></parameter> ×
|
|
<parameter>D<subscript>2</subscript></parameter> is
|
|
<literal>{(0,a),(0,b),(0,c),(1,a),(1,b),(1,c)}</literal>.
|
|
</para>
|
|
|
|
<para>
|
|
<!--
|
|
\begin{definition}
|
|
A Relation is any subset of the Cartesian product of one or more
|
|
domains: $R \subseteq$ \mbox{$D_{1} \times D_{2} \times \ldots \times D_{k}$}
|
|
\end{definition}
|
|
-->
|
|
A Relation is any subset of the Cartesian product of one or more
|
|
domains: <parameter>R</parameter> ⊆
|
|
<parameter>D<subscript>1</subscript></parameter> ×
|
|
<parameter>D<subscript>2</subscript></parameter> ×
|
|
... ×
|
|
<parameter>D<subscript>k</subscript></parameter>.
|
|
</para>
|
|
|
|
<para>
|
|
For example <literal>{(0,a),(0,b),(1,a)}</literal> is a relation;
|
|
it is in fact a subset of
|
|
<parameter>D<subscript>1</subscript></parameter> ×
|
|
<parameter>D<subscript>2</subscript></parameter>
|
|
mentioned above.
|
|
</para>
|
|
|
|
<para>
|
|
The members of a relation are called tuples. Each relation of some
|
|
Cartesian product
|
|
<parameter>D<subscript>1</subscript></parameter> ×
|
|
<parameter>D<subscript>2</subscript></parameter> ×
|
|
... ×
|
|
<parameter>D<subscript>k</subscript></parameter>
|
|
is said to have arity <literal>k</literal> and is therefore a set
|
|
of <literal>k</literal>-tuples.
|
|
</para>
|
|
|
|
<para>
|
|
A relation can be viewed as a table (as we already did, remember
|
|
<xref linkend="supplier-fig" endterm="supplier-fig"> where
|
|
every tuple is represented by a row and every column corresponds to
|
|
one component of a tuple. Giving names (called attributes) to the
|
|
columns leads to the definition of a
|
|
<firstterm>relation scheme</firstterm>.
|
|
</para>
|
|
|
|
<para>
|
|
<!--
|
|
\begin{definition}
|
|
A {\it relation scheme} $R$ is a finite set of attributes
|
|
\mbox{$\{A_{1},A_{2},\ldots,A_{k}\}$}. There is a domain $D_{i}$ for
|
|
each attribute $A_{i}, 1 \le i \le k$ where the values of the
|
|
attributes are taken from. We often write a relation scheme as
|
|
\mbox{$R(A_{1},A_{2},\ldots,A_{k})$}.
|
|
\end{definition}
|
|
-->
|
|
A <firstterm>relation scheme</firstterm> <literal>R</literal> is a
|
|
finite set of attributes
|
|
<parameter>A<subscript>1</subscript></parameter>,
|
|
<parameter>A<subscript>2</subscript></parameter>,
|
|
...
|
|
<parameter>A<subscript>k</subscript></parameter>.
|
|
There is a domain
|
|
<parameter>D<subscript>i</subscript></parameter>,
|
|
for each attribute
|
|
<parameter>A<subscript>i</subscript></parameter>,
|
|
1 <= <literal>i</literal> <= <literal>k</literal>,
|
|
where the values of the attributes are taken from. We often write
|
|
a relation scheme as
|
|
<literal>R(<parameter>A<subscript>1</subscript></parameter>,
|
|
<parameter>A<subscript>2</subscript></parameter>,
|
|
...
|
|
<parameter>A<subscript>k</subscript></parameter>)</literal>.
|
|
|
|
<note>
|
|
<para>
|
|
A <firstterm>relation scheme</firstterm> is just a kind of template
|
|
whereas a <firstterm>relation</firstterm> is an instance of a
|
|
<firstterm>relation
|
|
scheme</firstterm>. The relation consists of tuples (and can
|
|
therefore be
|
|
viewed as a table); not so the relation scheme.
|
|
</para>
|
|
</note>
|
|
</para>
|
|
|
|
<sect2>
|
|
<title id="domains">Domains vs. Data Types</title>
|
|
|
|
<para>
|
|
We often talked about <firstterm>domains</firstterm>
|
|
in the last section. Recall that a
|
|
domain is, formally, just a set of values (e.g., the set of integers or
|
|
the real numbers). In terms of database systems we often talk of
|
|
<firstterm>data types</firstterm> instead of domains.
|
|
When we define a table we have to make
|
|
a decision about which attributes to include. Additionally we
|
|
have to decide which kind of data is going to be stored as
|
|
attribute values. For example the values of
|
|
<classname>SNAME</classname> from the table
|
|
<classname>SUPPLIER</classname> will be character strings,
|
|
whereas <classname>SNO</classname> will store
|
|
integers. We define this by assigning a data type to each
|
|
attribute. The type of <classname>SNAME</classname> will be
|
|
<type>VARCHAR(20)</type> (this is the <acronym>SQL</acronym> type
|
|
for character strings of length <= 20),
|
|
the type of <classname>SNO</classname> will be
|
|
<type>INTEGER</type>. With the assignment of a data type we also
|
|
have selected
|
|
a domain for an attribute. The domain of
|
|
<classname>SNAME</classname> is the set of all
|
|
character strings of length <= 20,
|
|
the domain of <classname>SNO</classname> is the set of
|
|
all integer numbers.
|
|
</para>
|
|
</sect2>
|
|
</sect1>
|
|
|
|
<sect1 id="relmodel-oper">
|
|
<title id="operations">Operations in the Relational Data Model</title>
|
|
|
|
<para>
|
|
In the previous section
|
|
(<xref linkend="formal-notion" endterm="formal-notion">)
|
|
we defined the mathematical notion of
|
|
the relational model. Now we know how the data can be stored using a
|
|
relational data model but we do not know what to do with all these
|
|
tables to retrieve something from the database yet. For example somebody
|
|
could ask for the names of all suppliers that sell the part
|
|
'Screw'. Therefore two rather different kinds of notations for
|
|
expressing operations on relations have been defined:
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
The <firstterm>Relational Algebra</firstterm> which is an
|
|
algebraic notation,
|
|
where queries are expressed by applying specialized operators to the
|
|
relations.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
The <firstterm>Relational Calculus</firstterm> which is a
|
|
logical notation,
|
|
where queries are expressed by formulating some logical restrictions
|
|
that the tuples in the answer must satisfy.
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</para>
|
|
|
|
<sect2>
|
|
<title id="rel-alg">Relational Algebra</title>
|
|
|
|
<para>
|
|
The <firstterm>Relational Algebra</firstterm> was introduced by
|
|
E. F. Codd in 1972. It consists of a set of operations on relations:
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
SELECT (σ): extracts <firstterm>tuples</firstterm> from
|
|
a relation that
|
|
satisfy a given restriction. Let <parameter>R</parameter> be a
|
|
table that contains an attribute
|
|
<parameter>A</parameter>.
|
|
σ<subscript>A=a</subscript>(R) = {t ∈ R ∣ t(A) = a}
|
|
where <literal>t</literal> denotes a
|
|
tuple of <parameter>R</parameter> and <literal>t(A)</literal>
|
|
denotes the value of attribute <parameter>A</parameter> of
|
|
tuple <literal>t</literal>.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
PROJECT (π): extracts specified
|
|
<firstterm>attributes</firstterm> (columns) from a
|
|
relation. Let <classname>R</classname> be a relation
|
|
that contains an attribute <classname>X</classname>.
|
|
π<subscript>X</subscript>(<classname>R</classname>) = {t(X) ∣ t ∈ <classname>R</classname>},
|
|
where <literal>t</literal>(<classname>X</classname>) denotes the value of
|
|
attribute <classname>X</classname> of tuple <literal>t</literal>.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
PRODUCT (×): builds the Cartesian product of two
|
|
relations. Let <classname>R</classname> be a table with arity
|
|
<literal>k</literal><subscript>1</subscript> and let
|
|
<classname>S</classname> be a table with
|
|
arity <literal>k</literal><subscript>2</subscript>.
|
|
<classname>R</classname> × <classname>S</classname>
|
|
is the set of all
|
|
<literal>k</literal><subscript>1</subscript>
|
|
+ <literal>k</literal><subscript>2</subscript>-tuples
|
|
whose first <literal>k</literal><subscript>1</subscript>
|
|
components form a tuple in <classname>R</classname> and whose last
|
|
<literal>k</literal><subscript>2</subscript> components form a
|
|
tuple in <classname>S</classname>.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
UNION (∪): builds the set-theoretic union of two
|
|
tables. Given the tables <classname>R</classname> and
|
|
<classname>S</classname> (both must have the same arity),
|
|
the union <classname>R</classname> ∪ <classname>S</classname>
|
|
is the set of tuples that are in <classname>R</classname>
|
|
or <classname>S</classname> or both.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
INTERSECT (∩): builds the set-theoretic intersection of two
|
|
tables. Given the tables <classname>R</classname> and
|
|
<classname>S</classname>,
|
|
<classname>R</classname> ∩ <classname>S</classname> is the
|
|
set of tuples
|
|
that are in <classname>R</classname> and in
|
|
<classname>S</classname>.
|
|
We again require that <classname>R</classname> and
|
|
<classname>S</classname> have the
|
|
same arity.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
DIFFERENCE (− or ∖): builds the set difference of
|
|
two tables. Let <classname>R</classname> and <classname>S</classname>
|
|
again be two tables with the same
|
|
arity. <classname>R</classname> - <classname>S</classname>
|
|
is the set of tuples in <classname>R</classname> but not in
|
|
<classname>S</classname>.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
JOIN (∏): connects two tables by their common
|
|
attributes. Let <classname>R</classname> be a table with the
|
|
attributes <classname>A</classname>,<classname>B</classname>
|
|
and <classname>C</classname> and
|
|
let <classname>S</classname> be a table with the attributes
|
|
<classname>C</classname>,<classname>D</classname>
|
|
and <classname>E</classname>. There is one
|
|
attribute common to both relations,
|
|
the attribute <classname>C</classname>.
|
|
<!--
|
|
<classname>R</classname> ∏ <classname>S</classname> =
|
|
π<subscript><classname>R</classname>.<classname>A</classname>,<classname>R</classname>.<classname>B</classname>,<classname>R</classname>.<classname>C</classname>,<classname>S</classname>.<classname>D</classname>,<classname>S</classname>.<classname>E</classname></subscript>(σ<subscript><classname>R</classname>.<classname>C</classname>=<classname>S</classname>.<classname>C</classname></subscript>(<classname>R</classname> × <classname>S</classname>)).
|
|
-->
|
|
R ∏ S = π<subscript>R.A,R.B,R.C,S.D,S.E</subscript>(σ<subscript>R.C=S.C</subscript>(R × S)).
|
|
What are we doing here? We first calculate the Cartesian
|
|
product
|
|
<classname>R</classname> × <classname>S</classname>.
|
|
Then we select those tuples whose values for the common
|
|
attribute <classname>C</classname> are equal
|
|
(σ<subscript>R.C = S.C</subscript>).
|
|
Now we have a table
|
|
that contains the attribute <classname>C</classname>
|
|
two times and we correct this by
|
|
projecting out the duplicate column.
|
|
</para>
|
|
|
|
<example>
|
|
<title id="join-example">An Inner Join</title>
|
|
|
|
<para>
|
|
Let's have a look at the tables that are produced by evaluating the steps
|
|
necessary for a join.
|
|
Let the following two tables be given:
|
|
|
|
<programlisting>
|
|
R: S:
|
|
A | B | C C | D | E
|
|
---+---+--- ---+---+---
|
|
1 | 2 | 3 3 | a | b
|
|
4 | 5 | 6 6 | c | d
|
|
7 | 8 | 9
|
|
</programlisting>
|
|
</para>
|
|
</example>
|
|
|
|
<para>
|
|
First we calculate the Cartesian product
|
|
<classname>R</classname> × <classname>S</classname> and
|
|
get:
|
|
|
|
<programlisting>
|
|
R x S:
|
|
A | B | R.C | S.C | D | E
|
|
---+---+-----+-----+---+---
|
|
1 | 2 | 3 | 3 | a | b
|
|
1 | 2 | 3 | 6 | c | d
|
|
4 | 5 | 6 | 3 | a | b
|
|
4 | 5 | 6 | 6 | c | d
|
|
7 | 8 | 9 | 3 | a | b
|
|
7 | 8 | 9 | 6 | c | d
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
After the selection
|
|
σ<subscript>R.C=S.C</subscript>(R × S)
|
|
we get:
|
|
|
|
<programlisting>
|
|
A | B | R.C | S.C | D | E
|
|
---+---+-----+-----+---+---
|
|
1 | 2 | 3 | 3 | a | b
|
|
4 | 5 | 6 | 6 | c | d
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
To remove the duplicate column
|
|
<classname>S</classname>.<classname>C</classname>
|
|
we project it out by the following operation:
|
|
π<subscript>R.A,R.B,R.C,S.D,S.E</subscript>(σ<subscript>R.C=S.C</subscript>(R × S))
|
|
and get:
|
|
|
|
<programlisting>
|
|
A | B | C | D | E
|
|
---+---+---+---+---
|
|
1 | 2 | 3 | a | b
|
|
4 | 5 | 6 | c | d
|
|
</programlisting>
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
DIVIDE (÷): Let <classname>R</classname> be a table
|
|
with the attributes A, B, C, and D and let
|
|
<classname>S</classname> be a table with the attributes
|
|
C and D.
|
|
Then we define the division as:
|
|
|
|
<programlisting>
|
|
R ÷ S = {t ∣ ∀ t<subscript>s</subscript> ∈ S ∃ t<subscript>r</subscript> ∈ R
|
|
</programlisting>
|
|
|
|
such that
|
|
t<subscript>r</subscript>(A,B)=t∧t<subscript>r</subscript>(C,D)=t<subscript>s</subscript>}
|
|
where
|
|
t<subscript>r</subscript>(x,y)
|
|
denotes a
|
|
tuple of table <classname>R</classname> that consists only of
|
|
the components <literal>x</literal> and <literal>y</literal>.
|
|
Note that the tuple <literal>t</literal> only consists of the
|
|
components <classname>A</classname> and
|
|
<classname>B</classname> of relation <classname>R</classname>.
|
|
</para>
|
|
|
|
<para id="divide-example">
|
|
Given the following tables
|
|
|
|
<programlisting>
|
|
R: S:
|
|
A | B | C | D C | D
|
|
---+---+---+--- ---+---
|
|
a | b | c | d c | d
|
|
a | b | e | f e | f
|
|
b | c | e | f
|
|
e | d | c | d
|
|
e | d | e | f
|
|
a | b | d | e
|
|
</programlisting>
|
|
|
|
R ÷ S
|
|
is derived as
|
|
|
|
<programlisting>
|
|
A | B
|
|
---+---
|
|
a | b
|
|
e | d
|
|
</programlisting>
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</para>
|
|
|
|
<para>
|
|
For a more detailed description and definition of the relational
|
|
algebra refer to [<xref linkend="ULL88" endterm="ULL88">] or
|
|
[<xref linkend="DATE04" endterm="DATE04">].
|
|
</para>
|
|
|
|
<example>
|
|
<title id="suppl-rel-alg">A Query Using Relational Algebra</title>
|
|
<para>
|
|
Recall that we formulated all those relational operators to be able to
|
|
retrieve data from the database. Let's return to our example from
|
|
the previous
|
|
section (<xref linkend="operations" endterm="operations">)
|
|
where someone wanted to know the names of all
|
|
suppliers that sell the part <literal>Screw</literal>.
|
|
This question can be answered
|
|
using relational algebra by the following operation:
|
|
|
|
<programlisting>
|
|
π<subscript>SUPPLIER.SNAME</subscript>(σ<subscript>PART.PNAME='Screw'</subscript>(SUPPLIER ∏ SELLS ∏ PART))
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
We call such an operation a query. If we evaluate the above query
|
|
against the our example tables
|
|
(<xref linkend="supplier-fig" endterm="supplier-fig">)
|
|
we will obtain the following result:
|
|
|
|
<programlisting>
|
|
SNAME
|
|
-------
|
|
Smith
|
|
Adams
|
|
</programlisting>
|
|
</para>
|
|
</example>
|
|
</sect2>
|
|
|
|
<sect2 id="rel-calc">
|
|
<title>Relational Calculus</title>
|
|
|
|
<para>
|
|
The relational calculus is based on the
|
|
<firstterm>first order logic</firstterm>. There are
|
|
two variants of the relational calculus:
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
The <firstterm>Domain Relational Calculus</firstterm>
|
|
(<acronym>DRC</acronym>), where variables
|
|
stand for components (attributes) of the tuples.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
The <firstterm>Tuple Relational Calculus</firstterm>
|
|
(<acronym>TRC</acronym>), where variables stand for tuples.
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</para>
|
|
|
|
<para>
|
|
We want to discuss the tuple relational calculus only because it is
|
|
the one underlying the most relational languages. For a detailed
|
|
discussion on <acronym>DRC</acronym> (and also
|
|
<acronym>TRC</acronym>) see
|
|
<xref linkend="DATE04" endterm="DATE04">
|
|
or
|
|
<xref linkend="ULL88" endterm="ULL88">.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Tuple Relational Calculus</title>
|
|
|
|
<para>
|
|
The queries used in <acronym>TRC</acronym> are of the following
|
|
form:
|
|
|
|
<programlisting>
|
|
x(A) ∣ F(x)
|
|
</programlisting>
|
|
|
|
where <literal>x</literal> is a tuple variable
|
|
<classname>A</classname> is a set of attributes and <literal>F</literal> is a
|
|
formula. The resulting relation consists of all tuples
|
|
<literal>t(A)</literal> that satisfy <literal>F(t)</literal>.
|
|
</para>
|
|
|
|
<para>
|
|
If we want to answer the question from example
|
|
<xref linkend="suppl-rel-alg" endterm="suppl-rel-alg">
|
|
using <acronym>TRC</acronym> we formulate the following query:
|
|
|
|
<programlisting>
|
|
{x(SNAME) ∣ x ∈ SUPPLIER ∧
|
|
∃ y ∈ SELLS ∃ z ∈ PART (y(SNO)=x(SNO) ∧
|
|
z(PNO)=y(PNO) ∧
|
|
z(PNAME)='Screw')}
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
Evaluating the query against the tables from
|
|
<xref linkend="supplier-fig" endterm="supplier-fig">
|
|
again leads to the same result
|
|
as in
|
|
<xref linkend="suppl-rel-alg" endterm="suppl-rel-alg">.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2 id="alg-vs-calc">
|
|
<title>Relational Algebra vs. Relational Calculus</title>
|
|
|
|
<para>
|
|
The relational algebra and the relational calculus have the same
|
|
<firstterm>expressive power</firstterm>; i.e. all queries that
|
|
can be formulated using relational algebra can also be formulated
|
|
using the relational calculus and vice versa.
|
|
This was first proved by E. F. Codd in
|
|
1972. This proof is based on an algorithm (<quote>Codd's reduction
|
|
algorithm</quote>) by which an arbitrary expression of the relational
|
|
calculus can be reduced to a semantically equivalent expression of
|
|
relational algebra. For a more detailed discussion on that refer to
|
|
<xref linkend="DATE04" endterm="DATE04">
|
|
and
|
|
<xref linkend="ULL88" endterm="ULL88">.
|
|
</para>
|
|
|
|
<para>
|
|
It is sometimes said that languages based on the relational
|
|
calculus are <quote>higher level</quote> or <quote>more
|
|
declarative</quote> than languages based on relational algebra
|
|
because the algebra (partially) specifies the order of operations
|
|
while the calculus leaves it to a compiler or interpreter to
|
|
determine the most efficient order of evaluation.
|
|
</para>
|
|
</sect2>
|
|
</sect1>
|
|
|
|
<sect1 id="sql-language">
|
|
<title>The <acronym>SQL</acronym> Language</title>
|
|
|
|
<para>
|
|
As is the case with most modern relational languages,
|
|
<acronym>SQL</acronym> is based on the tuple
|
|
relational calculus. As a result every query that can be formulated
|
|
using the tuple relational calculus (or equivalently, relational
|
|
algebra) can also be formulated using
|
|
<acronym>SQL</acronym>. There are, however,
|
|
capabilities beyond the scope of relational algebra or calculus. Here
|
|
is a list of some additional features provided by
|
|
<acronym>SQL</acronym> that are not
|
|
part of relational algebra or calculus:
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
Commands for insertion, deletion or modification of data.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
Arithmetic capability: In <acronym>SQL</acronym> it is possible
|
|
to involve
|
|
arithmetic operations as well as comparisons, e.g.
|
|
|
|
<programlisting>
|
|
A < B + 3.
|
|
</programlisting>
|
|
|
|
Note
|
|
that + or other arithmetic operators appear neither in relational
|
|
algebra nor in relational calculus.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
Assignment and Print Commands: It is possible to print a
|
|
relation constructed by a query and to assign a computed relation to a
|
|
relation name.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
Aggregate Functions: Operations such as
|
|
<firstterm>average</firstterm>, <firstterm>sum</firstterm>,
|
|
<firstterm>max</firstterm>, etc. can be applied to columns of a
|
|
relation to
|
|
obtain a single quantity.
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</para>
|
|
|
|
<sect2 id="select">
|
|
<title id="select-title">Select</title>
|
|
|
|
<para>
|
|
The most often used command in <acronym>SQL</acronym> is the
|
|
<command>SELECT</command> statement,
|
|
used to retrieve data. The syntax is:
|
|
|
|
<synopsis>
|
|
SELECT [ ALL | DISTINCT [ ON ( <replaceable class="PARAMETER">expression</replaceable> [, ...] ) ] ]
|
|
* | <replaceable class="PARAMETER">expression</replaceable> [ AS <replaceable class="PARAMETER">output_name</replaceable> ] [, ...]
|
|
[ INTO [ TEMPORARY | TEMP ] [ TABLE ] <replaceable class="PARAMETER">new_table</replaceable> ]
|
|
[ FROM <replaceable class="PARAMETER">from_item</replaceable> [, ...] ]
|
|
[ WHERE <replaceable class="PARAMETER">condition</replaceable> ]
|
|
[ GROUP BY <replaceable class="PARAMETER">expression</replaceable> [, ...] ]
|
|
[ HAVING <replaceable class="PARAMETER">condition</replaceable> [, ...] ]
|
|
[ { UNION | INTERSECT | EXCEPT } [ ALL ] <replaceable class="PARAMETER">select</replaceable> ]
|
|
[ ORDER BY <replaceable class="parameter">expression</replaceable> [ ASC | DESC | USING <replaceable class="parameter">operator</replaceable> ] [ NULLS { FIRST | LAST } ] [, ...] ]
|
|
[ LIMIT { <replaceable class="PARAMETER">count</replaceable> | ALL } ]
|
|
[ OFFSET <replaceable class="PARAMETER">start</replaceable> ]
|
|
[ FOR { UPDATE | SHARE } [ OF <replaceable class="parameter">table_name</replaceable> [, ...] ] [ NOWAIT ] [...] ]
|
|
</synopsis>
|
|
</para>
|
|
|
|
<para>
|
|
Now we will illustrate the complex syntax of the
|
|
<command>SELECT</command> statement with various examples. The
|
|
tables used for the examples are defined in <xref
|
|
linkend="supplier-fig" endterm="supplier-fig">.
|
|
</para>
|
|
|
|
<sect3>
|
|
<title>Simple Selects</title>
|
|
|
|
<para>
|
|
Here are some simple examples using a <command>SELECT</command> statement:
|
|
|
|
<example>
|
|
<title id="simple-query">Simple Query with Qualification</title>
|
|
<para>
|
|
To retrieve all tuples from table PART where the attribute PRICE is
|
|
greater than 10 we formulate the following query:
|
|
|
|
<programlisting>
|
|
SELECT * FROM PART
|
|
WHERE PRICE > 10;
|
|
</programlisting>
|
|
|
|
and get the table:
|
|
|
|
<programlisting>
|
|
PNO | PNAME | PRICE
|
|
-----+---------+--------
|
|
3 | Bolt | 15
|
|
4 | Cam | 25
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
Using <quote>*</quote> in the <command>SELECT</command> statement
|
|
will deliver all attributes from the table. If we want to retrieve
|
|
only the attributes PNAME and PRICE from table PART we use the
|
|
statement:
|
|
|
|
<programlisting>
|
|
SELECT PNAME, PRICE
|
|
FROM PART
|
|
WHERE PRICE > 10;
|
|
</programlisting>
|
|
|
|
In this case the result is:
|
|
|
|
<programlisting>
|
|
PNAME | PRICE
|
|
--------+--------
|
|
Bolt | 15
|
|
Cam | 25
|
|
</programlisting>
|
|
|
|
Note that the <acronym>SQL</acronym> <command>SELECT</command>
|
|
corresponds to the <quote>projection</quote> in relational algebra
|
|
not to the <quote>selection</quote> (see <xref linkend="rel-alg"
|
|
endterm="rel-alg"> for more details).
|
|
</para>
|
|
|
|
<para>
|
|
The qualifications in the WHERE clause can also be logically connected
|
|
using the keywords OR, AND, and NOT:
|
|
|
|
<programlisting>
|
|
SELECT PNAME, PRICE
|
|
FROM PART
|
|
WHERE PNAME = 'Bolt' AND
|
|
(PRICE = 0 OR PRICE <= 15);
|
|
</programlisting>
|
|
|
|
will lead to the result:
|
|
|
|
<programlisting>
|
|
PNAME | PRICE
|
|
--------+--------
|
|
Bolt | 15
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
Arithmetic operations can be used in the target list and in the WHERE
|
|
clause. For example if we want to know how much it would cost if we
|
|
take two pieces of a part we could use the following query:
|
|
|
|
<programlisting>
|
|
SELECT PNAME, PRICE * 2 AS DOUBLE
|
|
FROM PART
|
|
WHERE PRICE * 2 < 50;
|
|
</programlisting>
|
|
|
|
and we get:
|
|
|
|
<programlisting>
|
|
PNAME | DOUBLE
|
|
--------+---------
|
|
Screw | 20
|
|
Nut | 16
|
|
Bolt | 30
|
|
</programlisting>
|
|
|
|
Note that the word DOUBLE after the keyword AS is the new title of the
|
|
second column. This technique can be used for every element of the
|
|
target list to assign a new title to the resulting
|
|
column. This new title
|
|
is often referred to as alias. The alias cannot be used throughout the
|
|
rest of the query.
|
|
</para>
|
|
</example>
|
|
</para>
|
|
</sect3>
|
|
|
|
<sect3>
|
|
<title>Joins</title>
|
|
|
|
<para id="simple-join">
|
|
The following example shows how <firstterm>joins</firstterm> are
|
|
realized in <acronym>SQL</acronym>.
|
|
</para>
|
|
|
|
<para>
|
|
To join the three tables SUPPLIER, PART and SELLS over their common
|
|
attributes we formulate the following statement:
|
|
|
|
<programlisting>
|
|
SELECT S.SNAME, P.PNAME
|
|
FROM SUPPLIER S, PART P, SELLS SE
|
|
WHERE S.SNO = SE.SNO AND
|
|
P.PNO = SE.PNO;
|
|
</programlisting>
|
|
|
|
and get the following table as a result:
|
|
|
|
<programlisting>
|
|
SNAME | PNAME
|
|
-------+-------
|
|
Smith | Screw
|
|
Smith | Nut
|
|
Jones | Cam
|
|
Adams | Screw
|
|
Adams | Bolt
|
|
Blake | Nut
|
|
Blake | Bolt
|
|
Blake | Cam
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
In the FROM clause we introduced an alias name for every relation
|
|
because there are common named attributes (SNO and PNO) among the
|
|
relations. Now we can distinguish between the common named attributes
|
|
by simply prefixing the attribute name with the alias name followed by
|
|
a dot. The join is calculated in the same way as shown in
|
|
<xref linkend="join-example" endterm="join-example">.
|
|
First the Cartesian product
|
|
|
|
SUPPLIER × PART × SELLS
|
|
|
|
is derived. Now only those tuples satisfying the
|
|
conditions given in the WHERE clause are selected (i.e. the common
|
|
named attributes have to be equal). Finally we project out all
|
|
columns but S.SNAME and P.PNAME.
|
|
</para>
|
|
|
|
<para>
|
|
Another way to perform joins is to use the SQL JOIN syntax as follows:
|
|
<programlisting>
|
|
select sname, pname from supplier
|
|
JOIN sells USING (sno)
|
|
JOIN part USING (pno);
|
|
</programlisting>
|
|
giving again:
|
|
<programlisting>
|
|
sname | pname
|
|
-------+-------
|
|
Smith | Screw
|
|
Adams | Screw
|
|
Smith | Nut
|
|
Blake | Nut
|
|
Adams | Bolt
|
|
Blake | Bolt
|
|
Jones | Cam
|
|
Blake | Cam
|
|
(8 rows)
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
A joined table, created using JOIN syntax, is a table reference list
|
|
item that occurs in a FROM clause and before any WHERE, GROUP BY,
|
|
or HAVING clause. Other table references, including table names or
|
|
other JOIN clauses, can be included in the FROM clause if separated
|
|
by commas. JOINed tables are logically like any other
|
|
table listed in the FROM clause.
|
|
</para>
|
|
|
|
<para>
|
|
SQL JOINs come in two main types, CROSS JOINs (unqualified joins)
|
|
and <firstterm>qualified JOINs</>. Qualified joins can be further
|
|
subdivided based on the way in which the <firstterm>join condition</>
|
|
is specified (ON, USING, or NATURAL) and the way in which it is
|
|
applied (INNER or OUTER join).
|
|
</para>
|
|
|
|
<variablelist>
|
|
<title>Join Types</title>
|
|
<varlistentry>
|
|
<term>CROSS JOIN</term>
|
|
<listitem>
|
|
<cmdsynopsis>
|
|
<arg choice="req"> <replaceable class="parameter">T1</replaceable> </arg>
|
|
<command> CROSS JOIN </command>
|
|
<arg choice="req"> <replaceable class="parameter">T2</replaceable> </arg>
|
|
</cmdsynopsis>
|
|
|
|
<para>
|
|
A cross join takes two tables T1 and T2 having N and M rows
|
|
respectively, and returns a joined table containing all
|
|
N*M possible joined rows. For each row R1 of T1, each row
|
|
R2 of T2 is joined with R1 to yield a joined table row JR
|
|
consisting of all fields in R1 and R2. A CROSS JOIN is
|
|
equivalent to an INNER JOIN ON TRUE.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>Qualified JOINs</term>
|
|
<listitem>
|
|
|
|
<cmdsynopsis>
|
|
<arg choice="req"> <replaceable class="parameter">T1</replaceable> </arg>
|
|
<arg choice="opt"> NATURAL </arg>
|
|
<group choice="opt">
|
|
<arg choice="opt"> INNER </arg>
|
|
<arg>
|
|
<group choice="req">
|
|
<arg choice="plain"> LEFT </arg>
|
|
<arg choice="plain"> RIGHT </arg>
|
|
<arg choice="plain"> FULL </arg>
|
|
</group>
|
|
<arg choice="opt"> OUTER </arg>
|
|
</arg>
|
|
</group>
|
|
<command> JOIN </command>
|
|
<arg choice="req"> <replaceable class="parameter">T2</replaceable> </arg>
|
|
<group choice="req">
|
|
<arg> ON <replaceable>search condition</replaceable></arg>
|
|
<arg> USING ( <replaceable>join column list</replaceable> ) </arg>
|
|
</group>
|
|
</cmdsynopsis>
|
|
|
|
<para>
|
|
A qualified JOIN must specify its join condition
|
|
by providing one (and only one) of NATURAL, ON, or
|
|
USING. The ON clause
|
|
takes a <replaceable>search condition</replaceable>,
|
|
which is the same as in a WHERE clause. The USING
|
|
clause takes a comma-separated list of column names,
|
|
which the joined tables must have in common, and joins
|
|
the tables on equality of those columns. NATURAL is
|
|
shorthand for a USING clause that lists all the common
|
|
column names of the two tables. A side-effect of both
|
|
USING and NATURAL is that only one copy of each joined
|
|
column is emitted into the result table (compare the
|
|
relational-algebra definition of JOIN, shown earlier).
|
|
</para>
|
|
|
|
<!-- begin join semantics -->
|
|
<variablelist>
|
|
<varlistentry>
|
|
<term>
|
|
<cmdsynopsis>
|
|
<arg> INNER </arg>
|
|
<command> JOIN </command>
|
|
</cmdsynopsis>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
For each row R1 of T1, the joined table has a row for each row
|
|
in T2 that satisfies the join condition with R1.
|
|
</para>
|
|
<tip>
|
|
<para>
|
|
The words INNER and OUTER are optional for all JOINs.
|
|
INNER is the default. LEFT, RIGHT, and FULL imply an
|
|
OUTER JOIN.
|
|
</para>
|
|
</tip>
|
|
</listitem>
|
|
</varlistentry>
|
|
<varlistentry>
|
|
<term>
|
|
<cmdsynopsis>
|
|
<arg choice="plain"> LEFT </arg>
|
|
<arg> OUTER </arg>
|
|
<command> JOIN </command>
|
|
</cmdsynopsis>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
First, an INNER JOIN is performed.
|
|
Then, for each row in T1 that does not satisfy the join
|
|
condition with any row in T2, an additional joined row is
|
|
returned with null fields in the columns from T2.
|
|
</para>
|
|
<tip>
|
|
<para>
|
|
The joined table unconditionally has a row for each row in T1.
|
|
</para>
|
|
</tip>
|
|
</listitem>
|
|
</varlistentry>
|
|
<varlistentry>
|
|
<term>
|
|
<cmdsynopsis>
|
|
<arg choice="plain"> RIGHT </arg>
|
|
<arg> OUTER </arg>
|
|
<command> JOIN </command>
|
|
</cmdsynopsis>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
First, an INNER JOIN is performed.
|
|
Then, for each row in T2 that does not satisfy the join
|
|
condition with any row in T1, an additional joined row is
|
|
returned with null fields in the columns from T1.
|
|
</para>
|
|
<tip>
|
|
<para>
|
|
The joined table unconditionally has a row for each row in T2.
|
|
</para>
|
|
</tip>
|
|
</listitem>
|
|
</varlistentry>
|
|
<varlistentry>
|
|
<term>
|
|
<cmdsynopsis>
|
|
<arg choice="plain"> FULL </arg>
|
|
<arg> OUTER </arg>
|
|
<command> JOIN </command>
|
|
</cmdsynopsis>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
First, an INNER JOIN is performed.
|
|
Then, for each row in T1 that does not satisfy the join
|
|
condition with any row in T2, an additional joined row is
|
|
returned with null fields in the columns from T2.
|
|
Also, for each row in T2 that does not satisfy the join
|
|
condition with any row in T1, an additional joined row is
|
|
returned with null fields in the columns from T1.
|
|
</para>
|
|
<tip>
|
|
<para>
|
|
The joined table unconditionally has a row for every row of T1
|
|
and a row for every row of T2.
|
|
</para>
|
|
</tip>
|
|
</listitem>
|
|
</varlistentry>
|
|
</variablelist>
|
|
<!-- end join semantics -->
|
|
|
|
</listitem>
|
|
</varlistentry>
|
|
</variablelist>
|
|
|
|
<para>
|
|
JOINs of all types can be chained together or nested where either or both of
|
|
<replaceable class="parameter">T1</replaceable> and
|
|
<replaceable class="parameter">T2</replaceable> can be JOINed tables.
|
|
Parenthesis can be used around JOIN clauses to control the order
|
|
of JOINs which are otherwise processed left to right.
|
|
</para>
|
|
|
|
</sect3>
|
|
|
|
<sect3>
|
|
<title id="aggregates-tutorial">Aggregate Functions</title>
|
|
|
|
<para>
|
|
<acronym>SQL</acronym> provides aggregate functions such as AVG,
|
|
COUNT, SUM, MIN, and MAX. The argument(s) of an aggregate function
|
|
are evaluated at each row that satisfies the WHERE
|
|
clause, and the aggregate function is calculated over this set
|
|
of input values. Normally, an aggregate delivers a single
|
|
result for a whole <command>SELECT</command> statement. But if
|
|
grouping is specified in the query, then a separate calculation
|
|
is done over the rows of each group, and an aggregate result is
|
|
delivered per group (see next section).
|
|
|
|
<example>
|
|
<title id="aggregates-example">Aggregates</title>
|
|
|
|
<para>
|
|
If we want to know the average cost of all parts in table PART we use
|
|
the following query:
|
|
|
|
<programlisting>
|
|
SELECT AVG(PRICE) AS AVG_PRICE
|
|
FROM PART;
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
The result is:
|
|
|
|
<programlisting>
|
|
AVG_PRICE
|
|
-----------
|
|
14.5
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
If we want to know how many parts are defined in table PART we use
|
|
the statement:
|
|
|
|
<programlisting>
|
|
SELECT COUNT(PNO)
|
|
FROM PART;
|
|
</programlisting>
|
|
|
|
and get:
|
|
|
|
<programlisting>
|
|
COUNT
|
|
-------
|
|
4
|
|
</programlisting>
|
|
|
|
</para>
|
|
</example>
|
|
</para>
|
|
</sect3>
|
|
|
|
<sect3>
|
|
<title>Aggregation by Groups</title>
|
|
|
|
<para>
|
|
<acronym>SQL</acronym> allows one to partition the tuples of a table
|
|
into groups. Then the
|
|
aggregate functions described above can be applied to the groups —
|
|
i.e. the value of the aggregate function is no longer calculated over
|
|
all the values of the specified column but over all values of a
|
|
group. Thus the aggregate function is evaluated separately for every
|
|
group.
|
|
</para>
|
|
|
|
<para>
|
|
The partitioning of the tuples into groups is done by using the
|
|
keywords <command>GROUP BY</command> followed by a list of
|
|
attributes that define the
|
|
groups. If we have
|
|
<command>GROUP BY A<subscript>1</subscript>, ⃛, A<subscript>k</subscript></command>
|
|
we partition
|
|
the relation into groups, such that two tuples are in the same group
|
|
if and only if they agree on all the attributes
|
|
A<subscript>1</subscript>, ⃛, A<subscript>k</subscript>.
|
|
|
|
<example>
|
|
<title id="aggregates-groupby">Aggregates</title>
|
|
<para>
|
|
If we want to know how many parts are sold by every supplier we
|
|
formulate the query:
|
|
|
|
<programlisting>
|
|
SELECT S.SNO, S.SNAME, COUNT(SE.PNO)
|
|
FROM SUPPLIER S, SELLS SE
|
|
WHERE S.SNO = SE.SNO
|
|
GROUP BY S.SNO, S.SNAME;
|
|
</programlisting>
|
|
|
|
and get:
|
|
|
|
<programlisting>
|
|
SNO | SNAME | COUNT
|
|
-----+-------+-------
|
|
1 | Smith | 2
|
|
2 | Jones | 1
|
|
3 | Adams | 2
|
|
4 | Blake | 3
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
Now let's have a look of what is happening here.
|
|
First the join of the
|
|
tables SUPPLIER and SELLS is derived:
|
|
|
|
<programlisting>
|
|
S.SNO | S.SNAME | SE.PNO
|
|
-------+---------+--------
|
|
1 | Smith | 1
|
|
1 | Smith | 2
|
|
2 | Jones | 4
|
|
3 | Adams | 1
|
|
3 | Adams | 3
|
|
4 | Blake | 2
|
|
4 | Blake | 3
|
|
4 | Blake | 4
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
Next we partition the tuples into groups by putting all tuples
|
|
together that agree on both attributes S.SNO and S.SNAME:
|
|
|
|
<programlisting>
|
|
S.SNO | S.SNAME | SE.PNO
|
|
-------+---------+--------
|
|
1 | Smith | 1
|
|
| 2
|
|
--------------------------
|
|
2 | Jones | 4
|
|
--------------------------
|
|
3 | Adams | 1
|
|
| 3
|
|
--------------------------
|
|
4 | Blake | 2
|
|
| 3
|
|
| 4
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
In our example we got four groups and now we can apply the aggregate
|
|
function COUNT to every group leading to the final result of the query
|
|
given above.
|
|
</para>
|
|
</example>
|
|
</para>
|
|
|
|
<para>
|
|
Note that for a query using GROUP BY and aggregate
|
|
functions to make sense, the target list can only refer directly to
|
|
the attributes being grouped by. Other attributes can only be used
|
|
inside the arguments of aggregate functions. Otherwise there would
|
|
not be a unique value to associate with the other attributes.
|
|
</para>
|
|
|
|
<para>
|
|
Also observe that it makes no sense to ask for an aggregate of
|
|
an aggregate, e.g., AVG(MAX(sno)), because a
|
|
<command>SELECT</command> only does one pass of grouping and
|
|
aggregation. You can get a result of this kind by using a
|
|
temporary table or a sub-SELECT in the FROM clause to do the
|
|
first level of aggregation.
|
|
</para>
|
|
</sect3>
|
|
|
|
<sect3>
|
|
<title>Having</title>
|
|
|
|
<para>
|
|
The HAVING clause works much like the WHERE clause and is used to
|
|
consider only those groups satisfying the qualification given in the
|
|
HAVING clause. Essentially, WHERE filters out unwanted input rows
|
|
before grouping and aggregation are done, whereas HAVING filters out
|
|
unwanted group rows post-GROUP. Therefore, WHERE cannot refer to the
|
|
results of aggregate functions. On the other hand, there's no point
|
|
in writing a HAVING condition that doesn't involve an aggregate
|
|
function! If your condition doesn't involve aggregates, you might
|
|
as well write it in WHERE, and thereby avoid the computation of
|
|
aggregates for groups that you're just going to throw away anyway.
|
|
|
|
<example>
|
|
<title id="having-example">Having</title>
|
|
|
|
<para>
|
|
If we want only those suppliers selling more than one part we use the
|
|
query:
|
|
|
|
<programlisting>
|
|
SELECT S.SNO, S.SNAME, COUNT(SE.PNO)
|
|
FROM SUPPLIER S, SELLS SE
|
|
WHERE S.SNO = SE.SNO
|
|
GROUP BY S.SNO, S.SNAME
|
|
HAVING COUNT(SE.PNO) > 1;
|
|
</programlisting>
|
|
|
|
and get:
|
|
|
|
<programlisting>
|
|
SNO | SNAME | COUNT
|
|
-----+-------+-------
|
|
1 | Smith | 2
|
|
3 | Adams | 2
|
|
4 | Blake | 3
|
|
</programlisting>
|
|
</para>
|
|
</example>
|
|
</para>
|
|
</sect3>
|
|
|
|
<sect3>
|
|
<title>Subqueries</title>
|
|
|
|
<para>
|
|
In the WHERE and HAVING clauses the use of subqueries (subselects) is
|
|
allowed in every place where a value is expected. In this case the
|
|
value must be derived by evaluating the subquery first. The usage of
|
|
subqueries extends the expressive power of
|
|
<acronym>SQL</acronym>.
|
|
|
|
<example>
|
|
<title id="subselect-example">Subselect</title>
|
|
|
|
<para>
|
|
If we want to know all parts having a greater price than the part
|
|
named 'Screw' we use the query:
|
|
|
|
<programlisting>
|
|
SELECT *
|
|
FROM PART
|
|
WHERE PRICE > (SELECT PRICE FROM PART
|
|
WHERE PNAME='Screw');
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
The result is:
|
|
|
|
<programlisting>
|
|
PNO | PNAME | PRICE
|
|
-----+---------+--------
|
|
3 | Bolt | 15
|
|
4 | Cam | 25
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
When we look at the above query we can see the keyword
|
|
<command>SELECT</command> two times. The first one at the
|
|
beginning of the query - we will refer to it as outer
|
|
<command>SELECT</command> - and the one in the WHERE clause which
|
|
begins a nested query - we will refer to it as inner
|
|
<command>SELECT</command>. For every tuple of the outer
|
|
<command>SELECT</command> the inner <command>SELECT</command> has
|
|
to be evaluated. After every evaluation we know the price of the
|
|
tuple named 'Screw' and we can check if the price of the actual
|
|
tuple is greater. (Actually, in this example the inner query need
|
|
only be evaluated once, since it does not depend on the state of
|
|
the outer query.)
|
|
</para>
|
|
|
|
<para>
|
|
If we want to know all suppliers that do not sell any part
|
|
(e.g. to be able to remove these suppliers from the database) we use:
|
|
|
|
<programlisting>
|
|
SELECT *
|
|
FROM SUPPLIER S
|
|
WHERE NOT EXISTS
|
|
(SELECT * FROM SELLS SE
|
|
WHERE SE.SNO = S.SNO);
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
In our example the result will be empty because every supplier
|
|
sells at least one part. Note that we use S.SNO from the outer
|
|
<command>SELECT</command> within the WHERE clause of the inner
|
|
<command>SELECT</command>. Here the subquery must be evaluated
|
|
afresh for each tuple from the outer query, i.e. the value for
|
|
S.SNO is always taken from the current tuple of the outer
|
|
<command>SELECT</command>.
|
|
</para>
|
|
</example>
|
|
</para>
|
|
</sect3>
|
|
|
|
<sect3>
|
|
<title>Subqueries in FROM</title>
|
|
|
|
<para>
|
|
A somewhat different way of using subqueries is to put them in the
|
|
FROM clause. This is a useful feature because a subquery of this
|
|
kind can output multiple columns and rows, whereas a subquery used
|
|
in an expression must deliver just a single result. It also lets
|
|
us get more than one round of grouping/aggregation without resorting
|
|
to a temporary table.
|
|
|
|
<example>
|
|
<title id="subselect-in-from-example">Subselect in FROM</title>
|
|
|
|
<para>
|
|
If we want to know the highest average part price among all our
|
|
suppliers, we cannot write MAX(AVG(PRICE)), but we can write:
|
|
|
|
<programlisting>
|
|
SELECT MAX(subtable.avgprice)
|
|
FROM (SELECT AVG(P.PRICE) AS avgprice
|
|
FROM SUPPLIER S, PART P, SELLS SE
|
|
WHERE S.SNO = SE.SNO AND
|
|
P.PNO = SE.PNO
|
|
GROUP BY S.SNO) subtable;
|
|
</programlisting>
|
|
|
|
The subquery returns one row per supplier (because of its GROUP BY)
|
|
and then we aggregate over those rows in the outer query.
|
|
</para>
|
|
</example>
|
|
</para>
|
|
</sect3>
|
|
|
|
<sect3>
|
|
<title>Union, Intersect, Except</title>
|
|
|
|
<para>
|
|
These operations calculate the union, intersection and set theoretic
|
|
difference of the tuples derived by two subqueries.
|
|
|
|
<example>
|
|
<title id="union-example">Union, Intersect, Except</title>
|
|
|
|
<para>
|
|
The following query is an example for UNION:
|
|
|
|
<programlisting>
|
|
SELECT S.SNO, S.SNAME, S.CITY
|
|
FROM SUPPLIER S
|
|
WHERE S.SNAME = 'Jones'
|
|
UNION
|
|
SELECT S.SNO, S.SNAME, S.CITY
|
|
FROM SUPPLIER S
|
|
WHERE S.SNAME = 'Adams';
|
|
</programlisting>
|
|
|
|
gives the result:
|
|
|
|
<programlisting>
|
|
SNO | SNAME | CITY
|
|
-----+-------+--------
|
|
2 | Jones | Paris
|
|
3 | Adams | Vienna
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
Here is an example for INTERSECT:
|
|
|
|
<programlisting>
|
|
SELECT S.SNO, S.SNAME, S.CITY
|
|
FROM SUPPLIER S
|
|
WHERE S.SNO > 1
|
|
INTERSECT
|
|
SELECT S.SNO, S.SNAME, S.CITY
|
|
FROM SUPPLIER S
|
|
WHERE S.SNO < 3;
|
|
</programlisting>
|
|
|
|
gives the result:
|
|
|
|
<programlisting>
|
|
SNO | SNAME | CITY
|
|
-----+-------+--------
|
|
2 | Jones | Paris
|
|
</programlisting>
|
|
|
|
The only tuple returned by both parts of the query is the one having SNO=2.
|
|
</para>
|
|
|
|
<para>
|
|
Finally an example for EXCEPT:
|
|
|
|
<programlisting>
|
|
SELECT S.SNO, S.SNAME, S.CITY
|
|
FROM SUPPLIER S
|
|
WHERE S.SNO > 1
|
|
EXCEPT
|
|
SELECT S.SNO, S.SNAME, S.CITY
|
|
FROM SUPPLIER S
|
|
WHERE S.SNO > 3;
|
|
</programlisting>
|
|
|
|
gives the result:
|
|
|
|
<programlisting>
|
|
SNO | SNAME | CITY
|
|
-----+-------+--------
|
|
2 | Jones | Paris
|
|
3 | Adams | Vienna
|
|
</programlisting>
|
|
</para>
|
|
</example>
|
|
</para>
|
|
</sect3>
|
|
</sect2>
|
|
|
|
<sect2 id="datadef">
|
|
<title>Data Definition</title>
|
|
|
|
<para>
|
|
There is a set of commands used for data definition included in the
|
|
<acronym>SQL</acronym> language.
|
|
</para>
|
|
|
|
<sect3 id="create">
|
|
<title id="create-title">Create Table</title>
|
|
|
|
<para>
|
|
The most fundamental command for data definition is the
|
|
one that creates a new relation (a new table). The syntax of the
|
|
<command>CREATE TABLE</command> command is:
|
|
|
|
<synopsis>
|
|
CREATE TABLE <replaceable class="parameter">table_name</replaceable>
|
|
(<replaceable class="parameter">name_of_attr_1</replaceable> <replaceable class="parameter">type_of_attr_1</replaceable>
|
|
[, <replaceable class="parameter">name_of_attr_2</replaceable> <replaceable class="parameter">type_of_attr_2</replaceable>
|
|
[, ...]]);
|
|
</synopsis>
|
|
|
|
<example>
|
|
<title id="table-create">Table Creation</title>
|
|
|
|
<para>
|
|
To create the tables defined in
|
|
<xref linkend="supplier-fig" endterm="supplier-fig"> the
|
|
following <acronym>SQL</acronym> statements are used:
|
|
|
|
<programlisting>
|
|
CREATE TABLE SUPPLIER
|
|
(SNO INTEGER,
|
|
SNAME VARCHAR(20),
|
|
CITY VARCHAR(20));
|
|
</programlisting>
|
|
|
|
<programlisting>
|
|
CREATE TABLE PART
|
|
(PNO INTEGER,
|
|
PNAME VARCHAR(20),
|
|
PRICE DECIMAL(4 , 2));
|
|
</programlisting>
|
|
|
|
<programlisting>
|
|
CREATE TABLE SELLS
|
|
(SNO INTEGER,
|
|
PNO INTEGER);
|
|
</programlisting>
|
|
</para>
|
|
</example>
|
|
</para>
|
|
</sect3>
|
|
|
|
<sect3>
|
|
<title>Data Types in <acronym>SQL</acronym></title>
|
|
|
|
<para>
|
|
The following is a list of some data types that are supported by
|
|
<acronym>SQL</acronym>:
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
INTEGER: signed fullword binary integer (31 bits precision).
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
SMALLINT: signed halfword binary integer (15 bits precision).
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
DECIMAL (<replaceable class="parameter">p</replaceable>[,<replaceable class="parameter">q</replaceable>]):
|
|
signed packed decimal number of up to
|
|
<replaceable class="parameter">p</replaceable>
|
|
digits, with
|
|
<replaceable class="parameter">q</replaceable>
|
|
digits to the right of the decimal point.
|
|
If <replaceable class="parameter">q</replaceable>
|
|
is omitted it is assumed to be 0.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
FLOAT: signed doubleword floating point number.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
VARCHAR(<replaceable class="parameter">n</replaceable>):
|
|
varying length character string of maximum length
|
|
<replaceable class="parameter">n</replaceable>.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
CHAR(<replaceable class="parameter">n</replaceable>):
|
|
fixed length character string of length
|
|
<replaceable class="parameter">n</replaceable>.
|
|
</para>
|
|
</listitem>
|
|
|
|
</itemizedlist>
|
|
</para>
|
|
</sect3>
|
|
|
|
<sect3>
|
|
<title>Create Index</title>
|
|
|
|
<para>
|
|
Indexes are used to speed up access to a relation. If a relation <classname>R</classname>
|
|
has an index on attribute <classname>A</classname> then we can
|
|
retrieve all tuples <replaceable>t</replaceable>
|
|
having
|
|
<replaceable>t</replaceable>(<classname>A</classname>) = <replaceable>a</replaceable>
|
|
in time roughly proportional to the number of such
|
|
tuples <replaceable>t</replaceable>
|
|
rather than in time proportional to the size of <classname>R</classname>.
|
|
</para>
|
|
|
|
<para>
|
|
To create an index in <acronym>SQL</acronym>
|
|
the <command>CREATE INDEX</command> command is used. The syntax is:
|
|
|
|
<programlisting>
|
|
CREATE INDEX <replaceable class="parameter">index_name</replaceable>
|
|
ON <replaceable class="parameter">table_name</replaceable> ( <replaceable class="parameter">name_of_attribute</replaceable> );
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
<example>
|
|
<title id="index-create">Create Index</title>
|
|
|
|
<para>
|
|
To create an index named I on attribute SNAME of relation SUPPLIER
|
|
we use the following statement:
|
|
|
|
<programlisting>
|
|
CREATE INDEX I ON SUPPLIER (SNAME);
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
The created index is maintained automatically, i.e. whenever a new
|
|
tuple is inserted into the relation SUPPLIER the index I is
|
|
adapted. Note that the only changes a user can perceive when an
|
|
index is present are increased speed for <command>SELECT</command>
|
|
and decreases in speed of updates.
|
|
</para>
|
|
</example>
|
|
</para>
|
|
</sect3>
|
|
|
|
<sect3>
|
|
<title>Create View</title>
|
|
|
|
<para>
|
|
A view can be regarded as a <firstterm>virtual table</firstterm>,
|
|
i.e. a table that
|
|
does not <emphasis>physically</emphasis> exist in the database
|
|
but looks to the user
|
|
as if it does. By contrast, when we talk of a
|
|
<firstterm>base table</firstterm> there is
|
|
really a physically stored counterpart of each row of the table
|
|
somewhere in the physical storage.
|
|
</para>
|
|
|
|
<para>
|
|
Views do not have their own, physically separate, distinguishable
|
|
stored data. Instead, the system stores the definition of the
|
|
view (i.e. the rules about how to access physically stored base
|
|
tables in order to materialize the view) somewhere in the system
|
|
catalogs (see
|
|
<xref linkend="tutorial-catalogs-title" endterm="tutorial-catalogs-title">). For a
|
|
discussion on different techniques to implement views refer to
|
|
<!--
|
|
section
|
|
<xref linkend="view-impl" endterm="view-impl">.
|
|
-->
|
|
<citetitle>SIM98</citetitle>.
|
|
</para>
|
|
|
|
<para>
|
|
In <acronym>SQL</acronym> the <command>CREATE VIEW</command>
|
|
command is used to define a view. The syntax
|
|
is:
|
|
|
|
<programlisting>
|
|
CREATE VIEW <replaceable class="parameter">view_name</replaceable>
|
|
AS <replaceable class="parameter">select_stmt</replaceable>
|
|
</programlisting>
|
|
|
|
where <replaceable class="parameter">select_stmt</replaceable>
|
|
is a valid select statement as defined
|
|
in <xref linkend="select-title" endterm="select-title">.
|
|
Note that <replaceable class="parameter">select_stmt</replaceable> is
|
|
not executed when the view is created. It is just stored in the
|
|
<firstterm>system catalogs</firstterm>
|
|
and is executed whenever a query against the view is made.
|
|
</para>
|
|
|
|
<para>
|
|
Let the following view definition be given (we use
|
|
the tables from
|
|
<xref linkend="supplier-fig" endterm="supplier-fig"> again):
|
|
|
|
<programlisting>
|
|
CREATE VIEW London_Suppliers
|
|
AS SELECT S.SNAME, P.PNAME
|
|
FROM SUPPLIER S, PART P, SELLS SE
|
|
WHERE S.SNO = SE.SNO AND
|
|
P.PNO = SE.PNO AND
|
|
S.CITY = 'London';
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
Now we can use this <firstterm>virtual relation</firstterm>
|
|
<classname>London_Suppliers</classname> as
|
|
if it were another base table:
|
|
|
|
<programlisting>
|
|
SELECT * FROM London_Suppliers
|
|
WHERE PNAME = 'Screw';
|
|
</programlisting>
|
|
|
|
which will return the following table:
|
|
|
|
<programlisting>
|
|
SNAME | PNAME
|
|
-------+-------
|
|
Smith | Screw
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
To calculate this result the database system has to do a
|
|
<emphasis>hidden</emphasis>
|
|
access to the base tables SUPPLIER, SELLS and PART first. It
|
|
does so by executing the query given in the view definition against
|
|
those base tables. After that the additional qualifications
|
|
(given in the
|
|
query against the view) can be applied to obtain the resulting
|
|
table.
|
|
</para>
|
|
</sect3>
|
|
|
|
<sect3>
|
|
<title>Drop Table, Drop Index, Drop View</title>
|
|
|
|
<para>
|
|
To destroy a table (including all tuples stored in that table) the
|
|
<command>DROP TABLE</command> command is used:
|
|
|
|
<programlisting>
|
|
DROP TABLE <replaceable class="parameter">table_name</replaceable>;
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
To destroy the SUPPLIER table use the following statement:
|
|
|
|
<programlisting>
|
|
DROP TABLE SUPPLIER;
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
The <command>DROP INDEX</command> command is used to destroy an index:
|
|
|
|
<programlisting>
|
|
DROP INDEX <replaceable class="parameter">index_name</replaceable>;
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
Finally to destroy a given view use the command <command>DROP
|
|
VIEW</command>:
|
|
|
|
<programlisting>
|
|
DROP VIEW <replaceable class="parameter">view_name</replaceable>;
|
|
</programlisting>
|
|
</para>
|
|
</sect3>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Data Manipulation</title>
|
|
|
|
<sect3>
|
|
<title>Insert Into</title>
|
|
|
|
<para>
|
|
Once a table is created (see
|
|
<xref linkend="create-title" endterm="create-title">), it can be filled
|
|
with tuples using the command <command>INSERT INTO</command>.
|
|
The syntax is:
|
|
|
|
<programlisting>
|
|
INSERT INTO <replaceable class="parameter">table_name</replaceable> (<replaceable class="parameter">name_of_attr_1</replaceable>
|
|
[, <replaceable class="parameter">name_of_attr_2</replaceable> [,...]])
|
|
VALUES (<replaceable class="parameter">val_attr_1</replaceable> [, <replaceable class="parameter">val_attr_2</replaceable> [, ...]]);
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
To insert the first tuple into the relation SUPPLIER (from
|
|
<xref linkend="supplier-fig" endterm="supplier-fig">) we use the
|
|
following statement:
|
|
|
|
<programlisting>
|
|
INSERT INTO SUPPLIER (SNO, SNAME, CITY)
|
|
VALUES (1, 'Smith', 'London');
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
To insert the first tuple into the relation SELLS we use:
|
|
|
|
<programlisting>
|
|
INSERT INTO SELLS (SNO, PNO)
|
|
VALUES (1, 1);
|
|
</programlisting>
|
|
</para>
|
|
</sect3>
|
|
|
|
<sect3>
|
|
<title>Update</title>
|
|
|
|
<para>
|
|
To change one or more attribute values of tuples in a relation the
|
|
<command>UPDATE</command> command is used. The syntax is:
|
|
|
|
<programlisting>
|
|
UPDATE <replaceable class="parameter">table_name</replaceable>
|
|
SET <replaceable class="parameter">name_of_attr_1</replaceable> = <replaceable class="parameter">value_1</replaceable>
|
|
[, ... [, <replaceable class="parameter">name_of_attr_k</replaceable> = <replaceable class="parameter">value_k</replaceable>]]
|
|
WHERE <replaceable class="parameter">condition</replaceable>;
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
To change the value of attribute PRICE of the part 'Screw' in the
|
|
relation PART we use:
|
|
|
|
<programlisting>
|
|
UPDATE PART
|
|
SET PRICE = 15
|
|
WHERE PNAME = 'Screw';
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
The new value of attribute PRICE of the tuple whose name is 'Screw' is
|
|
now 15.
|
|
</para>
|
|
</sect3>
|
|
|
|
<sect3>
|
|
<title>Delete</title>
|
|
|
|
<para>
|
|
To delete a tuple from a particular table use the command DELETE
|
|
FROM. The syntax is:
|
|
|
|
<programlisting>
|
|
DELETE FROM <replaceable class="parameter">table_name</replaceable>
|
|
WHERE <replaceable class="parameter">condition</replaceable>;
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
To delete the supplier called 'Smith' of the table SUPPLIER the
|
|
following statement is used:
|
|
|
|
<programlisting>
|
|
DELETE FROM SUPPLIER
|
|
WHERE SNAME = 'Smith';
|
|
</programlisting>
|
|
</para>
|
|
</sect3>
|
|
</sect2>
|
|
|
|
<sect2 id="tutorial-catalogs">
|
|
<title id="tutorial-catalogs-title">System Catalogs</title>
|
|
|
|
<para>
|
|
In every <acronym>SQL</acronym> database system
|
|
<firstterm>system catalogs</firstterm> are used to keep
|
|
track of which tables, views indexes etc. are defined in the
|
|
database. These system catalogs can be queried as if they were normal
|
|
relations. For example there is one catalog used for the definition of
|
|
views. This catalog stores the query from the view definition. Whenever
|
|
a query against a view is made, the system first gets the
|
|
<firstterm>view definition query</firstterm> out of the catalog
|
|
and materializes the view
|
|
before proceeding with the user query (see
|
|
<!--
|
|
section
|
|
<xref linkend="view-impl" endterm="view-impl">.
|
|
<citetitle>SIM98</citetitle>
|
|
-->
|
|
<xref linkend="SIM98" endterm="SIM98">
|
|
for a more detailed
|
|
description). For more information about system catalogs refer to
|
|
<xref linkend="DATE04" endterm="DATE04">.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2>
|
|
<title>Embedded <acronym>SQL</acronym></title>
|
|
|
|
<para>
|
|
In this section we will sketch how <acronym>SQL</acronym> can be
|
|
embedded into a host language (e.g. <literal>C</literal>).
|
|
There are two main reasons why we want to use <acronym>SQL</acronym>
|
|
from a host language:
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
There are queries that cannot be formulated using pure <acronym>SQL</acronym>
|
|
(i.e. recursive queries). To be able to perform such queries we need a
|
|
host language with a greater expressive power than
|
|
<acronym>SQL</acronym>.
|
|
</para>
|
|
</listitem>
|
|
|
|
<listitem>
|
|
<para>
|
|
We simply want to access a database from some application that
|
|
is written in the host language (e.g. a ticket reservation system
|
|
with a graphical user interface is written in C and the information
|
|
about which tickets are still left is stored in a database that can be
|
|
accessed using embedded <acronym>SQL</acronym>).
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</para>
|
|
|
|
<para>
|
|
A program using embedded <acronym>SQL</acronym>
|
|
in a host language consists of statements
|
|
of the host language and of
|
|
<firstterm>embedded <acronym>SQL</acronym></firstterm>
|
|
(<acronym>ESQL</acronym>) statements. Every <acronym>ESQL</acronym>
|
|
statement begins with the keywords <command>EXEC SQL</command>.
|
|
The <acronym>ESQL</acronym> statements are
|
|
transformed to statements of the host language
|
|
by a <firstterm>precompiler</firstterm>
|
|
(which usually inserts
|
|
calls to library routines that perform the various <acronym>SQL</acronym>
|
|
commands).
|
|
</para>
|
|
|
|
<para>
|
|
When we look at the examples throughout
|
|
<xref linkend="select-title" endterm="select-title"> we
|
|
realize that the result of the queries is very often a set of
|
|
tuples. Most host languages are not designed to operate on sets so we
|
|
need a mechanism to access every single tuple of the set of tuples
|
|
returned by a SELECT statement. This mechanism can be provided by
|
|
declaring a <firstterm>cursor</firstterm>.
|
|
After that we can use the <command>FETCH</command> command to
|
|
retrieve a tuple and set the cursor to the next tuple.
|
|
</para>
|
|
|
|
<para>
|
|
For a detailed discussion on embedded <acronym>SQL</acronym>
|
|
refer to
|
|
<xref linkend="DATE97" endterm="DATE97">,
|
|
<xref linkend="DATE04" endterm="DATE04">,
|
|
or
|
|
<xref linkend="ULL88" endterm="ULL88">.
|
|
</para>
|
|
</sect2>
|
|
</sect1>
|
|
</chapter>
|