Updates for schema features.

2025-11-03 09:13:20 +03:00 · 2002-04-25 20:14:43 +00:00
parent 52200befd0
commit e358a61d76
4 changed files with 352 additions and 80 deletions
--- a/doc/src/sgml/syntax.sgml
+++ b/doc/src/sgml/syntax.sgml
@@ -1,5 +1,5 @@
 <!--
-$Header: /cvsroot/pgsql/doc/src/sgml/syntax.sgml,v 1.59 2002/03/22 19:20:31 petere Exp $
+$Header: /cvsroot/pgsql/doc/src/sgml/syntax.sgml,v 1.60 2002/04/25 20:14:43 tgl Exp $
 -->

 <chapter id="sql-syntax">
@@ -623,14 +623,197 @@ CAST ( '<replaceable>string</replaceable>' AS <replaceable>type</replaceable> )
  </sect2>
 </sect1>

+ <sect1 id="sql-naming">
+  <title>Schemas and naming conventions</title>

-  <sect1 id="sql-syntax-columns">
-   <title>Columns</title>
+     <indexterm>
+      <primary>schemas</primary>
+     </indexterm>
+
+     <indexterm>
+      <primary>search path</primary>
+     </indexterm>
+
+     <indexterm>
+      <primary>namespaces</primary>
+     </indexterm>
+
+   <para>
+    A <productname>PostgreSQL</productname> database cluster (installation)
+    contains one or more named databases.  Users and groups of users are
+    shared across the entire cluster, but no other data is shared across
+    databases.  Any given client connection to the server can access
+    only the data in a single database, the one specified in the connection
+    request.
+   </para>
+
+   <note>
+    <para>
+     Users of a cluster do not necessarily have the privilege to access every
+     database in the cluster.  Sharing of user names means that there
+     cannot be different users named, say, <literal>joe</> in two databases
+     in the same cluster; but the system can be configured to allow
+     <literal>joe</> access to only some of the databases.
+    </para>
+   </note>
+
+   <para>
+    A database contains one or more named <firstterm>schemas</>, which
+    in turn contain tables.  Schemas also contain other kinds of named
+    objects, including datatypes, functions, and operators.  The same
+    object name can be used in different schemas without conflict; for
+    example, both <literal>schema1</> and <literal>myschema</> may
+    contain tables named <literal>mytable</>.  Unlike databases, schemas
+    are not rigidly separated: a user may access objects in any of the
+    schemas in the database he is connected to, if he has privileges
+    to do so.
+   </para>
+
+     <indexterm>
+      <primary>qualified names</primary>
+     </indexterm>
+
+     <indexterm>
+      <primary>names</primary>
+      <secondary>qualified</secondary>
+     </indexterm>
+
+   <para>
+    To name a table precisely, write a <firstterm>qualified name</> consisting
+    of the schema name and table name separated by a dot:
+<synopsis>
+    <replaceable>schema</><literal>.</><replaceable>table</>
+</synopsis>
+    Actually, the even more general syntax
+<synopsis>
+    <replaceable>database</><literal>.</><replaceable>schema</><literal>.</><replaceable>table</>
+</synopsis>
+    can be used too, but at present this is just for pro-forma compliance
+    with the SQL standard; if you write a database name it must be the
+    same as the database you are connected to.
+   </para>
+
+     <indexterm>
+      <primary>unqualified names</primary>
+     </indexterm>
+
+     <indexterm>
+      <primary>names</primary>
+      <secondary>unqualified</secondary>
+     </indexterm>
+
+   <para>
+    Qualified names are tedious to write, and it's often best not to
+    wire a particular schema name into applications anyway.  Therefore
+    tables are often referred to by <firstterm>unqualified names</>,
+    which consist of just the table name.  The system determines which table
+    is meant by following a <firstterm>search path</>, which is a list
+    of schemas to look in.  The first matching table in the search path
+    is taken to be the one wanted.  If there is no match in the search
+    path, an error is reported, even if matching table names exist
+    in other schemas in the database.
+   </para>
+
+   <para>
+    The first schema named in the search path is called the current schema.
+    Aside from being the first schema searched, it is also the schema in
+    which new tables will be created if the <command>CREATE TABLE</>
+    command does not specify a schema name.
+   </para>
+
+   <para>
+    The search path works in the same way for datatype names, function names,
+    and operator names as it does for table names.  Datatype and function
+    names can be qualified in exactly the same way as table names.  If you
+    need to write a qualified operator name in an expression, there is a
+    special provision: you must write
+<synopsis>
+    <literal>OPERATOR(</><replaceable>schema</><literal>.</><replaceable>operator</><literal>)</>
+</synopsis>
+    This is needed to avoid syntactic ambiguity.  An example is
+<programlisting>
+    SELECT 3 OPERATOR(pg_catalog.+) 4;
+</programlisting>
+    In practice one usually relies on the search path for operators,
+    so as not to have to write anything so ugly as that.
+   </para>
+
+   <para>
+    The standard search path in <productname>PostgreSQL</productname>
+    contains first the schema having the same name as the session user
+    (if it exists), and second the schema named <literal>public</>
+    (if it exists, which it does by default).  This arrangement allows
+    a flexible combination of private and shared tables.  If no per-user
+    schemas are created then all user tables will exist in the shared
+    <literal>public</> schema, providing behavior that is backwards-compatible
+    with pre-7.3 <productname>PostgreSQL</productname> releases.
+   </para>
+
+   <note>
+    <para>
+     There is no concept of a <literal>public</> schema in the SQL standard.
+     To achieve closest conformance to the standard, the DBA should
+     create per-user schemas for every user, and not use (perhaps even
+     remove) the <literal>public</> schema.
+    </para>
+   </note>
+
+   <para>
+    In addition to <literal>public</> and user-created schemas, each database
+    contains a 
+    <literal>pg_catalog</> schema, which contains the system tables
+    and all the built-in datatypes, functions, and operators.
+    <literal>pg_catalog</> is always effectively part of the search path.
+    If it is not named explicitly in the path then it is implicitly searched
+    <emphasis>before</> searching the path's schemas.  This ensures that
+    built-in names will always be findable.  However, you may explicitly
+    place <literal>pg_catalog</> at the end of your search path if you
+    prefer to have user-defined names override built-in names.
+   </para>
+
+  <sect2 id="sql-reserved-names">
+   <title>Reserved names</title>
+
+     <indexterm>
+      <primary>reserved names</primary>
+     </indexterm>
+
+     <indexterm>
+      <primary>names</primary>
+      <secondary>reserved</secondary>
+     </indexterm>

    <para>
-     A <firstterm>column</firstterm>
-     is either a user-defined column of a given table or one of the
-     following system-defined columns:
+     There are several restrictions on the names that can be chosen for
+     user-defined database objects.  These restrictions vary depending
+     on the kind of object.  (Note that these restrictions are
+     separate from whether the name is a key word or not; quoting a
+     name will not allow you to escape these restrictions.)
+    </para>
+
+    <para>
+     Schema names beginning with <literal>pg_</> are reserved for system
+     purposes and may not be created by users.
+    </para>
+
+    <para>
+     In <productname>PostgreSQL</productname> versions before 7.3, table
+     names beginning with <literal>pg_</> were reserved.  This is no longer
+     true: you may create such a table name if you wish, in any non-system
+     schema.  However, it's best to continue to avoid such names,
+     to ensure that you won't suffer a conflict if some future version
+     defines a system catalog named the same as your table.  (With the
+     default search path, an unqualified reference to your table name
+     would be resolved as the system catalog instead.)  System catalogs will
+     continue to follow the convention of having names beginning with
+     <literal>pg_</>, so that they will not conflict with unqualified
+     user-table names so long as users avoid the <literal>pg_</> prefix.
+    </para>
+
+    <para>
+     Every table has several <firstterm>system columns</> that are
+     implicitly defined by the system.  Therefore, these names cannot
+     be used as names of user-defined columns:

     <indexterm>
      <primary>columns</primary>
@@ -648,7 +831,7 @@ CAST ( '<replaceable>string</replaceable>' AS <replaceable>type</replaceable> )
 	 The object identifier (object ID) of a row.  This is a serial number
 	 that is automatically added by <productname>PostgreSQL</productname> to all table rows (unless
 	 the table was created WITHOUT OIDS, in which case this column is
-	 not present).
+	 not present).  See <xref linkend="datatype-oid"> for more info.
 	</para>
       </listitem>
      </varlistentry>
@@ -715,13 +898,13 @@ CAST ( '<replaceable>string</replaceable>' AS <replaceable>type</replaceable> )
      <term><structfield>ctid</></term>
       <listitem>
 	<para>
-	 The tuple ID of the tuple within its table.  This is a pair
-	 (block number, tuple index within block) that identifies the
-	 physical location of the tuple.  Note that although the <structfield>ctid</structfield>
-	 can be used to locate the tuple very quickly, a row's <structfield>ctid</structfield>
-	 will change each time it is updated or moved by <command>VACUUM
-	 FULL</>.
-	 Therefore <structfield>ctid</structfield> is useless as a long-term row identifier.
+	 The physical location of the tuple within its table.
+	 Note that although the <structfield>ctid</structfield>
+	 can be used to locate the tuple very quickly, a row's
+	 <structfield>ctid</structfield> will change each time it is updated
+	 or moved by <command>VACUUM FULL</>.
+	 Therefore <structfield>ctid</structfield> is useless as a long-term
+	 row identifier.
 	 The OID, or even better a user-defined serial number, should
 	 be used to identify logical rows.
 	</para>
@@ -729,38 +912,8 @@ CAST ( '<replaceable>string</replaceable>' AS <replaceable>type</replaceable> )
      </varlistentry>
     </variablelist>
    </para>
-
-    <para>
-     OIDs are 32-bit quantities and are assigned from a single cluster-wide
-     counter.  In a large or long-lived database, it is possible for the
-     counter to wrap around.  Hence, it is bad practice to assume that OIDs
-     are unique, unless you take steps to ensure that they are unique.
-     Recommended practice when using OIDs for row identification is to create
-     a unique constraint on the OID column of each table for which the OID will be
-     used.  Never assume that OIDs are unique across tables; use the
-     combination of <structfield>tableoid</> and row OID if you need a database-wide
-     identifier.  (Future releases of <productname>PostgreSQL</productname> are likely to use a separate
-     OID counter for each table, so that <structfield>tableoid</> <emphasis>must</> be
-     included to arrive at a globally unique identifier.)
-    </para>
-
-    <para>
-     Transaction identifiers are 32-bit quantities.  In a long-lived
-     database it is possible for transaction IDs to wrap around.  This
-     is not a fatal problem given appropriate maintenance procedures;
-     see the <citetitle>Administrator's Guide</> for details.  However, it is
-     unwise to depend on uniqueness of transaction IDs over the long term
-     (more than one billion transactions).
-    </para>
-
-    <para>
-     Command identifiers are also 32-bit quantities.  This creates a hard
-     limit of 2<superscript>32</> (4 billion) SQL commands within a single transaction.
-     In practice this limit is not a problem --- note that the limit is on
-     number of SQL queries, not number of tuples processed.
-    </para>
-  </sect1>
-
+  </sect2>
+ </sect1>

 <sect1 id="sql-expressions">
  <title>Value Expressions</title>
@@ -864,8 +1017,9 @@ CAST ( '<replaceable>string</replaceable>' AS <replaceable>type</replaceable> )
 <replaceable>correlation</replaceable>.<replaceable>columnname</replaceable> `['<replaceable>subscript</replaceable>`]'
 </synopsis>

-    <replaceable>correlation</replaceable> is either the name of a
-    table, an alias for a table defined by means of a FROM clause, or
+    <replaceable>correlation</replaceable> is the name of a
+    table (possibly qualified), or an alias for a table defined by means of a
+    FROM clause, or 
    the key words <literal>NEW</literal> or <literal>OLD</literal>.
    (NEW and OLD can only appear in the action portion of a rule,
    while other correlation names can be used in any SQL statement.)
@@ -918,9 +1072,13 @@ CREATE FUNCTION dept (text) RETURNS dept
     <member><replaceable>expression</replaceable> <replaceable>operator</replaceable> (unary postfix operator)</member>
    </simplelist>
    where the <replaceable>operator</replaceable> token follows the syntax
-    rules of <xref linkend="sql-syntax-operators"> or is one of the
-    tokens <token>AND</token>, <token>OR</token>, and
-    <token>NOT</token>.  Which particular operators exist and whether
+    rules of <xref linkend="sql-syntax-operators">, or is one of the
+    keywords <token>AND</token>, <token>OR</token>, and
+    <token>NOT</token>, or is a qualified operator name
+<synopsis>
+    <literal>OPERATOR(</><replaceable>schema</><literal>.</><replaceable>operatorname</><literal>)</>
+</synopsis>
+    Which particular operators exist and whether
    they are unary or binary depends on what operators have been
    defined by the system or the user.  <xref linkend="functions">
    describes the built-in operators.
@@ -932,8 +1090,7 @@ CREATE FUNCTION dept (text) RETURNS dept

   <para>
    The syntax for a function call is the name of a function
-    (which is subject to the syntax rules for identifiers of <xref
-    linkend="sql-syntax-identifiers">), followed by its argument list
+    (possibly qualified with a schema name), followed by its argument list
    enclosed in parentheses:

 <synopsis>
@@ -976,7 +1133,8 @@ sqrt(2)
    </simplelist>

    where <replaceable>aggregate_name</replaceable> is a previously
-    defined aggregate, and <replaceable>expression</replaceable> is
+    defined aggregate (possibly a qualified name), and
+    <replaceable>expression</replaceable> is 
    any value expression that does not itself contain an aggregate
    expression.
   </para>
@@ -1044,10 +1202,14 @@ CAST ( <replaceable>expression</replaceable> AS <replaceable>type</replaceable>
   </para>

   <para>
-    An explicit type cast may be omitted if there is no ambiguity as to the
-    type that a value expression must produce (for example, when it is
+    An explicit type cast may usually be omitted if there is no ambiguity as
+    to the type that a value expression must produce (for example, when it is
    assigned to a table column); the system will automatically apply a
-    type cast in such cases.
+    type cast in such cases.  However, automatic casting is only done for
+    cast functions that are marked <quote>okay to apply implicitly</>
+    in the system catalogs.  Other cast functions must be invoked with
+    explicit casting syntax.  This restriction is intended to prevent
+    surprising conversions from being applied silently.
   </para>

   <para>
@@ -1061,7 +1223,7 @@ CAST ( <replaceable>expression</replaceable> AS <replaceable>type</replaceable>
    can't be used this way, but the equivalent <literal>float8</literal>
    can.  Also, the names <literal>interval</>, <literal>time</>, and
    <literal>timestamp</> can only be used in this fashion if they are
-    double-quoted, because of parser conflicts.  Therefore, the use of
+    double-quoted, because of syntactic conflicts.  Therefore, the use of
    the function-like cast syntax leads to inconsistencies and should
    probably be avoided in new applications.
   </para>
@@ -1142,6 +1304,12 @@ SELECT (5 !) - 6;
     </thead>

     <tbody>
+      <row>
+       <entry><token>.</token></entry>
+       <entry>left</entry>
+       <entry>table/column name separator</entry>
+      </row>
+
      <row>
       <entry><token>::</token></entry>
       <entry>left</entry>
@@ -1154,12 +1322,6 @@ SELECT (5 !) - 6;
       <entry>array element selection</entry>
      </row>

-      <row>
-       <entry><token>.</token></entry>
-       <entry>left</entry>
-       <entry>table/column name separator</entry>
-      </row>
-
      <row>
       <entry><token>-</token></entry>
       <entry>right</entry>