mirror of
https://github.com/postgres/postgres.git
synced 2025-12-19 17:02:53 +03:00
First files for reference pages.
This commit is contained in:
435
doc/src/sgml/ref/copy.sgml
Normal file
435
doc/src/sgml/ref/copy.sgml
Normal file
@@ -0,0 +1,435 @@
|
||||
<REFENTRY ID="SQL-COPY-1">
|
||||
<REFMETA>
|
||||
<REFENTRYTITLE>
|
||||
COPY
|
||||
</REFENTRYTITLE>
|
||||
<REFMISCINFO>SQL - Language Statements</REFMISCINFO>
|
||||
</REFMETA>
|
||||
<REFNAMEDIV>
|
||||
<REFNAME>
|
||||
COPY
|
||||
</REFNAME>
|
||||
<REFPURPOSE>
|
||||
Copies data between files and tables
|
||||
</REFPURPOSE>
|
||||
<REFSYNOPSISDIV>
|
||||
<REFSYNOPSISDIVINFO>
|
||||
<DATE>1998-04-15</DATE>
|
||||
</REFSYNOPSISDIVINFO>
|
||||
<SYNOPSIS>
|
||||
COPY [BINARY] <replaceable class="parameter">table</replaceable> [WITH OIDS]
|
||||
TO|FROM '<replaceable class="parameter">filename</replaceable>'|stdin|stdout
|
||||
[USING DELIMITERS '<replaceable class="parameter">delimiter</replaceable>']
|
||||
</SYNOPSIS>
|
||||
|
||||
<REFSECT2 ID="R2-SQL-COPY-1">
|
||||
<REFSECT2INFO>
|
||||
<DATE>1998-04-15</DATE>
|
||||
</REFSECT2INFO>
|
||||
<TITLE>
|
||||
Inputs
|
||||
</TITLE>
|
||||
<PARA>
|
||||
</PARA>
|
||||
<VARIABLELIST>
|
||||
<VARLISTENTRY>
|
||||
<TERM>
|
||||
</TERM>
|
||||
<LISTITEM>
|
||||
<PARA>
|
||||
<VARIABLELIST>
|
||||
<VARLISTENTRY>
|
||||
<TERM>
|
||||
<ReturnValue><replaceable class="parameter">table</replaceable></ReturnValue>
|
||||
</TERM>
|
||||
<LISTITEM>
|
||||
<PARA>
|
||||
The name of a table.
|
||||
</PARA>
|
||||
</LISTITEM>
|
||||
</VARLISTENTRY>
|
||||
<VARLISTENTRY>
|
||||
<TERM>
|
||||
<ReturnValue><replaceable class="parameter">delimiter</replaceable></ReturnValue>
|
||||
</TERM>
|
||||
<LISTITEM>
|
||||
<PARA>
|
||||
A character that delimits fields.
|
||||
</PARA>
|
||||
</LISTITEM>
|
||||
</VARLISTENTRY>
|
||||
</variablelist>
|
||||
</LISTITEM>
|
||||
</VARLISTENTRY>
|
||||
</VARIABLELIST>
|
||||
</REFSECT2>
|
||||
|
||||
<REFSECT2 ID="R2-SQL-COPY-2">
|
||||
<REFSECT2INFO>
|
||||
<DATE>1998-04-15</DATE>
|
||||
</REFSECT2INFO>
|
||||
<TITLE>
|
||||
Outputs
|
||||
</TITLE>
|
||||
<PARA>
|
||||
</PARA>
|
||||
<VARIABLELIST>
|
||||
<VARLISTENTRY>
|
||||
<TERM>
|
||||
Status
|
||||
</TERM>
|
||||
<LISTITEM>
|
||||
<PARA>
|
||||
<VARIABLELIST>
|
||||
<VARLISTENTRY>
|
||||
<TERM>
|
||||
<ReturnValue>COPY</ReturnValue>
|
||||
</TERM>
|
||||
<LISTITEM>
|
||||
<PARA>
|
||||
The copy completed successfully.
|
||||
</PARA>
|
||||
</LISTITEM>
|
||||
</VARLISTENTRY>
|
||||
<VARLISTENTRY>
|
||||
<TERM>
|
||||
<ReturnValue>ERROR: <replaceable>error message</replaceable></ReturnValue>
|
||||
</TERM>
|
||||
<LISTITEM>
|
||||
<PARA>
|
||||
The copy failed for the reason stated in the error message.
|
||||
</PARA>
|
||||
</LISTITEM>
|
||||
</VARLISTENTRY>
|
||||
</variablelist>
|
||||
</LISTITEM>
|
||||
</VARLISTENTRY>
|
||||
</VARIABLELIST>
|
||||
</REFSECT2>
|
||||
</REFSYNOPSISDIV>
|
||||
|
||||
<REFSECT1 ID="R1-SQL-COPY-1">
|
||||
<REFSECT1INFO>
|
||||
<DATE>1998-04-15</DATE>
|
||||
</REFSECT1INFO>
|
||||
<TITLE>
|
||||
Description
|
||||
</TITLE>
|
||||
<PARA>
|
||||
<command>COPY</command> moves data between PostgreSQL tables and
|
||||
standard Unix files. The keyword <function>BINARY</function>
|
||||
changes the behavior of field formatting, as described
|
||||
below. <replaceable class="parameter">Table</replaceable> is the
|
||||
name of an existing table. The keyword <function>WITH
|
||||
OIDS</function> copies the internal unique object id (OID) for each
|
||||
row. <replaceable class="parameter">Filename</replaceable> is the
|
||||
absolute Unix pathname of the file. In place of a filename, the
|
||||
keywords <function>stdin</function> and <function>stdout</function>
|
||||
can be used, so that input to <command>COPY</command> can be written
|
||||
by a libpq application and output from <command>COPY</command> can
|
||||
be read by a libpq application.
|
||||
</para>
|
||||
<para>
|
||||
The <function>BINARY</function> keyword will force all data to be
|
||||
stored/read as binary objects rather than as ASCII text. It is
|
||||
somewhat faster than the normal copy command, but is not
|
||||
generally portable, and the files generated are somewhat larger,
|
||||
although this factor is highly dependent on the data itself. By
|
||||
default, an ASCII copy uses a tab (\t) character as a delimiter.
|
||||
The delimiter may also be changed to any other single character
|
||||
with the keyword <function>USING DELIMITERS</function>. Characters
|
||||
in data fields which happen to match the delimiter character will
|
||||
be quoted.
|
||||
</para>
|
||||
|
||||
<REFSECT2 ID="R2-SQL-COPY-3">
|
||||
<REFSECT2INFO>
|
||||
<DATE>1998-04-15</DATE>
|
||||
</REFSECT2INFO>
|
||||
<TITLE>
|
||||
Notes
|
||||
</TITLE>
|
||||
<para>
|
||||
You must have select access on any table whose values are read by
|
||||
<command>COPY</command>, and either insert or update access to a
|
||||
table into which values are being inserted by <command>COPY</command>.
|
||||
The backend also needs appropriate Unix permissions for any file read
|
||||
or written by <command>COPY</command>.
|
||||
<comment>
|
||||
Is this right? The man page talked of read, write and append access, which
|
||||
is neither SQL nor Unix terminology.
|
||||
</comment>
|
||||
</para>
|
||||
<para>
|
||||
The keyword <function>USING DELIMITERS</function> is inaptly
|
||||
named, since only a single character may be specified. (If a
|
||||
group of characters is specified, only the first character is
|
||||
used.)
|
||||
</para>
|
||||
<para>
|
||||
WARNING: do not confuse <command>COPY</command> with the
|
||||
<command>psql</command> instruction <command>\copy</command>.
|
||||
</para>
|
||||
</REFSECT2>
|
||||
</refsect1>
|
||||
<refsect1 ID="R1-SQL-COPY-2">
|
||||
<refsect1info>
|
||||
<date>1998-05-04</date>
|
||||
</refsect1info>
|
||||
<title>Format of output files</title>
|
||||
<refsect2>
|
||||
<refsect2info>
|
||||
<date>1998-05-04</date>
|
||||
</refsect2info>
|
||||
<title>ASCII copy format</title>
|
||||
<para>
|
||||
When <command>COPY</command> is used without <function>BINARY</function>,
|
||||
the file generated will have each instance on a single line, with each
|
||||
attribute separated by the delimiter character. Embedded
|
||||
delimiter characters will be preceded by a backslash character
|
||||
(\). The attribute values themselves are strings generated by the
|
||||
output function associated with each attribute type. The output
|
||||
function for a type should not try to generate the backslash
|
||||
character; this will be handled by <command>COPY</command> itself.
|
||||
</para>
|
||||
<para>
|
||||
The actual format for each instance is
|
||||
<programlisting>
|
||||
<attr1><<replaceable class=parameter>separator</replaceable>><attr2><<replaceable class=parameter>separator</replaceable>>...<<replaceable class=parameter>separator</replaceable>><attr<replaceable class="parameter">n</replaceable>><newline></programlisting>
|
||||
The oid is placed on the beginning of the line
|
||||
if <function>WITH OIDS</function> is specified.
|
||||
</para>
|
||||
<para>
|
||||
If <command>COPY</command> is sending its output to standard
|
||||
output instead of a file, it will send a backslash(\) and a period
|
||||
(.) followed immediately by a newline, on a separate line,
|
||||
when it is done. Similarly, if <command>COPY</command> is reading
|
||||
from standard input, it will expect a backslash (\) and a period
|
||||
(.) followed by a newline, as the first three characters on a
|
||||
line, to denote end-of-file. However, <command>COPY</command>
|
||||
will terminate (followed by the backend itself) if a true EOF is
|
||||
encountered.
|
||||
</para>
|
||||
<para>
|
||||
The backslash character has special meaning. NULL attributes are
|
||||
output as \N. A literal backslash character is output as two
|
||||
consecutive backslashes. A literal tab character is represented
|
||||
as a backslash and a tab. A literal newline character is
|
||||
represented as a backslash and a newline. When loading ASCII data
|
||||
not generated by PostgreSQL, you will need to convert backslash
|
||||
characters (\) to double-backslashes (\\) to ensure that they are loaded
|
||||
properly.
|
||||
</para>
|
||||
</refsect2>
|
||||
<refsect2>
|
||||
<refsect2info>
|
||||
<date>1998-05-04</date>
|
||||
</refsect2info>
|
||||
<title>Binary copy format</title>
|
||||
<para>
|
||||
In the case of <command>COPY BINARY</command>, the first four
|
||||
bytes in the file will be the number of instances in the file. If
|
||||
this number is zero, the <command>COPY BINARY</command> command
|
||||
will read until end of file is encountered. Otherwise, it will
|
||||
stop reading when this number of instances has been read.
|
||||
Remaining data in the file will be ignored.
|
||||
</para>
|
||||
<para>
|
||||
The format for each instance in the file is as follows. Note that
|
||||
this format must be followed <emphasis>exactly</emphasis>.
|
||||
Unsigned four-byte integer quantities are called uint32 in the
|
||||
table below.
|
||||
</para>
|
||||
<table frame="all">
|
||||
<title>Contents of a binary copy file</title>
|
||||
<tgroup cols="2"colsep="1" rowsep="1" align="center">
|
||||
<COLSPEC COLNAME="col1">
|
||||
<COLSPEC COLNAME="col2">
|
||||
<spanspec namest="col1" nameend="col2" spanname="subhead">
|
||||
<tbody>
|
||||
<row>
|
||||
<entry align="center" spanname="subhead">At the start of the file</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>uint32</entry>
|
||||
<entry>number of tuples</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry align="center" spanname="subhead">For each tuple</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>uint32</entry>
|
||||
<entry>total length of tuple data</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>uint32</entry>
|
||||
<entry>oid (if specified)</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>uint32</entry>
|
||||
<entry>number of null attributes</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>[uint32</entry>
|
||||
<entry>attribute number of first null attribute, counting from 0</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>...</entry>
|
||||
<entry>...</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>uint32</entry>
|
||||
<entry>attribute number of last null attribute]</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>-</entry>
|
||||
<entry><tuple data></entry>
|
||||
</row>
|
||||
</tbody>
|
||||
</tgroup>
|
||||
</table>
|
||||
|
||||
</refsect2>
|
||||
<refsect2>
|
||||
<refsect2info>
|
||||
<date>1998-05-04</date>
|
||||
</refsect2info>
|
||||
<title>Alignment of binary data</title>
|
||||
<para>
|
||||
On Sun-3s, 2-byte attributes are aligned on two-byte boundaries,
|
||||
and all larger attributes are aligned on four-byte boundaries.
|
||||
Character attributes are aligned on single-byte boundaries. On
|
||||
other machines, all attributes larger than 1 byte are aligned on
|
||||
four-byte boundaries. Note that variable length attributes are
|
||||
preceded by the attribute's length; arrays are simply contiguous
|
||||
streams of the array element type.
|
||||
</para>
|
||||
</refsect2>
|
||||
</refsect1>
|
||||
|
||||
|
||||
<REFSECT1 ID="R1-SQL-COPY-3">
|
||||
<TITLE>
|
||||
Usage
|
||||
</TITLE>
|
||||
<PARA>
|
||||
To copy a table to standard output, using | as a delimiter
|
||||
</PARA>
|
||||
<ProgramListing>
|
||||
COPY country TO stdout USING DELIMITERS '|';
|
||||
</ProgramListing>
|
||||
<PARA>
|
||||
To copy data from a Unix file into a table:
|
||||
</PARA>
|
||||
<ProgramListing>
|
||||
COPY country FROM '/usr1/proj/bray/sql/country_data';
|
||||
</ProgramListing>
|
||||
<PARA>
|
||||
A sample of data suitable for copying into a table from <filename>stdin</filename> (so it
|
||||
has the termination sequence on the last line):
|
||||
</PARA>
|
||||
<ProgramListing>
|
||||
AF AFGHANISTAN
|
||||
AL ALBANIA
|
||||
DZ ALGERIA
|
||||
...
|
||||
ZM ZAMBIA
|
||||
ZW ZIMBABWE
|
||||
\.
|
||||
</ProgramListing>
|
||||
<PARA>
|
||||
The same data, output in binary format on a Linux Intel machine.
|
||||
The data is shown after filtering through the Unix utility <command>od -c</command>. The table has
|
||||
three fields; the first is <classname>char(2)</classname> and the second is <classname>text</classname>. All the
|
||||
rows have a null value in the third field). Notice how the <classname>char(2)</classname>
|
||||
field is padded with nulls to four bytes and the text field is
|
||||
preceded by its length:
|
||||
</PARA>
|
||||
<ProgramListing>
|
||||
355 \0 \0 \0 027 \0 \0 \0 001 \0 \0 \0 002 \0 \0 \0
|
||||
006 \0 \0 \0 A F \0 \0 017 \0 \0 \0 A F G H
|
||||
A N I S T A N 023 \0 \0 \0 001 \0 \0 \0 002
|
||||
\0 \0 \0 006 \0 \0 \0 A L \0 \0 \v \0 \0 \0 A
|
||||
L B A N I A 023 \0 \0 \0 001 \0 \0 \0 002 \0
|
||||
\0 \0 006 \0 \0 \0 D Z \0 \0 \v \0 \0 \0 A L
|
||||
G E R I A
|
||||
... \n \0 \0 \0 Z A M B I A 024 \0
|
||||
\0 \0 001 \0 \0 \0 002 \0 \0 \0 006 \0 \0 \0 Z W
|
||||
\0 \0 \f \0 \0 \0 Z I M B A B W E
|
||||
</ProgramListing>
|
||||
</refsect1>
|
||||
|
||||
<refsect1 ID="R1-SQL-COPY-4">
|
||||
<title>See also</title>
|
||||
<para>
|
||||
insert(l), create table(l), vacuum(l), libpq.
|
||||
</para>
|
||||
</refsect1>
|
||||
|
||||
<refsect1 ID="R1-SQL-COPY-5">
|
||||
<title>Bugs</title>
|
||||
<para>
|
||||
<command>COPY</command> stops operation at the first error. This
|
||||
should not lead to problems in the event of a copy from, but the
|
||||
target relation will, of course, be partially modified in a copy
|
||||
to. The <command>VACUUM</command> query should be used to clean up
|
||||
after a failed copy.
|
||||
</para>
|
||||
<para>
|
||||
Because Postgres' current directory is not the same as the user's
|
||||
working directory, the result of copying to a file "foo" (without
|
||||
additional path information) may yield unexpected results for the
|
||||
naive user. In this case, "foo" will wind up in $PGDATA/foo. In
|
||||
general, the full pathname should be used when specifying files to
|
||||
be copied.
|
||||
</para>
|
||||
<para>
|
||||
Files used as arguments to the copy command must reside on or be
|
||||
accessible to the database server machine by being either on
|
||||
local disks or on a networked file system.
|
||||
</para>
|
||||
<para>
|
||||
When a TCP/IP connection from one machine to another is used, and a
|
||||
target file is specified, the target file will be written on the
|
||||
machine where the backend is running rather than the user's
|
||||
machine.
|
||||
</para>
|
||||
</refsect1>
|
||||
|
||||
<REFSECT1 ID="R1-SQL-COPY-6">
|
||||
<TITLE>
|
||||
Compatibility
|
||||
</TITLE>
|
||||
<PARA>
|
||||
</PARA>
|
||||
|
||||
<REFSECT2 ID="R2-SQL-COPY-4">
|
||||
<REFSECT2INFO>
|
||||
<DATE>1998-04-15</DATE>
|
||||
</REFSECT2INFO>
|
||||
<TITLE>
|
||||
SQL92
|
||||
</TITLE>
|
||||
<PARA>
|
||||
There is no COPY statement in SQL92.
|
||||
</PARA>
|
||||
</refsect2>
|
||||
</refsect1>
|
||||
</REFENTRY>
|
||||
|
||||
<!-- Keep this comment at the end of the file
|
||||
Local variables:
|
||||
mode: sgml
|
||||
sgml-omittag:t
|
||||
sgml-shorttag:t
|
||||
sgml-minimize-attributes:nil
|
||||
sgml-always-quote-attributes:t
|
||||
sgml-indent-step:1
|
||||
sgml-indent-data:t
|
||||
sgml-parent-document:nil
|
||||
sgml-default-dtd-file:"../reference.ced"
|
||||
sgml-exposed-tags:nil
|
||||
sgml-local-catalogs:"/usr/lib/sgml/catalog"
|
||||
sgml-local-ecat-files:nil
|
||||
End:
|
||||
-->
|
||||
Reference in New Issue
Block a user