mirror of
https://github.com/postgres/postgres.git
synced 2025-04-21 12:05:57 +03:00
Support an optional asynchronous commit mode, in which we don't flush WAL
before reporting a transaction committed. Data consistency is still guaranteed (unlike setting fsync = off), but a crash may lose the effects of the last few transactions. Patch by Simon, some editorialization by Tom.
This commit is contained in:
parent
c722628a43
commit
4a78cdeb6b
@ -1,4 +1,4 @@
|
|||||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.133 2007/07/24 04:54:08 tgl Exp $ -->
|
<!-- $PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.134 2007/08/01 22:45:07 tgl Exp $ -->
|
||||||
|
|
||||||
<chapter Id="runtime-config">
|
<chapter Id="runtime-config">
|
||||||
<title>Server Configuration</title>
|
<title>Server Configuration</title>
|
||||||
@ -1307,6 +1307,13 @@ SET ENABLE_SEQSCAN TO OFF;
|
|||||||
disabling <varname>fsync</varname>.
|
disabling <varname>fsync</varname>.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
In many situations, turning off <xref linkend="guc-synchronous-commit">
|
||||||
|
for noncritical transactions can provide much of the potential
|
||||||
|
performance benefit of turning off <varname>fsync</varname>, without
|
||||||
|
the attendant risks of data corruption.
|
||||||
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
This parameter can only be set in the <filename>postgresql.conf</>
|
This parameter can only be set in the <filename>postgresql.conf</>
|
||||||
file or on the server command line.
|
file or on the server command line.
|
||||||
@ -1315,6 +1322,38 @@ SET ENABLE_SEQSCAN TO OFF;
|
|||||||
</para>
|
</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
</varlistentry>
|
</varlistentry>
|
||||||
|
|
||||||
|
<varlistentry id="guc-synchronous-commit" xreflabel="synchronous_commit">
|
||||||
|
<term><varname>synchronous_commit</varname> (<type>boolean</type>)</term>
|
||||||
|
<indexterm>
|
||||||
|
<primary><varname>synchronous_commit</> configuration parameter</primary>
|
||||||
|
</indexterm>
|
||||||
|
<listitem>
|
||||||
|
<para>
|
||||||
|
Specifies whether transaction commit will wait for WAL records
|
||||||
|
to be written to disk before the command returns a <quote>success</>
|
||||||
|
indication to the client. The default, and safe, setting is
|
||||||
|
<literal>on</>. When <literal>off</>, there can be a delay between
|
||||||
|
when success is reported to the client and when the transaction is
|
||||||
|
really guaranteed to be safe against a server crash. (The maximum
|
||||||
|
delay is three times <xref linkend="guc-wal-writer-delay">.) Unlike
|
||||||
|
<xref linkend="guc-fsync">, setting this parameter to <literal>off</>
|
||||||
|
does not create any risk of database inconsistency: a crash might
|
||||||
|
result in some recent allegedly-committed transactions being lost, but
|
||||||
|
the database state will be just the same as if those transactions had
|
||||||
|
been aborted cleanly. So, turning <varname>synchronous_commit</> off
|
||||||
|
can be a useful alternative when performance is more important than
|
||||||
|
exact certainty about the durability of a transaction. For more
|
||||||
|
discussion see <xref linkend="wal-async-commit">.
|
||||||
|
</para>
|
||||||
|
<para>
|
||||||
|
This parameter can be changed at any time; the behavior for any
|
||||||
|
one transaction is determined by the setting in effect when it
|
||||||
|
commits. It is therefore possible, and useful, to have some
|
||||||
|
transactions commit synchronously and others asynchronously.
|
||||||
|
</para>
|
||||||
|
</listitem>
|
||||||
|
</varlistentry>
|
||||||
|
|
||||||
<varlistentry id="guc-wal-sync-method" xreflabel="wal_sync_method">
|
<varlistentry id="guc-wal-sync-method" xreflabel="wal_sync_method">
|
||||||
<term><varname>wal_sync_method</varname> (<type>string</type>)</term>
|
<term><varname>wal_sync_method</varname> (<type>string</type>)</term>
|
||||||
|
@ -1,10 +1,10 @@
|
|||||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/wal.sgml,v 1.44 2007/06/28 00:02:37 tgl Exp $ -->
|
<!-- $PostgreSQL: pgsql/doc/src/sgml/wal.sgml,v 1.45 2007/08/01 22:45:07 tgl Exp $ -->
|
||||||
|
|
||||||
<chapter id="wal">
|
<chapter id="wal">
|
||||||
<title>Reliability and the Write-Ahead Log</title>
|
<title>Reliability and the Write-Ahead Log</title>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
This chapter explain how the Write-Ahead Log is used to obtain
|
This chapter explains how the Write-Ahead Log is used to obtain
|
||||||
efficient, reliable operation.
|
efficient, reliable operation.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
@ -71,7 +71,7 @@
|
|||||||
write caches. At the drive level, disable write-back caching if the
|
write caches. At the drive level, disable write-back caching if the
|
||||||
drive cannot guarantee the data will be written before shutdown.
|
drive cannot guarantee the data will be written before shutdown.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
Another risk of data loss is posed by the disk platter write
|
Another risk of data loss is posed by the disk platter write
|
||||||
operations themselves. Disk platters are divided into sectors,
|
operations themselves. Disk platters are divided into sectors,
|
||||||
@ -86,11 +86,11 @@
|
|||||||
disk. By doing this, during crash recovery <productname>PostgreSQL</> can
|
disk. By doing this, during crash recovery <productname>PostgreSQL</> can
|
||||||
restore partially-written pages. If you have a battery-backed disk
|
restore partially-written pages. If you have a battery-backed disk
|
||||||
controller or file-system software that prevents partial page writes
|
controller or file-system software that prevents partial page writes
|
||||||
(e.g., ReiserFS 4), you can turn off this page imaging by using the
|
(e.g., ReiserFS 4), you can turn off this page imaging by using the
|
||||||
<xref linkend="guc-full-page-writes"> parameter.
|
<xref linkend="guc-full-page-writes"> parameter.
|
||||||
</para>
|
</para>
|
||||||
</sect1>
|
</sect1>
|
||||||
|
|
||||||
<sect1 id="wal-intro">
|
<sect1 id="wal-intro">
|
||||||
<title>Write-Ahead Logging (<acronym>WAL</acronym>)</title>
|
<title>Write-Ahead Logging (<acronym>WAL</acronym>)</title>
|
||||||
|
|
||||||
@ -105,12 +105,12 @@
|
|||||||
|
|
||||||
<para>
|
<para>
|
||||||
<firstterm>Write-Ahead Logging</firstterm> (<acronym>WAL</acronym>)
|
<firstterm>Write-Ahead Logging</firstterm> (<acronym>WAL</acronym>)
|
||||||
is a standard approach to transaction logging. Its detailed
|
is a standard method for ensuring data integrity. A detailed
|
||||||
description can be found in most (if not all) books about
|
description can be found in most (if not all) books about
|
||||||
transaction processing. Briefly, <acronym>WAL</acronym>'s central
|
transaction processing. Briefly, <acronym>WAL</acronym>'s central
|
||||||
concept is that changes to data files (where tables and indexes
|
concept is that changes to data files (where tables and indexes
|
||||||
reside) must be written only after those changes have been logged,
|
reside) must be written only after those changes have been logged,
|
||||||
that is, when log records describing the changes have been flushed
|
that is, after log records describing the changes have been flushed
|
||||||
to permanent storage. If we follow this procedure, we do not need
|
to permanent storage. If we follow this procedure, we do not need
|
||||||
to flush data pages to disk on every transaction commit, because we
|
to flush data pages to disk on every transaction commit, because we
|
||||||
know that in the event of a crash we will be able to recover the
|
know that in the event of a crash we will be able to recover the
|
||||||
@ -120,17 +120,17 @@
|
|||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
A major benefit of using <acronym>WAL</acronym> is a
|
Using <acronym>WAL</acronym> results in a
|
||||||
significantly reduced number of disk writes, because only the log
|
significantly reduced number of disk writes, because only the log
|
||||||
file needs to be flushed to disk at the time of transaction
|
file needs to be flushed to disk to guarantee that a transaction is
|
||||||
commit, rather than every data file changed by the transaction.
|
committed, rather than every data file changed by the transaction.
|
||||||
In multiuser environments, commits of many transactions
|
The log file is written sequentially,
|
||||||
can be accomplished with a single <function>fsync</function> of
|
|
||||||
the log file. Furthermore, the log file is written sequentially,
|
|
||||||
and so the cost of syncing the log is much less than the cost of
|
and so the cost of syncing the log is much less than the cost of
|
||||||
flushing the data pages. This is especially true for servers
|
flushing the data pages. This is especially true for servers
|
||||||
handling many small transactions touching different parts of the data
|
handling many small transactions touching different parts of the data
|
||||||
store.
|
store. Furthermore, when the server is processing many small concurrent
|
||||||
|
transactions, one <function>fsync</function> of the log file may
|
||||||
|
suffice to commit many transactions.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
@ -147,6 +147,139 @@
|
|||||||
</para>
|
</para>
|
||||||
</sect1>
|
</sect1>
|
||||||
|
|
||||||
|
<sect1 id="wal-async-commit">
|
||||||
|
<title>Asynchronous Commit</title>
|
||||||
|
|
||||||
|
<indexterm>
|
||||||
|
<primary>synchronous commit</primary>
|
||||||
|
</indexterm>
|
||||||
|
|
||||||
|
<indexterm>
|
||||||
|
<primary>asynchronous commit</primary>
|
||||||
|
</indexterm>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
<firstterm>Asynchronous commit</> is an option that allows transactions
|
||||||
|
to complete more quickly, at the cost that the most recent transactions may
|
||||||
|
be lost if the database should crash. In many applications this is an
|
||||||
|
acceptable tradeoff.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
As described in the previous section, transaction commit is normally
|
||||||
|
<firstterm>synchronous</>: the server waits for the transaction's
|
||||||
|
<acronym>WAL</acronym> records to be flushed to permanent storage
|
||||||
|
before returning a success indication to the client. The client is
|
||||||
|
therefore guaranteed that a transaction reported to be committed will
|
||||||
|
be preserved, even in the event of a server crash immediately after.
|
||||||
|
However, for short transactions this delay is a major component of the
|
||||||
|
total transaction time. Selecting asynchronous commit mode means that
|
||||||
|
the server returns success as soon as the transaction is logically
|
||||||
|
completed, before the <acronym>WAL</acronym> records it generated have
|
||||||
|
actually made their way to disk. This can provide a significant boost
|
||||||
|
in throughput for small transactions.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
Asynchronous commit introduces the risk of data loss. There is a short
|
||||||
|
time window between the report of transaction completion to the client
|
||||||
|
and the time that the transaction is truly committed (that is, it is
|
||||||
|
guaranteed not to be lost if the server crashes). Thus asynchronous
|
||||||
|
commit should not be used if the client will take external actions
|
||||||
|
relying on the assumption that the transaction will be remembered.
|
||||||
|
As an example, a bank would certainly not use asynchronous commit for
|
||||||
|
a transaction recording an ATM's dispensing of cash. But in many
|
||||||
|
scenarios, such as event logging, there is no need for a strong
|
||||||
|
guarantee of this kind.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
The risk that is taken by using asynchronous commit is of data loss,
|
||||||
|
not data corruption. If the database should crash, it will recover
|
||||||
|
by replaying <acronym>WAL</acronym> up to the last record that was
|
||||||
|
flushed. The database will therefore be restored to a self-consistent
|
||||||
|
state, but any transactions that were not yet flushed to disk will
|
||||||
|
not be reflected in that state. The net effect is therefore loss of
|
||||||
|
the last few transactions. Because the transactions are replayed in
|
||||||
|
commit order, no inconsistency can be introduced — for example,
|
||||||
|
if transaction B made changes relying on the effects of a previous
|
||||||
|
transaction A, it is not possible for A's effects to be lost while B's
|
||||||
|
effects are preserved.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
The user can select the commit mode of each transaction, so that
|
||||||
|
it is possible to have both synchronous and asynchronous commit
|
||||||
|
transactions running concurrently. This allows flexible tradeoffs
|
||||||
|
between performance and certainty of transaction durability.
|
||||||
|
The commit mode is controlled by the user-settable parameter
|
||||||
|
<xref linkend="guc-synchronous-commit">, which can be changed in any of
|
||||||
|
the ways that a configuration parameter can be set. The mode used for
|
||||||
|
any one transaction depends on the value of
|
||||||
|
<varname>synchronous_commit</varname> when transaction commit begins.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
Certain utility commands, for instance <command>DROP TABLE</>, are
|
||||||
|
forced to commit synchronously regardless of the setting of
|
||||||
|
<varname>synchronous_commit</varname>. This is to ensure consistency
|
||||||
|
between the server's filesystem and the logical state of the database.
|
||||||
|
The commands supporting two-phase commit, such as <command>PREPARE
|
||||||
|
TRANSACTION</>, are also always synchronous.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
If the database crashes during the risk window between an
|
||||||
|
asynchronous commit and the writing of the transaction's
|
||||||
|
<acronym>WAL</acronym> records,
|
||||||
|
then changes made during that transaction <emphasis>will</> be lost.
|
||||||
|
The duration of the
|
||||||
|
risk window is limited because a background process (the <quote>wal
|
||||||
|
writer</>) flushes unwritten <acronym>WAL</acronym> records to disk
|
||||||
|
every <xref linkend="guc-wal-writer-delay"> milliseconds.
|
||||||
|
The actual maximum duration of the risk window is three times
|
||||||
|
<varname>wal_writer_delay</varname> because the wal writer is
|
||||||
|
designed to favor writing whole pages at a time during busy periods.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<caution>
|
||||||
|
<para>
|
||||||
|
An immediate-mode shutdown is equivalent to a server crash, and will
|
||||||
|
therefore cause loss of any unflushed asynchronous commits.
|
||||||
|
</para>
|
||||||
|
</caution>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
Asynchronous commit provides behavior different from setting
|
||||||
|
<xref linkend="guc-fsync"> = off.
|
||||||
|
<varname>fsync</varname> is a server-wide
|
||||||
|
setting that will alter the behavior of all transactions. It disables
|
||||||
|
all logic within <productname>PostgreSQL</> that attempts to synchronize
|
||||||
|
writes to different portions of the database, and therefore a system
|
||||||
|
crash (that is, a hardware or operating system crash, not a failure of
|
||||||
|
<productname>PostgreSQL</> itself) could result in arbitrarily bad
|
||||||
|
corruption of the database state. In many scenarios, asynchronous
|
||||||
|
commit provides most of the performance improvement that could be
|
||||||
|
obtained by turning off <varname>fsync</varname>, but without the risk
|
||||||
|
of data corruption.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
<xref linkend="guc-commit-delay"> also sounds very similar to
|
||||||
|
asynchronous commit, but it is actually a synchronous commit method
|
||||||
|
(in fact, <varname>commit_delay</varname> is ignored during an
|
||||||
|
asynchronous commit). <varname>commit_delay</varname> causes a delay
|
||||||
|
just before a synchronous commit attempts to flush
|
||||||
|
<acronym>WAL</acronym> to disk, in the hope that a single flush
|
||||||
|
executed by one such transaction can also serve other transactions
|
||||||
|
committing at about the same time. Setting <varname>commit_delay</varname>
|
||||||
|
can only help when there are many concurrently committing transactions,
|
||||||
|
and it is difficult to tune it to a value that actually helps rather
|
||||||
|
than hurting throughput.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
</sect1>
|
||||||
|
|
||||||
<sect1 id="wal-configuration">
|
<sect1 id="wal-configuration">
|
||||||
<title><acronym>WAL</acronym> Configuration</title>
|
<title><acronym>WAL</acronym> Configuration</title>
|
||||||
|
|
||||||
@ -188,13 +321,13 @@
|
|||||||
<varname>checkpoint_timeout</varname> causes checkpoints to be done
|
<varname>checkpoint_timeout</varname> causes checkpoints to be done
|
||||||
more often. This allows faster after-crash recovery (since less work
|
more often. This allows faster after-crash recovery (since less work
|
||||||
will need to be redone). However, one must balance this against the
|
will need to be redone). However, one must balance this against the
|
||||||
increased cost of flushing dirty data pages more often. If
|
increased cost of flushing dirty data pages more often. If
|
||||||
<xref linkend="guc-full-page-writes"> is set (as is the default), there is
|
<xref linkend="guc-full-page-writes"> is set (as is the default), there is
|
||||||
another factor to consider. To ensure data page consistency,
|
another factor to consider. To ensure data page consistency,
|
||||||
the first modification of a data page after each checkpoint results in
|
the first modification of a data page after each checkpoint results in
|
||||||
logging the entire page content. In that case,
|
logging the entire page content. In that case,
|
||||||
a smaller checkpoint interval increases the volume of output to the WAL log,
|
a smaller checkpoint interval increases the volume of output to the WAL log,
|
||||||
partially negating the goal of using a smaller interval,
|
partially negating the goal of using a smaller interval,
|
||||||
and in any case causing more disk I/O.
|
and in any case causing more disk I/O.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
@ -206,8 +339,8 @@
|
|||||||
don't happen too often. As a simple sanity check on your checkpointing
|
don't happen too often. As a simple sanity check on your checkpointing
|
||||||
parameters, you can set the <xref linkend="guc-checkpoint-warning">
|
parameters, you can set the <xref linkend="guc-checkpoint-warning">
|
||||||
parameter. If checkpoints happen closer together than
|
parameter. If checkpoints happen closer together than
|
||||||
<varname>checkpoint_warning</> seconds,
|
<varname>checkpoint_warning</> seconds,
|
||||||
a message will be output to the server log recommending increasing
|
a message will be output to the server log recommending increasing
|
||||||
<varname>checkpoint_segments</varname>. Occasional appearance of such
|
<varname>checkpoint_segments</varname>. Occasional appearance of such
|
||||||
a message is not cause for alarm, but if it appears often then the
|
a message is not cause for alarm, but if it appears often then the
|
||||||
checkpoint control parameters should be increased. Bulk operations such
|
checkpoint control parameters should be increased. Bulk operations such
|
||||||
@ -280,9 +413,9 @@
|
|||||||
modifying the configuration parameter <xref
|
modifying the configuration parameter <xref
|
||||||
linkend="guc-wal-buffers">. The default number of <acronym>WAL</acronym>
|
linkend="guc-wal-buffers">. The default number of <acronym>WAL</acronym>
|
||||||
buffers is 8. Increasing this value will
|
buffers is 8. Increasing this value will
|
||||||
correspondingly increase shared memory usage. When
|
correspondingly increase shared memory usage. When
|
||||||
<xref linkend="guc-full-page-writes"> is set and the system is very busy,
|
<xref linkend="guc-full-page-writes"> is set and the system is very busy,
|
||||||
setting this value higher will help smooth response times during the
|
setting this value higher will help smooth response times during the
|
||||||
period immediately following each checkpoint.
|
period immediately following each checkpoint.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
@ -307,7 +440,7 @@
|
|||||||
<para>
|
<para>
|
||||||
The <xref linkend="guc-wal-sync-method"> parameter determines how
|
The <xref linkend="guc-wal-sync-method"> parameter determines how
|
||||||
<productname>PostgreSQL</productname> will ask the kernel to force
|
<productname>PostgreSQL</productname> will ask the kernel to force
|
||||||
<acronym>WAL</acronym> updates out to disk.
|
<acronym>WAL</acronym> updates out to disk.
|
||||||
All the options should be the same as far as reliability goes,
|
All the options should be the same as far as reliability goes,
|
||||||
but it's quite platform-specific which one will be the fastest.
|
but it's quite platform-specific which one will be the fastest.
|
||||||
Note that this parameter is irrelevant if <varname>fsync</varname>
|
Note that this parameter is irrelevant if <varname>fsync</varname>
|
||||||
@ -360,7 +493,7 @@
|
|||||||
The aim of <acronym>WAL</acronym>, to ensure that the log is
|
The aim of <acronym>WAL</acronym>, to ensure that the log is
|
||||||
written before database records are altered, can be subverted by
|
written before database records are altered, can be subverted by
|
||||||
disk drives<indexterm><primary>disk drive</></> that falsely report a
|
disk drives<indexterm><primary>disk drive</></> that falsely report a
|
||||||
successful write to the kernel,
|
successful write to the kernel,
|
||||||
when in fact they have only cached the data and not yet stored it
|
when in fact they have only cached the data and not yet stored it
|
||||||
on the disk. A power failure in such a situation might still lead to
|
on the disk. A power failure in such a situation might still lead to
|
||||||
irrecoverable data corruption. Administrators should try to ensure
|
irrecoverable data corruption. Administrators should try to ensure
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
$PostgreSQL: pgsql/src/backend/access/transam/README,v 1.5 2006/03/31 23:32:05 tgl Exp $
|
$PostgreSQL: pgsql/src/backend/access/transam/README,v 1.6 2007/08/01 22:45:07 tgl Exp $
|
||||||
|
|
||||||
The Transaction System
|
The Transaction System
|
||||||
----------------------
|
----------------------
|
||||||
@ -409,4 +409,113 @@ two separate WAL records. The replay code has to remember "unfinished" split
|
|||||||
operations, and match them up to subsequent insertions in the parent level.
|
operations, and match them up to subsequent insertions in the parent level.
|
||||||
If no matching insert has been found by the time the WAL replay ends, the
|
If no matching insert has been found by the time the WAL replay ends, the
|
||||||
replay code has to do the insertion on its own to restore the index to
|
replay code has to do the insertion on its own to restore the index to
|
||||||
consistency.
|
consistency. Such insertions occur after WAL is operational, so they can
|
||||||
|
and should write WAL records for the additional generated actions.
|
||||||
|
|
||||||
|
|
||||||
|
Asynchronous Commit
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
As of PostgreSQL 8.3 it is possible to perform asynchronous commits - i.e.,
|
||||||
|
we don't wait while the WAL record for the commit is fsync'ed.
|
||||||
|
We perform an asynchronous commit when synchronous_commit = off. Instead
|
||||||
|
of performing an XLogFlush() up to the LSN of the commit, we merely note
|
||||||
|
the LSN in shared memory. The backend then continues with other work.
|
||||||
|
We record the LSN only for an asynchronous commit, not an abort; there's
|
||||||
|
never any need to flush an abort record, since the presumption after a
|
||||||
|
crash would be that the transaction aborted anyway.
|
||||||
|
|
||||||
|
We always force synchronous commit when the transaction is deleting
|
||||||
|
relations, to ensure the commit record is down to disk before the relations
|
||||||
|
are removed from the filesystem. Also, certain utility commands that have
|
||||||
|
non-roll-backable side effects (such as filesystem changes) force sync
|
||||||
|
commit to minimize the window in which the filesystem change has been made
|
||||||
|
but the transaction isn't guaranteed committed.
|
||||||
|
|
||||||
|
Every wal_writer_delay milliseconds, the walwriter process performs an
|
||||||
|
XLogBackgroundFlush(). This checks the location of the last completely
|
||||||
|
filled WAL page. If that has moved forwards, then we write all the changed
|
||||||
|
buffers up to that point, so that under full load we write only whole
|
||||||
|
buffers. If there has been a break in activity and the current WAL page is
|
||||||
|
the same as before, then we find out the LSN of the most recent
|
||||||
|
asynchronous commit, and flush up to that point, if required (i.e.,
|
||||||
|
if it's in the current WAL page). This arrangement in itself would
|
||||||
|
guarantee that an async commit record reaches disk during at worst the
|
||||||
|
second walwriter cycle after the transaction completes. However, we also
|
||||||
|
allow XLogFlush to flush full buffers "flexibly" (ie, not wrapping around
|
||||||
|
at the end of the circular WAL buffer area), so as to minimize the number
|
||||||
|
of writes issued under high load when multiple WAL pages are filled per
|
||||||
|
walwriter cycle. This makes the worst-case delay three walwriter cycles.
|
||||||
|
|
||||||
|
There are some other subtle points to consider with asynchronous commits.
|
||||||
|
First, for each page of CLOG we must remember the LSN of the latest commit
|
||||||
|
affecting the page, so that we can enforce the same flush-WAL-before-write
|
||||||
|
rule that we do for ordinary relation pages. Otherwise the record of the
|
||||||
|
commit might reach disk before the WAL record does. Again, abort records
|
||||||
|
need not factor into this consideration.
|
||||||
|
|
||||||
|
In fact, we store more than one LSN for each clog page. This relates to
|
||||||
|
the way we set transaction status hint bits during visibility tests.
|
||||||
|
We must not set a transaction-committed hint bit on a relation page and
|
||||||
|
have that record make it to disk prior to the WAL record of the commit.
|
||||||
|
Since visibility tests are normally made while holding buffer share locks,
|
||||||
|
we do not have the option of changing the page's LSN to guarantee WAL
|
||||||
|
synchronization. Instead, we defer the setting of the hint bit if we have
|
||||||
|
not yet flushed WAL as far as the LSN associated with the transaction.
|
||||||
|
This requires tracking the LSN of each unflushed async commit. It is
|
||||||
|
convenient to associate this data with clog buffers: because we will flush
|
||||||
|
WAL before writing a clog page, we know that we do not need to remember a
|
||||||
|
transaction's LSN longer than the clog page holding its commit status
|
||||||
|
remains in memory. However, the naive approach of storing an LSN for each
|
||||||
|
clog position is unattractive: the LSNs are 32x bigger than the two-bit
|
||||||
|
commit status fields, and so we'd need 256K of additional shared memory for
|
||||||
|
each 8K clog buffer page. We choose instead to store a smaller number of
|
||||||
|
LSNs per page, where each LSN is the highest LSN associated with any
|
||||||
|
transaction commit in a contiguous range of transaction IDs on that page.
|
||||||
|
This saves storage at the price of some possibly-unnecessary delay in
|
||||||
|
setting transaction hint bits.
|
||||||
|
|
||||||
|
How many transactions should share the same cached LSN (N)? If the
|
||||||
|
system's workload consists only of small async-commit transactions, then
|
||||||
|
it's reasonable to have N similar to the number of transactions per
|
||||||
|
walwriter cycle, since that is the granularity with which transactions will
|
||||||
|
become truly committed (and thus hintable) anyway. The worst case is where
|
||||||
|
a sync-commit xact shares a cached LSN with an async-commit xact that
|
||||||
|
commits a bit later; even though we paid to sync the first xact to disk,
|
||||||
|
we won't be able to hint its outputs until the second xact is sync'd, up to
|
||||||
|
three walwriter cycles later. This argues for keeping N (the group size)
|
||||||
|
as small as possible. For the moment we are setting the group size to 32,
|
||||||
|
which makes the LSN cache space the same size as the actual clog buffer
|
||||||
|
space (independently of BLCKSZ).
|
||||||
|
|
||||||
|
It is useful that we can run both synchronous and asynchronous commit
|
||||||
|
transactions concurrently, but the safety of this is perhaps not
|
||||||
|
immediately obvious. Assume we have two transactions, T1 and T2. The Log
|
||||||
|
Sequence Number (LSN) is the point in the WAL sequence where a transaction
|
||||||
|
commit is recorded, so LSN1 and LSN2 are the commit records of those
|
||||||
|
transactions. If T2 can see changes made by T1 then when T2 commits it
|
||||||
|
must be true that LSN2 follows LSN1. Thus when T2 commits it is certain
|
||||||
|
that all of the changes made by T1 are also now recorded in the WAL. This
|
||||||
|
is true whether T1 was asynchronous or synchronous. As a result, it is
|
||||||
|
safe for asynchronous commits and synchronous commits to work concurrently
|
||||||
|
without endangering data written by synchronous commits. Sub-transactions
|
||||||
|
are not important here since the final write to disk only occurs at the
|
||||||
|
commit of the top level transaction.
|
||||||
|
|
||||||
|
Changes to data blocks cannot reach disk unless WAL is flushed up to the
|
||||||
|
point of the LSN of the data blocks. Any attempt to write unsafe data to
|
||||||
|
disk will trigger a write which ensures the safety of all data written by
|
||||||
|
that and prior transactions. Data blocks and clog pages are both protected
|
||||||
|
by LSNs.
|
||||||
|
|
||||||
|
Changes to a temp table are not WAL-logged, hence could reach disk in
|
||||||
|
advance of T1's commit, but we don't care since temp table contents don't
|
||||||
|
survive crashes anyway.
|
||||||
|
|
||||||
|
Database writes made via any of the paths we have introduced to avoid WAL
|
||||||
|
overhead for bulk updates are also safe. In these cases it's entirely
|
||||||
|
possible for the data to reach disk before T1's commit, because T1 will
|
||||||
|
fsync it down to disk without any sort of interlock, as soon as it finishes
|
||||||
|
the bulk update. However, all these paths are designed to write data that
|
||||||
|
no other transaction can see until after T1 commits. The situation is thus
|
||||||
|
not different from ordinary WAL-logged updates.
|
||||||
|
@ -14,17 +14,19 @@
|
|||||||
* CLOG page is initialized to zeroes. Other writes of CLOG come from
|
* CLOG page is initialized to zeroes. Other writes of CLOG come from
|
||||||
* recording of transaction commit or abort in xact.c, which generates its
|
* recording of transaction commit or abort in xact.c, which generates its
|
||||||
* own XLOG records for these events and will re-perform the status update
|
* own XLOG records for these events and will re-perform the status update
|
||||||
* on redo; so we need make no additional XLOG entry here. Also, the XLOG
|
* on redo; so we need make no additional XLOG entry here. For synchronous
|
||||||
* is guaranteed flushed through the XLOG commit record before we are called
|
* transaction commits, the XLOG is guaranteed flushed through the XLOG commit
|
||||||
* to log a commit, so the WAL rule "write xlog before data" is satisfied
|
* record before we are called to log a commit, so the WAL rule "write xlog
|
||||||
* automatically for commits, and we don't really care for aborts. Therefore,
|
* before data" is satisfied automatically. However, for async commits we
|
||||||
* we don't need to mark CLOG pages with LSN information; we have enough
|
* must track the latest LSN affecting each CLOG page, so that we can flush
|
||||||
* synchronization already.
|
* XLOG that far and satisfy the WAL rule. We don't have to worry about this
|
||||||
|
* for aborts (whether sync or async), since the post-crash assumption would
|
||||||
|
* be that such transactions failed anyway.
|
||||||
*
|
*
|
||||||
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
|
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
|
||||||
* Portions Copyright (c) 1994, Regents of the University of California
|
* Portions Copyright (c) 1994, Regents of the University of California
|
||||||
*
|
*
|
||||||
* $PostgreSQL: pgsql/src/backend/access/transam/clog.c,v 1.42 2007/01/05 22:19:23 momjian Exp $
|
* $PostgreSQL: pgsql/src/backend/access/transam/clog.c,v 1.43 2007/08/01 22:45:07 tgl Exp $
|
||||||
*
|
*
|
||||||
*-------------------------------------------------------------------------
|
*-------------------------------------------------------------------------
|
||||||
*/
|
*/
|
||||||
@ -57,6 +59,13 @@
|
|||||||
#define TransactionIdToByte(xid) (TransactionIdToPgIndex(xid) / CLOG_XACTS_PER_BYTE)
|
#define TransactionIdToByte(xid) (TransactionIdToPgIndex(xid) / CLOG_XACTS_PER_BYTE)
|
||||||
#define TransactionIdToBIndex(xid) ((xid) % (TransactionId) CLOG_XACTS_PER_BYTE)
|
#define TransactionIdToBIndex(xid) ((xid) % (TransactionId) CLOG_XACTS_PER_BYTE)
|
||||||
|
|
||||||
|
/* We store the latest async LSN for each group of transactions */
|
||||||
|
#define CLOG_XACTS_PER_LSN_GROUP 32 /* keep this a power of 2 */
|
||||||
|
#define CLOG_LSNS_PER_PAGE (CLOG_XACTS_PER_PAGE / CLOG_XACTS_PER_LSN_GROUP)
|
||||||
|
|
||||||
|
#define GetLSNIndex(slotno, xid) ((slotno) * CLOG_LSNS_PER_PAGE + \
|
||||||
|
((xid) % (TransactionId) CLOG_XACTS_PER_PAGE) / CLOG_XACTS_PER_LSN_GROUP)
|
||||||
|
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Link to shared-memory data structures for CLOG control
|
* Link to shared-memory data structures for CLOG control
|
||||||
@ -75,11 +84,16 @@ static void WriteTruncateXlogRec(int pageno);
|
|||||||
/*
|
/*
|
||||||
* Record the final state of a transaction in the commit log.
|
* Record the final state of a transaction in the commit log.
|
||||||
*
|
*
|
||||||
|
* lsn must be the WAL location of the commit record when recording an async
|
||||||
|
* commit. For a synchronous commit it can be InvalidXLogRecPtr, since the
|
||||||
|
* caller guarantees the commit record is already flushed in that case. It
|
||||||
|
* should be InvalidXLogRecPtr for abort cases, too.
|
||||||
|
*
|
||||||
* NB: this is a low-level routine and is NOT the preferred entry point
|
* NB: this is a low-level routine and is NOT the preferred entry point
|
||||||
* for most uses; TransactionLogUpdate() in transam.c is the intended caller.
|
* for most uses; TransactionLogUpdate() in transam.c is the intended caller.
|
||||||
*/
|
*/
|
||||||
void
|
void
|
||||||
TransactionIdSetStatus(TransactionId xid, XidStatus status)
|
TransactionIdSetStatus(TransactionId xid, XidStatus status, XLogRecPtr lsn)
|
||||||
{
|
{
|
||||||
int pageno = TransactionIdToPage(xid);
|
int pageno = TransactionIdToPage(xid);
|
||||||
int byteno = TransactionIdToByte(xid);
|
int byteno = TransactionIdToByte(xid);
|
||||||
@ -94,7 +108,16 @@ TransactionIdSetStatus(TransactionId xid, XidStatus status)
|
|||||||
|
|
||||||
LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
|
LWLockAcquire(CLogControlLock, LW_EXCLUSIVE);
|
||||||
|
|
||||||
slotno = SimpleLruReadPage(ClogCtl, pageno, xid);
|
/*
|
||||||
|
* If we're doing an async commit (ie, lsn is valid), then we must wait
|
||||||
|
* for any active write on the page slot to complete. Otherwise our
|
||||||
|
* update could reach disk in that write, which will not do since we
|
||||||
|
* mustn't let it reach disk until we've done the appropriate WAL flush.
|
||||||
|
* But when lsn is invalid, it's OK to scribble on a page while it is
|
||||||
|
* write-busy, since we don't care if the update reaches disk sooner than
|
||||||
|
* we think. Hence, pass write_ok = XLogRecPtrIsInvalid(lsn).
|
||||||
|
*/
|
||||||
|
slotno = SimpleLruReadPage(ClogCtl, pageno, XLogRecPtrIsInvalid(lsn), xid);
|
||||||
byteptr = ClogCtl->shared->page_buffer[slotno] + byteno;
|
byteptr = ClogCtl->shared->page_buffer[slotno] + byteno;
|
||||||
|
|
||||||
/* Current state should be 0, subcommitted or target state */
|
/* Current state should be 0, subcommitted or target state */
|
||||||
@ -110,22 +133,48 @@ TransactionIdSetStatus(TransactionId xid, XidStatus status)
|
|||||||
|
|
||||||
ClogCtl->shared->page_dirty[slotno] = true;
|
ClogCtl->shared->page_dirty[slotno] = true;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Update the group LSN if the transaction completion LSN is higher.
|
||||||
|
*
|
||||||
|
* Note: lsn will be invalid when supplied during InRecovery processing,
|
||||||
|
* so we don't need to do anything special to avoid LSN updates during
|
||||||
|
* recovery. After recovery completes the next clog change will set the
|
||||||
|
* LSN correctly.
|
||||||
|
*/
|
||||||
|
if (!XLogRecPtrIsInvalid(lsn))
|
||||||
|
{
|
||||||
|
int lsnindex = GetLSNIndex(slotno, xid);
|
||||||
|
|
||||||
|
if (XLByteLT(ClogCtl->shared->group_lsn[lsnindex], lsn))
|
||||||
|
ClogCtl->shared->group_lsn[lsnindex] = lsn;
|
||||||
|
}
|
||||||
|
|
||||||
LWLockRelease(CLogControlLock);
|
LWLockRelease(CLogControlLock);
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Interrogate the state of a transaction in the commit log.
|
* Interrogate the state of a transaction in the commit log.
|
||||||
*
|
*
|
||||||
|
* Aside from the actual commit status, this function returns (into *lsn)
|
||||||
|
* an LSN that is late enough to be able to guarantee that if we flush up to
|
||||||
|
* that LSN then we will have flushed the transaction's commit record to disk.
|
||||||
|
* The result is not necessarily the exact LSN of the transaction's commit
|
||||||
|
* record! For example, for long-past transactions (those whose clog pages
|
||||||
|
* already migrated to disk), we'll return InvalidXLogRecPtr. Also, because
|
||||||
|
* we group transactions on the same clog page to conserve storage, we might
|
||||||
|
* return the LSN of a later transaction that falls into the same group.
|
||||||
|
*
|
||||||
* NB: this is a low-level routine and is NOT the preferred entry point
|
* NB: this is a low-level routine and is NOT the preferred entry point
|
||||||
* for most uses; TransactionLogFetch() in transam.c is the intended caller.
|
* for most uses; TransactionLogFetch() in transam.c is the intended caller.
|
||||||
*/
|
*/
|
||||||
XidStatus
|
XidStatus
|
||||||
TransactionIdGetStatus(TransactionId xid)
|
TransactionIdGetStatus(TransactionId xid, XLogRecPtr *lsn)
|
||||||
{
|
{
|
||||||
int pageno = TransactionIdToPage(xid);
|
int pageno = TransactionIdToPage(xid);
|
||||||
int byteno = TransactionIdToByte(xid);
|
int byteno = TransactionIdToByte(xid);
|
||||||
int bshift = TransactionIdToBIndex(xid) * CLOG_BITS_PER_XACT;
|
int bshift = TransactionIdToBIndex(xid) * CLOG_BITS_PER_XACT;
|
||||||
int slotno;
|
int slotno;
|
||||||
|
int lsnindex;
|
||||||
char *byteptr;
|
char *byteptr;
|
||||||
XidStatus status;
|
XidStatus status;
|
||||||
|
|
||||||
@ -136,6 +185,9 @@ TransactionIdGetStatus(TransactionId xid)
|
|||||||
|
|
||||||
status = (*byteptr >> bshift) & CLOG_XACT_BITMASK;
|
status = (*byteptr >> bshift) & CLOG_XACT_BITMASK;
|
||||||
|
|
||||||
|
lsnindex = GetLSNIndex(slotno, xid);
|
||||||
|
*lsn = ClogCtl->shared->group_lsn[lsnindex];
|
||||||
|
|
||||||
LWLockRelease(CLogControlLock);
|
LWLockRelease(CLogControlLock);
|
||||||
|
|
||||||
return status;
|
return status;
|
||||||
@ -148,14 +200,14 @@ TransactionIdGetStatus(TransactionId xid)
|
|||||||
Size
|
Size
|
||||||
CLOGShmemSize(void)
|
CLOGShmemSize(void)
|
||||||
{
|
{
|
||||||
return SimpleLruShmemSize(NUM_CLOG_BUFFERS);
|
return SimpleLruShmemSize(NUM_CLOG_BUFFERS, CLOG_LSNS_PER_PAGE);
|
||||||
}
|
}
|
||||||
|
|
||||||
void
|
void
|
||||||
CLOGShmemInit(void)
|
CLOGShmemInit(void)
|
||||||
{
|
{
|
||||||
ClogCtl->PagePrecedes = CLOGPagePrecedes;
|
ClogCtl->PagePrecedes = CLOGPagePrecedes;
|
||||||
SimpleLruInit(ClogCtl, "CLOG Ctl", NUM_CLOG_BUFFERS,
|
SimpleLruInit(ClogCtl, "CLOG Ctl", NUM_CLOG_BUFFERS, CLOG_LSNS_PER_PAGE,
|
||||||
CLogControlLock, "pg_clog");
|
CLogControlLock, "pg_clog");
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -240,7 +292,7 @@ StartupCLOG(void)
|
|||||||
int slotno;
|
int slotno;
|
||||||
char *byteptr;
|
char *byteptr;
|
||||||
|
|
||||||
slotno = SimpleLruReadPage(ClogCtl, pageno, xid);
|
slotno = SimpleLruReadPage(ClogCtl, pageno, false, xid);
|
||||||
byteptr = ClogCtl->shared->page_buffer[slotno] + byteno;
|
byteptr = ClogCtl->shared->page_buffer[slotno] + byteno;
|
||||||
|
|
||||||
/* Zero so-far-unused positions in the current byte */
|
/* Zero so-far-unused positions in the current byte */
|
||||||
|
@ -42,7 +42,7 @@
|
|||||||
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
|
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
|
||||||
* Portions Copyright (c) 1994, Regents of the University of California
|
* Portions Copyright (c) 1994, Regents of the University of California
|
||||||
*
|
*
|
||||||
* $PostgreSQL: pgsql/src/backend/access/transam/multixact.c,v 1.23 2007/01/05 22:19:23 momjian Exp $
|
* $PostgreSQL: pgsql/src/backend/access/transam/multixact.c,v 1.24 2007/08/01 22:45:07 tgl Exp $
|
||||||
*
|
*
|
||||||
*-------------------------------------------------------------------------
|
*-------------------------------------------------------------------------
|
||||||
*/
|
*/
|
||||||
@ -749,7 +749,7 @@ RecordNewMultiXact(MultiXactId multi, MultiXactOffset offset,
|
|||||||
* enough that a MultiXactId is really involved. Perhaps someday we'll
|
* enough that a MultiXactId is really involved. Perhaps someday we'll
|
||||||
* take the trouble to generalize the slru.c error reporting code.
|
* take the trouble to generalize the slru.c error reporting code.
|
||||||
*/
|
*/
|
||||||
slotno = SimpleLruReadPage(MultiXactOffsetCtl, pageno, multi);
|
slotno = SimpleLruReadPage(MultiXactOffsetCtl, pageno, true, multi);
|
||||||
offptr = (MultiXactOffset *) MultiXactOffsetCtl->shared->page_buffer[slotno];
|
offptr = (MultiXactOffset *) MultiXactOffsetCtl->shared->page_buffer[slotno];
|
||||||
offptr += entryno;
|
offptr += entryno;
|
||||||
|
|
||||||
@ -773,7 +773,7 @@ RecordNewMultiXact(MultiXactId multi, MultiXactOffset offset,
|
|||||||
|
|
||||||
if (pageno != prev_pageno)
|
if (pageno != prev_pageno)
|
||||||
{
|
{
|
||||||
slotno = SimpleLruReadPage(MultiXactMemberCtl, pageno, multi);
|
slotno = SimpleLruReadPage(MultiXactMemberCtl, pageno, true, multi);
|
||||||
prev_pageno = pageno;
|
prev_pageno = pageno;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -993,7 +993,7 @@ retry:
|
|||||||
pageno = MultiXactIdToOffsetPage(multi);
|
pageno = MultiXactIdToOffsetPage(multi);
|
||||||
entryno = MultiXactIdToOffsetEntry(multi);
|
entryno = MultiXactIdToOffsetEntry(multi);
|
||||||
|
|
||||||
slotno = SimpleLruReadPage(MultiXactOffsetCtl, pageno, multi);
|
slotno = SimpleLruReadPage(MultiXactOffsetCtl, pageno, true, multi);
|
||||||
offptr = (MultiXactOffset *) MultiXactOffsetCtl->shared->page_buffer[slotno];
|
offptr = (MultiXactOffset *) MultiXactOffsetCtl->shared->page_buffer[slotno];
|
||||||
offptr += entryno;
|
offptr += entryno;
|
||||||
offset = *offptr;
|
offset = *offptr;
|
||||||
@ -1025,7 +1025,7 @@ retry:
|
|||||||
entryno = MultiXactIdToOffsetEntry(tmpMXact);
|
entryno = MultiXactIdToOffsetEntry(tmpMXact);
|
||||||
|
|
||||||
if (pageno != prev_pageno)
|
if (pageno != prev_pageno)
|
||||||
slotno = SimpleLruReadPage(MultiXactOffsetCtl, pageno, tmpMXact);
|
slotno = SimpleLruReadPage(MultiXactOffsetCtl, pageno, true, tmpMXact);
|
||||||
|
|
||||||
offptr = (MultiXactOffset *) MultiXactOffsetCtl->shared->page_buffer[slotno];
|
offptr = (MultiXactOffset *) MultiXactOffsetCtl->shared->page_buffer[slotno];
|
||||||
offptr += entryno;
|
offptr += entryno;
|
||||||
@ -1061,7 +1061,7 @@ retry:
|
|||||||
|
|
||||||
if (pageno != prev_pageno)
|
if (pageno != prev_pageno)
|
||||||
{
|
{
|
||||||
slotno = SimpleLruReadPage(MultiXactMemberCtl, pageno, multi);
|
slotno = SimpleLruReadPage(MultiXactMemberCtl, pageno, true, multi);
|
||||||
prev_pageno = pageno;
|
prev_pageno = pageno;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -1289,8 +1289,8 @@ MultiXactShmemSize(void)
|
|||||||
mul_size(sizeof(MultiXactId) * 2, MaxBackends))
|
mul_size(sizeof(MultiXactId) * 2, MaxBackends))
|
||||||
|
|
||||||
size = SHARED_MULTIXACT_STATE_SIZE;
|
size = SHARED_MULTIXACT_STATE_SIZE;
|
||||||
size = add_size(size, SimpleLruShmemSize(NUM_MXACTOFFSET_BUFFERS));
|
size = add_size(size, SimpleLruShmemSize(NUM_MXACTOFFSET_BUFFERS, 0));
|
||||||
size = add_size(size, SimpleLruShmemSize(NUM_MXACTMEMBER_BUFFERS));
|
size = add_size(size, SimpleLruShmemSize(NUM_MXACTMEMBER_BUFFERS, 0));
|
||||||
|
|
||||||
return size;
|
return size;
|
||||||
}
|
}
|
||||||
@ -1306,10 +1306,10 @@ MultiXactShmemInit(void)
|
|||||||
MultiXactMemberCtl->PagePrecedes = MultiXactMemberPagePrecedes;
|
MultiXactMemberCtl->PagePrecedes = MultiXactMemberPagePrecedes;
|
||||||
|
|
||||||
SimpleLruInit(MultiXactOffsetCtl,
|
SimpleLruInit(MultiXactOffsetCtl,
|
||||||
"MultiXactOffset Ctl", NUM_MXACTOFFSET_BUFFERS,
|
"MultiXactOffset Ctl", NUM_MXACTOFFSET_BUFFERS, 0,
|
||||||
MultiXactOffsetControlLock, "pg_multixact/offsets");
|
MultiXactOffsetControlLock, "pg_multixact/offsets");
|
||||||
SimpleLruInit(MultiXactMemberCtl,
|
SimpleLruInit(MultiXactMemberCtl,
|
||||||
"MultiXactMember Ctl", NUM_MXACTMEMBER_BUFFERS,
|
"MultiXactMember Ctl", NUM_MXACTMEMBER_BUFFERS, 0,
|
||||||
MultiXactMemberControlLock, "pg_multixact/members");
|
MultiXactMemberControlLock, "pg_multixact/members");
|
||||||
|
|
||||||
/* Initialize our shared state struct */
|
/* Initialize our shared state struct */
|
||||||
@ -1442,7 +1442,7 @@ StartupMultiXact(void)
|
|||||||
int slotno;
|
int slotno;
|
||||||
MultiXactOffset *offptr;
|
MultiXactOffset *offptr;
|
||||||
|
|
||||||
slotno = SimpleLruReadPage(MultiXactOffsetCtl, pageno, multi);
|
slotno = SimpleLruReadPage(MultiXactOffsetCtl, pageno, true, multi);
|
||||||
offptr = (MultiXactOffset *) MultiXactOffsetCtl->shared->page_buffer[slotno];
|
offptr = (MultiXactOffset *) MultiXactOffsetCtl->shared->page_buffer[slotno];
|
||||||
offptr += entryno;
|
offptr += entryno;
|
||||||
|
|
||||||
@ -1472,7 +1472,7 @@ StartupMultiXact(void)
|
|||||||
int slotno;
|
int slotno;
|
||||||
TransactionId *xidptr;
|
TransactionId *xidptr;
|
||||||
|
|
||||||
slotno = SimpleLruReadPage(MultiXactMemberCtl, pageno, offset);
|
slotno = SimpleLruReadPage(MultiXactMemberCtl, pageno, true, offset);
|
||||||
xidptr = (TransactionId *) MultiXactMemberCtl->shared->page_buffer[slotno];
|
xidptr = (TransactionId *) MultiXactMemberCtl->shared->page_buffer[slotno];
|
||||||
xidptr += entryno;
|
xidptr += entryno;
|
||||||
|
|
||||||
|
@ -41,7 +41,7 @@
|
|||||||
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
|
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
|
||||||
* Portions Copyright (c) 1994, Regents of the University of California
|
* Portions Copyright (c) 1994, Regents of the University of California
|
||||||
*
|
*
|
||||||
* $PostgreSQL: pgsql/src/backend/access/transam/slru.c,v 1.40 2007/01/05 22:19:23 momjian Exp $
|
* $PostgreSQL: pgsql/src/backend/access/transam/slru.c,v 1.41 2007/08/01 22:45:07 tgl Exp $
|
||||||
*
|
*
|
||||||
*-------------------------------------------------------------------------
|
*-------------------------------------------------------------------------
|
||||||
*/
|
*/
|
||||||
@ -140,6 +140,8 @@ static SlruErrorCause slru_errcause;
|
|||||||
static int slru_errno;
|
static int slru_errno;
|
||||||
|
|
||||||
|
|
||||||
|
static void SimpleLruZeroLSNs(SlruCtl ctl, int slotno);
|
||||||
|
static void SimpleLruWaitIO(SlruCtl ctl, int slotno);
|
||||||
static bool SlruPhysicalReadPage(SlruCtl ctl, int pageno, int slotno);
|
static bool SlruPhysicalReadPage(SlruCtl ctl, int pageno, int slotno);
|
||||||
static bool SlruPhysicalWritePage(SlruCtl ctl, int pageno, int slotno,
|
static bool SlruPhysicalWritePage(SlruCtl ctl, int pageno, int slotno,
|
||||||
SlruFlush fdata);
|
SlruFlush fdata);
|
||||||
@ -152,7 +154,7 @@ static int SlruSelectLRUPage(SlruCtl ctl, int pageno);
|
|||||||
*/
|
*/
|
||||||
|
|
||||||
Size
|
Size
|
||||||
SimpleLruShmemSize(int nslots)
|
SimpleLruShmemSize(int nslots, int nlsns)
|
||||||
{
|
{
|
||||||
Size sz;
|
Size sz;
|
||||||
|
|
||||||
@ -165,18 +167,21 @@ SimpleLruShmemSize(int nslots)
|
|||||||
sz += MAXALIGN(nslots * sizeof(int)); /* page_lru_count[] */
|
sz += MAXALIGN(nslots * sizeof(int)); /* page_lru_count[] */
|
||||||
sz += MAXALIGN(nslots * sizeof(LWLockId)); /* buffer_locks[] */
|
sz += MAXALIGN(nslots * sizeof(LWLockId)); /* buffer_locks[] */
|
||||||
|
|
||||||
|
if (nlsns > 0)
|
||||||
|
sz += MAXALIGN(nslots * nlsns * sizeof(XLogRecPtr)); /* group_lsn[] */
|
||||||
|
|
||||||
return BUFFERALIGN(sz) + BLCKSZ * nslots;
|
return BUFFERALIGN(sz) + BLCKSZ * nslots;
|
||||||
}
|
}
|
||||||
|
|
||||||
void
|
void
|
||||||
SimpleLruInit(SlruCtl ctl, const char *name, int nslots,
|
SimpleLruInit(SlruCtl ctl, const char *name, int nslots, int nlsns,
|
||||||
LWLockId ctllock, const char *subdir)
|
LWLockId ctllock, const char *subdir)
|
||||||
{
|
{
|
||||||
SlruShared shared;
|
SlruShared shared;
|
||||||
bool found;
|
bool found;
|
||||||
|
|
||||||
shared = (SlruShared) ShmemInitStruct(name,
|
shared = (SlruShared) ShmemInitStruct(name,
|
||||||
SimpleLruShmemSize(nslots),
|
SimpleLruShmemSize(nslots, nlsns),
|
||||||
&found);
|
&found);
|
||||||
|
|
||||||
if (!IsUnderPostmaster)
|
if (!IsUnderPostmaster)
|
||||||
@ -193,6 +198,7 @@ SimpleLruInit(SlruCtl ctl, const char *name, int nslots,
|
|||||||
shared->ControlLock = ctllock;
|
shared->ControlLock = ctllock;
|
||||||
|
|
||||||
shared->num_slots = nslots;
|
shared->num_slots = nslots;
|
||||||
|
shared->lsn_groups_per_page = nlsns;
|
||||||
|
|
||||||
shared->cur_lru_count = 0;
|
shared->cur_lru_count = 0;
|
||||||
|
|
||||||
@ -212,8 +218,14 @@ SimpleLruInit(SlruCtl ctl, const char *name, int nslots,
|
|||||||
offset += MAXALIGN(nslots * sizeof(int));
|
offset += MAXALIGN(nslots * sizeof(int));
|
||||||
shared->buffer_locks = (LWLockId *) (ptr + offset);
|
shared->buffer_locks = (LWLockId *) (ptr + offset);
|
||||||
offset += MAXALIGN(nslots * sizeof(LWLockId));
|
offset += MAXALIGN(nslots * sizeof(LWLockId));
|
||||||
ptr += BUFFERALIGN(offset);
|
|
||||||
|
|
||||||
|
if (nlsns > 0)
|
||||||
|
{
|
||||||
|
shared->group_lsn = (XLogRecPtr *) (ptr + offset);
|
||||||
|
offset += MAXALIGN(nslots * nlsns * sizeof(XLogRecPtr));
|
||||||
|
}
|
||||||
|
|
||||||
|
ptr += BUFFERALIGN(offset);
|
||||||
for (slotno = 0; slotno < nslots; slotno++)
|
for (slotno = 0; slotno < nslots; slotno++)
|
||||||
{
|
{
|
||||||
shared->page_buffer[slotno] = ptr;
|
shared->page_buffer[slotno] = ptr;
|
||||||
@ -266,15 +278,37 @@ SimpleLruZeroPage(SlruCtl ctl, int pageno)
|
|||||||
/* Set the buffer to zeroes */
|
/* Set the buffer to zeroes */
|
||||||
MemSet(shared->page_buffer[slotno], 0, BLCKSZ);
|
MemSet(shared->page_buffer[slotno], 0, BLCKSZ);
|
||||||
|
|
||||||
|
/* Set the LSNs for this new page to zero */
|
||||||
|
SimpleLruZeroLSNs(ctl, slotno);
|
||||||
|
|
||||||
/* Assume this page is now the latest active page */
|
/* Assume this page is now the latest active page */
|
||||||
shared->latest_page_number = pageno;
|
shared->latest_page_number = pageno;
|
||||||
|
|
||||||
return slotno;
|
return slotno;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Zero all the LSNs we store for this slru page.
|
||||||
|
*
|
||||||
|
* This should be called each time we create a new page, and each time we read
|
||||||
|
* in a page from disk into an existing buffer. (Such an old page cannot
|
||||||
|
* have any interesting LSNs, since we'd have flushed them before writing
|
||||||
|
* the page in the first place.)
|
||||||
|
*/
|
||||||
|
static void
|
||||||
|
SimpleLruZeroLSNs(SlruCtl ctl, int slotno)
|
||||||
|
{
|
||||||
|
SlruShared shared = ctl->shared;
|
||||||
|
|
||||||
|
if (shared->lsn_groups_per_page > 0)
|
||||||
|
MemSet(&shared->group_lsn[slotno * shared->lsn_groups_per_page], 0,
|
||||||
|
shared->lsn_groups_per_page * sizeof(XLogRecPtr));
|
||||||
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Wait for any active I/O on a page slot to finish. (This does not
|
* Wait for any active I/O on a page slot to finish. (This does not
|
||||||
* guarantee that new I/O hasn't been started before we return, though.)
|
* guarantee that new I/O hasn't been started before we return, though.
|
||||||
|
* In fact the slot might not even contain the same page anymore.)
|
||||||
*
|
*
|
||||||
* Control lock must be held at entry, and will be held at exit.
|
* Control lock must be held at entry, and will be held at exit.
|
||||||
*/
|
*/
|
||||||
@ -305,8 +339,7 @@ SimpleLruWaitIO(SlruCtl ctl, int slotno)
|
|||||||
/* indeed, the I/O must have failed */
|
/* indeed, the I/O must have failed */
|
||||||
if (shared->page_status[slotno] == SLRU_PAGE_READ_IN_PROGRESS)
|
if (shared->page_status[slotno] == SLRU_PAGE_READ_IN_PROGRESS)
|
||||||
shared->page_status[slotno] = SLRU_PAGE_EMPTY;
|
shared->page_status[slotno] = SLRU_PAGE_EMPTY;
|
||||||
else
|
else /* write_in_progress */
|
||||||
/* write_in_progress */
|
|
||||||
{
|
{
|
||||||
shared->page_status[slotno] = SLRU_PAGE_VALID;
|
shared->page_status[slotno] = SLRU_PAGE_VALID;
|
||||||
shared->page_dirty[slotno] = true;
|
shared->page_dirty[slotno] = true;
|
||||||
@ -320,6 +353,11 @@ SimpleLruWaitIO(SlruCtl ctl, int slotno)
|
|||||||
* Find a page in a shared buffer, reading it in if necessary.
|
* Find a page in a shared buffer, reading it in if necessary.
|
||||||
* The page number must correspond to an already-initialized page.
|
* The page number must correspond to an already-initialized page.
|
||||||
*
|
*
|
||||||
|
* If write_ok is true then it is OK to return a page that is in
|
||||||
|
* WRITE_IN_PROGRESS state; it is the caller's responsibility to be sure
|
||||||
|
* that modification of the page is safe. If write_ok is false then we
|
||||||
|
* will not return the page until it is not undergoing active I/O.
|
||||||
|
*
|
||||||
* The passed-in xid is used only for error reporting, and may be
|
* The passed-in xid is used only for error reporting, and may be
|
||||||
* InvalidTransactionId if no specific xid is associated with the action.
|
* InvalidTransactionId if no specific xid is associated with the action.
|
||||||
*
|
*
|
||||||
@ -329,7 +367,8 @@ SimpleLruWaitIO(SlruCtl ctl, int slotno)
|
|||||||
* Control lock must be held at entry, and will be held at exit.
|
* Control lock must be held at entry, and will be held at exit.
|
||||||
*/
|
*/
|
||||||
int
|
int
|
||||||
SimpleLruReadPage(SlruCtl ctl, int pageno, TransactionId xid)
|
SimpleLruReadPage(SlruCtl ctl, int pageno, bool write_ok,
|
||||||
|
TransactionId xid)
|
||||||
{
|
{
|
||||||
SlruShared shared = ctl->shared;
|
SlruShared shared = ctl->shared;
|
||||||
|
|
||||||
@ -346,8 +385,13 @@ SimpleLruReadPage(SlruCtl ctl, int pageno, TransactionId xid)
|
|||||||
if (shared->page_number[slotno] == pageno &&
|
if (shared->page_number[slotno] == pageno &&
|
||||||
shared->page_status[slotno] != SLRU_PAGE_EMPTY)
|
shared->page_status[slotno] != SLRU_PAGE_EMPTY)
|
||||||
{
|
{
|
||||||
/* If page is still being read in, we must wait for I/O */
|
/*
|
||||||
if (shared->page_status[slotno] == SLRU_PAGE_READ_IN_PROGRESS)
|
* If page is still being read in, we must wait for I/O. Likewise
|
||||||
|
* if the page is being written and the caller said that's not OK.
|
||||||
|
*/
|
||||||
|
if (shared->page_status[slotno] == SLRU_PAGE_READ_IN_PROGRESS ||
|
||||||
|
(shared->page_status[slotno] == SLRU_PAGE_WRITE_IN_PROGRESS &&
|
||||||
|
!write_ok))
|
||||||
{
|
{
|
||||||
SimpleLruWaitIO(ctl, slotno);
|
SimpleLruWaitIO(ctl, slotno);
|
||||||
/* Now we must recheck state from the top */
|
/* Now we must recheck state from the top */
|
||||||
@ -383,6 +427,9 @@ SimpleLruReadPage(SlruCtl ctl, int pageno, TransactionId xid)
|
|||||||
/* Do the read */
|
/* Do the read */
|
||||||
ok = SlruPhysicalReadPage(ctl, pageno, slotno);
|
ok = SlruPhysicalReadPage(ctl, pageno, slotno);
|
||||||
|
|
||||||
|
/* Set the LSNs for this newly read-in page to zero */
|
||||||
|
SimpleLruZeroLSNs(ctl, slotno);
|
||||||
|
|
||||||
/* Re-acquire control lock and update page state */
|
/* Re-acquire control lock and update page state */
|
||||||
LWLockAcquire(shared->ControlLock, LW_EXCLUSIVE);
|
LWLockAcquire(shared->ControlLock, LW_EXCLUSIVE);
|
||||||
|
|
||||||
@ -443,7 +490,7 @@ SimpleLruReadPage_ReadOnly(SlruCtl ctl, int pageno, TransactionId xid)
|
|||||||
LWLockRelease(shared->ControlLock);
|
LWLockRelease(shared->ControlLock);
|
||||||
LWLockAcquire(shared->ControlLock, LW_EXCLUSIVE);
|
LWLockAcquire(shared->ControlLock, LW_EXCLUSIVE);
|
||||||
|
|
||||||
return SimpleLruReadPage(ctl, pageno, xid);
|
return SimpleLruReadPage(ctl, pageno, true, xid);
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
@ -621,6 +668,47 @@ SlruPhysicalWritePage(SlruCtl ctl, int pageno, int slotno, SlruFlush fdata)
|
|||||||
char path[MAXPGPATH];
|
char path[MAXPGPATH];
|
||||||
int fd = -1;
|
int fd = -1;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Honor the write-WAL-before-data rule, if appropriate, so that we do
|
||||||
|
* not write out data before associated WAL records. This is the same
|
||||||
|
* action performed during FlushBuffer() in the main buffer manager.
|
||||||
|
*/
|
||||||
|
if (shared->group_lsn != NULL)
|
||||||
|
{
|
||||||
|
/*
|
||||||
|
* We must determine the largest async-commit LSN for the page.
|
||||||
|
* This is a bit tedious, but since this entire function is a slow
|
||||||
|
* path anyway, it seems better to do this here than to maintain
|
||||||
|
* a per-page LSN variable (which'd need an extra comparison in the
|
||||||
|
* transaction-commit path).
|
||||||
|
*/
|
||||||
|
XLogRecPtr max_lsn;
|
||||||
|
int lsnindex, lsnoff;
|
||||||
|
|
||||||
|
lsnindex = slotno * shared->lsn_groups_per_page;
|
||||||
|
max_lsn = shared->group_lsn[lsnindex++];
|
||||||
|
for (lsnoff = 1; lsnoff < shared->lsn_groups_per_page; lsnoff++)
|
||||||
|
{
|
||||||
|
XLogRecPtr this_lsn = shared->group_lsn[lsnindex++];
|
||||||
|
|
||||||
|
if (XLByteLT(max_lsn, this_lsn))
|
||||||
|
max_lsn = this_lsn;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (!XLogRecPtrIsInvalid(max_lsn))
|
||||||
|
{
|
||||||
|
/*
|
||||||
|
* As noted above, elog(ERROR) is not acceptable here, so if
|
||||||
|
* XLogFlush were to fail, we must PANIC. This isn't much of
|
||||||
|
* a restriction because XLogFlush is just about all critical
|
||||||
|
* section anyway, but let's make sure.
|
||||||
|
*/
|
||||||
|
START_CRIT_SECTION();
|
||||||
|
XLogFlush(max_lsn);
|
||||||
|
END_CRIT_SECTION();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* During a Flush, we may already have the desired file open.
|
* During a Flush, we may already have the desired file open.
|
||||||
*/
|
*/
|
||||||
|
@ -22,7 +22,7 @@
|
|||||||
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
|
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
|
||||||
* Portions Copyright (c) 1994, Regents of the University of California
|
* Portions Copyright (c) 1994, Regents of the University of California
|
||||||
*
|
*
|
||||||
* $PostgreSQL: pgsql/src/backend/access/transam/subtrans.c,v 1.18 2007/01/05 22:19:23 momjian Exp $
|
* $PostgreSQL: pgsql/src/backend/access/transam/subtrans.c,v 1.19 2007/08/01 22:45:07 tgl Exp $
|
||||||
*
|
*
|
||||||
*-------------------------------------------------------------------------
|
*-------------------------------------------------------------------------
|
||||||
*/
|
*/
|
||||||
@ -78,7 +78,7 @@ SubTransSetParent(TransactionId xid, TransactionId parent)
|
|||||||
|
|
||||||
LWLockAcquire(SubtransControlLock, LW_EXCLUSIVE);
|
LWLockAcquire(SubtransControlLock, LW_EXCLUSIVE);
|
||||||
|
|
||||||
slotno = SimpleLruReadPage(SubTransCtl, pageno, xid);
|
slotno = SimpleLruReadPage(SubTransCtl, pageno, true, xid);
|
||||||
ptr = (TransactionId *) SubTransCtl->shared->page_buffer[slotno];
|
ptr = (TransactionId *) SubTransCtl->shared->page_buffer[slotno];
|
||||||
ptr += entryno;
|
ptr += entryno;
|
||||||
|
|
||||||
@ -165,14 +165,14 @@ SubTransGetTopmostTransaction(TransactionId xid)
|
|||||||
Size
|
Size
|
||||||
SUBTRANSShmemSize(void)
|
SUBTRANSShmemSize(void)
|
||||||
{
|
{
|
||||||
return SimpleLruShmemSize(NUM_SUBTRANS_BUFFERS);
|
return SimpleLruShmemSize(NUM_SUBTRANS_BUFFERS, 0);
|
||||||
}
|
}
|
||||||
|
|
||||||
void
|
void
|
||||||
SUBTRANSShmemInit(void)
|
SUBTRANSShmemInit(void)
|
||||||
{
|
{
|
||||||
SubTransCtl->PagePrecedes = SubTransPagePrecedes;
|
SubTransCtl->PagePrecedes = SubTransPagePrecedes;
|
||||||
SimpleLruInit(SubTransCtl, "SUBTRANS Ctl", NUM_SUBTRANS_BUFFERS,
|
SimpleLruInit(SubTransCtl, "SUBTRANS Ctl", NUM_SUBTRANS_BUFFERS, 0,
|
||||||
SubtransControlLock, "pg_subtrans");
|
SubtransControlLock, "pg_subtrans");
|
||||||
/* Override default assumption that writes should be fsync'd */
|
/* Override default assumption that writes should be fsync'd */
|
||||||
SubTransCtl->do_fsync = false;
|
SubTransCtl->do_fsync = false;
|
||||||
|
@ -8,7 +8,7 @@
|
|||||||
*
|
*
|
||||||
*
|
*
|
||||||
* IDENTIFICATION
|
* IDENTIFICATION
|
||||||
* $PostgreSQL: pgsql/src/backend/access/transam/transam.c,v 1.69 2007/01/05 22:19:23 momjian Exp $
|
* $PostgreSQL: pgsql/src/backend/access/transam/transam.c,v 1.70 2007/08/01 22:45:07 tgl Exp $
|
||||||
*
|
*
|
||||||
* NOTES
|
* NOTES
|
||||||
* This file contains the high level access-method interface to the
|
* This file contains the high level access-method interface to the
|
||||||
@ -27,14 +27,17 @@
|
|||||||
|
|
||||||
static XidStatus TransactionLogFetch(TransactionId transactionId);
|
static XidStatus TransactionLogFetch(TransactionId transactionId);
|
||||||
static void TransactionLogUpdate(TransactionId transactionId,
|
static void TransactionLogUpdate(TransactionId transactionId,
|
||||||
XidStatus status);
|
XidStatus status, XLogRecPtr lsn);
|
||||||
|
|
||||||
/* ----------------
|
/*
|
||||||
* Single-item cache for results of TransactionLogFetch.
|
* Single-item cache for results of TransactionLogFetch.
|
||||||
* ----------------
|
|
||||||
*/
|
*/
|
||||||
static TransactionId cachedFetchXid = InvalidTransactionId;
|
static TransactionId cachedFetchXid = InvalidTransactionId;
|
||||||
static XidStatus cachedFetchXidStatus;
|
static XidStatus cachedFetchXidStatus;
|
||||||
|
static XLogRecPtr cachedCommitLSN;
|
||||||
|
|
||||||
|
/* Handy constant for an invalid xlog recptr */
|
||||||
|
static const XLogRecPtr InvalidXLogRecPtr = {0, 0};
|
||||||
|
|
||||||
|
|
||||||
/* ----------------------------------------------------------------
|
/* ----------------------------------------------------------------
|
||||||
@ -52,6 +55,7 @@ static XidStatus
|
|||||||
TransactionLogFetch(TransactionId transactionId)
|
TransactionLogFetch(TransactionId transactionId)
|
||||||
{
|
{
|
||||||
XidStatus xidstatus;
|
XidStatus xidstatus;
|
||||||
|
XLogRecPtr xidlsn;
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Before going to the commit log manager, check our single item cache to
|
* Before going to the commit log manager, check our single item cache to
|
||||||
@ -73,9 +77,9 @@ TransactionLogFetch(TransactionId transactionId)
|
|||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Get the status.
|
* Get the transaction status.
|
||||||
*/
|
*/
|
||||||
xidstatus = TransactionIdGetStatus(transactionId);
|
xidstatus = TransactionIdGetStatus(transactionId, &xidlsn);
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* DO NOT cache status for unfinished or sub-committed transactions! We
|
* DO NOT cache status for unfinished or sub-committed transactions! We
|
||||||
@ -84,8 +88,9 @@ TransactionLogFetch(TransactionId transactionId)
|
|||||||
if (xidstatus != TRANSACTION_STATUS_IN_PROGRESS &&
|
if (xidstatus != TRANSACTION_STATUS_IN_PROGRESS &&
|
||||||
xidstatus != TRANSACTION_STATUS_SUB_COMMITTED)
|
xidstatus != TRANSACTION_STATUS_SUB_COMMITTED)
|
||||||
{
|
{
|
||||||
TransactionIdStore(transactionId, &cachedFetchXid);
|
cachedFetchXid = transactionId;
|
||||||
cachedFetchXidStatus = xidstatus;
|
cachedFetchXidStatus = xidstatus;
|
||||||
|
cachedCommitLSN = xidlsn;
|
||||||
}
|
}
|
||||||
|
|
||||||
return xidstatus;
|
return xidstatus;
|
||||||
@ -93,16 +98,19 @@ TransactionLogFetch(TransactionId transactionId)
|
|||||||
|
|
||||||
/* --------------------------------
|
/* --------------------------------
|
||||||
* TransactionLogUpdate
|
* TransactionLogUpdate
|
||||||
|
*
|
||||||
|
* Store the new status of a transaction. The commit record LSN must be
|
||||||
|
* passed when recording an async commit; else it should be InvalidXLogRecPtr.
|
||||||
* --------------------------------
|
* --------------------------------
|
||||||
*/
|
*/
|
||||||
static void
|
static inline void
|
||||||
TransactionLogUpdate(TransactionId transactionId, /* trans id to update */
|
TransactionLogUpdate(TransactionId transactionId,
|
||||||
XidStatus status) /* new trans status */
|
XidStatus status, XLogRecPtr lsn)
|
||||||
{
|
{
|
||||||
/*
|
/*
|
||||||
* update the commit log
|
* update the commit log
|
||||||
*/
|
*/
|
||||||
TransactionIdSetStatus(transactionId, status);
|
TransactionIdSetStatus(transactionId, status, lsn);
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
@ -111,15 +119,16 @@ TransactionLogUpdate(TransactionId transactionId, /* trans id to update */
|
|||||||
* Update multiple transaction identifiers to a given status.
|
* Update multiple transaction identifiers to a given status.
|
||||||
* Don't depend on this being atomic; it's not.
|
* Don't depend on this being atomic; it's not.
|
||||||
*/
|
*/
|
||||||
static void
|
static inline void
|
||||||
TransactionLogMultiUpdate(int nxids, TransactionId *xids, XidStatus status)
|
TransactionLogMultiUpdate(int nxids, TransactionId *xids,
|
||||||
|
XidStatus status, XLogRecPtr lsn)
|
||||||
{
|
{
|
||||||
int i;
|
int i;
|
||||||
|
|
||||||
Assert(nxids != 0);
|
Assert(nxids != 0);
|
||||||
|
|
||||||
for (i = 0; i < nxids; i++)
|
for (i = 0; i < nxids; i++)
|
||||||
TransactionIdSetStatus(xids[i], status);
|
TransactionIdSetStatus(xids[i], status, lsn);
|
||||||
}
|
}
|
||||||
|
|
||||||
/* ----------------------------------------------------------------
|
/* ----------------------------------------------------------------
|
||||||
@ -269,31 +278,49 @@ TransactionIdDidAbort(TransactionId transactionId)
|
|||||||
void
|
void
|
||||||
TransactionIdCommit(TransactionId transactionId)
|
TransactionIdCommit(TransactionId transactionId)
|
||||||
{
|
{
|
||||||
TransactionLogUpdate(transactionId, TRANSACTION_STATUS_COMMITTED);
|
TransactionLogUpdate(transactionId, TRANSACTION_STATUS_COMMITTED,
|
||||||
|
InvalidXLogRecPtr);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* TransactionIdAsyncCommit
|
||||||
|
* Same as above, but for async commits. The commit record LSN is needed.
|
||||||
|
*/
|
||||||
|
void
|
||||||
|
TransactionIdAsyncCommit(TransactionId transactionId, XLogRecPtr lsn)
|
||||||
|
{
|
||||||
|
TransactionLogUpdate(transactionId, TRANSACTION_STATUS_COMMITTED, lsn);
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* TransactionIdAbort
|
* TransactionIdAbort
|
||||||
* Aborts the transaction associated with the identifier.
|
* Aborts the transaction associated with the identifier.
|
||||||
*
|
*
|
||||||
* Note:
|
* Note:
|
||||||
* Assumes transaction identifier is valid.
|
* Assumes transaction identifier is valid.
|
||||||
|
* No async version of this is needed.
|
||||||
*/
|
*/
|
||||||
void
|
void
|
||||||
TransactionIdAbort(TransactionId transactionId)
|
TransactionIdAbort(TransactionId transactionId)
|
||||||
{
|
{
|
||||||
TransactionLogUpdate(transactionId, TRANSACTION_STATUS_ABORTED);
|
TransactionLogUpdate(transactionId, TRANSACTION_STATUS_ABORTED,
|
||||||
|
InvalidXLogRecPtr);
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* TransactionIdSubCommit
|
* TransactionIdSubCommit
|
||||||
* Marks the subtransaction associated with the identifier as
|
* Marks the subtransaction associated with the identifier as
|
||||||
* sub-committed.
|
* sub-committed.
|
||||||
|
*
|
||||||
|
* Note:
|
||||||
|
* No async version of this is needed.
|
||||||
*/
|
*/
|
||||||
void
|
void
|
||||||
TransactionIdSubCommit(TransactionId transactionId)
|
TransactionIdSubCommit(TransactionId transactionId)
|
||||||
{
|
{
|
||||||
TransactionLogUpdate(transactionId, TRANSACTION_STATUS_SUB_COMMITTED);
|
TransactionLogUpdate(transactionId, TRANSACTION_STATUS_SUB_COMMITTED,
|
||||||
|
InvalidXLogRecPtr);
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
@ -309,9 +336,23 @@ void
|
|||||||
TransactionIdCommitTree(int nxids, TransactionId *xids)
|
TransactionIdCommitTree(int nxids, TransactionId *xids)
|
||||||
{
|
{
|
||||||
if (nxids > 0)
|
if (nxids > 0)
|
||||||
TransactionLogMultiUpdate(nxids, xids, TRANSACTION_STATUS_COMMITTED);
|
TransactionLogMultiUpdate(nxids, xids, TRANSACTION_STATUS_COMMITTED,
|
||||||
|
InvalidXLogRecPtr);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* TransactionIdAsyncCommitTree
|
||||||
|
* Same as above, but for async commits. The commit record LSN is needed.
|
||||||
|
*/
|
||||||
|
void
|
||||||
|
TransactionIdAsyncCommitTree(int nxids, TransactionId *xids, XLogRecPtr lsn)
|
||||||
|
{
|
||||||
|
if (nxids > 0)
|
||||||
|
TransactionLogMultiUpdate(nxids, xids, TRANSACTION_STATUS_COMMITTED,
|
||||||
|
lsn);
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* TransactionIdAbortTree
|
* TransactionIdAbortTree
|
||||||
* Marks all the given transaction ids as aborted.
|
* Marks all the given transaction ids as aborted.
|
||||||
@ -323,7 +364,8 @@ void
|
|||||||
TransactionIdAbortTree(int nxids, TransactionId *xids)
|
TransactionIdAbortTree(int nxids, TransactionId *xids)
|
||||||
{
|
{
|
||||||
if (nxids > 0)
|
if (nxids > 0)
|
||||||
TransactionLogMultiUpdate(nxids, xids, TRANSACTION_STATUS_ABORTED);
|
TransactionLogMultiUpdate(nxids, xids, TRANSACTION_STATUS_ABORTED,
|
||||||
|
InvalidXLogRecPtr);
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
@ -389,3 +431,43 @@ TransactionIdFollowsOrEquals(TransactionId id1, TransactionId id2)
|
|||||||
diff = (int32) (id1 - id2);
|
diff = (int32) (id1 - id2);
|
||||||
return (diff >= 0);
|
return (diff >= 0);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* TransactionIdGetCommitLSN
|
||||||
|
*
|
||||||
|
* This function returns an LSN that is late enough to be able
|
||||||
|
* to guarantee that if we flush up to the LSN returned then we
|
||||||
|
* will have flushed the transaction's commit record to disk.
|
||||||
|
*
|
||||||
|
* The result is not necessarily the exact LSN of the transaction's
|
||||||
|
* commit record! For example, for long-past transactions (those whose
|
||||||
|
* clog pages already migrated to disk), we'll return InvalidXLogRecPtr.
|
||||||
|
* Also, because we group transactions on the same clog page to conserve
|
||||||
|
* storage, we might return the LSN of a later transaction that falls into
|
||||||
|
* the same group.
|
||||||
|
*/
|
||||||
|
XLogRecPtr
|
||||||
|
TransactionIdGetCommitLSN(TransactionId xid)
|
||||||
|
{
|
||||||
|
XLogRecPtr result;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Currently, all uses of this function are for xids that were just
|
||||||
|
* reported to be committed by TransactionLogFetch, so we expect that
|
||||||
|
* checking TransactionLogFetch's cache will usually succeed and avoid an
|
||||||
|
* extra trip to shared memory.
|
||||||
|
*/
|
||||||
|
if (TransactionIdEquals(xid, cachedFetchXid))
|
||||||
|
return cachedCommitLSN;
|
||||||
|
|
||||||
|
/* Special XIDs are always known committed */
|
||||||
|
if (!TransactionIdIsNormal(xid))
|
||||||
|
return InvalidXLogRecPtr;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Get the transaction status.
|
||||||
|
*/
|
||||||
|
(void) TransactionIdGetStatus(xid, &result);
|
||||||
|
|
||||||
|
return result;
|
||||||
|
}
|
||||||
|
@ -7,7 +7,7 @@
|
|||||||
* Portions Copyright (c) 1994, Regents of the University of California
|
* Portions Copyright (c) 1994, Regents of the University of California
|
||||||
*
|
*
|
||||||
* IDENTIFICATION
|
* IDENTIFICATION
|
||||||
* $PostgreSQL: pgsql/src/backend/access/transam/twophase.c,v 1.31 2007/05/27 03:50:39 tgl Exp $
|
* $PostgreSQL: pgsql/src/backend/access/transam/twophase.c,v 1.32 2007/08/01 22:45:07 tgl Exp $
|
||||||
*
|
*
|
||||||
* NOTES
|
* NOTES
|
||||||
* Each global transaction is associated with a global transaction
|
* Each global transaction is associated with a global transaction
|
||||||
@ -1706,7 +1706,11 @@ RecordTransactionCommitPrepared(TransactionId xid,
|
|||||||
XLOG_XACT_COMMIT_PREPARED | XLOG_NO_TRAN,
|
XLOG_XACT_COMMIT_PREPARED | XLOG_NO_TRAN,
|
||||||
rdata);
|
rdata);
|
||||||
|
|
||||||
/* we don't currently try to sleep before flush here ... */
|
/*
|
||||||
|
* We don't currently try to sleep before flush here ... nor is there
|
||||||
|
* any support for async commit of a prepared xact (the very idea is
|
||||||
|
* probably a contradiction)
|
||||||
|
*/
|
||||||
|
|
||||||
/* Flush XLOG to disk */
|
/* Flush XLOG to disk */
|
||||||
XLogFlush(recptr);
|
XLogFlush(recptr);
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
*
|
*
|
||||||
*
|
*
|
||||||
* IDENTIFICATION
|
* IDENTIFICATION
|
||||||
* $PostgreSQL: pgsql/src/backend/access/transam/xact.c,v 1.245 2007/06/07 21:45:58 tgl Exp $
|
* $PostgreSQL: pgsql/src/backend/access/transam/xact.c,v 1.246 2007/08/01 22:45:07 tgl Exp $
|
||||||
*
|
*
|
||||||
*-------------------------------------------------------------------------
|
*-------------------------------------------------------------------------
|
||||||
*/
|
*/
|
||||||
@ -55,6 +55,8 @@ int XactIsoLevel;
|
|||||||
bool DefaultXactReadOnly = false;
|
bool DefaultXactReadOnly = false;
|
||||||
bool XactReadOnly;
|
bool XactReadOnly;
|
||||||
|
|
||||||
|
bool XactSyncCommit = true;
|
||||||
|
|
||||||
int CommitDelay = 0; /* precommit delay in microseconds */
|
int CommitDelay = 0; /* precommit delay in microseconds */
|
||||||
int CommitSiblings = 5; /* # concurrent xacts needed to sleep */
|
int CommitSiblings = 5; /* # concurrent xacts needed to sleep */
|
||||||
|
|
||||||
@ -174,6 +176,11 @@ static TimestampTz xactStopTimestamp;
|
|||||||
*/
|
*/
|
||||||
static char *prepareGID;
|
static char *prepareGID;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Some commands want to force synchronous commit.
|
||||||
|
*/
|
||||||
|
static bool forceSyncCommit = false;
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Private context for transaction-abort work --- we reserve space for this
|
* Private context for transaction-abort work --- we reserve space for this
|
||||||
* at startup to ensure that AbortTransaction and AbortSubTransaction can work
|
* at startup to ensure that AbortTransaction and AbortSubTransaction can work
|
||||||
@ -554,6 +561,18 @@ CommandCounterIncrement(void)
|
|||||||
AtStart_Cache();
|
AtStart_Cache();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* ForceSyncCommit
|
||||||
|
*
|
||||||
|
* Interface routine to allow commands to force a synchronous commit of the
|
||||||
|
* current top-level transaction
|
||||||
|
*/
|
||||||
|
void
|
||||||
|
ForceSyncCommit(void)
|
||||||
|
{
|
||||||
|
forceSyncCommit = true;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
/* ----------------------------------------------------------------
|
/* ----------------------------------------------------------------
|
||||||
* StartTransaction stuff
|
* StartTransaction stuff
|
||||||
@ -724,6 +743,7 @@ RecordTransactionCommit(void)
|
|||||||
{
|
{
|
||||||
TransactionId xid = GetCurrentTransactionId();
|
TransactionId xid = GetCurrentTransactionId();
|
||||||
bool madeTCentries;
|
bool madeTCentries;
|
||||||
|
bool isAsyncCommit = false;
|
||||||
XLogRecPtr recptr;
|
XLogRecPtr recptr;
|
||||||
|
|
||||||
/* Tell bufmgr and smgr to prepare for commit */
|
/* Tell bufmgr and smgr to prepare for commit */
|
||||||
@ -810,21 +830,44 @@ RecordTransactionCommit(void)
|
|||||||
if (MyXactMadeXLogEntry)
|
if (MyXactMadeXLogEntry)
|
||||||
{
|
{
|
||||||
/*
|
/*
|
||||||
* Sleep before flush! So we can flush more than one commit
|
* If the user has set synchronous_commit = off, and we're
|
||||||
* records per single fsync. (The idea is some other backend may
|
* not doing cleanup of any rels nor committing any command
|
||||||
* do the XLogFlush while we're sleeping. This needs work still,
|
* that wanted to force sync commit, then we can defer fsync.
|
||||||
* because on most Unixen, the minimum select() delay is 10msec or
|
|
||||||
* more, which is way too long.)
|
|
||||||
*
|
|
||||||
* We do not sleep if enableFsync is not turned on, nor if there
|
|
||||||
* are fewer than CommitSiblings other backends with active
|
|
||||||
* transactions.
|
|
||||||
*/
|
*/
|
||||||
if (CommitDelay > 0 && enableFsync &&
|
if (XactSyncCommit || forceSyncCommit || nrels > 0)
|
||||||
CountActiveBackends() >= CommitSiblings)
|
{
|
||||||
pg_usleep(CommitDelay);
|
/*
|
||||||
|
* Synchronous commit case.
|
||||||
|
*
|
||||||
|
* Sleep before flush! So we can flush more than one commit
|
||||||
|
* records per single fsync. (The idea is some other backend
|
||||||
|
* may do the XLogFlush while we're sleeping. This needs work
|
||||||
|
* still, because on most Unixen, the minimum select() delay
|
||||||
|
* is 10msec or more, which is way too long.)
|
||||||
|
*
|
||||||
|
* We do not sleep if enableFsync is not turned on, nor if
|
||||||
|
* there are fewer than CommitSiblings other backends with
|
||||||
|
* active transactions.
|
||||||
|
*/
|
||||||
|
if (CommitDelay > 0 && enableFsync &&
|
||||||
|
CountActiveBackends() >= CommitSiblings)
|
||||||
|
pg_usleep(CommitDelay);
|
||||||
|
|
||||||
XLogFlush(recptr);
|
XLogFlush(recptr);
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
/*
|
||||||
|
* Asynchronous commit case.
|
||||||
|
*/
|
||||||
|
isAsyncCommit = true;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Report the latest async commit LSN, so that
|
||||||
|
* the WAL writer knows to flush this commit.
|
||||||
|
*/
|
||||||
|
XLogSetAsyncCommitLSN(recptr);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
@ -835,12 +878,24 @@ RecordTransactionCommit(void)
|
|||||||
* emitted an XLOG record for our commit, and so in the event of a
|
* emitted an XLOG record for our commit, and so in the event of a
|
||||||
* crash the clog update might be lost. This is okay because no one
|
* crash the clog update might be lost. This is okay because no one
|
||||||
* else will ever care whether we committed.
|
* else will ever care whether we committed.
|
||||||
|
*
|
||||||
|
* The recptr here refers to the last xlog entry by this transaction
|
||||||
|
* so is the correct value to use for setting the clog.
|
||||||
*/
|
*/
|
||||||
if (madeTCentries || MyXactMadeTempRelUpdate)
|
if (madeTCentries || MyXactMadeTempRelUpdate)
|
||||||
{
|
{
|
||||||
TransactionIdCommit(xid);
|
if (isAsyncCommit)
|
||||||
/* to avoid race conditions, the parent must commit first */
|
{
|
||||||
TransactionIdCommitTree(nchildren, children);
|
TransactionIdAsyncCommit(xid, recptr);
|
||||||
|
/* to avoid race conditions, the parent must commit first */
|
||||||
|
TransactionIdAsyncCommitTree(nchildren, children, recptr);
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
TransactionIdCommit(xid);
|
||||||
|
/* to avoid race conditions, the parent must commit first */
|
||||||
|
TransactionIdCommitTree(nchildren, children);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Checkpoint can proceed now */
|
/* Checkpoint can proceed now */
|
||||||
@ -1406,6 +1461,7 @@ StartTransaction(void)
|
|||||||
FreeXactSnapshot();
|
FreeXactSnapshot();
|
||||||
XactIsoLevel = DefaultXactIsoLevel;
|
XactIsoLevel = DefaultXactIsoLevel;
|
||||||
XactReadOnly = DefaultXactReadOnly;
|
XactReadOnly = DefaultXactReadOnly;
|
||||||
|
forceSyncCommit = false;
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* reinitialize within-transaction counters
|
* reinitialize within-transaction counters
|
||||||
|
@ -7,7 +7,7 @@
|
|||||||
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
|
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
|
||||||
* Portions Copyright (c) 1994, Regents of the University of California
|
* Portions Copyright (c) 1994, Regents of the University of California
|
||||||
*
|
*
|
||||||
* $PostgreSQL: pgsql/src/backend/access/transam/xlog.c,v 1.275 2007/07/24 04:54:08 tgl Exp $
|
* $PostgreSQL: pgsql/src/backend/access/transam/xlog.c,v 1.276 2007/08/01 22:45:08 tgl Exp $
|
||||||
*
|
*
|
||||||
*-------------------------------------------------------------------------
|
*-------------------------------------------------------------------------
|
||||||
*/
|
*/
|
||||||
@ -305,6 +305,7 @@ typedef struct XLogCtlData
|
|||||||
XLogwrtResult LogwrtResult;
|
XLogwrtResult LogwrtResult;
|
||||||
uint32 ckptXidEpoch; /* nextXID & epoch of latest checkpoint */
|
uint32 ckptXidEpoch; /* nextXID & epoch of latest checkpoint */
|
||||||
TransactionId ckptXid;
|
TransactionId ckptXid;
|
||||||
|
XLogRecPtr asyncCommitLSN; /* LSN of newest async commit */
|
||||||
|
|
||||||
/* Protected by WALWriteLock: */
|
/* Protected by WALWriteLock: */
|
||||||
XLogCtlWrite Write;
|
XLogCtlWrite Write;
|
||||||
@ -1643,6 +1644,22 @@ XLogWrite(XLogwrtRqst WriteRqst, bool flexible, bool xlog_switch)
|
|||||||
Write->LogwrtResult = LogwrtResult;
|
Write->LogwrtResult = LogwrtResult;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Record the LSN for an asynchronous transaction commit.
|
||||||
|
* (This should not be called for aborts, nor for synchronous commits.)
|
||||||
|
*/
|
||||||
|
void
|
||||||
|
XLogSetAsyncCommitLSN(XLogRecPtr asyncCommitLSN)
|
||||||
|
{
|
||||||
|
/* use volatile pointer to prevent code rearrangement */
|
||||||
|
volatile XLogCtlData *xlogctl = XLogCtl;
|
||||||
|
|
||||||
|
SpinLockAcquire(&xlogctl->info_lck);
|
||||||
|
if (XLByteLT(xlogctl->asyncCommitLSN, asyncCommitLSN))
|
||||||
|
xlogctl->asyncCommitLSN = asyncCommitLSN;
|
||||||
|
SpinLockRelease(&xlogctl->info_lck);
|
||||||
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Ensure that all XLOG data through the given position is flushed to disk.
|
* Ensure that all XLOG data through the given position is flushed to disk.
|
||||||
*
|
*
|
||||||
@ -1797,19 +1814,17 @@ XLogBackgroundFlush(void)
|
|||||||
/* back off to last completed page boundary */
|
/* back off to last completed page boundary */
|
||||||
WriteRqstPtr.xrecoff -= WriteRqstPtr.xrecoff % XLOG_BLCKSZ;
|
WriteRqstPtr.xrecoff -= WriteRqstPtr.xrecoff % XLOG_BLCKSZ;
|
||||||
|
|
||||||
#ifdef NOT_YET /* async commit patch is still to come */
|
|
||||||
/* if we have already flushed that far, consider async commit records */
|
/* if we have already flushed that far, consider async commit records */
|
||||||
if (XLByteLE(WriteRqstPtr, LogwrtResult.Flush))
|
if (XLByteLE(WriteRqstPtr, LogwrtResult.Flush))
|
||||||
{
|
{
|
||||||
/* use volatile pointer to prevent code rearrangement */
|
/* use volatile pointer to prevent code rearrangement */
|
||||||
volatile XLogCtlData *xlogctl = XLogCtl;
|
volatile XLogCtlData *xlogctl = XLogCtl;
|
||||||
|
|
||||||
SpinLockAcquire(&xlogctl->async_commit_lck);
|
SpinLockAcquire(&xlogctl->info_lck);
|
||||||
WriteRqstPtr = xlogctl->asyncCommitLSN;
|
WriteRqstPtr = xlogctl->asyncCommitLSN;
|
||||||
SpinLockRelease(&xlogctl->async_commit_lck);
|
SpinLockRelease(&xlogctl->info_lck);
|
||||||
flexible = false; /* ensure it all gets written */
|
flexible = false; /* ensure it all gets written */
|
||||||
}
|
}
|
||||||
#endif
|
|
||||||
|
|
||||||
/* Done if already known flushed */
|
/* Done if already known flushed */
|
||||||
if (XLByteLE(WriteRqstPtr, LogwrtResult.Flush))
|
if (XLByteLE(WriteRqstPtr, LogwrtResult.Flush))
|
||||||
@ -1841,6 +1856,23 @@ XLogBackgroundFlush(void)
|
|||||||
END_CRIT_SECTION();
|
END_CRIT_SECTION();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Flush any previous asynchronously-committed transactions' commit records.
|
||||||
|
*/
|
||||||
|
void
|
||||||
|
XLogAsyncCommitFlush(void)
|
||||||
|
{
|
||||||
|
XLogRecPtr WriteRqstPtr;
|
||||||
|
/* use volatile pointer to prevent code rearrangement */
|
||||||
|
volatile XLogCtlData *xlogctl = XLogCtl;
|
||||||
|
|
||||||
|
SpinLockAcquire(&xlogctl->info_lck);
|
||||||
|
WriteRqstPtr = xlogctl->asyncCommitLSN;
|
||||||
|
SpinLockRelease(&xlogctl->info_lck);
|
||||||
|
|
||||||
|
XLogFlush(WriteRqstPtr);
|
||||||
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Test whether XLOG data has been flushed up to (at least) the given position.
|
* Test whether XLOG data has been flushed up to (at least) the given position.
|
||||||
*
|
*
|
||||||
@ -5466,7 +5498,7 @@ ShutdownXLOG(int code, Datum arg)
|
|||||||
(errmsg("database system is shut down")));
|
(errmsg("database system is shut down")));
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Log start of a checkpoint.
|
* Log start of a checkpoint.
|
||||||
*/
|
*/
|
||||||
static void
|
static void
|
||||||
@ -5481,7 +5513,7 @@ LogCheckpointStart(int flags)
|
|||||||
(flags & CHECKPOINT_CAUSE_TIME) ? " time" : "");
|
(flags & CHECKPOINT_CAUSE_TIME) ? " time" : "");
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Log end of a checkpoint.
|
* Log end of a checkpoint.
|
||||||
*/
|
*/
|
||||||
static void
|
static void
|
||||||
@ -5523,7 +5555,7 @@ LogCheckpointEnd(void)
|
|||||||
* flags is a bitwise OR of the following:
|
* flags is a bitwise OR of the following:
|
||||||
* CHECKPOINT_IS_SHUTDOWN: checkpoint is for database shutdown.
|
* CHECKPOINT_IS_SHUTDOWN: checkpoint is for database shutdown.
|
||||||
* CHECKPOINT_IMMEDIATE: finish the checkpoint ASAP,
|
* CHECKPOINT_IMMEDIATE: finish the checkpoint ASAP,
|
||||||
* ignoring checkpoint_completion_target parameter.
|
* ignoring checkpoint_completion_target parameter.
|
||||||
* CHECKPOINT_FORCE: force a checkpoint even if no XLOG activity has occured
|
* CHECKPOINT_FORCE: force a checkpoint even if no XLOG activity has occured
|
||||||
* since the last one (implied by CHECKPOINT_IS_SHUTDOWN).
|
* since the last one (implied by CHECKPOINT_IS_SHUTDOWN).
|
||||||
*
|
*
|
||||||
|
@ -13,7 +13,7 @@
|
|||||||
*
|
*
|
||||||
*
|
*
|
||||||
* IDENTIFICATION
|
* IDENTIFICATION
|
||||||
* $PostgreSQL: pgsql/src/backend/commands/dbcommands.c,v 1.196 2007/06/28 00:02:38 tgl Exp $
|
* $PostgreSQL: pgsql/src/backend/commands/dbcommands.c,v 1.197 2007/08/01 22:45:08 tgl Exp $
|
||||||
*
|
*
|
||||||
*-------------------------------------------------------------------------
|
*-------------------------------------------------------------------------
|
||||||
*/
|
*/
|
||||||
@ -515,7 +515,11 @@ createdb(const CreatedbStmt *stmt)
|
|||||||
heap_close(pg_database_rel, NoLock);
|
heap_close(pg_database_rel, NoLock);
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Set flag to update flat database file at commit.
|
* Set flag to update flat database file at commit. Note: this also
|
||||||
|
* forces synchronous commit, which minimizes the window between
|
||||||
|
* creation of the database files and commital of the transaction.
|
||||||
|
* If we crash before committing, we'll have a DB that's taking up
|
||||||
|
* disk space but is not in pg_database, which is not good.
|
||||||
*/
|
*/
|
||||||
database_file_update_needed();
|
database_file_update_needed();
|
||||||
}
|
}
|
||||||
@ -675,7 +679,11 @@ dropdb(const char *dbname, bool missing_ok)
|
|||||||
heap_close(pgdbrel, NoLock);
|
heap_close(pgdbrel, NoLock);
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Set flag to update flat database file at commit.
|
* Set flag to update flat database file at commit. Note: this also
|
||||||
|
* forces synchronous commit, which minimizes the window between
|
||||||
|
* removal of the database files and commital of the transaction.
|
||||||
|
* If we crash before committing, we'll have a DB that's gone on disk
|
||||||
|
* but still there according to pg_database, which is not good.
|
||||||
*/
|
*/
|
||||||
database_file_update_needed();
|
database_file_update_needed();
|
||||||
}
|
}
|
||||||
|
@ -37,7 +37,7 @@
|
|||||||
*
|
*
|
||||||
*
|
*
|
||||||
* IDENTIFICATION
|
* IDENTIFICATION
|
||||||
* $PostgreSQL: pgsql/src/backend/commands/tablespace.c,v 1.48 2007/06/07 19:19:56 tgl Exp $
|
* $PostgreSQL: pgsql/src/backend/commands/tablespace.c,v 1.49 2007/08/01 22:45:08 tgl Exp $
|
||||||
*
|
*
|
||||||
*-------------------------------------------------------------------------
|
*-------------------------------------------------------------------------
|
||||||
*/
|
*/
|
||||||
@ -354,6 +354,14 @@ CreateTableSpace(CreateTableSpaceStmt *stmt)
|
|||||||
(void) XLogInsert(RM_TBLSPC_ID, XLOG_TBLSPC_CREATE, rdata);
|
(void) XLogInsert(RM_TBLSPC_ID, XLOG_TBLSPC_CREATE, rdata);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Force synchronous commit, to minimize the window between creating
|
||||||
|
* the symlink on-disk and marking the transaction committed. It's
|
||||||
|
* not great that there is any window at all, but definitely we don't
|
||||||
|
* want to make it larger than necessary.
|
||||||
|
*/
|
||||||
|
ForceSyncCommit();
|
||||||
|
|
||||||
pfree(linkloc);
|
pfree(linkloc);
|
||||||
pfree(location);
|
pfree(location);
|
||||||
|
|
||||||
@ -480,6 +488,14 @@ DropTableSpace(DropTableSpaceStmt *stmt)
|
|||||||
* entries for relations in the tablespace.
|
* entries for relations in the tablespace.
|
||||||
*/
|
*/
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Force synchronous commit, to minimize the window between removing
|
||||||
|
* the files on-disk and marking the transaction committed. It's
|
||||||
|
* not great that there is any window at all, but definitely we don't
|
||||||
|
* want to make it larger than necessary.
|
||||||
|
*/
|
||||||
|
ForceSyncCommit();
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Allow TablespaceCreateDbspace again.
|
* Allow TablespaceCreateDbspace again.
|
||||||
*/
|
*/
|
||||||
|
@ -13,7 +13,7 @@
|
|||||||
*
|
*
|
||||||
*
|
*
|
||||||
* IDENTIFICATION
|
* IDENTIFICATION
|
||||||
* $PostgreSQL: pgsql/src/backend/commands/vacuum.c,v 1.353 2007/06/14 13:53:14 alvherre Exp $
|
* $PostgreSQL: pgsql/src/backend/commands/vacuum.c,v 1.354 2007/08/01 22:45:08 tgl Exp $
|
||||||
*
|
*
|
||||||
*-------------------------------------------------------------------------
|
*-------------------------------------------------------------------------
|
||||||
*/
|
*/
|
||||||
@ -27,6 +27,7 @@
|
|||||||
#include "access/heapam.h"
|
#include "access/heapam.h"
|
||||||
#include "access/transam.h"
|
#include "access/transam.h"
|
||||||
#include "access/xact.h"
|
#include "access/xact.h"
|
||||||
|
#include "access/xlog.h"
|
||||||
#include "catalog/namespace.h"
|
#include "catalog/namespace.h"
|
||||||
#include "catalog/pg_database.h"
|
#include "catalog/pg_database.h"
|
||||||
#include "commands/dbcommands.h"
|
#include "commands/dbcommands.h"
|
||||||
@ -1162,6 +1163,16 @@ full_vacuum_rel(Relation onerel, VacuumStmt *vacstmt)
|
|||||||
vacuum_set_xid_limits(vacstmt->freeze_min_age, onerel->rd_rel->relisshared,
|
vacuum_set_xid_limits(vacstmt->freeze_min_age, onerel->rd_rel->relisshared,
|
||||||
&OldestXmin, &FreezeLimit);
|
&OldestXmin, &FreezeLimit);
|
||||||
|
|
||||||
|
/*
|
||||||
|
* VACUUM FULL assumes that all tuple states are well-known prior to
|
||||||
|
* moving tuples around --- see comment "known dead" in repair_frag(),
|
||||||
|
* as well as simplifications in tqual.c. So before we start we must
|
||||||
|
* ensure that any asynchronously-committed transactions with changes
|
||||||
|
* against this table have been flushed to disk. It's sufficient to do
|
||||||
|
* this once after we've acquired AccessExclusiveLock.
|
||||||
|
*/
|
||||||
|
XLogAsyncCommitFlush();
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Set up statistics-gathering machinery.
|
* Set up statistics-gathering machinery.
|
||||||
*/
|
*/
|
||||||
@ -2373,8 +2384,15 @@ repair_frag(VRelStats *vacrelstats, Relation onerel,
|
|||||||
* exclusive access to the relation. However, that would require a
|
* exclusive access to the relation. However, that would require a
|
||||||
* lot of extra code to close and re-open the relation, indexes, etc.
|
* lot of extra code to close and re-open the relation, indexes, etc.
|
||||||
* For now, a quick hack: record status of current transaction as
|
* For now, a quick hack: record status of current transaction as
|
||||||
* committed, and continue.
|
* committed, and continue. We force the commit to be synchronous
|
||||||
|
* so that it's down to disk before we truncate. (Note: tqual.c
|
||||||
|
* knows that VACUUM FULL always uses sync commit, too.)
|
||||||
|
*
|
||||||
|
* XXX This desperately needs to be revisited. Any failure after
|
||||||
|
* this point will result in a PANIC "cannot abort transaction nnn,
|
||||||
|
* it was already committed"!
|
||||||
*/
|
*/
|
||||||
|
ForceSyncCommit();
|
||||||
RecordTransactionCommit();
|
RecordTransactionCommit();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -23,7 +23,7 @@
|
|||||||
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
|
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
|
||||||
* Portions Copyright (c) 1994, Regents of the University of California
|
* Portions Copyright (c) 1994, Regents of the University of California
|
||||||
*
|
*
|
||||||
* $PostgreSQL: pgsql/src/backend/utils/init/flatfiles.c,v 1.26 2007/06/12 17:16:52 tgl Exp $
|
* $PostgreSQL: pgsql/src/backend/utils/init/flatfiles.c,v 1.27 2007/08/01 22:45:08 tgl Exp $
|
||||||
*
|
*
|
||||||
*-------------------------------------------------------------------------
|
*-------------------------------------------------------------------------
|
||||||
*/
|
*/
|
||||||
@ -855,6 +855,14 @@ AtEOXact_UpdateFlatFiles(bool isCommit)
|
|||||||
* Signal the postmaster to reload its caches.
|
* Signal the postmaster to reload its caches.
|
||||||
*/
|
*/
|
||||||
SendPostmasterSignal(PMSIGNAL_PASSWORD_CHANGE);
|
SendPostmasterSignal(PMSIGNAL_PASSWORD_CHANGE);
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Force synchronous commit, to minimize the window between changing
|
||||||
|
* the flat files on-disk and marking the transaction committed. It's
|
||||||
|
* not great that there is any window at all, but definitely we don't
|
||||||
|
* want to make it larger than necessary.
|
||||||
|
*/
|
||||||
|
ForceSyncCommit();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
* Written by Peter Eisentraut <peter_e@gmx.net>.
|
* Written by Peter Eisentraut <peter_e@gmx.net>.
|
||||||
*
|
*
|
||||||
* IDENTIFICATION
|
* IDENTIFICATION
|
||||||
* $PostgreSQL: pgsql/src/backend/utils/misc/guc.c,v 1.407 2007/07/24 04:54:09 tgl Exp $
|
* $PostgreSQL: pgsql/src/backend/utils/misc/guc.c,v 1.408 2007/08/01 22:45:09 tgl Exp $
|
||||||
*
|
*
|
||||||
*--------------------------------------------------------------------
|
*--------------------------------------------------------------------
|
||||||
*/
|
*/
|
||||||
@ -553,6 +553,14 @@ static struct config_bool ConfigureNamesBool[] =
|
|||||||
&enableFsync,
|
&enableFsync,
|
||||||
true, NULL, NULL
|
true, NULL, NULL
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
{"synchronous_commit", PGC_USERSET, WAL_SETTINGS,
|
||||||
|
gettext_noop("Sets immediate fsync at commit."),
|
||||||
|
NULL
|
||||||
|
},
|
||||||
|
&XactSyncCommit,
|
||||||
|
true, NULL, NULL
|
||||||
|
},
|
||||||
{
|
{
|
||||||
{"zero_damaged_pages", PGC_SUSET, DEVELOPER_OPTIONS,
|
{"zero_damaged_pages", PGC_SUSET, DEVELOPER_OPTIONS,
|
||||||
gettext_noop("Continues processing past damaged page headers."),
|
gettext_noop("Continues processing past damaged page headers."),
|
||||||
@ -1521,7 +1529,7 @@ static struct config_int ConfigureNamesInt[] =
|
|||||||
},
|
},
|
||||||
|
|
||||||
{
|
{
|
||||||
{"commit_delay", PGC_USERSET, WAL_CHECKPOINTS,
|
{"commit_delay", PGC_USERSET, WAL_SETTINGS,
|
||||||
gettext_noop("Sets the delay in microseconds between transaction commit and "
|
gettext_noop("Sets the delay in microseconds between transaction commit and "
|
||||||
"flushing WAL to disk."),
|
"flushing WAL to disk."),
|
||||||
NULL
|
NULL
|
||||||
@ -1531,7 +1539,7 @@ static struct config_int ConfigureNamesInt[] =
|
|||||||
},
|
},
|
||||||
|
|
||||||
{
|
{
|
||||||
{"commit_siblings", PGC_USERSET, WAL_CHECKPOINTS,
|
{"commit_siblings", PGC_USERSET, WAL_SETTINGS,
|
||||||
gettext_noop("Sets the minimum concurrent open transactions before performing "
|
gettext_noop("Sets the minimum concurrent open transactions before performing "
|
||||||
"commit_delay."),
|
"commit_delay."),
|
||||||
NULL
|
NULL
|
||||||
|
@ -149,6 +149,7 @@
|
|||||||
# - Settings -
|
# - Settings -
|
||||||
|
|
||||||
#fsync = on # turns forced synchronization on or off
|
#fsync = on # turns forced synchronization on or off
|
||||||
|
#synchronous_commit = on # immediate fsync at commit
|
||||||
#wal_sync_method = fsync # the default is the first option
|
#wal_sync_method = fsync # the default is the first option
|
||||||
# supported by the operating system:
|
# supported by the operating system:
|
||||||
# open_datasync
|
# open_datasync
|
||||||
|
@ -31,7 +31,7 @@
|
|||||||
* Portions Copyright (c) 1994, Regents of the University of California
|
* Portions Copyright (c) 1994, Regents of the University of California
|
||||||
*
|
*
|
||||||
* IDENTIFICATION
|
* IDENTIFICATION
|
||||||
* $PostgreSQL: pgsql/src/backend/utils/time/tqual.c,v 1.102 2007/03/25 19:45:14 tgl Exp $
|
* $PostgreSQL: pgsql/src/backend/utils/time/tqual.c,v 1.103 2007/08/01 22:45:09 tgl Exp $
|
||||||
*
|
*
|
||||||
*-------------------------------------------------------------------------
|
*-------------------------------------------------------------------------
|
||||||
*/
|
*/
|
||||||
@ -80,6 +80,44 @@ TransactionId RecentGlobalXmin = InvalidTransactionId;
|
|||||||
static bool XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot);
|
static bool XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot);
|
||||||
|
|
||||||
|
|
||||||
|
/*
|
||||||
|
* HeapTupleSetHintBits()
|
||||||
|
*
|
||||||
|
* Set commit/abort hint bits on a tuple, if appropriate at this time.
|
||||||
|
*
|
||||||
|
* We cannot change the LSN of the page here because we may hold only a share
|
||||||
|
* lock on the buffer, so it is only safe to set a transaction-committed hint
|
||||||
|
* bit if we know the transaction's commit record has been flushed to disk.
|
||||||
|
*
|
||||||
|
* We can always set hint bits when marking a transaction aborted. Also,
|
||||||
|
* if we are cleaning up HEAP_MOVED_IN or HEAP_MOVED_OFF entries, then
|
||||||
|
* we can always set the hint bits, since VACUUM FULL always uses synchronous
|
||||||
|
* commits.
|
||||||
|
*
|
||||||
|
* Normal commits may be asynchronous, so for those we need to get the LSN
|
||||||
|
* of the transaction and then check whether this is flushed.
|
||||||
|
*
|
||||||
|
* The caller should pass xid as the XID of the transaction to check, or
|
||||||
|
* InvalidTransactionId if no check is needed.
|
||||||
|
*/
|
||||||
|
static inline void
|
||||||
|
HeapTupleSetHintBits(HeapTupleHeader tuple, Buffer buffer,
|
||||||
|
uint16 infomask, TransactionId xid)
|
||||||
|
{
|
||||||
|
if (TransactionIdIsValid(xid))
|
||||||
|
{
|
||||||
|
/* NB: xid must be known committed here! */
|
||||||
|
XLogRecPtr commitLSN = TransactionIdGetCommitLSN(xid);
|
||||||
|
|
||||||
|
if (XLogNeedsFlush(commitLSN))
|
||||||
|
return; /* not flushed yet, so don't set hint */
|
||||||
|
}
|
||||||
|
|
||||||
|
tuple->t_infomask |= infomask;
|
||||||
|
SetBufferCommitInfoNeedsSave(buffer);
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* HeapTupleSatisfiesSelf
|
* HeapTupleSatisfiesSelf
|
||||||
* True iff heap tuple is valid "for itself".
|
* True iff heap tuple is valid "for itself".
|
||||||
@ -122,12 +160,12 @@ HeapTupleSatisfiesSelf(HeapTupleHeader tuple, Snapshot snapshot, Buffer buffer)
|
|||||||
{
|
{
|
||||||
if (TransactionIdDidCommit(xvac))
|
if (TransactionIdDidCommit(xvac))
|
||||||
{
|
{
|
||||||
tuple->t_infomask |= HEAP_XMIN_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
tuple->t_infomask |= HEAP_XMIN_COMMITTED;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
else if (tuple->t_infomask & HEAP_MOVED_IN)
|
else if (tuple->t_infomask & HEAP_MOVED_IN)
|
||||||
@ -139,14 +177,12 @@ HeapTupleSatisfiesSelf(HeapTupleHeader tuple, Snapshot snapshot, Buffer buffer)
|
|||||||
if (TransactionIdIsInProgress(xvac))
|
if (TransactionIdIsInProgress(xvac))
|
||||||
return false;
|
return false;
|
||||||
if (TransactionIdDidCommit(xvac))
|
if (TransactionIdDidCommit(xvac))
|
||||||
{
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
|
||||||
tuple->t_infomask |= HEAP_XMIN_COMMITTED;
|
InvalidTransactionId);
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
|
||||||
}
|
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
tuple->t_infomask |= HEAP_XMIN_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@ -164,8 +200,8 @@ HeapTupleSatisfiesSelf(HeapTupleHeader tuple, Snapshot snapshot, Buffer buffer)
|
|||||||
/* deleting subtransaction aborted? */
|
/* deleting subtransaction aborted? */
|
||||||
if (TransactionIdDidAbort(HeapTupleHeaderGetXmax(tuple)))
|
if (TransactionIdDidAbort(HeapTupleHeaderGetXmax(tuple)))
|
||||||
{
|
{
|
||||||
tuple->t_infomask |= HEAP_XMAX_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -176,15 +212,13 @@ HeapTupleSatisfiesSelf(HeapTupleHeader tuple, Snapshot snapshot, Buffer buffer)
|
|||||||
else if (TransactionIdIsInProgress(HeapTupleHeaderGetXmin(tuple)))
|
else if (TransactionIdIsInProgress(HeapTupleHeaderGetXmin(tuple)))
|
||||||
return false;
|
return false;
|
||||||
else if (TransactionIdDidCommit(HeapTupleHeaderGetXmin(tuple)))
|
else if (TransactionIdDidCommit(HeapTupleHeaderGetXmin(tuple)))
|
||||||
{
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
|
||||||
tuple->t_infomask |= HEAP_XMIN_COMMITTED;
|
HeapTupleHeaderGetXmin(tuple));
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
|
||||||
}
|
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
/* it must have aborted or crashed */
|
/* it must have aborted or crashed */
|
||||||
tuple->t_infomask |= HEAP_XMIN_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@ -221,8 +255,8 @@ HeapTupleSatisfiesSelf(HeapTupleHeader tuple, Snapshot snapshot, Buffer buffer)
|
|||||||
if (!TransactionIdDidCommit(HeapTupleHeaderGetXmax(tuple)))
|
if (!TransactionIdDidCommit(HeapTupleHeaderGetXmax(tuple)))
|
||||||
{
|
{
|
||||||
/* it must have aborted or crashed */
|
/* it must have aborted or crashed */
|
||||||
tuple->t_infomask |= HEAP_XMAX_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -230,13 +264,13 @@ HeapTupleSatisfiesSelf(HeapTupleHeader tuple, Snapshot snapshot, Buffer buffer)
|
|||||||
|
|
||||||
if (tuple->t_infomask & HEAP_IS_LOCKED)
|
if (tuple->t_infomask & HEAP_IS_LOCKED)
|
||||||
{
|
{
|
||||||
tuple->t_infomask |= HEAP_XMAX_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
tuple->t_infomask |= HEAP_XMAX_COMMITTED;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_COMMITTED,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
HeapTupleHeaderGetXmax(tuple));
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -299,12 +333,12 @@ HeapTupleSatisfiesNow(HeapTupleHeader tuple, Snapshot snapshot, Buffer buffer)
|
|||||||
{
|
{
|
||||||
if (TransactionIdDidCommit(xvac))
|
if (TransactionIdDidCommit(xvac))
|
||||||
{
|
{
|
||||||
tuple->t_infomask |= HEAP_XMIN_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
tuple->t_infomask |= HEAP_XMIN_COMMITTED;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
else if (tuple->t_infomask & HEAP_MOVED_IN)
|
else if (tuple->t_infomask & HEAP_MOVED_IN)
|
||||||
@ -316,14 +350,12 @@ HeapTupleSatisfiesNow(HeapTupleHeader tuple, Snapshot snapshot, Buffer buffer)
|
|||||||
if (TransactionIdIsInProgress(xvac))
|
if (TransactionIdIsInProgress(xvac))
|
||||||
return false;
|
return false;
|
||||||
if (TransactionIdDidCommit(xvac))
|
if (TransactionIdDidCommit(xvac))
|
||||||
{
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
|
||||||
tuple->t_infomask |= HEAP_XMIN_COMMITTED;
|
InvalidTransactionId);
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
|
||||||
}
|
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
tuple->t_infomask |= HEAP_XMIN_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@ -344,8 +376,8 @@ HeapTupleSatisfiesNow(HeapTupleHeader tuple, Snapshot snapshot, Buffer buffer)
|
|||||||
/* deleting subtransaction aborted? */
|
/* deleting subtransaction aborted? */
|
||||||
if (TransactionIdDidAbort(HeapTupleHeaderGetXmax(tuple)))
|
if (TransactionIdDidAbort(HeapTupleHeaderGetXmax(tuple)))
|
||||||
{
|
{
|
||||||
tuple->t_infomask |= HEAP_XMAX_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -359,15 +391,13 @@ HeapTupleSatisfiesNow(HeapTupleHeader tuple, Snapshot snapshot, Buffer buffer)
|
|||||||
else if (TransactionIdIsInProgress(HeapTupleHeaderGetXmin(tuple)))
|
else if (TransactionIdIsInProgress(HeapTupleHeaderGetXmin(tuple)))
|
||||||
return false;
|
return false;
|
||||||
else if (TransactionIdDidCommit(HeapTupleHeaderGetXmin(tuple)))
|
else if (TransactionIdDidCommit(HeapTupleHeaderGetXmin(tuple)))
|
||||||
{
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
|
||||||
tuple->t_infomask |= HEAP_XMIN_COMMITTED;
|
HeapTupleHeaderGetXmin(tuple));
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
|
||||||
}
|
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
/* it must have aborted or crashed */
|
/* it must have aborted or crashed */
|
||||||
tuple->t_infomask |= HEAP_XMIN_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@ -407,8 +437,8 @@ HeapTupleSatisfiesNow(HeapTupleHeader tuple, Snapshot snapshot, Buffer buffer)
|
|||||||
if (!TransactionIdDidCommit(HeapTupleHeaderGetXmax(tuple)))
|
if (!TransactionIdDidCommit(HeapTupleHeaderGetXmax(tuple)))
|
||||||
{
|
{
|
||||||
/* it must have aborted or crashed */
|
/* it must have aborted or crashed */
|
||||||
tuple->t_infomask |= HEAP_XMAX_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -416,13 +446,13 @@ HeapTupleSatisfiesNow(HeapTupleHeader tuple, Snapshot snapshot, Buffer buffer)
|
|||||||
|
|
||||||
if (tuple->t_infomask & HEAP_IS_LOCKED)
|
if (tuple->t_infomask & HEAP_IS_LOCKED)
|
||||||
{
|
{
|
||||||
tuple->t_infomask |= HEAP_XMAX_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
tuple->t_infomask |= HEAP_XMAX_COMMITTED;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_COMMITTED,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
HeapTupleHeaderGetXmax(tuple));
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -469,12 +499,12 @@ HeapTupleSatisfiesToast(HeapTupleHeader tuple, Snapshot snapshot,
|
|||||||
{
|
{
|
||||||
if (TransactionIdDidCommit(xvac))
|
if (TransactionIdDidCommit(xvac))
|
||||||
{
|
{
|
||||||
tuple->t_infomask |= HEAP_XMIN_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
tuple->t_infomask |= HEAP_XMIN_COMMITTED;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
else if (tuple->t_infomask & HEAP_MOVED_IN)
|
else if (tuple->t_infomask & HEAP_MOVED_IN)
|
||||||
@ -486,14 +516,12 @@ HeapTupleSatisfiesToast(HeapTupleHeader tuple, Snapshot snapshot,
|
|||||||
if (TransactionIdIsInProgress(xvac))
|
if (TransactionIdIsInProgress(xvac))
|
||||||
return false;
|
return false;
|
||||||
if (TransactionIdDidCommit(xvac))
|
if (TransactionIdDidCommit(xvac))
|
||||||
{
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
|
||||||
tuple->t_infomask |= HEAP_XMIN_COMMITTED;
|
InvalidTransactionId);
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
|
||||||
}
|
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
tuple->t_infomask |= HEAP_XMIN_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@ -550,12 +578,12 @@ HeapTupleSatisfiesUpdate(HeapTupleHeader tuple, CommandId curcid,
|
|||||||
{
|
{
|
||||||
if (TransactionIdDidCommit(xvac))
|
if (TransactionIdDidCommit(xvac))
|
||||||
{
|
{
|
||||||
tuple->t_infomask |= HEAP_XMIN_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return HeapTupleInvisible;
|
return HeapTupleInvisible;
|
||||||
}
|
}
|
||||||
tuple->t_infomask |= HEAP_XMIN_COMMITTED;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
else if (tuple->t_infomask & HEAP_MOVED_IN)
|
else if (tuple->t_infomask & HEAP_MOVED_IN)
|
||||||
@ -567,14 +595,12 @@ HeapTupleSatisfiesUpdate(HeapTupleHeader tuple, CommandId curcid,
|
|||||||
if (TransactionIdIsInProgress(xvac))
|
if (TransactionIdIsInProgress(xvac))
|
||||||
return HeapTupleInvisible;
|
return HeapTupleInvisible;
|
||||||
if (TransactionIdDidCommit(xvac))
|
if (TransactionIdDidCommit(xvac))
|
||||||
{
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
|
||||||
tuple->t_infomask |= HEAP_XMIN_COMMITTED;
|
InvalidTransactionId);
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
|
||||||
}
|
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
tuple->t_infomask |= HEAP_XMIN_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return HeapTupleInvisible;
|
return HeapTupleInvisible;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@ -595,8 +621,8 @@ HeapTupleSatisfiesUpdate(HeapTupleHeader tuple, CommandId curcid,
|
|||||||
/* deleting subtransaction aborted? */
|
/* deleting subtransaction aborted? */
|
||||||
if (TransactionIdDidAbort(HeapTupleHeaderGetXmax(tuple)))
|
if (TransactionIdDidAbort(HeapTupleHeaderGetXmax(tuple)))
|
||||||
{
|
{
|
||||||
tuple->t_infomask |= HEAP_XMAX_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return HeapTupleMayBeUpdated;
|
return HeapTupleMayBeUpdated;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -610,15 +636,13 @@ HeapTupleSatisfiesUpdate(HeapTupleHeader tuple, CommandId curcid,
|
|||||||
else if (TransactionIdIsInProgress(HeapTupleHeaderGetXmin(tuple)))
|
else if (TransactionIdIsInProgress(HeapTupleHeaderGetXmin(tuple)))
|
||||||
return HeapTupleInvisible;
|
return HeapTupleInvisible;
|
||||||
else if (TransactionIdDidCommit(HeapTupleHeaderGetXmin(tuple)))
|
else if (TransactionIdDidCommit(HeapTupleHeaderGetXmin(tuple)))
|
||||||
{
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
|
||||||
tuple->t_infomask |= HEAP_XMIN_COMMITTED;
|
HeapTupleHeaderGetXmin(tuple));
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
|
||||||
}
|
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
/* it must have aborted or crashed */
|
/* it must have aborted or crashed */
|
||||||
tuple->t_infomask |= HEAP_XMIN_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return HeapTupleInvisible;
|
return HeapTupleInvisible;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@ -642,8 +666,8 @@ HeapTupleSatisfiesUpdate(HeapTupleHeader tuple, CommandId curcid,
|
|||||||
|
|
||||||
if (MultiXactIdIsRunning(HeapTupleHeaderGetXmax(tuple)))
|
if (MultiXactIdIsRunning(HeapTupleHeaderGetXmax(tuple)))
|
||||||
return HeapTupleBeingUpdated;
|
return HeapTupleBeingUpdated;
|
||||||
tuple->t_infomask |= HEAP_XMAX_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return HeapTupleMayBeUpdated;
|
return HeapTupleMayBeUpdated;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -663,8 +687,8 @@ HeapTupleSatisfiesUpdate(HeapTupleHeader tuple, CommandId curcid,
|
|||||||
if (!TransactionIdDidCommit(HeapTupleHeaderGetXmax(tuple)))
|
if (!TransactionIdDidCommit(HeapTupleHeaderGetXmax(tuple)))
|
||||||
{
|
{
|
||||||
/* it must have aborted or crashed */
|
/* it must have aborted or crashed */
|
||||||
tuple->t_infomask |= HEAP_XMAX_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return HeapTupleMayBeUpdated;
|
return HeapTupleMayBeUpdated;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -672,13 +696,13 @@ HeapTupleSatisfiesUpdate(HeapTupleHeader tuple, CommandId curcid,
|
|||||||
|
|
||||||
if (tuple->t_infomask & HEAP_IS_LOCKED)
|
if (tuple->t_infomask & HEAP_IS_LOCKED)
|
||||||
{
|
{
|
||||||
tuple->t_infomask |= HEAP_XMAX_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return HeapTupleMayBeUpdated;
|
return HeapTupleMayBeUpdated;
|
||||||
}
|
}
|
||||||
|
|
||||||
tuple->t_infomask |= HEAP_XMAX_COMMITTED;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_COMMITTED,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
HeapTupleHeaderGetXmax(tuple));
|
||||||
return HeapTupleUpdated; /* updated by other */
|
return HeapTupleUpdated; /* updated by other */
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -723,12 +747,12 @@ HeapTupleSatisfiesDirty(HeapTupleHeader tuple, Snapshot snapshot,
|
|||||||
{
|
{
|
||||||
if (TransactionIdDidCommit(xvac))
|
if (TransactionIdDidCommit(xvac))
|
||||||
{
|
{
|
||||||
tuple->t_infomask |= HEAP_XMIN_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
tuple->t_infomask |= HEAP_XMIN_COMMITTED;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
else if (tuple->t_infomask & HEAP_MOVED_IN)
|
else if (tuple->t_infomask & HEAP_MOVED_IN)
|
||||||
@ -740,14 +764,12 @@ HeapTupleSatisfiesDirty(HeapTupleHeader tuple, Snapshot snapshot,
|
|||||||
if (TransactionIdIsInProgress(xvac))
|
if (TransactionIdIsInProgress(xvac))
|
||||||
return false;
|
return false;
|
||||||
if (TransactionIdDidCommit(xvac))
|
if (TransactionIdDidCommit(xvac))
|
||||||
{
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
|
||||||
tuple->t_infomask |= HEAP_XMIN_COMMITTED;
|
InvalidTransactionId);
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
|
||||||
}
|
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
tuple->t_infomask |= HEAP_XMIN_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@ -765,8 +787,8 @@ HeapTupleSatisfiesDirty(HeapTupleHeader tuple, Snapshot snapshot,
|
|||||||
/* deleting subtransaction aborted? */
|
/* deleting subtransaction aborted? */
|
||||||
if (TransactionIdDidAbort(HeapTupleHeaderGetXmax(tuple)))
|
if (TransactionIdDidAbort(HeapTupleHeaderGetXmax(tuple)))
|
||||||
{
|
{
|
||||||
tuple->t_infomask |= HEAP_XMAX_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -781,15 +803,13 @@ HeapTupleSatisfiesDirty(HeapTupleHeader tuple, Snapshot snapshot,
|
|||||||
return true; /* in insertion by other */
|
return true; /* in insertion by other */
|
||||||
}
|
}
|
||||||
else if (TransactionIdDidCommit(HeapTupleHeaderGetXmin(tuple)))
|
else if (TransactionIdDidCommit(HeapTupleHeaderGetXmin(tuple)))
|
||||||
{
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
|
||||||
tuple->t_infomask |= HEAP_XMIN_COMMITTED;
|
HeapTupleHeaderGetXmin(tuple));
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
|
||||||
}
|
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
/* it must have aborted or crashed */
|
/* it must have aborted or crashed */
|
||||||
tuple->t_infomask |= HEAP_XMIN_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@ -829,8 +849,8 @@ HeapTupleSatisfiesDirty(HeapTupleHeader tuple, Snapshot snapshot,
|
|||||||
if (!TransactionIdDidCommit(HeapTupleHeaderGetXmax(tuple)))
|
if (!TransactionIdDidCommit(HeapTupleHeaderGetXmax(tuple)))
|
||||||
{
|
{
|
||||||
/* it must have aborted or crashed */
|
/* it must have aborted or crashed */
|
||||||
tuple->t_infomask |= HEAP_XMAX_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -838,13 +858,13 @@ HeapTupleSatisfiesDirty(HeapTupleHeader tuple, Snapshot snapshot,
|
|||||||
|
|
||||||
if (tuple->t_infomask & HEAP_IS_LOCKED)
|
if (tuple->t_infomask & HEAP_IS_LOCKED)
|
||||||
{
|
{
|
||||||
tuple->t_infomask |= HEAP_XMAX_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
tuple->t_infomask |= HEAP_XMAX_COMMITTED;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_COMMITTED,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
HeapTupleHeaderGetXmax(tuple));
|
||||||
return false; /* updated by other */
|
return false; /* updated by other */
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -888,12 +908,12 @@ HeapTupleSatisfiesMVCC(HeapTupleHeader tuple, Snapshot snapshot,
|
|||||||
{
|
{
|
||||||
if (TransactionIdDidCommit(xvac))
|
if (TransactionIdDidCommit(xvac))
|
||||||
{
|
{
|
||||||
tuple->t_infomask |= HEAP_XMIN_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
tuple->t_infomask |= HEAP_XMIN_COMMITTED;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
else if (tuple->t_infomask & HEAP_MOVED_IN)
|
else if (tuple->t_infomask & HEAP_MOVED_IN)
|
||||||
@ -905,14 +925,12 @@ HeapTupleSatisfiesMVCC(HeapTupleHeader tuple, Snapshot snapshot,
|
|||||||
if (TransactionIdIsInProgress(xvac))
|
if (TransactionIdIsInProgress(xvac))
|
||||||
return false;
|
return false;
|
||||||
if (TransactionIdDidCommit(xvac))
|
if (TransactionIdDidCommit(xvac))
|
||||||
{
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
|
||||||
tuple->t_infomask |= HEAP_XMIN_COMMITTED;
|
InvalidTransactionId);
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
|
||||||
}
|
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
tuple->t_infomask |= HEAP_XMIN_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@ -934,8 +952,8 @@ HeapTupleSatisfiesMVCC(HeapTupleHeader tuple, Snapshot snapshot,
|
|||||||
/* FIXME -- is this correct w.r.t. the cmax of the tuple? */
|
/* FIXME -- is this correct w.r.t. the cmax of the tuple? */
|
||||||
if (TransactionIdDidAbort(HeapTupleHeaderGetXmax(tuple)))
|
if (TransactionIdDidAbort(HeapTupleHeaderGetXmax(tuple)))
|
||||||
{
|
{
|
||||||
tuple->t_infomask |= HEAP_XMAX_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -949,15 +967,13 @@ HeapTupleSatisfiesMVCC(HeapTupleHeader tuple, Snapshot snapshot,
|
|||||||
else if (TransactionIdIsInProgress(HeapTupleHeaderGetXmin(tuple)))
|
else if (TransactionIdIsInProgress(HeapTupleHeaderGetXmin(tuple)))
|
||||||
return false;
|
return false;
|
||||||
else if (TransactionIdDidCommit(HeapTupleHeaderGetXmin(tuple)))
|
else if (TransactionIdDidCommit(HeapTupleHeaderGetXmin(tuple)))
|
||||||
{
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
|
||||||
tuple->t_infomask |= HEAP_XMIN_COMMITTED;
|
HeapTupleHeaderGetXmin(tuple));
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
|
||||||
}
|
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
/* it must have aborted or crashed */
|
/* it must have aborted or crashed */
|
||||||
tuple->t_infomask |= HEAP_XMIN_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@ -998,14 +1014,14 @@ HeapTupleSatisfiesMVCC(HeapTupleHeader tuple, Snapshot snapshot,
|
|||||||
if (!TransactionIdDidCommit(HeapTupleHeaderGetXmax(tuple)))
|
if (!TransactionIdDidCommit(HeapTupleHeaderGetXmax(tuple)))
|
||||||
{
|
{
|
||||||
/* it must have aborted or crashed */
|
/* it must have aborted or crashed */
|
||||||
tuple->t_infomask |= HEAP_XMAX_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return true;
|
return true;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* xmax transaction committed */
|
/* xmax transaction committed */
|
||||||
tuple->t_infomask |= HEAP_XMAX_COMMITTED;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_COMMITTED,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
HeapTupleHeaderGetXmax(tuple));
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
@ -1054,12 +1070,12 @@ HeapTupleSatisfiesVacuum(HeapTupleHeader tuple, TransactionId OldestXmin,
|
|||||||
return HEAPTUPLE_DELETE_IN_PROGRESS;
|
return HEAPTUPLE_DELETE_IN_PROGRESS;
|
||||||
if (TransactionIdDidCommit(xvac))
|
if (TransactionIdDidCommit(xvac))
|
||||||
{
|
{
|
||||||
tuple->t_infomask |= HEAP_XMIN_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return HEAPTUPLE_DEAD;
|
return HEAPTUPLE_DEAD;
|
||||||
}
|
}
|
||||||
tuple->t_infomask |= HEAP_XMIN_COMMITTED;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
}
|
}
|
||||||
else if (tuple->t_infomask & HEAP_MOVED_IN)
|
else if (tuple->t_infomask & HEAP_MOVED_IN)
|
||||||
{
|
{
|
||||||
@ -1070,14 +1086,12 @@ HeapTupleSatisfiesVacuum(HeapTupleHeader tuple, TransactionId OldestXmin,
|
|||||||
if (TransactionIdIsInProgress(xvac))
|
if (TransactionIdIsInProgress(xvac))
|
||||||
return HEAPTUPLE_INSERT_IN_PROGRESS;
|
return HEAPTUPLE_INSERT_IN_PROGRESS;
|
||||||
if (TransactionIdDidCommit(xvac))
|
if (TransactionIdDidCommit(xvac))
|
||||||
{
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
|
||||||
tuple->t_infomask |= HEAP_XMIN_COMMITTED;
|
InvalidTransactionId);
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
|
||||||
}
|
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
tuple->t_infomask |= HEAP_XMIN_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return HEAPTUPLE_DEAD;
|
return HEAPTUPLE_DEAD;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@ -1091,21 +1105,22 @@ HeapTupleSatisfiesVacuum(HeapTupleHeader tuple, TransactionId OldestXmin,
|
|||||||
return HEAPTUPLE_DELETE_IN_PROGRESS;
|
return HEAPTUPLE_DELETE_IN_PROGRESS;
|
||||||
}
|
}
|
||||||
else if (TransactionIdDidCommit(HeapTupleHeaderGetXmin(tuple)))
|
else if (TransactionIdDidCommit(HeapTupleHeaderGetXmin(tuple)))
|
||||||
{
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
|
||||||
tuple->t_infomask |= HEAP_XMIN_COMMITTED;
|
HeapTupleHeaderGetXmin(tuple));
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
|
||||||
}
|
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
/*
|
/*
|
||||||
* Not in Progress, Not Committed, so either Aborted or crashed
|
* Not in Progress, Not Committed, so either Aborted or crashed
|
||||||
*/
|
*/
|
||||||
tuple->t_infomask |= HEAP_XMIN_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMIN_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return HEAPTUPLE_DEAD;
|
return HEAPTUPLE_DEAD;
|
||||||
}
|
}
|
||||||
/* Should only get here if we set XMIN_COMMITTED */
|
/*
|
||||||
Assert(tuple->t_infomask & HEAP_XMIN_COMMITTED);
|
* At this point the xmin is known committed, but we might not have
|
||||||
|
* been able to set the hint bit yet; so we can no longer Assert
|
||||||
|
* that it's set.
|
||||||
|
*/
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
@ -1143,8 +1158,8 @@ HeapTupleSatisfiesVacuum(HeapTupleHeader tuple, TransactionId OldestXmin,
|
|||||||
* We know that xmax did lock the tuple, but it did not and will
|
* We know that xmax did lock the tuple, but it did not and will
|
||||||
* never actually update it.
|
* never actually update it.
|
||||||
*/
|
*/
|
||||||
tuple->t_infomask |= HEAP_XMAX_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
}
|
}
|
||||||
return HEAPTUPLE_LIVE;
|
return HEAPTUPLE_LIVE;
|
||||||
}
|
}
|
||||||
@ -1161,21 +1176,22 @@ HeapTupleSatisfiesVacuum(HeapTupleHeader tuple, TransactionId OldestXmin,
|
|||||||
if (TransactionIdIsInProgress(HeapTupleHeaderGetXmax(tuple)))
|
if (TransactionIdIsInProgress(HeapTupleHeaderGetXmax(tuple)))
|
||||||
return HEAPTUPLE_DELETE_IN_PROGRESS;
|
return HEAPTUPLE_DELETE_IN_PROGRESS;
|
||||||
else if (TransactionIdDidCommit(HeapTupleHeaderGetXmax(tuple)))
|
else if (TransactionIdDidCommit(HeapTupleHeaderGetXmax(tuple)))
|
||||||
{
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_COMMITTED,
|
||||||
tuple->t_infomask |= HEAP_XMAX_COMMITTED;
|
HeapTupleHeaderGetXmax(tuple));
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
|
||||||
}
|
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
/*
|
/*
|
||||||
* Not in Progress, Not Committed, so either Aborted or crashed
|
* Not in Progress, Not Committed, so either Aborted or crashed
|
||||||
*/
|
*/
|
||||||
tuple->t_infomask |= HEAP_XMAX_INVALID;
|
HeapTupleSetHintBits(tuple, buffer, HEAP_XMAX_INVALID,
|
||||||
SetBufferCommitInfoNeedsSave(buffer);
|
InvalidTransactionId);
|
||||||
return HEAPTUPLE_LIVE;
|
return HEAPTUPLE_LIVE;
|
||||||
}
|
}
|
||||||
/* Should only get here if we set XMAX_COMMITTED */
|
/*
|
||||||
Assert(tuple->t_infomask & HEAP_XMAX_COMMITTED);
|
* At this point the xmax is known committed, but we might not have
|
||||||
|
* been able to set the hint bit yet; so we can no longer Assert
|
||||||
|
* that it's set.
|
||||||
|
*/
|
||||||
}
|
}
|
||||||
|
|
||||||
/*
|
/*
|
||||||
|
@ -6,7 +6,7 @@
|
|||||||
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
|
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
|
||||||
* Portions Copyright (c) 1994, Regents of the University of California
|
* Portions Copyright (c) 1994, Regents of the University of California
|
||||||
*
|
*
|
||||||
* $PostgreSQL: pgsql/src/include/access/clog.h,v 1.19 2007/01/05 22:19:50 momjian Exp $
|
* $PostgreSQL: pgsql/src/include/access/clog.h,v 1.20 2007/08/01 22:45:09 tgl Exp $
|
||||||
*/
|
*/
|
||||||
#ifndef CLOG_H
|
#ifndef CLOG_H
|
||||||
#define CLOG_H
|
#define CLOG_H
|
||||||
@ -32,8 +32,8 @@ typedef int XidStatus;
|
|||||||
#define NUM_CLOG_BUFFERS 8
|
#define NUM_CLOG_BUFFERS 8
|
||||||
|
|
||||||
|
|
||||||
extern void TransactionIdSetStatus(TransactionId xid, XidStatus status);
|
extern void TransactionIdSetStatus(TransactionId xid, XidStatus status, XLogRecPtr lsn);
|
||||||
extern XidStatus TransactionIdGetStatus(TransactionId xid);
|
extern XidStatus TransactionIdGetStatus(TransactionId xid, XLogRecPtr *lsn);
|
||||||
|
|
||||||
extern Size CLOGShmemSize(void);
|
extern Size CLOGShmemSize(void);
|
||||||
extern void CLOGShmemInit(void);
|
extern void CLOGShmemInit(void);
|
||||||
|
@ -7,7 +7,7 @@
|
|||||||
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
|
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
|
||||||
* Portions Copyright (c) 1994, Regents of the University of California
|
* Portions Copyright (c) 1994, Regents of the University of California
|
||||||
*
|
*
|
||||||
* $PostgreSQL: pgsql/src/include/access/gist_private.h,v 1.26 2007/01/20 18:43:35 neilc Exp $
|
* $PostgreSQL: pgsql/src/include/access/gist_private.h,v 1.27 2007/08/01 22:45:09 tgl Exp $
|
||||||
*
|
*
|
||||||
*-------------------------------------------------------------------------
|
*-------------------------------------------------------------------------
|
||||||
*/
|
*/
|
||||||
@ -200,8 +200,6 @@ typedef struct GistSplitVector
|
|||||||
* distributed between left and right pages */
|
* distributed between left and right pages */
|
||||||
} GistSplitVector;
|
} GistSplitVector;
|
||||||
|
|
||||||
#define XLogRecPtrIsInvalid( r ) ( (r).xlogid == 0 && (r).xrecoff == 0 )
|
|
||||||
|
|
||||||
typedef struct
|
typedef struct
|
||||||
{
|
{
|
||||||
Relation r;
|
Relation r;
|
||||||
|
@ -6,13 +6,14 @@
|
|||||||
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
|
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
|
||||||
* Portions Copyright (c) 1994, Regents of the University of California
|
* Portions Copyright (c) 1994, Regents of the University of California
|
||||||
*
|
*
|
||||||
* $PostgreSQL: pgsql/src/include/access/slru.h,v 1.20 2007/01/05 22:19:51 momjian Exp $
|
* $PostgreSQL: pgsql/src/include/access/slru.h,v 1.21 2007/08/01 22:45:09 tgl Exp $
|
||||||
*
|
*
|
||||||
*-------------------------------------------------------------------------
|
*-------------------------------------------------------------------------
|
||||||
*/
|
*/
|
||||||
#ifndef SLRU_H
|
#ifndef SLRU_H
|
||||||
#define SLRU_H
|
#define SLRU_H
|
||||||
|
|
||||||
|
#include "access/xlogdefs.h"
|
||||||
#include "storage/lwlock.h"
|
#include "storage/lwlock.h"
|
||||||
|
|
||||||
|
|
||||||
@ -51,6 +52,17 @@ typedef struct SlruSharedData
|
|||||||
int *page_lru_count;
|
int *page_lru_count;
|
||||||
LWLockId *buffer_locks;
|
LWLockId *buffer_locks;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Optional array of WAL flush LSNs associated with entries in the SLRU
|
||||||
|
* pages. If not zero/NULL, we must flush WAL before writing pages (true
|
||||||
|
* for pg_clog, false for multixact and pg_subtrans). group_lsn[] has
|
||||||
|
* lsn_groups_per_page entries per buffer slot, each containing the
|
||||||
|
* highest LSN known for a contiguous group of SLRU entries on that slot's
|
||||||
|
* page.
|
||||||
|
*/
|
||||||
|
XLogRecPtr *group_lsn;
|
||||||
|
int lsn_groups_per_page;
|
||||||
|
|
||||||
/*----------
|
/*----------
|
||||||
* We mark a page "most recently used" by setting
|
* We mark a page "most recently used" by setting
|
||||||
* page_lru_count[slotno] = ++cur_lru_count;
|
* page_lru_count[slotno] = ++cur_lru_count;
|
||||||
@ -81,8 +93,8 @@ typedef struct SlruCtlData
|
|||||||
SlruShared shared;
|
SlruShared shared;
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* This flag tells whether to fsync writes (true for pg_clog, false for
|
* This flag tells whether to fsync writes (true for pg_clog and multixact
|
||||||
* pg_subtrans).
|
* stuff, false for pg_subtrans).
|
||||||
*/
|
*/
|
||||||
bool do_fsync;
|
bool do_fsync;
|
||||||
|
|
||||||
@ -106,11 +118,12 @@ typedef SlruCtlData *SlruCtl;
|
|||||||
typedef struct SlruFlushData *SlruFlush;
|
typedef struct SlruFlushData *SlruFlush;
|
||||||
|
|
||||||
|
|
||||||
extern Size SimpleLruShmemSize(int nslots);
|
extern Size SimpleLruShmemSize(int nslots, int nlsns);
|
||||||
extern void SimpleLruInit(SlruCtl ctl, const char *name, int nslots,
|
extern void SimpleLruInit(SlruCtl ctl, const char *name, int nslots, int nlsns,
|
||||||
LWLockId ctllock, const char *subdir);
|
LWLockId ctllock, const char *subdir);
|
||||||
extern int SimpleLruZeroPage(SlruCtl ctl, int pageno);
|
extern int SimpleLruZeroPage(SlruCtl ctl, int pageno);
|
||||||
extern int SimpleLruReadPage(SlruCtl ctl, int pageno, TransactionId xid);
|
extern int SimpleLruReadPage(SlruCtl ctl, int pageno, bool write_ok,
|
||||||
|
TransactionId xid);
|
||||||
extern int SimpleLruReadPage_ReadOnly(SlruCtl ctl, int pageno,
|
extern int SimpleLruReadPage_ReadOnly(SlruCtl ctl, int pageno,
|
||||||
TransactionId xid);
|
TransactionId xid);
|
||||||
extern void SimpleLruWritePage(SlruCtl ctl, int slotno, SlruFlush fdata);
|
extern void SimpleLruWritePage(SlruCtl ctl, int slotno, SlruFlush fdata);
|
||||||
|
@ -7,13 +7,15 @@
|
|||||||
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
|
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
|
||||||
* Portions Copyright (c) 1994, Regents of the University of California
|
* Portions Copyright (c) 1994, Regents of the University of California
|
||||||
*
|
*
|
||||||
* $PostgreSQL: pgsql/src/include/access/transam.h,v 1.60 2007/01/05 22:19:51 momjian Exp $
|
* $PostgreSQL: pgsql/src/include/access/transam.h,v 1.61 2007/08/01 22:45:09 tgl Exp $
|
||||||
*
|
*
|
||||||
*-------------------------------------------------------------------------
|
*-------------------------------------------------------------------------
|
||||||
*/
|
*/
|
||||||
#ifndef TRANSAM_H
|
#ifndef TRANSAM_H
|
||||||
#define TRANSAM_H
|
#define TRANSAM_H
|
||||||
|
|
||||||
|
#include "access/xlogdefs.h"
|
||||||
|
|
||||||
|
|
||||||
/* ----------------
|
/* ----------------
|
||||||
* Special transaction ID values
|
* Special transaction ID values
|
||||||
@ -115,14 +117,17 @@ extern VariableCache ShmemVariableCache;
|
|||||||
extern bool TransactionIdDidCommit(TransactionId transactionId);
|
extern bool TransactionIdDidCommit(TransactionId transactionId);
|
||||||
extern bool TransactionIdDidAbort(TransactionId transactionId);
|
extern bool TransactionIdDidAbort(TransactionId transactionId);
|
||||||
extern void TransactionIdCommit(TransactionId transactionId);
|
extern void TransactionIdCommit(TransactionId transactionId);
|
||||||
|
extern void TransactionIdAsyncCommit(TransactionId transactionId, XLogRecPtr lsn);
|
||||||
extern void TransactionIdAbort(TransactionId transactionId);
|
extern void TransactionIdAbort(TransactionId transactionId);
|
||||||
extern void TransactionIdSubCommit(TransactionId transactionId);
|
extern void TransactionIdSubCommit(TransactionId transactionId);
|
||||||
extern void TransactionIdCommitTree(int nxids, TransactionId *xids);
|
extern void TransactionIdCommitTree(int nxids, TransactionId *xids);
|
||||||
|
extern void TransactionIdAsyncCommitTree(int nxids, TransactionId *xids, XLogRecPtr lsn);
|
||||||
extern void TransactionIdAbortTree(int nxids, TransactionId *xids);
|
extern void TransactionIdAbortTree(int nxids, TransactionId *xids);
|
||||||
extern bool TransactionIdPrecedes(TransactionId id1, TransactionId id2);
|
extern bool TransactionIdPrecedes(TransactionId id1, TransactionId id2);
|
||||||
extern bool TransactionIdPrecedesOrEquals(TransactionId id1, TransactionId id2);
|
extern bool TransactionIdPrecedesOrEquals(TransactionId id1, TransactionId id2);
|
||||||
extern bool TransactionIdFollows(TransactionId id1, TransactionId id2);
|
extern bool TransactionIdFollows(TransactionId id1, TransactionId id2);
|
||||||
extern bool TransactionIdFollowsOrEquals(TransactionId id1, TransactionId id2);
|
extern bool TransactionIdFollowsOrEquals(TransactionId id1, TransactionId id2);
|
||||||
|
extern XLogRecPtr TransactionIdGetCommitLSN(TransactionId xid);
|
||||||
|
|
||||||
/* in transam/varsup.c */
|
/* in transam/varsup.c */
|
||||||
extern TransactionId GetNewTransactionId(bool isSubXact);
|
extern TransactionId GetNewTransactionId(bool isSubXact);
|
||||||
|
@ -7,7 +7,7 @@
|
|||||||
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
|
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
|
||||||
* Portions Copyright (c) 1994, Regents of the University of California
|
* Portions Copyright (c) 1994, Regents of the University of California
|
||||||
*
|
*
|
||||||
* $PostgreSQL: pgsql/src/include/access/xact.h,v 1.87 2007/04/30 21:01:53 tgl Exp $
|
* $PostgreSQL: pgsql/src/include/access/xact.h,v 1.88 2007/08/01 22:45:09 tgl Exp $
|
||||||
*
|
*
|
||||||
*-------------------------------------------------------------------------
|
*-------------------------------------------------------------------------
|
||||||
*/
|
*/
|
||||||
@ -41,6 +41,9 @@ extern int XactIsoLevel;
|
|||||||
extern bool DefaultXactReadOnly;
|
extern bool DefaultXactReadOnly;
|
||||||
extern bool XactReadOnly;
|
extern bool XactReadOnly;
|
||||||
|
|
||||||
|
/* Asynchronous commits */
|
||||||
|
extern bool XactSyncCommit;
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* start- and end-of-transaction callbacks for dynamically loaded modules
|
* start- and end-of-transaction callbacks for dynamically loaded modules
|
||||||
*/
|
*/
|
||||||
@ -147,6 +150,7 @@ extern void SetCurrentStatementStartTimestamp(void);
|
|||||||
extern int GetCurrentTransactionNestLevel(void);
|
extern int GetCurrentTransactionNestLevel(void);
|
||||||
extern bool TransactionIdIsCurrentTransactionId(TransactionId xid);
|
extern bool TransactionIdIsCurrentTransactionId(TransactionId xid);
|
||||||
extern void CommandCounterIncrement(void);
|
extern void CommandCounterIncrement(void);
|
||||||
|
extern void ForceSyncCommit(void);
|
||||||
extern void StartTransactionCommand(void);
|
extern void StartTransactionCommand(void);
|
||||||
extern void CommitTransactionCommand(void);
|
extern void CommitTransactionCommand(void);
|
||||||
extern void AbortCurrentTransaction(void);
|
extern void AbortCurrentTransaction(void);
|
||||||
|
@ -6,7 +6,7 @@
|
|||||||
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
|
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
|
||||||
* Portions Copyright (c) 1994, Regents of the University of California
|
* Portions Copyright (c) 1994, Regents of the University of California
|
||||||
*
|
*
|
||||||
* $PostgreSQL: pgsql/src/include/access/xlog.h,v 1.81 2007/07/24 04:54:09 tgl Exp $
|
* $PostgreSQL: pgsql/src/include/access/xlog.h,v 1.82 2007/08/01 22:45:09 tgl Exp $
|
||||||
*/
|
*/
|
||||||
#ifndef XLOG_H
|
#ifndef XLOG_H
|
||||||
#define XLOG_H
|
#define XLOG_H
|
||||||
@ -197,8 +197,11 @@ extern CheckpointStatsData CheckpointStats;
|
|||||||
extern XLogRecPtr XLogInsert(RmgrId rmid, uint8 info, XLogRecData *rdata);
|
extern XLogRecPtr XLogInsert(RmgrId rmid, uint8 info, XLogRecData *rdata);
|
||||||
extern void XLogFlush(XLogRecPtr RecPtr);
|
extern void XLogFlush(XLogRecPtr RecPtr);
|
||||||
extern void XLogBackgroundFlush(void);
|
extern void XLogBackgroundFlush(void);
|
||||||
|
extern void XLogAsyncCommitFlush(void);
|
||||||
extern bool XLogNeedsFlush(XLogRecPtr RecPtr);
|
extern bool XLogNeedsFlush(XLogRecPtr RecPtr);
|
||||||
|
|
||||||
|
extern void XLogSetAsyncCommitLSN(XLogRecPtr record);
|
||||||
|
|
||||||
extern void xlog_redo(XLogRecPtr lsn, XLogRecord *record);
|
extern void xlog_redo(XLogRecPtr lsn, XLogRecord *record);
|
||||||
extern void xlog_desc(StringInfo buf, uint8 xl_info, char *rec);
|
extern void xlog_desc(StringInfo buf, uint8 xl_info, char *rec);
|
||||||
|
|
||||||
|
@ -7,7 +7,7 @@
|
|||||||
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
|
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
|
||||||
* Portions Copyright (c) 1994, Regents of the University of California
|
* Portions Copyright (c) 1994, Regents of the University of California
|
||||||
*
|
*
|
||||||
* $PostgreSQL: pgsql/src/include/access/xlogdefs.h,v 1.17 2007/02/14 05:00:40 momjian Exp $
|
* $PostgreSQL: pgsql/src/include/access/xlogdefs.h,v 1.18 2007/08/01 22:45:09 tgl Exp $
|
||||||
*/
|
*/
|
||||||
#ifndef XLOG_DEFS_H
|
#ifndef XLOG_DEFS_H
|
||||||
#define XLOG_DEFS_H
|
#define XLOG_DEFS_H
|
||||||
@ -33,6 +33,8 @@ typedef struct XLogRecPtr
|
|||||||
uint32 xrecoff; /* byte offset of location in log file */
|
uint32 xrecoff; /* byte offset of location in log file */
|
||||||
} XLogRecPtr;
|
} XLogRecPtr;
|
||||||
|
|
||||||
|
#define XLogRecPtrIsInvalid(r) ((r).xrecoff == 0)
|
||||||
|
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Macros for comparing XLogRecPtrs
|
* Macros for comparing XLogRecPtrs
|
||||||
|
Loading…
x
Reference in New Issue
Block a user