mirror of
				https://github.com/postgres/postgres.git
				synced 2025-11-03 09:13:20 +03:00 
			
		
		
		
	Add 2phase TODO.detail.
This commit is contained in:
		
							
								
								
									
										2161
									
								
								doc/TODO.detail/2phase
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										2161
									
								
								doc/TODO.detail/2phase
									
									
									
									
									
										Normal file
									
								
							
										
											
												File diff suppressed because it is too large
												Load Diff
											
										
									
								
							
							
								
								
									
										827
									
								
								doc/TODO.detail/pitr
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										827
									
								
								doc/TODO.detail/pitr
									
									
									
									
									
										Normal file
									
								
							@@ -0,0 +1,827 @@
 | 
			
		||||
From pgsql-admin-owner+M15281=pgman=candle.pha.pa.us@postgresql.org Thu Oct 21 18:57:36 2004
 | 
			
		||||
Return-path: <pgsql-admin-owner+M15281=pgman=candle.pha.pa.us@postgresql.org>
 | 
			
		||||
Received: from svr1.postgresql.org (svr1.postgresql.org [200.46.204.71])
 | 
			
		||||
	by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i9LLvYf17059
 | 
			
		||||
	for <pgman@candle.pha.pa.us>; Thu, 21 Oct 2004 17:57:34 -0400 (EDT)
 | 
			
		||||
Received: from localhost (unknown [200.46.204.144])
 | 
			
		||||
	by svr1.postgresql.org (Postfix) with ESMTP id 79D9132A71A
 | 
			
		||||
	for <pgman@candle.pha.pa.us>; Thu, 21 Oct 2004 22:57:29 +0100 (BST)
 | 
			
		||||
Received: from svr1.postgresql.org ([200.46.204.71])
 | 
			
		||||
	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | 
			
		||||
	with ESMTP id 80515-02 for <pgman@candle.pha.pa.us>;
 | 
			
		||||
	Thu, 21 Oct 2004 21:57:26 +0000 (GMT)
 | 
			
		||||
Received: from postgresql.org (svr1.postgresql.org [200.46.204.71])
 | 
			
		||||
	by svr1.postgresql.org (Postfix) with ESMTP id 1209432A70E
 | 
			
		||||
	for <pgman@candle.pha.pa.us>; Thu, 21 Oct 2004 22:57:29 +0100 (BST)
 | 
			
		||||
X-Original-To: pgsql-admin-postgresql.org@localhost.postgresql.org
 | 
			
		||||
Received: from localhost (unknown [200.46.204.144])
 | 
			
		||||
	by svr1.postgresql.org (Postfix) with ESMTP id 4B39932A6C3
 | 
			
		||||
	for <pgsql-admin-postgresql.org@localhost.postgresql.org>; Thu, 21 Oct 2004 22:51:01 +0100 (BST)
 | 
			
		||||
Received: from svr1.postgresql.org ([200.46.204.71])
 | 
			
		||||
	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | 
			
		||||
	with ESMTP id 78125-02
 | 
			
		||||
	for <pgsql-admin-postgresql.org@localhost.postgresql.org>;
 | 
			
		||||
	Thu, 21 Oct 2004 21:50:48 +0000 (GMT)
 | 
			
		||||
Received: from news.hub.org (news.hub.org [200.46.204.72])
 | 
			
		||||
	by svr1.postgresql.org (Postfix) with ESMTP id 27F0632A6C2
 | 
			
		||||
	for <pgsql-admin@postgresql.org>; Thu, 21 Oct 2004 22:50:49 +0100 (BST)
 | 
			
		||||
Received: from news.hub.org (news.hub.org [200.46.204.72])
 | 
			
		||||
	by news.hub.org (8.12.9/8.12.9) with ESMTP id i9LLojJ7079086
 | 
			
		||||
	for <pgsql-admin@postgresql.org>; Thu, 21 Oct 2004 21:50:45 GMT
 | 
			
		||||
	(envelope-from news@news.hub.org)
 | 
			
		||||
Received: (from news@localhost)
 | 
			
		||||
	by news.hub.org (8.12.9/8.12.9/Submit) id i9LLnd7p078783
 | 
			
		||||
	for pgsql-admin@postgresql.org; Thu, 21 Oct 2004 21:49:39 GMT
 | 
			
		||||
From: Gaetano Mendola <mendola@bigfoot.com>
 | 
			
		||||
X-Newsgroups: comp.databases.postgresql.admin
 | 
			
		||||
Subject: Re: [ADMIN] replication using WAL archives
 | 
			
		||||
Date: Thu, 21 Oct 2004 23:49:35 +0200
 | 
			
		||||
Organization: PYRENET Midi-pyrenees Provider
 | 
			
		||||
Lines: 216
 | 
			
		||||
Message-ID: <41782EEF.5040708@bigfoot.com>
 | 
			
		||||
References: <002801c4b739$68450870$7201a8c0@mst1x5r347kymb> <1098384082.15573.14.camel@camel>
 | 
			
		||||
MIME-Version: 1.0
 | 
			
		||||
Content-Type: multipart/mixed;
 | 
			
		||||
	boundary="------------060900090803090101060101"
 | 
			
		||||
X-Complaints-To: abuse@pyrenet.fr
 | 
			
		||||
cc: iain@mst.co.jp
 | 
			
		||||
To: Robert Treat <xzilla@users.sourceforge.net>
 | 
			
		||||
User-Agent: Mozilla Thunderbird 0.8 (Windows/20040913)
 | 
			
		||||
X-Accept-Language: en-us, en
 | 
			
		||||
In-Reply-To: <1098384082.15573.14.camel@camel>
 | 
			
		||||
X-Enigmail-Version: 0.86.1.0
 | 
			
		||||
X-Enigmail-Supports: pgp-inline, pgp-mime
 | 
			
		||||
To: pgsql-admin@postgresql.org
 | 
			
		||||
X-Virus-Scanned: by amavisd-new at hub.org
 | 
			
		||||
X-Mailing-List: pgsql-admin
 | 
			
		||||
Precedence: bulk
 | 
			
		||||
Sender: pgsql-admin-owner@postgresql.org
 | 
			
		||||
X-Virus-Scanned: by amavisd-new at hub.org
 | 
			
		||||
X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on 
 | 
			
		||||
	candle.pha.pa.us
 | 
			
		||||
X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham 
 | 
			
		||||
	version=2.61
 | 
			
		||||
Status: OR
 | 
			
		||||
 | 
			
		||||
This is a multi-part message in MIME format.
 | 
			
		||||
--------------060900090803090101060101
 | 
			
		||||
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
 | 
			
		||||
Content-Transfer-Encoding: 7bit
 | 
			
		||||
 | 
			
		||||
Robert Treat wrote:
 | 
			
		||||
> On Thu, 2004-10-21 at 02:44, Iain wrote:
 | 
			
		||||
> 
 | 
			
		||||
>>Hi,
 | 
			
		||||
>> 
 | 
			
		||||
>>I thought I read something about this in relation to v8, but I can't
 | 
			
		||||
>>find any reference to it now... is it (or will it be) possible to do
 | 
			
		||||
>>master-slave style database replication by transmitting log files to the
 | 
			
		||||
>>standby server and having it process them?
 | 
			
		||||
>> 
 | 
			
		||||
> 
 | 
			
		||||
> 
 | 
			
		||||
> I'm not certain if this is 8.0, but some folks have created a working
 | 
			
		||||
> version against the 8.0 code that will do something like this. Search
 | 
			
		||||
> the pgsql-hacker mail list archives for more information on it. 
 | 
			
		||||
 | 
			
		||||
I sent a post on hackers, I put it here:
 | 
			
		||||
 | 
			
		||||
=======================================================================
 | 
			
		||||
Hi all,
 | 
			
		||||
I seen that Eric Kerin did the work suggested by Tom about
 | 
			
		||||
how to use the PITR in order to have an hot spare postgres,
 | 
			
		||||
writing a C program.
 | 
			
		||||
 | 
			
		||||
I did the same writing 2 shell scripts, one of them perform
 | 
			
		||||
the restore the other one deliver the partial filled wal and
 | 
			
		||||
check if the postmaster is alive ( check if the pid process
 | 
			
		||||
still exist ).
 | 
			
		||||
 | 
			
		||||
With these two scripts I'm able to have an hot spare installation,
 | 
			
		||||
and the spare one go alive when the first postmaster dies.
 | 
			
		||||
 | 
			
		||||
How test it:
 | 
			
		||||
 | 
			
		||||
1) Master node:
 | 
			
		||||
	modify postgresql.conf using:
 | 
			
		||||
 | 
			
		||||
~        archive_command = 'cp %p /mnt/server/archivedir/%f'
 | 
			
		||||
 | 
			
		||||
~        launch postgres and perform a backup as doc
 | 
			
		||||
 | 
			
		||||
~        http://developer.postgresql.org/docs/postgres/backup-online.html
 | 
			
		||||
 | 
			
		||||
	suggest to do
 | 
			
		||||
 | 
			
		||||
	launch the script:
 | 
			
		||||
 | 
			
		||||
	partial_wal_deliver.sh <PID> /mnt/server/partialdir <pg_xlog path>
 | 
			
		||||
 | 
			
		||||
~        this script will delivery each 10 seconds the "current" wal file,
 | 
			
		||||
~        and touch the "alive" file in order to notify the spare node that
 | 
			
		||||
~        the master node is up and running
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
2) Spare node:
 | 
			
		||||
	create a recovery.conf with the line:
 | 
			
		||||
 | 
			
		||||
~        restore_command = 'restore.sh /mnt/server/archivedir/%f %p /mnt/server/partialdir'
 | 
			
		||||
 | 
			
		||||
~        replace the content of data directory with the backup performed at point 1,
 | 
			
		||||
~        remove any file present in the pg_xlog directory ( leaving there the archive_status
 | 
			
		||||
~        directory ) and remove the postmaster.pid file ( this is necessary if you are running
 | 
			
		||||
~        the spare postgres on the same hw ).
 | 
			
		||||
 | 
			
		||||
~        launch the postmaster, the restore will continue till the "alive" file present in the
 | 
			
		||||
~        /mnt/server/partialdir directory is not updated for 60 seconds ( you can modify this
 | 
			
		||||
~        values inside the restore.sh script ).
 | 
			
		||||
 | 
			
		||||
Be sure that restore.sh and all directories involved are accessible
 | 
			
		||||
 | 
			
		||||
Let me know.
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
This is a first step, of course, as Eric Kerin did, is better port these script
 | 
			
		||||
in C and make it more robust.
 | 
			
		||||
 | 
			
		||||
Postgres can help this process, as suggested by Tom creating a pg_current_wal()
 | 
			
		||||
or even better having two new GUC parameters: archive_current_wal_command and
 | 
			
		||||
archive_current_wal_delay.
 | 
			
		||||
 | 
			
		||||
I problem I discover during the tests is that if you shut down the spare node
 | 
			
		||||
and the restore_command is still waiting for a file then the postmaster will never
 | 
			
		||||
exit  :-(
 | 
			
		||||
==========================================================================
 | 
			
		||||
 | 
			
		||||
I hope that is clear.
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
Regards
 | 
			
		||||
Gaetano Mendola
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
--------------060900090803090101060101
 | 
			
		||||
Content-Type: text/plain;
 | 
			
		||||
 name="restore.sh"
 | 
			
		||||
Content-Transfer-Encoding: 7bit
 | 
			
		||||
Content-Disposition: inline;
 | 
			
		||||
 filename="restore.sh"
 | 
			
		||||
 | 
			
		||||
#!/bin/bash
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
SOURCE=$1
 | 
			
		||||
TARGET=$2
 | 
			
		||||
PARTIAL=$3
 | 
			
		||||
 | 
			
		||||
SIZE_EXPECTED=16777216		#bytes     16 MB
 | 
			
		||||
DIED_TIME=60                    #seconds
 | 
			
		||||
 | 
			
		||||
function test_existence
 | 
			
		||||
{
 | 
			
		||||
    if [ -f ${SOURCE}   ]
 | 
			
		||||
    then
 | 
			
		||||
       COUNTER=0
 | 
			
		||||
 | 
			
		||||
       #I have to check if the file is begin copied
 | 
			
		||||
       #I assume that it will reach the right
 | 
			
		||||
       #size in a few seconds
 | 
			
		||||
 | 
			
		||||
       while [ $(stat -c '%s' ${SOURCE} ) -lt $SIZE_EXPECTED ]
 | 
			
		||||
       do
 | 
			
		||||
          sleep 1
 | 
			
		||||
          let COUNTER+=1
 | 
			
		||||
          if [ 20 -lt $COUNTER ]
 | 
			
		||||
          then
 | 
			
		||||
             exit 1    # BAILING OUT
 | 
			
		||||
          fi
 | 
			
		||||
       done
 | 
			
		||||
 | 
			
		||||
       cp $SOURCE $TARGET
 | 
			
		||||
       exit 0
 | 
			
		||||
    fi
 | 
			
		||||
    echo ${SOURCE}"> not found"
 | 
			
		||||
    
 | 
			
		||||
    #if is looking for a history file and not exist 
 | 
			
		||||
    #I have suddenly exit
 | 
			
		||||
    echo $SOURCE | grep history > /dev/null 2>&1 && exit 1
 | 
			
		||||
}
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
while [ 1 ]
 | 
			
		||||
do 
 | 
			
		||||
 | 
			
		||||
   test_existence
 | 
			
		||||
 | 
			
		||||
   #CHECK IF THE MASTER IS ALIVE
 | 
			
		||||
   DELTA_TIME=$(( $( date +'%s' ) - $( stat -c '%Z' ${PARTIAL}/alive ) ))
 | 
			
		||||
   if [ $DIED_TIME -lt $DELTA_TIME ]
 | 
			
		||||
   then
 | 
			
		||||
       echo "Master is dead..."
 | 
			
		||||
       # Master is dead
 | 
			
		||||
       CURRENT_WAL=$( basename $SOURCE )
 | 
			
		||||
       echo "Partial: " ${PARTIAL}
 | 
			
		||||
       echo "Current wal: " ${CURRENT_WAL}
 | 
			
		||||
       echo "Target: " ${TARGET}
 | 
			
		||||
       cp ${PARTIAL}/${CURRENT_WAL}.partial ${TARGET}  > /dev/null 2>&1 && exit 0
 | 
			
		||||
       exit 1
 | 
			
		||||
   fi
 | 
			
		||||
 | 
			
		||||
   sleep 1
 | 
			
		||||
 | 
			
		||||
done
 | 
			
		||||
 | 
			
		||||
--------------060900090803090101060101
 | 
			
		||||
Content-Type: text/plain;
 | 
			
		||||
 name="partial_wal_deliver.sh"
 | 
			
		||||
Content-Transfer-Encoding: 7bit
 | 
			
		||||
Content-Disposition: inline;
 | 
			
		||||
 filename="partial_wal_deliver.sh"
 | 
			
		||||
 | 
			
		||||
#!/bin/bash
 | 
			
		||||
 | 
			
		||||
PID=$1
 | 
			
		||||
PARTIAL=$2
 | 
			
		||||
PGXLOG=$3
 | 
			
		||||
 | 
			
		||||
function copy_last_wal
 | 
			
		||||
{
 | 
			
		||||
   FILE=$( ls -t1p $PGXLOG | grep -v / | head -1 )
 | 
			
		||||
 | 
			
		||||
   echo "Last Wal> " $FILE
 | 
			
		||||
 | 
			
		||||
   cp ${PGXLOG}/${FILE} ${PARTIAL}/${FILE}.tmp
 | 
			
		||||
   mv ${PARTIAL}/${FILE}.tmp ${PARTIAL}/${FILE}.partial
 | 
			
		||||
   find ${PARTIAL} -name *.partial | grep -v ${FILE} | xargs -i rm -fr {}
 | 
			
		||||
}
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
while [ 1 ]
 | 
			
		||||
do 
 | 
			
		||||
   ps --pid $PID > /dev/null 2>&1
 | 
			
		||||
   ALIVE=$?
 | 
			
		||||
   
 | 
			
		||||
   if [ "${ALIVE}" == "1" ]
 | 
			
		||||
   then
 | 
			
		||||
       #The process is dead
 | 
			
		||||
       echo "Process dead"
 | 
			
		||||
       copy_last_wal 
 | 
			
		||||
       exit 1
 | 
			
		||||
   fi
 | 
			
		||||
 | 
			
		||||
   #The process still exist
 | 
			
		||||
   touch ${PARTIAL}/alive
 | 
			
		||||
   copy_last_wal 
 | 
			
		||||
 | 
			
		||||
   sleep 10
 | 
			
		||||
done
 | 
			
		||||
 | 
			
		||||
--------------060900090803090101060101
 | 
			
		||||
Content-Type: text/plain
 | 
			
		||||
Content-Disposition: inline
 | 
			
		||||
Content-Transfer-Encoding: 8bit
 | 
			
		||||
MIME-Version: 1.0
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
---------------------------(end of broadcast)---------------------------
 | 
			
		||||
TIP 5: Have you checked our extensive FAQ?
 | 
			
		||||
 | 
			
		||||
               http://www.postgresql.org/docs/faqs/FAQ.html
 | 
			
		||||
 | 
			
		||||
--------------060900090803090101060101--
 | 
			
		||||
 | 
			
		||||
From pgsql-admin-owner+M15295=pgman=candle.pha.pa.us@postgresql.org Fri Oct 22 06:32:38 2004
 | 
			
		||||
Return-path: <pgsql-admin-owner+M15295=pgman=candle.pha.pa.us@postgresql.org>
 | 
			
		||||
Received: from svr1.postgresql.org (svr1.postgresql.org [200.46.204.71])
 | 
			
		||||
	by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i9M9Waf18397
 | 
			
		||||
	for <pgman@candle.pha.pa.us>; Fri, 22 Oct 2004 05:32:36 -0400 (EDT)
 | 
			
		||||
Received: from localhost (unknown [200.46.204.144])
 | 
			
		||||
	by svr1.postgresql.org (Postfix) with ESMTP id 9C9A532AC61
 | 
			
		||||
	for <pgman@candle.pha.pa.us>; Fri, 22 Oct 2004 10:32:32 +0100 (BST)
 | 
			
		||||
Received: from svr1.postgresql.org ([200.46.204.71])
 | 
			
		||||
	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | 
			
		||||
	with ESMTP id 53654-01 for <pgman@candle.pha.pa.us>;
 | 
			
		||||
	Fri, 22 Oct 2004 09:32:26 +0000 (GMT)
 | 
			
		||||
Received: from postgresql.org (svr1.postgresql.org [200.46.204.71])
 | 
			
		||||
	by svr1.postgresql.org (Postfix) with ESMTP id 3132D32AC53
 | 
			
		||||
	for <pgman@candle.pha.pa.us>; Fri, 22 Oct 2004 10:32:32 +0100 (BST)
 | 
			
		||||
X-Original-To: pgsql-admin-postgresql.org@localhost.postgresql.org
 | 
			
		||||
Received: from localhost (unknown [200.46.204.144])
 | 
			
		||||
	by svr1.postgresql.org (Postfix) with ESMTP id DC46E32A095
 | 
			
		||||
	for <pgsql-admin-postgresql.org@localhost.postgresql.org>; Fri, 22 Oct 2004 10:23:07 +0100 (BST)
 | 
			
		||||
Received: from svr1.postgresql.org ([200.46.204.71])
 | 
			
		||||
	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | 
			
		||||
	with ESMTP id 49812-03
 | 
			
		||||
	for <pgsql-admin-postgresql.org@localhost.postgresql.org>;
 | 
			
		||||
	Fri, 22 Oct 2004 09:22:52 +0000 (GMT)
 | 
			
		||||
Received: from cmailg3.svr.pol.co.uk (cmailg3.svr.pol.co.uk [195.92.195.173])
 | 
			
		||||
	by svr1.postgresql.org (Postfix) with ESMTP id 5AEA6329F2F
 | 
			
		||||
	for <pgsql-admin@postgresql.org>; Fri, 22 Oct 2004 10:22:57 +0100 (BST)
 | 
			
		||||
Received: from modem-21.monkey.dialup.pol.co.uk ([217.135.208.21] helo=Nightingale)
 | 
			
		||||
	by cmailg3.svr.pol.co.uk with smtp (Exim 4.41)
 | 
			
		||||
	id 1CKvdM-0005eh-NO; Fri, 22 Oct 2004 10:22:53 +0100
 | 
			
		||||
Message-ID: <011a01c4b818$b7370a20$06e887d9@Nightingale>
 | 
			
		||||
From: "Simon Riggs" <simon@2ndquadrant.com>
 | 
			
		||||
To: "Gaetano Mendola" <mendola@bigfoot.com>,
 | 
			
		||||
   "Robert Treat" <xzilla@users.sourceforge.net>, <pgsql-admin@postgresql.org>
 | 
			
		||||
cc: <iain@mst.co.jp>
 | 
			
		||||
References: <002801c4b739$68450870$7201a8c0@mst1x5r347kymb> <1098384082.15573.14.camel@camel> <41782EEF.5040708@bigfoot.com>
 | 
			
		||||
Subject: Re: [ADMIN] replication using WAL archives
 | 
			
		||||
Date: Fri, 22 Oct 2004 10:22:54 +0100
 | 
			
		||||
MIME-Version: 1.0
 | 
			
		||||
Content-Type: text/plain;
 | 
			
		||||
	charset="iso-8859-1"
 | 
			
		||||
Content-Transfer-Encoding: 7bit
 | 
			
		||||
X-Priority: 3
 | 
			
		||||
X-MSMail-Priority: Normal
 | 
			
		||||
X-Mailer: Microsoft Outlook Express 6.00.2800.1409
 | 
			
		||||
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1409
 | 
			
		||||
X-Virus-Scanned: by amavisd-new at hub.org
 | 
			
		||||
X-Mailing-List: pgsql-admin
 | 
			
		||||
Precedence: bulk
 | 
			
		||||
Sender: pgsql-admin-owner@postgresql.org
 | 
			
		||||
X-Virus-Scanned: by amavisd-new at hub.org
 | 
			
		||||
X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on 
 | 
			
		||||
	candle.pha.pa.us
 | 
			
		||||
X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham 
 | 
			
		||||
	version=2.61
 | 
			
		||||
Status: OR
 | 
			
		||||
 | 
			
		||||
> Gaetano Mendola wrote
 | 
			
		||||
> Postgres can help this process, as suggested by Tom creating a
 | 
			
		||||
pg_current_wal()
 | 
			
		||||
> or even better having two new GUC parameters: archive_current_wal_command
 | 
			
		||||
and
 | 
			
		||||
> archive_current_wal_delay.
 | 
			
		||||
 | 
			
		||||
OK, we can modify the archiver to do this as well as the archive-when-full
 | 
			
		||||
functionality. I'd already agreed to do something similar for 8.1
 | 
			
		||||
 | 
			
		||||
PROPOSAL:
 | 
			
		||||
By default, archive_max_delay would be 10 seconds.
 | 
			
		||||
By default, archive_current_wal_command is not set.
 | 
			
		||||
If archive_current_wal_command is not set, the archiver will archive a file
 | 
			
		||||
using archive_command only when the file is full.
 | 
			
		||||
If archive_current_wal_command is set, the archiver would archive a file
 | 
			
		||||
whichever of these occurs first...
 | 
			
		||||
- it is full
 | 
			
		||||
- the archive_max_delay timeout occurs (default: disabled)
 | 
			
		||||
...as you can see I've renamed archive_current_wal_delay to reflect the fact
 | 
			
		||||
that there is an interaction between the current mechanism (only when full)
 | 
			
		||||
and this additional mechanism (no longer than X secs between log files).
 | 
			
		||||
With that design, if the logs are being created quickly enough, then a
 | 
			
		||||
partial log file is never created, only full ones.
 | 
			
		||||
 | 
			
		||||
When an xlog file is archived because it is full, then it is sent to both
 | 
			
		||||
archive_current_wal_command and archive_command (in that order). When the
 | 
			
		||||
timeout occurs and we have a partial xlog file, it would only be sent to
 | 
			
		||||
archive_current_wal_command. It may also be desirable to not use
 | 
			
		||||
archive_command at all, only to use archive_current_wal_command. That's not
 | 
			
		||||
currently possible because archive_command is the switch by which all of the
 | 
			
		||||
archive functioanlity is enabled, so you can't actually turn this off.
 | 
			
		||||
 | 
			
		||||
There is already a timeout feature designed into archiver for safety...so I
 | 
			
		||||
can make that read the GUCs, above and act accordingly.
 | 
			
		||||
 | 
			
		||||
There is an unresolved resilience issue: if the archiver goes down (or
 | 
			
		||||
whatever does the partial_wal copy functionality) then it it is possible
 | 
			
		||||
that users will continue writing to the database and creating xlog records.
 | 
			
		||||
It would be up to the user to define how to handle records that had been
 | 
			
		||||
committed to the first database in the interim before cutover. It would also
 | 
			
		||||
be up to the user to shut down the first node from the second - Shoot the
 | 
			
		||||
Other Node in the Head, as its known. All of that is up to the second node,
 | 
			
		||||
and as Tom says, is "the hard part"....I'm not proposing to do anything
 | 
			
		||||
about that at this stage, since it is implementation dependant.
 | 
			
		||||
 | 
			
		||||
I was thinking perhaps to move to having variable size xlog files, since
 | 
			
		||||
their contents are now variable - no padded records at EOF. If we did that,
 | 
			
		||||
then the archiver could simply issue a "switch logfile" and then the
 | 
			
		||||
archiver would cut in anyway to copy away the xlog. Having said that it is
 | 
			
		||||
lots easier just to put a blind timeout in the archiver and copy the file -
 | 
			
		||||
though I'm fairly uneasy about the point that we'd be ignoring the fact that
 | 
			
		||||
many people are still writing to it. But I propose doing the easy way....
 | 
			
		||||
 | 
			
		||||
Thoughts?
 | 
			
		||||
 | 
			
		||||
= - = - =
 | 
			
		||||
 | 
			
		||||
Gaetano - skim-reading your script, how do you handle the situation when a
 | 
			
		||||
new xlog file has been written within 10 seconds? That way the current file
 | 
			
		||||
number will have jumped by 2, so when your script looks for the "Last wal"
 | 
			
		||||
using head -1 it will find the N+2 and the intermediate file will never be
 | 
			
		||||
copied. Looks like a problem to me...
 | 
			
		||||
 | 
			
		||||
> I problem I discover during the tests is that if you shut down the spare
 | 
			
		||||
node
 | 
			
		||||
> and the restore_command is still waiting for a file then the postmaster
 | 
			
		||||
will never
 | 
			
		||||
> exit  :-(
 | 
			
		||||
 | 
			
		||||
Hmm....Are you reporting this as a bug for 8.0? It's not on the bug list...
 | 
			
		||||
 | 
			
		||||
Do we consider that to be desirable or not?
 | 
			
		||||
 | 
			
		||||
Best Regards, Simon Riggs
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
---------------------------(end of broadcast)---------------------------
 | 
			
		||||
TIP 2: you can get off all lists at once with the unregister command
 | 
			
		||||
    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
 | 
			
		||||
 | 
			
		||||
From pgsql-admin-owner+M15302=pgman=candle.pha.pa.us@postgresql.org Fri Oct 22 13:56:14 2004
 | 
			
		||||
Return-path: <pgsql-admin-owner+M15302=pgman=candle.pha.pa.us@postgresql.org>
 | 
			
		||||
Received: from svr1.postgresql.org (svr1.postgresql.org [200.46.204.71])
 | 
			
		||||
	by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i9MGuCf28637
 | 
			
		||||
	for <pgman@candle.pha.pa.us>; Fri, 22 Oct 2004 12:56:13 -0400 (EDT)
 | 
			
		||||
Received: from localhost (unknown [200.46.204.144])
 | 
			
		||||
	by svr1.postgresql.org (Postfix) with ESMTP id 54E77EAEDAA
 | 
			
		||||
	for <pgman@candle.pha.pa.us>; Fri, 22 Oct 2004 17:55:51 +0100 (BST)
 | 
			
		||||
Received: from svr1.postgresql.org ([200.46.204.71])
 | 
			
		||||
	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | 
			
		||||
	with ESMTP id 86116-09 for <pgman@candle.pha.pa.us>;
 | 
			
		||||
	Fri, 22 Oct 2004 16:55:57 +0000 (GMT)
 | 
			
		||||
Received: from postgresql.org (svr1.postgresql.org [200.46.204.71])
 | 
			
		||||
	by svr1.postgresql.org (Postfix) with ESMTP id 0EC98EAEDA7
 | 
			
		||||
	for <pgman@candle.pha.pa.us>; Fri, 22 Oct 2004 17:55:51 +0100 (BST)
 | 
			
		||||
X-Original-To: pgsql-admin-postgresql.org@localhost.postgresql.org
 | 
			
		||||
Received: from localhost (unknown [200.46.204.144])
 | 
			
		||||
	by svr1.postgresql.org (Postfix) with ESMTP id 5DB98EAEDBE
 | 
			
		||||
	for <pgsql-admin-postgresql.org@localhost.postgresql.org>; Fri, 22 Oct 2004 17:45:13 +0100 (BST)
 | 
			
		||||
Received: from svr1.postgresql.org ([200.46.204.71])
 | 
			
		||||
	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | 
			
		||||
	with ESMTP id 82473-08
 | 
			
		||||
	for <pgsql-admin-postgresql.org@localhost.postgresql.org>;
 | 
			
		||||
	Fri, 22 Oct 2004 16:45:11 +0000 (GMT)
 | 
			
		||||
Received: from pns.mm.eutelsat.org (pns.mm.eutelsat.org [194.214.173.227])
 | 
			
		||||
	by svr1.postgresql.org (Postfix) with ESMTP id E49F0EAEDB5
 | 
			
		||||
	for <pgsql-admin@postgresql.org>; Fri, 22 Oct 2004 17:45:00 +0100 (BST)
 | 
			
		||||
Received: from nts-03.mm.eutelsat.org (localhost [127.0.0.1])
 | 
			
		||||
	by pns.mm.eutelsat.org (8.11.6/linuxconf) with ESMTP id i9MGh0U26124;
 | 
			
		||||
	Fri, 22 Oct 2004 18:43:01 +0200
 | 
			
		||||
Received: from [127.0.0.1] (accesspoint.mm.eutelsat.org [194.214.173.4])
 | 
			
		||||
	by nts-03.mm.eutelsat.org (8.11.6/linuxconf) with ESMTP id i9MGj5f09681;
 | 
			
		||||
	Fri, 22 Oct 2004 18:45:05 +0200
 | 
			
		||||
Message-ID: <4179390B.10700@bigfoot.com>
 | 
			
		||||
Date: Fri, 22 Oct 2004 18:44:59 +0200
 | 
			
		||||
From: Gaetano Mendola <mendola@bigfoot.com>
 | 
			
		||||
User-Agent: Mozilla Thunderbird 0.8 (Windows/20040913)
 | 
			
		||||
X-Accept-Language: en-us, en
 | 
			
		||||
MIME-Version: 1.0
 | 
			
		||||
To: Simon Riggs <simon@2ndquadrant.com>
 | 
			
		||||
cc: Robert Treat <xzilla@users.sourceforge.net>, pgsql-admin@postgresql.org,
 | 
			
		||||
   iain@mst.co.jp
 | 
			
		||||
Subject: Re: [ADMIN] replication using WAL archives
 | 
			
		||||
References: <002801c4b739$68450870$7201a8c0@mst1x5r347kymb> <1098384082.15573.14.camel@camel> <41782EEF.5040708@bigfoot.com> <011a01c4b818$b7370a20$06e887d9@Nightingale>
 | 
			
		||||
In-Reply-To: <011a01c4b818$b7370a20$06e887d9@Nightingale>
 | 
			
		||||
X-Enigmail-Version: 0.86.1.0
 | 
			
		||||
X-Enigmail-Supports: pgp-inline, pgp-mime
 | 
			
		||||
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
 | 
			
		||||
Content-Transfer-Encoding: 7bit
 | 
			
		||||
X-Virus-Scanned: by amavisd-new at hub.org
 | 
			
		||||
X-Mailing-List: pgsql-admin
 | 
			
		||||
Precedence: bulk
 | 
			
		||||
Sender: pgsql-admin-owner@postgresql.org
 | 
			
		||||
X-Virus-Scanned: by amavisd-new at hub.org
 | 
			
		||||
X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on 
 | 
			
		||||
	candle.pha.pa.us
 | 
			
		||||
X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham 
 | 
			
		||||
	version=2.61
 | 
			
		||||
Status: OR
 | 
			
		||||
 | 
			
		||||
-----BEGIN PGP SIGNED MESSAGE-----
 | 
			
		||||
Hash: SHA1
 | 
			
		||||
 | 
			
		||||
Simon Riggs wrote:
 | 
			
		||||
|>Gaetano Mendola wrote
 | 
			
		||||
|>Postgres can help this process, as suggested by Tom creating a
 | 
			
		||||
|
 | 
			
		||||
| pg_current_wal()
 | 
			
		||||
|
 | 
			
		||||
|>or even better having two new GUC parameters: archive_current_wal_command
 | 
			
		||||
|
 | 
			
		||||
| and
 | 
			
		||||
|
 | 
			
		||||
|>archive_current_wal_delay.
 | 
			
		||||
|
 | 
			
		||||
|
 | 
			
		||||
| OK, we can modify the archiver to do this as well as the archive-when-full
 | 
			
		||||
| functionality. I'd already agreed to do something similar for 8.1
 | 
			
		||||
|
 | 
			
		||||
| PROPOSAL:
 | 
			
		||||
| By default, archive_max_delay would be 10 seconds.
 | 
			
		||||
| By default, archive_current_wal_command is not set.
 | 
			
		||||
| If archive_current_wal_command is not set, the archiver will archive a file
 | 
			
		||||
| using archive_command only when the file is full.
 | 
			
		||||
| If archive_current_wal_command is set, the archiver would archive a file
 | 
			
		||||
| whichever of these occurs first...
 | 
			
		||||
| - it is full
 | 
			
		||||
| - the archive_max_delay timeout occurs (default: disabled)
 | 
			
		||||
| ...as you can see I've renamed archive_current_wal_delay to reflect the fact
 | 
			
		||||
| that there is an interaction between the current mechanism (only when full)
 | 
			
		||||
| and this additional mechanism (no longer than X secs between log files).
 | 
			
		||||
| With that design, if the logs are being created quickly enough, then a
 | 
			
		||||
| partial log file is never created, only full ones.
 | 
			
		||||
|
 | 
			
		||||
| When an xlog file is archived because it is full, then it is sent to both
 | 
			
		||||
| archive_current_wal_command and archive_command (in that order). When the
 | 
			
		||||
| timeout occurs and we have a partial xlog file, it would only be sent to
 | 
			
		||||
| archive_current_wal_command. It may also be desirable to not use
 | 
			
		||||
| archive_command at all, only to use archive_current_wal_command. That's not
 | 
			
		||||
| currently possible because archive_command is the switch by which all of the
 | 
			
		||||
| archive functioanlity is enabled, so you can't actually turn this off.
 | 
			
		||||
 | 
			
		||||
It seems good to me, the script behind archive command can be a nop if someone
 | 
			
		||||
want use the archive_current_wal_command
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
| = - = - =
 | 
			
		||||
|
 | 
			
		||||
| Gaetano - skim-reading your script, how do you handle the situation when a
 | 
			
		||||
| new xlog file has been written within 10 seconds? That way the current file
 | 
			
		||||
| number will have jumped by 2, so when your script looks for the "Last wal"
 | 
			
		||||
| using head -1 it will find the N+2 and the intermediate file will never be
 | 
			
		||||
| copied. Looks like a problem to me...
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
Yes, the only window failure I seen ( but I don't know if it's possible )
 | 
			
		||||
 | 
			
		||||
Master:
 | 
			
		||||
~        log N created
 | 
			
		||||
	log N filled
 | 
			
		||||
	archive log N
 | 
			
		||||
	log N+1 created
 | 
			
		||||
	log N+1 filled
 | 
			
		||||
~        log N+2 created
 | 
			
		||||
~                   <---- the master die here before to archive the log N+1
 | 
			
		||||
~        archive log N+1
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
in this case as you underline tha last log archived is the N and the N+2
 | 
			
		||||
partial wal is added to archived wal collection and in the recovery fase
 | 
			
		||||
the recovery stop after processing the log N.
 | 
			
		||||
 | 
			
		||||
Is it possible that the postmaster create the N+2 file without finish to archive
 | 
			
		||||
the N+1 ? ( I suspect yes :-(  )
 | 
			
		||||
 | 
			
		||||
The only cure I see here is to look for not archived WAL ( if possible ).
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
|>I problem I discover during the tests is that if you shut down the spare
 | 
			
		||||
|>node and the restore_command is still waiting for a file then the postmaster
 | 
			
		||||
|>will never exit  :-(
 | 
			
		||||
|
 | 
			
		||||
|
 | 
			
		||||
| Hmm....Are you reporting this as a bug for 8.0? It's not on the bug list...
 | 
			
		||||
 | 
			
		||||
For me is a behave to avoid.
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
Regards
 | 
			
		||||
Gaetano Mendola
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
-----BEGIN PGP SIGNATURE-----
 | 
			
		||||
Version: GnuPG v1.2.5 (MingW32)
 | 
			
		||||
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
 | 
			
		||||
 | 
			
		||||
iD8DBQFBeTkJ7UpzwH2SGd4RAsMxAKCbV7W+wrGBocf2Ftlthm0egAlIWACgp87L
 | 
			
		||||
KU/YusyHuvT7jSFwZVKpP3M=
 | 
			
		||||
=rWZx
 | 
			
		||||
-----END PGP SIGNATURE-----
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
---------------------------(end of broadcast)---------------------------
 | 
			
		||||
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
 | 
			
		||||
 | 
			
		||||
From pgsql-admin-owner+M15303=pgman=candle.pha.pa.us@postgresql.org Fri Oct 22 14:43:36 2004
 | 
			
		||||
Return-path: <pgsql-admin-owner+M15303=pgman=candle.pha.pa.us@postgresql.org>
 | 
			
		||||
Received: from svr1.postgresql.org (svr1.postgresql.org [200.46.204.71])
 | 
			
		||||
	by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i9MHhZf06453
 | 
			
		||||
	for <pgman@candle.pha.pa.us>; Fri, 22 Oct 2004 13:43:35 -0400 (EDT)
 | 
			
		||||
Received: from localhost (unknown [200.46.204.144])
 | 
			
		||||
	by svr1.postgresql.org (Postfix) with ESMTP id 01DD2EADBB7
 | 
			
		||||
	for <pgman@candle.pha.pa.us>; Fri, 22 Oct 2004 18:43:13 +0100 (BST)
 | 
			
		||||
Received: from svr1.postgresql.org ([200.46.204.71])
 | 
			
		||||
	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | 
			
		||||
	with ESMTP id 01872-03 for <pgman@candle.pha.pa.us>;
 | 
			
		||||
	Fri, 22 Oct 2004 17:43:19 +0000 (GMT)
 | 
			
		||||
Received: from postgresql.org (svr1.postgresql.org [200.46.204.71])
 | 
			
		||||
	by svr1.postgresql.org (Postfix) with ESMTP id 9E633EADAD4
 | 
			
		||||
	for <pgman@candle.pha.pa.us>; Fri, 22 Oct 2004 18:43:12 +0100 (BST)
 | 
			
		||||
X-Original-To: pgsql-admin-postgresql.org@localhost.postgresql.org
 | 
			
		||||
Received: from localhost (unknown [200.46.204.144])
 | 
			
		||||
	by svr1.postgresql.org (Postfix) with ESMTP id C1133EAED89
 | 
			
		||||
	for <pgsql-admin-postgresql.org@localhost.postgresql.org>; Fri, 22 Oct 2004 18:31:20 +0100 (BST)
 | 
			
		||||
Received: from svr1.postgresql.org ([200.46.204.71])
 | 
			
		||||
	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | 
			
		||||
	with ESMTP id 97130-03
 | 
			
		||||
	for <pgsql-admin-postgresql.org@localhost.postgresql.org>;
 | 
			
		||||
	Fri, 22 Oct 2004 17:31:17 +0000 (GMT)
 | 
			
		||||
Received: from cmailm2.svr.pol.co.uk (cmailm2.svr.pol.co.uk [195.92.193.210])
 | 
			
		||||
	by svr1.postgresql.org (Postfix) with ESMTP id 276CAEADBBD
 | 
			
		||||
	for <pgsql-admin@postgresql.org>; Fri, 22 Oct 2004 18:31:07 +0100 (BST)
 | 
			
		||||
Received: from modem-558.snake.dialup.pol.co.uk ([62.137.114.46] helo=[192.168.0.102])
 | 
			
		||||
	by cmailm2.svr.pol.co.uk with esmtp (Exim 4.41)
 | 
			
		||||
	id 1CL3G3-0001Tx-K5; Fri, 22 Oct 2004 18:31:20 +0100
 | 
			
		||||
Subject: Re: [ADMIN] replication using WAL archives
 | 
			
		||||
From: Simon Riggs <simon@2ndquadrant.com>
 | 
			
		||||
To: Gaetano Mendola <mendola@bigfoot.com>
 | 
			
		||||
cc: Robert Treat <xzilla@users.sourceforge.net>, pgsql-admin@postgresql.org,
 | 
			
		||||
   iain@mst.co.jp
 | 
			
		||||
In-Reply-To: <4179390B.10700@bigfoot.com>
 | 
			
		||||
References: <002801c4b739$68450870$7201a8c0@mst1x5r347kymb>
 | 
			
		||||
  <1098384082.15573.14.camel@camel> <41782EEF.5040708@bigfoot.com>
 | 
			
		||||
  <011a01c4b818$b7370a20$06e887d9@Nightingale>  <4179390B.10700@bigfoot.com>
 | 
			
		||||
Content-Type: text/plain
 | 
			
		||||
Organization: 2nd Quadrant
 | 
			
		||||
Message-ID: <1098466150.20926.13.camel@localhost.localdomain>
 | 
			
		||||
MIME-Version: 1.0
 | 
			
		||||
X-Mailer: Ximian Evolution 1.4.6 (1.4.6-2) 
 | 
			
		||||
Date: Fri, 22 Oct 2004 18:29:10 +0100
 | 
			
		||||
Content-Transfer-Encoding: 7bit
 | 
			
		||||
X-Virus-Scanned: by amavisd-new at hub.org
 | 
			
		||||
X-Mailing-List: pgsql-admin
 | 
			
		||||
Precedence: bulk
 | 
			
		||||
Sender: pgsql-admin-owner@postgresql.org
 | 
			
		||||
X-Virus-Scanned: by amavisd-new at hub.org
 | 
			
		||||
X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on 
 | 
			
		||||
	candle.pha.pa.us
 | 
			
		||||
X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham 
 | 
			
		||||
	version=2.61
 | 
			
		||||
Status: OR
 | 
			
		||||
 | 
			
		||||
On Fri, 2004-10-22 at 17:44, Gaetano Mendola wrote:
 | 
			
		||||
> | Gaetano - skim-reading your script, how do you handle the situation when a
 | 
			
		||||
> | new xlog file has been written within 10 seconds? That way the current file
 | 
			
		||||
> | number will have jumped by 2, so when your script looks for the "Last wal"
 | 
			
		||||
> | using head -1 it will find the N+2 and the intermediate file will never be
 | 
			
		||||
> | copied. Looks like a problem to me...
 | 
			
		||||
> 
 | 
			
		||||
> 
 | 
			
		||||
> Yes, the only window failure I seen ( but I don't know if it's possible )
 | 
			
		||||
> 
 | 
			
		||||
> Master:
 | 
			
		||||
> ~        log N created
 | 
			
		||||
> 	log N filled
 | 
			
		||||
> 	archive log N
 | 
			
		||||
> 	log N+1 created
 | 
			
		||||
> 	log N+1 filled
 | 
			
		||||
> ~        log N+2 created
 | 
			
		||||
> ~                   <---- the master die here before to archive the log N+1
 | 
			
		||||
> ~        archive log N+1
 | 
			
		||||
> 
 | 
			
		||||
> 
 | 
			
		||||
> in this case as you underline tha last log archived is the N and the N+2
 | 
			
		||||
> partial wal is added to archived wal collection and in the recovery fase
 | 
			
		||||
> the recovery stop after processing the log N.
 | 
			
		||||
> 
 | 
			
		||||
> Is it possible that the postmaster create the N+2 file without finish to archive
 | 
			
		||||
> the N+1 ? ( I suspect yes :-(  )
 | 
			
		||||
> 
 | 
			
		||||
> The only cure I see here is to look for not archived WAL ( if possible ).
 | 
			
		||||
> 
 | 
			
		||||
 | 
			
		||||
Hmm...well you aren't looking for archived wal, you're just looking at
 | 
			
		||||
wal...which is a different thing...
 | 
			
		||||
 | 
			
		||||
Situation I thought I saw was:
 | 
			
		||||
 | 
			
		||||
- copy away current partial filled xlog N
 | 
			
		||||
- xlog N fills, N+1 starts
 | 
			
		||||
- xlog N+1 fills, N+2 starts
 | 
			
		||||
- copy away current partial filled xlog: N+2 (+10 secs later)
 | 
			
		||||
 | 
			
		||||
i.e. if time to fill xlog (is ever) < time to copy away current xlog,
 | 
			
		||||
then you miss one.
 | 
			
		||||
 | 
			
		||||
So problem: you can miss one and never know you've missed one until the
 | 
			
		||||
recovery can't find it, which it never returns from...so it just hangs.
 | 
			
		||||
 | 
			
		||||
[Just so we're all clear: we're talking about Gaetano's script, not the
 | 
			
		||||
PostgreSQL archver. The postgresql archiver doesn't do it that way, so
 | 
			
		||||
it never misses one.]
 | 
			
		||||
 | 
			
		||||
-- 
 | 
			
		||||
Best Regards, Simon Riggs
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
---------------------------(end of broadcast)---------------------------
 | 
			
		||||
TIP 9: the planner will ignore your desire to choose an index scan if your
 | 
			
		||||
      joining column's datatypes do not match
 | 
			
		||||
 | 
			
		||||
From pgsql-admin-owner+M15306=pgman=candle.pha.pa.us@postgresql.org Fri Oct 22 17:56:07 2004
 | 
			
		||||
Return-path: <pgsql-admin-owner+M15306=pgman=candle.pha.pa.us@postgresql.org>
 | 
			
		||||
Received: from svr1.postgresql.org (svr1.postgresql.org [200.46.204.71])
 | 
			
		||||
	by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i9MKu6f05264
 | 
			
		||||
	for <pgman@candle.pha.pa.us>; Fri, 22 Oct 2004 16:56:06 -0400 (EDT)
 | 
			
		||||
Received: from localhost (unknown [200.46.204.144])
 | 
			
		||||
	by svr1.postgresql.org (Postfix) with ESMTP id 4F4C2EAE4AE
 | 
			
		||||
	for <pgman@candle.pha.pa.us>; Fri, 22 Oct 2004 21:55:41 +0100 (BST)
 | 
			
		||||
Received: from svr1.postgresql.org ([200.46.204.71])
 | 
			
		||||
	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | 
			
		||||
	with ESMTP id 62857-05 for <pgman@candle.pha.pa.us>;
 | 
			
		||||
	Fri, 22 Oct 2004 20:55:48 +0000 (GMT)
 | 
			
		||||
Received: from postgresql.org (svr1.postgresql.org [200.46.204.71])
 | 
			
		||||
	by svr1.postgresql.org (Postfix) with ESMTP id 095CEEAE4AC
 | 
			
		||||
	for <pgman@candle.pha.pa.us>; Fri, 22 Oct 2004 21:55:41 +0100 (BST)
 | 
			
		||||
X-Original-To: pgsql-admin-postgresql.org@localhost.postgresql.org
 | 
			
		||||
Received: from localhost (unknown [200.46.204.144])
 | 
			
		||||
	by svr1.postgresql.org (Postfix) with ESMTP id 3FC9BEAE486
 | 
			
		||||
	for <pgsql-admin-postgresql.org@localhost.postgresql.org>; Fri, 22 Oct 2004 21:50:48 +0100 (BST)
 | 
			
		||||
Received: from svr1.postgresql.org ([200.46.204.71])
 | 
			
		||||
	by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
 | 
			
		||||
	with ESMTP id 62565-02
 | 
			
		||||
	for <pgsql-admin-postgresql.org@localhost.postgresql.org>;
 | 
			
		||||
	Fri, 22 Oct 2004 20:50:48 +0000 (GMT)
 | 
			
		||||
Received: from news.hub.org (news.hub.org [200.46.204.72])
 | 
			
		||||
	by svr1.postgresql.org (Postfix) with ESMTP id 06C49EAE46B
 | 
			
		||||
	for <pgsql-admin@postgresql.org>; Fri, 22 Oct 2004 21:50:40 +0100 (BST)
 | 
			
		||||
Received: from news.hub.org (news.hub.org [200.46.204.72])
 | 
			
		||||
	by news.hub.org (8.12.9/8.12.9) with ESMTP id i9MKolJB062812
 | 
			
		||||
	for <pgsql-admin@postgresql.org>; Fri, 22 Oct 2004 20:50:48 GMT
 | 
			
		||||
	(envelope-from news@news.hub.org)
 | 
			
		||||
Received: (from news@localhost)
 | 
			
		||||
	by news.hub.org (8.12.9/8.12.9/Submit) id i9MKoRHh062731
 | 
			
		||||
	for pgsql-admin@postgresql.org; Fri, 22 Oct 2004 20:50:27 GMT
 | 
			
		||||
From: Gaetano Mendola <mendola@bigfoot.com>
 | 
			
		||||
X-Newsgroups: comp.databases.postgresql.admin
 | 
			
		||||
Subject: Re: [ADMIN] replication using WAL archives
 | 
			
		||||
Date: Fri, 22 Oct 2004 22:50:34 +0200
 | 
			
		||||
Organization: PYRENET Midi-pyrenees Provider
 | 
			
		||||
Lines: 39
 | 
			
		||||
Message-ID: <4179729A.5020401@bigfoot.com>
 | 
			
		||||
References: <002801c4b739$68450870$7201a8c0@mst1x5r347kymb>	 <1098384082.15573.14.camel@camel> <41782EEF.5040708@bigfoot.com>	 <011a01c4b818$b7370a20$06e887d9@Nightingale>  <4179390B.10700@bigfoot.com> <1098466150.20926.13.camel@localhost.localdomain>
 | 
			
		||||
MIME-Version: 1.0
 | 
			
		||||
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
 | 
			
		||||
Content-Transfer-Encoding: 7bit
 | 
			
		||||
X-Complaints-To: abuse@pyrenet.fr
 | 
			
		||||
To: Simon Riggs <simon@2ndquadrant.com>
 | 
			
		||||
User-Agent: Mozilla Thunderbird 0.8 (Windows/20040913)
 | 
			
		||||
X-Accept-Language: en-us, en
 | 
			
		||||
In-Reply-To: <1098466150.20926.13.camel@localhost.localdomain>
 | 
			
		||||
X-Enigmail-Version: 0.86.1.0
 | 
			
		||||
X-Enigmail-Supports: pgp-inline, pgp-mime
 | 
			
		||||
To: pgsql-admin@postgresql.org
 | 
			
		||||
X-Virus-Scanned: by amavisd-new at hub.org
 | 
			
		||||
X-Mailing-List: pgsql-admin
 | 
			
		||||
Precedence: bulk
 | 
			
		||||
Sender: pgsql-admin-owner@postgresql.org
 | 
			
		||||
X-Virus-Scanned: by amavisd-new at hub.org
 | 
			
		||||
X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on 
 | 
			
		||||
	candle.pha.pa.us
 | 
			
		||||
X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham 
 | 
			
		||||
	version=2.61
 | 
			
		||||
Status: OR
 | 
			
		||||
 | 
			
		||||
Simon Riggs wrote:
 | 
			
		||||
 | 
			
		||||
 > Situation I thought I saw was:
 | 
			
		||||
 >
 | 
			
		||||
 > - copy away current partial filled xlog N
 | 
			
		||||
 > - xlog N fills, N+1 starts
 | 
			
		||||
 > - xlog N+1 fills, N+2 starts
 | 
			
		||||
 > - copy away current partial filled xlog: N+2 (+10 secs later)
 | 
			
		||||
 >
 | 
			
		||||
 > i.e. if time to fill xlog (is ever) < time to copy away current xlog,
 | 
			
		||||
 > then you miss one.
 | 
			
		||||
 >
 | 
			
		||||
 > So problem: you can miss one and never know you've missed one until the
 | 
			
		||||
 > recovery can't find it, which it never returns from...so it just hangs.
 | 
			
		||||
 | 
			
		||||
No. The restore.sh is not smart enough to know the last wal that must be
 | 
			
		||||
replayed, the only "smart thing" is to copy the supposed "current wal" in the
 | 
			
		||||
archive directory.
 | 
			
		||||
 | 
			
		||||
The script hang (and is a feature not a bug) if and only if the master is alive
 | 
			
		||||
( at least I'm not seeing any other hang ).
 | 
			
		||||
 | 
			
		||||
In your example in the archived directory will be present the files until logN
 | 
			
		||||
and logN+2 ( the current wal ) is in the partial directory, if the master die,
 | 
			
		||||
the restore.sh will copy logN+2 in the archived directory, the spare node will
 | 
			
		||||
execute restore.sh with file logN+1 as argument and if is not found then the
 | 
			
		||||
restore.sh will exit.
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
Regards
 | 
			
		||||
Gaetano Mendola
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
---------------------------(end of broadcast)---------------------------
 | 
			
		||||
TIP 8: explain analyze is your friend
 | 
			
		||||
 | 
			
		||||
		Reference in New Issue
	
	Block a user