mirror of
				https://github.com/postgres/postgres.git
				synced 2025-10-29 22:49:41 +03:00 
			
		
		
		
	
		
			
				
	
	
		
			6242 lines
		
	
	
		
			266 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			6242 lines
		
	
	
		
			266 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| From goran@kirra.net Mon Dec 20 14:30:54 1999
 | |
| Received: from villa.bildbasen.se (villa.bildbasen.se [193.45.225.97])
 | |
| 	by candle.pha.pa.us (8.9.0/8.9.0) with SMTP id PAA29058
 | |
| 	for <pgman@candle.pha.pa.us>; Mon, 20 Dec 1999 15:30:17 -0500 (EST)
 | |
| Received: (qmail 2485 invoked from network); 20 Dec 1999 20:29:53 -0000
 | |
| Received: from a112.dial.kiruna.se (HELO kirra.net) (193.45.238.12)
 | |
|   by villa.bildbasen.se with SMTP; 20 Dec 1999 20:29:53 -0000
 | |
| Sender: goran
 | |
| Message-ID: <385E9192.226CC37D@kirra.net>
 | |
| Date: Mon, 20 Dec 1999 21:29:06 +0100
 | |
| From: Goran Thyni <goran@kirra.net>
 | |
| Organization: kirra.net
 | |
| X-Mailer: Mozilla 4.6 [en] (X11; U; Linux 2.2.13 i586)
 | |
| X-Accept-Language: sv, en
 | |
| MIME-Version: 1.0
 | |
| To: Bruce Momjian <pgman@candle.pha.pa.us>
 | |
| CC: "neil d. quiogue" <nquiogue@ieee.org>,
 | |
|         PostgreSQL-development <pgsql-hackers@postgreSQL.org>
 | |
| Subject: Re: [HACKERS] Re: QUESTION: Replication
 | |
| References: <199912201508.KAA20572@candle.pha.pa.us>
 | |
| Content-Type: text/plain; charset=iso-8859-1
 | |
| Content-Transfer-Encoding: 8bit
 | |
| Status: OR
 | |
| 
 | |
| Bruce Momjian wrote:
 | |
| > We need major work in this area, or at least a plan and an FAQ item.
 | |
| > We are getting major questions on this, and I don't know enough even to
 | |
| > make an FAQ item telling people their options.
 | |
| 
 | |
| My 2 cents, or 2 ören since I'm a Swede, on this:
 | |
| 
 | |
| It is pretty simple to build a replication with pg_dump, transfer,
 | |
| empty replic and reload.
 | |
| But if we want "live replicas" we better base our efforts on a
 | |
| mechanism using WAL-logs to rollforward the replicas.
 | |
| 
 | |
| regards, 
 | |
| -----------------
 | |
| Göran Thyni
 | |
| On quiet nights you can hear Windows NT reboot!
 | |
| 
 | |
| From owner-pgsql-hackers@hub.org Fri Dec 24 10:01:18 1999
 | |
| Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
 | |
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA11295
 | |
| 	for <pgman@candle.pha.pa.us>; Fri, 24 Dec 1999 11:01:17 -0500 (EST)
 | |
| Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id KAA20310 for <pgman@candle.pha.pa.us>; Fri, 24 Dec 1999 10:39:18 -0500 (EST)
 | |
| Received: from localhost (majordom@localhost)
 | |
| 	by hub.org (8.9.3/8.9.3) with SMTP id KAA61760;
 | |
| 	Fri, 24 Dec 1999 10:31:13 -0500 (EST)
 | |
| 	(envelope-from owner-pgsql-hackers)
 | |
| Received: by hub.org (bulk_mailer v1.5); Fri, 24 Dec 1999 10:30:48 -0500
 | |
| Received: (from majordom@localhost)
 | |
| 	by hub.org (8.9.3/8.9.3) id KAA58879
 | |
| 	for pgsql-hackers-outgoing; Fri, 24 Dec 1999 10:29:51 -0500 (EST)
 | |
| 	(envelope-from owner-pgsql-hackers@postgreSQL.org)
 | |
| Received: from bocs170n.black-oak.COM ([38.149.137.131])
 | |
| 	by hub.org (8.9.3/8.9.3) with ESMTP id KAA58795
 | |
| 	for <pgsql-hackers@postgreSQL.org>; Fri, 24 Dec 1999 10:29:00 -0500 (EST)
 | |
| 	(envelope-from DWalker@black-oak.com)
 | |
| From: DWalker@black-oak.com
 | |
| To: pgsql-hackers@postgreSQL.org
 | |
| Subject: [HACKERS] database replication
 | |
| Date: Fri, 24 Dec 1999 10:27:59 -0500
 | |
| Message-ID: <OFD38C9424.B391F434-ON85256851.0054F41A@black-oak.COM>
 | |
| X-Priority: 3 (Normal)
 | |
| X-MIMETrack: Serialize by Router on notes01n/BOCS(Release 5.0.1|July 16, 1999) at 12/24/99
 | |
| 	10:28:01 AM
 | |
| MIME-Version: 1.0
 | |
| MIME-Version: 1.0
 | |
| Content-Type: text/html; charset=ISO-8859-1
 | |
| Content-Transfer-Encoding: quoted-printable
 | |
| Sender: owner-pgsql-hackers@postgreSQL.org
 | |
| Status: OR
 | |
| 
 | |
| <P>I've been toying with the idea of implementing database replication for =
 | |
| the last few days.  The system I'm proposing will be a seperate progra=
 | |
| m which can be run on any machine and will most likely be implemented in Py=
 | |
| thon.  What I'm looking for at this point are gaping holes in my think=
 | |
| ing/logic/etc.  Here's what I'm thinking...</P><P> </P><P>1) I wa=
 | |
| nt to make this program an additional layer over PostgreSQL.  I really=
 | |
|  don't want to hack server code if I can get away with it.  At this po=
 | |
| int I don't feel I need to.</P><P>2) The replication system will need to ad=
 | |
| d at least one field to each table in each database that needs to be replic=
 | |
| ated.  This field will be a date/time stamp which identifies the "=
 | |
| ;last update" of the record.  This field will be called PGR=5FTIM=
 | |
| E for lack of a better name.  Because this field will be used from wit=
 | |
| hin programs and triggers it can be longer so as to not mistake it for a us=
 | |
| er field.</P><P>3) For each table to be replicated the replication system w=
 | |
| ill programatically add one plpgsql function and trigger to modify the PGR=
 | |
| =5FTIME field on both UPDATEs and INSERTs.  The name of this function =
 | |
| and trigger will be along the lines of <table=5Fname>=5Freplication=
 | |
| =5Fupdate=5Ftrigger and <table=5Fname>=5Freplication=5Fupdate=5Ffunct=
 | |
| ion.  The function is a simple two-line chunk of code to set the field=
 | |
|  PGR=5FTIME equal to NOW.  The trigger is called before each insert/up=
 | |
| date.  When looking at the Docs I see that times are stored in Zulu (G=
 | |
| T) time.  Because of this I don't have to worry about time zones and t=
 | |
| he like.  I need direction on this part (such as "hey dummy, look=
 | |
|  at page N of file X.").</P><P>4) At this point we have tables which c=
 | |
| an, at a basic level, tell the replication system when they were last updat=
 | |
| ed.</P><P>5) The replication system will have a database of its own to reco=
 | |
| rd the last replication event, hold configuration, logs, etc.  I'd pre=
 | |
| fer to store the configuration in a PostgreSQL table but it could just as e=
 | |
| asily be stored in a text file on the filesystem somewhere.</P><P>6) To han=
 | |
| dle replication I basically check the local "last replication time&quo=
 | |
| t; and compare it against the remote PGR=5FTIME fields.  If the remote=
 | |
|  PGR=5FTIME is greater than the last replication time then change the local=
 | |
|  copy of the database, otherwise, change the remote end of the database. &n=
 | |
| bsp;At this point I don't have a way to know WHICH field changed between th=
 | |
| e two replicas so either I do ROW level replication or I check each field. =
 | |
|  I check PGR=5FTIME to determine which field is the most current. &nbs=
 | |
| p;Some fine tuning of this process will have to occur no doubt.</P><P>7) Th=
 | |
| e commandline utility, fired off by something like cron, could run several =
 | |
| times during the day -- command line parameters can be implemented to say P=
 | |
| USH ALL CHANGES TO SERVER A, or PULL ALL CHANGES FROM SERVER B.</P><P> =
 | |
| ;</P><P>Questions/Concerns:</P><P>1) How far do I go with this?  Do I =
 | |
| start manhandling the system catalogs (pg=5F* tables)?</P><P>2) As to #2 an=
 | |
| d #3 above, I really don't like tools automagically changing my tables but =
 | |
| at this point I don't see a way around it.  I guess this is where the =
 | |
| testing comes into play.</P><P>3) Security: the replication app will have t=
 | |
| o have pretty good rights to the database so it can add the nessecary funct=
 | |
| ions and triggers, modify table schema, etc.  </P><P> </P><P>&nbs=
 | |
| p; So, any "you're insane and should run home to momma" comments?=
 | |
| </P><P> </P><P>              Damond=
 | |
| </P><P></P>=
 | |
| 
 | |
| ************
 | |
| 
 | |
| From owner-pgsql-hackers@hub.org Fri Dec 24 18:31:03 1999
 | |
| Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
 | |
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA26244
 | |
| 	for <pgman@candle.pha.pa.us>; Fri, 24 Dec 1999 19:31:02 -0500 (EST)
 | |
| Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id TAA12730 for <pgman@candle.pha.pa.us>; Fri, 24 Dec 1999 19:30:05 -0500 (EST)
 | |
| Received: from localhost (majordom@localhost)
 | |
| 	by hub.org (8.9.3/8.9.3) with SMTP id TAA57851;
 | |
| 	Fri, 24 Dec 1999 19:23:31 -0500 (EST)
 | |
| 	(envelope-from owner-pgsql-hackers)
 | |
| Received: by hub.org (bulk_mailer v1.5); Fri, 24 Dec 1999 19:22:54 -0500
 | |
| Received: (from majordom@localhost)
 | |
| 	by hub.org (8.9.3/8.9.3) id TAA57710
 | |
| 	for pgsql-hackers-outgoing; Fri, 24 Dec 1999 19:21:56 -0500 (EST)
 | |
| 	(envelope-from owner-pgsql-hackers@postgreSQL.org)
 | |
| Received: from Mail.austin.rr.com (sm2.texas.rr.com [24.93.35.55])
 | |
| 	by hub.org (8.9.3/8.9.3) with ESMTP id TAA57680
 | |
| 	for <pgsql-hackers@postgresql.org>; Fri, 24 Dec 1999 19:21:25 -0500 (EST)
 | |
| 	(envelope-from ELOEHR@austin.rr.com)
 | |
| Received: from austin.rr.com ([24.93.40.248]) by Mail.austin.rr.com  with Microsoft SMTPSVC(5.5.1877.197.19);
 | |
|   Fri, 24 Dec 1999 18:12:50 -0600
 | |
| Message-ID: <38640E2D.75136600@austin.rr.com>
 | |
| Date: Fri, 24 Dec 1999 18:22:05 -0600
 | |
| From: Ed Loehr <ELOEHR@austin.rr.com>
 | |
| X-Mailer: Mozilla 4.7 [en] (X11; U; Linux 2.2.12-20smp i686)
 | |
| X-Accept-Language: en
 | |
| MIME-Version: 1.0
 | |
| To: DWalker@black-oak.com
 | |
| CC: pgsql-hackers@postgreSQL.org
 | |
| Subject: Re: [HACKERS] database replication
 | |
| References: <OFD38C9424.B391F434-ON85256851.0054F41A@black-oak.COM>
 | |
| Content-Type: text/plain; charset=us-ascii
 | |
| Content-Transfer-Encoding: 7bit
 | |
| Sender: owner-pgsql-hackers@postgreSQL.org
 | |
| Status: OR
 | |
| 
 | |
| DWalker@black-oak.com wrote:
 | |
| 
 | |
| > 6) To handle replication I basically check the local "last
 | |
| > replication time" and compare it against the remote PGR_TIME
 | |
| > fields.  If the remote PGR_TIME is greater than the last replication
 | |
| > time then change the local copy of the database, otherwise, change
 | |
| > the remote end of the database.  At this point I don't have a way to
 | |
| > know WHICH field changed between the two replicas so either I do ROW
 | |
| > level replication or I check each field.  I check PGR_TIME to
 | |
| > determine which field is the most current.  Some fine tuning of this
 | |
| > process will have to occur no doubt.
 | |
| 
 | |
| Interesting idea.  I can see how this might sync up two databases
 | |
| somehow.  For true replication, however, I would always want every
 | |
| replicated database to be, at the very least, internally consistent
 | |
| (i.e., referential integrity), even if it was a little behind on
 | |
| processing transactions.  In this method, its not clear how
 | |
| consistency is every achieved/guaranteed at any point in time if the
 | |
| input stream of changes is continuous.  If the input stream ceased,
 | |
| then I can see how this approach might eventually catch up and totally
 | |
| resync everything, but it looks *very* computationally  expensive.
 | |
| 
 | |
| But I might have missed something.  How would internal consistency be
 | |
| maintained?
 | |
| 
 | |
| 
 | |
| > 7) The commandline utility, fired off by something like cron, could
 | |
| > run several times during the day -- command line parameters can be
 | |
| > implemented to say PUSH ALL CHANGES TO SERVER A, or PULL ALL CHANGES
 | |
| > FROM SERVER B.
 | |
| 
 | |
| My two cents is that, while I can see this kind of database syncing as
 | |
| valuable, this is not the kind of "replication" I had in mind.  This
 | |
| may already possible by simply copying the database.  What replication
 | |
| means to me is a live, continuously streaming sequence of updates from
 | |
| one database to another where the replicated database is always
 | |
| internally consistent, available for read-only queries, and never "too
 | |
| far" out of sync with the source/primary database.
 | |
| 
 | |
| What does replication mean to others?
 | |
| 
 | |
| Cheers,
 | |
| Ed Loehr
 | |
| 
 | |
| 
 | |
| 
 | |
| ************
 | |
| 
 | |
| From owner-pgsql-hackers@hub.org Fri Dec 24 21:31:10 1999
 | |
| Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
 | |
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA02578
 | |
| 	for <pgman@candle.pha.pa.us>; Fri, 24 Dec 1999 22:31:09 -0500 (EST)
 | |
| Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id WAA16641 for <pgman@candle.pha.pa.us>; Fri, 24 Dec 1999 22:18:56 -0500 (EST)
 | |
| Received: from localhost (majordom@localhost)
 | |
| 	by hub.org (8.9.3/8.9.3) with SMTP id WAA89135;
 | |
| 	Fri, 24 Dec 1999 22:11:12 -0500 (EST)
 | |
| 	(envelope-from owner-pgsql-hackers)
 | |
| Received: by hub.org (bulk_mailer v1.5); Fri, 24 Dec 1999 22:10:56 -0500
 | |
| Received: (from majordom@localhost)
 | |
| 	by hub.org (8.9.3/8.9.3) id WAA89019
 | |
| 	for pgsql-hackers-outgoing; Fri, 24 Dec 1999 22:09:59 -0500 (EST)
 | |
| 	(envelope-from owner-pgsql-hackers@postgreSQL.org)
 | |
| Received: from bocs170n.black-oak.COM ([38.149.137.131])
 | |
| 	by hub.org (8.9.3/8.9.3) with ESMTP id WAA88957;
 | |
| 	Fri, 24 Dec 1999 22:09:11 -0500 (EST)
 | |
| 	(envelope-from dwalker@black-oak.com)
 | |
| Received: from gcx80 ([151.196.99.113])
 | |
|           by bocs170n.black-oak.COM (Lotus Domino Release 5.0.1)
 | |
|           with SMTP id 1999122422080835:6 ;
 | |
|           Fri, 24 Dec 1999 22:08:08 -0500 
 | |
| Message-ID: <001b01bf4e9e$647287d0$af63a8c0@walkers.org>
 | |
| From: "Damond Walker" <dwalker@black-oak.com>
 | |
| To: <owner-pgsql-hackers@postgreSQL.org>
 | |
| Cc: <pgsql-hackers@postgreSQL.org>
 | |
| References: <OFD38C9424.B391F434-ON85256851.0054F41A@black-oak.COM> <38640E2D.75136600@austin.rr.com>
 | |
| Subject: Re: [HACKERS] database replication
 | |
| Date: Fri, 24 Dec 1999 22:07:55 -0800
 | |
| MIME-Version: 1.0
 | |
| X-Priority: 3 (Normal)
 | |
| X-MSMail-Priority: Normal
 | |
| X-Mailer: Microsoft Outlook Express 5.00.2314.1300
 | |
| X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
 | |
| X-MIMETrack: Itemize by SMTP Server on notes01n/BOCS(Release 5.0.1|July 16, 1999) at 12/24/99
 | |
| 	10:08:09 PM,
 | |
| 	Serialize by Router on notes01n/BOCS(Release 5.0.1|July 16, 1999) at 12/24/99
 | |
| 	10:08:11 PM,
 | |
| 	Serialize complete at 12/24/99 10:08:11 PM
 | |
| Content-Transfer-Encoding: 7bit
 | |
| Content-Type: text/plain;
 | |
| 	charset="iso-8859-1"
 | |
| Sender: owner-pgsql-hackers@postgreSQL.org
 | |
| Status: OR
 | |
| 
 | |
| >
 | |
| > Interesting idea.  I can see how this might sync up two databases
 | |
| > somehow.  For true replication, however, I would always want every
 | |
| > replicated database to be, at the very least, internally consistent
 | |
| > (i.e., referential integrity), even if it was a little behind on
 | |
| > processing transactions.  In this method, its not clear how
 | |
| > consistency is every achieved/guaranteed at any point in time if the
 | |
| > input stream of changes is continuous.  If the input stream ceased,
 | |
| > then I can see how this approach might eventually catch up and totally
 | |
| > resync everything, but it looks *very* computationally  expensive.
 | |
| >
 | |
| 
 | |
|     What's the typical unit of work for the database?  Are we talking about
 | |
| update transactions which span the entire DB?  Or are we talking about
 | |
| updating maybe 1% or less of the database everyday?  I'd think it would be
 | |
| more towards the latter than the former.  So, yes, this process would be
 | |
| computationally expensive but how many records would actually have to be
 | |
| sent back and forth?
 | |
| 
 | |
| > But I might have missed something.  How would internal consistency be
 | |
| > maintained?
 | |
| >
 | |
| 
 | |
|     Updates that occur at site A will be moved to site B and vice versa.
 | |
| Consistency would be maintained.  The only problem that I can see right off
 | |
| the bat would be what if site A and site B made changes to a row and then
 | |
| site C was brought into the picture?  Which one wins?
 | |
| 
 | |
|     Someone *has* to win when it comes to this type of thing.  You really
 | |
| DON'T want to start merging row changes...
 | |
| 
 | |
| >
 | |
| > My two cents is that, while I can see this kind of database syncing as
 | |
| > valuable, this is not the kind of "replication" I had in mind.  This
 | |
| > may already possible by simply copying the database.  What replication
 | |
| > means to me is a live, continuously streaming sequence of updates from
 | |
| > one database to another where the replicated database is always
 | |
| > internally consistent, available for read-only queries, and never "too
 | |
| > far" out of sync with the source/primary database.
 | |
| >
 | |
| 
 | |
|     Sounds like you're talking about distributed transactions to me.  That's
 | |
| an entirely different subject all-together.  What you describe can be done
 | |
| by copying a database...but as you say, this would only work in a read-only
 | |
| situation.
 | |
| 
 | |
| 
 | |
|                 Damond
 | |
| 
 | |
| 
 | |
| ************
 | |
| 
 | |
| From owner-pgsql-hackers@hub.org Sat Dec 25 16:35:07 1999
 | |
| Received: from hub.org (hub.org [216.126.84.1])
 | |
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA28890
 | |
| 	for <pgman@candle.pha.pa.us>; Sat, 25 Dec 1999 17:35:05 -0500 (EST)
 | |
| Received: from localhost (majordom@localhost)
 | |
| 	by hub.org (8.9.3/8.9.3) with SMTP id RAA86997;
 | |
| 	Sat, 25 Dec 1999 17:29:10 -0500 (EST)
 | |
| 	(envelope-from owner-pgsql-hackers)
 | |
| Received: by hub.org (bulk_mailer v1.5); Sat, 25 Dec 1999 17:28:09 -0500
 | |
| Received: (from majordom@localhost)
 | |
| 	by hub.org (8.9.3/8.9.3) id RAA86863
 | |
| 	for pgsql-hackers-outgoing; Sat, 25 Dec 1999 17:27:11 -0500 (EST)
 | |
| 	(envelope-from owner-pgsql-hackers@postgreSQL.org)
 | |
| Received: from mtiwmhc08.worldnet.att.net (mtiwmhc08.worldnet.att.net [204.127.131.19])
 | |
| 	by hub.org (8.9.3/8.9.3) with ESMTP id RAA86798
 | |
| 	for <pgsql-hackers@postgreSQL.org>; Sat, 25 Dec 1999 17:26:34 -0500 (EST)
 | |
| 	(envelope-from pgsql@rkirkpat.net)
 | |
| Received: from [192.168.3.100] ([12.74.72.219])
 | |
|           by mtiwmhc08.worldnet.att.net (InterMail v03.02.07.07 118-134)
 | |
|           with ESMTP id <19991225222554.VIOL28505@[12.74.72.219]>;
 | |
|           Sat, 25 Dec 1999 22:25:54 +0000
 | |
| Date: Sat, 25 Dec 1999 15:25:47 -0700 (MST)
 | |
| From: Ryan Kirkpatrick <pgsql@rkirkpat.net>
 | |
| X-Sender: rkirkpat@excelsior.rkirkpat.net
 | |
| To: DWalker@black-oak.com
 | |
| cc: pgsql-hackers@postgreSQL.org
 | |
| Subject: Re: [HACKERS] database replication
 | |
| In-Reply-To: <OFD38C9424.B391F434-ON85256851.0054F41A@black-oak.COM>
 | |
| Message-ID: <Pine.LNX.4.10.9912251433310.1551-100000@excelsior.rkirkpat.net>
 | |
| MIME-Version: 1.0
 | |
| Content-Type: TEXT/PLAIN; charset=US-ASCII
 | |
| Sender: owner-pgsql-hackers@postgreSQL.org
 | |
| Status: OR
 | |
| 
 | |
| On Fri, 24 Dec 1999 DWalker@black-oak.com wrote:
 | |
| 
 | |
| > I've been toying with the idea of implementing database replication
 | |
| > for the last few days.
 | |
| 
 | |
| 	I too have been thinking about this some over the last year or
 | |
| two, just trying to find a quick and easy way to do it. I am not so
 | |
| interested in replication, as in synchronization, as in between a desktop
 | |
| machine and a laptop, so I can keep the databases on each in sync with
 | |
| each other. For this sort of purpose, both the local and remote databases
 | |
| would be "idle" at the time of syncing.
 | |
| 
 | |
| > 2) The replication system will need to add at least one field to each
 | |
| > table in each database that needs to be replicated. This field will be
 | |
| > a date/time stamp which identifies the "last update" of the record.  
 | |
| > This field will be called PGR_TIME for lack of a better name.  
 | |
| > Because this field will be used from within programs and triggers it
 | |
| > can be longer so as to not mistake it for a user field.
 | |
| 
 | |
| 	How about a single, seperate table with the fields of 'database',
 | |
| 'tablename', 'oid', 'last_changed', that would store the same data as your
 | |
| PGR_TIME field. It would be seperated from the actually data tables, and
 | |
| therefore would be totally transparent to any database interface
 | |
| applications. The 'oid' field would hold each row's OID, a nice, unique
 | |
| identification number for the row, while the other fields would tell which
 | |
| table and database the oid is in. Then this table can be compared with the
 | |
| this table on a remote machine to quickly find updates and changes, then
 | |
| each differences can be dealt with in turn.
 | |
| 
 | |
| > 3) For each table to be replicated the replication system will
 | |
| > programatically add one plpgsql function and trigger to modify the
 | |
| > PGR_TIME field on both UPDATEs and INSERTs.  The name of this function
 | |
| > and trigger will be along the lines of
 | |
| > <table_name>_replication_update_trigger and
 | |
| > <table_name>_replication_update_function.  The function is a simple
 | |
| > two-line chunk of code to set the field PGR_TIME equal to NOW.  The
 | |
| > trigger is called before each insert/update.  When looking at the Docs
 | |
| > I see that times are stored in Zulu (GT) time.  Because of this I
 | |
| > don't have to worry about time zones and the like.  I need direction
 | |
| > on this part (such as "hey dummy, look at page N of file X.").
 | |
| 
 | |
| 	I like this idea, better than any I have come up with yet. Though,
 | |
| how are you going to handle DELETEs? 
 | |
| 
 | |
| > 6) To handle replication I basically check the local "last replication
 | |
| > time" and compare it against the remote PGR_TIME fields.  If the
 | |
| > remote PGR_TIME is greater than the last replication time then change
 | |
| > the local copy of the database, otherwise, change the remote end of
 | |
| > the database.  At this point I don't have a way to know WHICH field
 | |
| > changed between the two replicas so either I do ROW level replication
 | |
| > or I check each field.  I check PGR_TIME to determine which field is
 | |
| > the most current.  Some fine tuning of this process will have to occur
 | |
| > no doubt.
 | |
| 
 | |
| 	Yea, this is indeed the sticky part, and would indeed require some
 | |
| fine-tunning. Basically, the way I see it, is if the two timestamps for a
 | |
| single row do not match (or even if the row and therefore timestamp is
 | |
| missing on one side or the other altogether):
 | |
| 	local ts > remote ts => Local row is exported to remote.
 | |
| 	remote ts > local ts => Remote row is exported to local.
 | |
| 	local ts > last sync time && no remote ts => 
 | |
| 		Local row is inserted on remote.
 | |
| 	local ts < last sync time && no remote ts =>
 | |
| 		Local row is deleted.
 | |
| 	remote ts > last sync time && no local ts =>
 | |
| 		Remote row is inserted on local.
 | |
| 	remote ts < last sync time && no local ts =>
 | |
| 		Remote row is deleted.
 | |
| where the synchronization process is running on the local machine. By
 | |
| exported, I mean the local values are sent to the remote machine, and the
 | |
| row on that remote machine is updated to the local values. How does this
 | |
| sound?
 | |
| 
 | |
| > 7) The commandline utility, fired off by something like cron, could
 | |
| > run several times during the day -- command line parameters can be
 | |
| > implemented to say PUSH ALL CHANGES TO SERVER A, or PULL ALL CHANGES
 | |
| > FROM SERVER B.
 | |
| 
 | |
| 	Or run manually for my purposes. Also, maybe follow it
 | |
| with a vacuum run on both sides for all databases, as this is going to
 | |
| potenitally cause lots of table changes that could stand with a cleanup. 
 | |
| 
 | |
| > 1) How far do I go with this?  Do I start manhandling the system catalogs (pg_* tables)?
 | |
| 
 | |
| 	Initially, I would just stick to user table data... If you have
 | |
| changes in triggers and other meta-data/executable code, you are going to
 | |
| want to make syncs of that stuff manually anyway. At least I would want
 | |
| to.
 | |
| 
 | |
| > 2) As to #2 and #3 above, I really don't like tools automagically
 | |
| > changing my tables but at this point I don't see a way around it.  I
 | |
| > guess this is where the testing comes into play.
 | |
| 
 | |
| 	Hence the reason for the seperate table with just a row's
 | |
| identification and last update time. Only modifications to the synced
 | |
| database is the update trigger, which should be pretty harmless.
 | |
| 
 | |
| > 3) Security: the replication app will have to have pretty good rights
 | |
| > to the database so it can add the nessecary functions and triggers,
 | |
| > modify table schema, etc.
 | |
| 
 | |
| 	Just run the sync program as the postgres super user, and there
 | |
| are no problems. :)
 | |
| 
 | |
| >   So, any "you're insane and should run home to momma" comments?
 | |
| 
 | |
| 	No, not at all. Though it probably should be remaned from
 | |
| replication to synchronization. The former is usually associated with a
 | |
| continuous stream of updates between the local and remote databases, so
 | |
| they are almost always in sync, and have a queuing ability if their
 | |
| connection is loss for span of time as well. Very complex and difficult to
 | |
| implement, and would require hacking server code. :( Something only Sybase
 | |
| and Oracle have (as far as I know), and from what I have seen of Sybase's
 | |
| replication server support (dated by 5yrs) it was a pain to setup and get
 | |
| running correctly.
 | |
| 	The latter, synchronization, is much more managable, and can still
 | |
| be useful, especially when you have a large database you want in two
 | |
| places, mainly for read only purposes at one end or the other, but don't
 | |
| want to waste the time/bandwidth to move and load the entire database each
 | |
| time it changes on one end or the other. Same idea as mirroring software
 | |
| for FTP sites, just transfers the changes, and nothing more.
 | |
| 	I also like the idea of using Python. I have been using it
 | |
| recently for some database interfaces (to PostgreSQL of course :), and it
 | |
| is a very nice language to work with. Some worries about performance of
 | |
| the program though, as python is only an interpreted lanuage, and I have
 | |
| yet to really be impressed with the speed of execution of my database
 | |
| interfaces yet.
 | |
| 	Anyway, it sound like a good project, and finally one where I
 | |
| actually have a clue of what is going on, and the skills to help. So, if
 | |
| you are interested in pursing this project, I would be more than glad to
 | |
| help. TTYL.
 | |
| 
 | |
| ---------------------------------------------------------------------------
 | |
| |   "For to me to live is Christ, and to die is gain."                    |
 | |
| |                                            --- Philippians 1:21 (KJV)   |
 | |
| ---------------------------------------------------------------------------
 | |
| |   Ryan Kirkpatrick  |  Boulder, Colorado  |  http://www.rkirkpat.net/   |
 | |
| ---------------------------------------------------------------------------
 | |
| 
 | |
| 
 | |
| 
 | |
| ************
 | |
| 
 | |
| From owner-pgsql-hackers@hub.org Sun Dec 26 08:31:09 1999
 | |
| Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
 | |
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id JAA17976
 | |
| 	for <pgman@candle.pha.pa.us>; Sun, 26 Dec 1999 09:31:07 -0500 (EST)
 | |
| Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id JAA23337 for <pgman@candle.pha.pa.us>; Sun, 26 Dec 1999 09:28:36 -0500 (EST)
 | |
| Received: from localhost (majordom@localhost)
 | |
| 	by hub.org (8.9.3/8.9.3) with SMTP id JAA90738;
 | |
| 	Sun, 26 Dec 1999 09:21:58 -0500 (EST)
 | |
| 	(envelope-from owner-pgsql-hackers)
 | |
| Received: by hub.org (bulk_mailer v1.5); Sun, 26 Dec 1999 09:19:19 -0500
 | |
| Received: (from majordom@localhost)
 | |
| 	by hub.org (8.9.3/8.9.3) id JAA90498
 | |
| 	for pgsql-hackers-outgoing; Sun, 26 Dec 1999 09:18:21 -0500 (EST)
 | |
| 	(envelope-from owner-pgsql-hackers@postgreSQL.org)
 | |
| Received: from bocs170n.black-oak.COM ([38.149.137.131])
 | |
| 	by hub.org (8.9.3/8.9.3) with ESMTP id JAA90452
 | |
| 	for <pgsql-hackers@postgreSQL.org>; Sun, 26 Dec 1999 09:17:54 -0500 (EST)
 | |
| 	(envelope-from dwalker@black-oak.com)
 | |
| Received: from vmware98 ([151.196.99.113])
 | |
|           by bocs170n.black-oak.COM (Lotus Domino Release 5.0.1)
 | |
|           with SMTP id 1999122609164808:7 ;
 | |
|           Sun, 26 Dec 1999 09:16:48 -0500 
 | |
| Message-ID: <002201bf4fb3$623f0220$b263a8c0@vmware98.walkers.org>
 | |
| From: "Damond Walker" <dwalker@black-oak.com>
 | |
| To: "Ryan Kirkpatrick" <pgsql@rkirkpat.net>
 | |
| Cc: <pgsql-hackers@postgreSQL.org>
 | |
| Subject: Re: [HACKERS] database replication
 | |
| Date: Sun, 26 Dec 1999 10:10:41 -0500
 | |
| MIME-Version: 1.0
 | |
| X-Priority: 3 (Normal)
 | |
| X-MSMail-Priority: Normal
 | |
| X-Mailer: Microsoft Outlook Express 4.72.3110.1
 | |
| X-MimeOLE: Produced By Microsoft MimeOLE V4.72.3110.3
 | |
| X-MIMETrack: Itemize by SMTP Server on notes01n/BOCS(Release 5.0.1|July 16, 1999) at 12/26/99
 | |
| 	09:16:51 AM,
 | |
| 	Serialize by Router on notes01n/BOCS(Release 5.0.1|July 16, 1999) at 12/26/99
 | |
| 	09:16:54 AM,
 | |
| 	Serialize complete at 12/26/99 09:16:54 AM
 | |
| Content-Transfer-Encoding: 7bit
 | |
| Content-Type: text/plain;
 | |
| 	charset="iso-8859-1"
 | |
| Sender: owner-pgsql-hackers@postgreSQL.org
 | |
| Status: OR
 | |
| 
 | |
| >
 | |
| >     I too have been thinking about this some over the last year or
 | |
| >two, just trying to find a quick and easy way to do it. I am not so
 | |
| >interested in replication, as in synchronization, as in between a desktop
 | |
| >machine and a laptop, so I can keep the databases on each in sync with
 | |
| >each other. For this sort of purpose, both the local and remote databases
 | |
| >would be "idle" at the time of syncing.
 | |
| >
 | |
| 
 | |
|     I don't think it would matter if the databases are idle or not to be
 | |
| honest with you.  At any single point in time when you replicate I'd figure
 | |
| that the database would be in a consistent state.  So, you should be able to
 | |
| replicate (or sync) a remote database that is in use.  After all, you're
 | |
| getting a snapshot of the database as it stands at 8:45 PM.  At 8:46 PM it
 | |
| may be totally different...but the next time syncing takes place those
 | |
| changes would appear in your local copy.
 | |
| 
 | |
|     The one problem you may run into is if the remote host is running a
 | |
| large batch process.  It's very likely that you will get 50% of their
 | |
| changes when you replicate...but then again, that's why you can schedule the
 | |
| event to work around such things.
 | |
| 
 | |
| >     How about a single, seperate table with the fields of 'database',
 | |
| >'tablename', 'oid', 'last_changed', that would store the same data as your
 | |
| >PGR_TIME field. It would be seperated from the actually data tables, and
 | |
| >therefore would be totally transparent to any database interface
 | |
| >applications. The 'oid' field would hold each row's OID, a nice, unique
 | |
| >identification number for the row, while the other fields would tell which
 | |
| >table and database the oid is in. Then this table can be compared with the
 | |
| >this table on a remote machine to quickly find updates and changes, then
 | |
| >each differences can be dealt with in turn.
 | |
| >
 | |
| 
 | |
|     The problem with OID's is that they are unique at the local level but if
 | |
| you try and use them between servers you can run into overlap.  Also, if a
 | |
| database is under heavy use this table could quickly become VERY large.  Add
 | |
| indexes to this table to help performance and you're taking up even more
 | |
| disk space.
 | |
| 
 | |
|     Using the PGR_TIME field with an index will allow us to find rows which
 | |
| have changed VERY quickly.  All we need to do now is somehow programatically
 | |
| find the primary key for a table so the person setting up replication (or
 | |
| syncing) doesn't have to have an indepth knowledge of the schema in order to
 | |
| setup a syncing schedule.
 | |
| 
 | |
| >
 | |
| >     I like this idea, better than any I have come up with yet. Though,
 | |
| >how are you going to handle DELETEs?
 | |
| >
 | |
| 
 | |
|     Oops...how about defining a trigger for this?  With deletion I guess we
 | |
| would have to move a flag into another table saying we deleted record 'X'
 | |
| with this primary key from this table.
 | |
| 
 | |
| >
 | |
| >     Yea, this is indeed the sticky part, and would indeed require some
 | |
| >fine-tunning. Basically, the way I see it, is if the two timestamps for a
 | |
| >single row do not match (or even if the row and therefore timestamp is
 | |
| >missing on one side or the other altogether):
 | |
| >     local ts > remote ts => Local row is exported to remote.
 | |
| >     remote ts > local ts => Remote row is exported to local.
 | |
| >     local ts > last sync time && no remote ts =>
 | |
| >          Local row is inserted on remote.
 | |
| >     local ts < last sync time && no remote ts =>
 | |
| >          Local row is deleted.
 | |
| >     remote ts > last sync time && no local ts =>
 | |
| >          Remote row is inserted on local.
 | |
| >     remote ts < last sync time && no local ts =>
 | |
| >          Remote row is deleted.
 | |
| >where the synchronization process is running on the local machine. By
 | |
| >exported, I mean the local values are sent to the remote machine, and the
 | |
| >row on that remote machine is updated to the local values. How does this
 | |
| >sound?
 | |
| >
 | |
| 
 | |
|     The replication part will be the most complex...that much is for
 | |
| certain...
 | |
| 
 | |
|     I've been writing systems in Lotus Notes/Domino for the last year or so
 | |
| and I've grown quite spoiled with what it can do in regards to replication.
 | |
| It's not real-time but you have to gear your applications to this type of
 | |
| thing (it's possible to create documents, fire off email to notify people of
 | |
| changes and have the email arrive before the replicated documents do).
 | |
| Replicating large Notes/Domino databases takes quite a while....I don't see
 | |
| any kind of replication or syncing running in a blink of an eye.
 | |
| 
 | |
|     Having said that, a good algo will have to be written to cut down on
 | |
| network traffic and to keep database conversations down to a minimum.  This
 | |
| will be appreciated by people with low bandwidth connections I'm sure
 | |
| (dial-ups, fractional T1's, etc).
 | |
| 
 | |
| >     Or run manually for my purposes. Also, maybe follow it
 | |
| >with a vacuum run on both sides for all databases, as this is going to
 | |
| >potenitally cause lots of table changes that could stand with a cleanup.
 | |
| >
 | |
| 
 | |
|     What would a vacuum do to a system being used by many people?
 | |
| 
 | |
| >     No, not at all. Though it probably should be remaned from
 | |
| >replication to synchronization. The former is usually associated with a
 | |
| >continuous stream of updates between the local and remote databases, so
 | |
| >they are almost always in sync, and have a queuing ability if their
 | |
| >connection is loss for span of time as well. Very complex and difficult to
 | |
| >implement, and would require hacking server code. :( Something only Sybase
 | |
| >and Oracle have (as far as I know), and from what I have seen of Sybase's
 | |
| >replication server support (dated by 5yrs) it was a pain to setup and get
 | |
| >running correctly.
 | |
| 
 | |
|     It could probably be named either way...but the one thing I really don't
 | |
| want to do is start hacking server code.  The PostgreSQL people have enough
 | |
| to do without worrying about trying to meld anything I've done to their
 | |
| server.   :)
 | |
| 
 | |
|     Besides, I like the idea of having it operate as a stand-alone product.
 | |
| The only PostgreSQL feature we would require would be triggers and
 | |
| plpgsql...what was the earliest version of PostgreSQL that supported
 | |
| plpgsql?  Even then I don't see the triggers being that complex to boot.
 | |
| 
 | |
| >     I also like the idea of using Python. I have been using it
 | |
| >recently for some database interfaces (to PostgreSQL of course :), and it
 | |
| >is a very nice language to work with. Some worries about performance of
 | |
| >the program though, as python is only an interpreted lanuage, and I have
 | |
| >yet to really be impressed with the speed of execution of my database
 | |
| >interfaces yet.
 | |
| 
 | |
|     The only thing we'd need for Python is the Python extensions for
 | |
| PostgreSQL...which in turn requires libpq and that's about it.  So, it
 | |
| should be able to run on any platform supported by Python and libpq.  Using
 | |
| TK for the interface components will require NT people to get additional
 | |
| software from the 'net.  At least it did with older version of Windows
 | |
| Python.  Unix folks should be happy....assuming they have X running on the
 | |
| machine doing the replication or syncing.  Even then I wrote a curses based
 | |
| Python interface awhile back which allows buttons, progress bars, input
 | |
| fields, etc (I called it tinter and it's available at
 | |
| http://iximd.com/~dwalker).  It's a simple interface and could probably be
 | |
| cleaned up a bit but it works.  :)
 | |
| 
 | |
| >     Anyway, it sound like a good project, and finally one where I
 | |
| >actually have a clue of what is going on, and the skills to help. So, if
 | |
| >you are interested in pursing this project, I would be more than glad to
 | |
| >help. TTYL.
 | |
| >
 | |
| 
 | |
| 
 | |
|     That would be a Good Thing.  Have webspace somewhere?  If I can get
 | |
| permission from the "powers that be" at the office I could host a website on
 | |
| our (Domino) webserver.
 | |
| 
 | |
|                 Damond
 | |
| 
 | |
| 
 | |
| ************
 | |
| 
 | |
| From owner-pgsql-hackers@hub.org Sun Dec 26 19:11:48 1999
 | |
| Received: from hub.org (hub.org [216.126.84.1])
 | |
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA26661
 | |
| 	for <pgman@candle.pha.pa.us>; Sun, 26 Dec 1999 20:11:46 -0500 (EST)
 | |
| Received: from localhost (majordom@localhost)
 | |
| 	by hub.org (8.9.3/8.9.3) with SMTP id UAA14959;
 | |
| 	Sun, 26 Dec 1999 20:08:15 -0500 (EST)
 | |
| 	(envelope-from owner-pgsql-hackers)
 | |
| Received: by hub.org (bulk_mailer v1.5); Sun, 26 Dec 1999 20:07:27 -0500
 | |
| Received: (from majordom@localhost)
 | |
| 	by hub.org (8.9.3/8.9.3) id UAA14820
 | |
| 	for pgsql-hackers-outgoing; Sun, 26 Dec 1999 20:06:28 -0500 (EST)
 | |
| 	(envelope-from owner-pgsql-hackers@postgreSQL.org)
 | |
| Received: from mtiwmhc02.worldnet.att.net (mtiwmhc02.worldnet.att.net [204.127.131.37])
 | |
| 	by hub.org (8.9.3/8.9.3) with ESMTP id UAA14749
 | |
| 	for <pgsql-hackers@postgreSQL.org>; Sun, 26 Dec 1999 20:05:39 -0500 (EST)
 | |
| 	(envelope-from rkirkpat@rkirkpat.net)
 | |
| Received: from [192.168.3.100] ([12.74.72.56])
 | |
|           by mtiwmhc02.worldnet.att.net (InterMail v03.02.07.07 118-134)
 | |
|           with ESMTP id <19991227010506.WJVW1914@[12.74.72.56]>;
 | |
|           Mon, 27 Dec 1999 01:05:06 +0000
 | |
| Date: Sun, 26 Dec 1999 18:05:02 -0700 (MST)
 | |
| From: Ryan Kirkpatrick <pgsql@rkirkpat.net>
 | |
| X-Sender: rkirkpat@excelsior.rkirkpat.net
 | |
| To: Damond Walker <dwalker@black-oak.com>
 | |
| cc: pgsql-hackers@postgreSQL.org
 | |
| Subject: Re: [HACKERS] database replication
 | |
| In-Reply-To: <002201bf4fb3$623f0220$b263a8c0@vmware98.walkers.org>
 | |
| Message-ID: <Pine.LNX.4.10.9912261742550.7666-100000@excelsior.rkirkpat.net>
 | |
| MIME-Version: 1.0
 | |
| Content-Type: TEXT/PLAIN; charset=US-ASCII
 | |
| Sender: owner-pgsql-hackers@postgreSQL.org
 | |
| Status: OR
 | |
| 
 | |
| On Sun, 26 Dec 1999, Damond Walker wrote:
 | |
| 
 | |
| > >     How about a single, seperate table with the fields of 'database',
 | |
| > >'tablename', 'oid', 'last_changed', that would store the same data as your
 | |
| > >PGR_TIME field. It would be seperated from the actually data tables, and
 | |
| ...
 | |
| >     The problem with OID's is that they are unique at the local level but if
 | |
| > you try and use them between servers you can run into overlap.  
 | |
| 
 | |
| 	Yea, forgot about that point, but became dead obvious once you
 | |
| mentioned it. Boy, I feel stupid now. :)
 | |
| 
 | |
| >     Using the PGR_TIME field with an index will allow us to find rows which
 | |
| > have changed VERY quickly.  All we need to do now is somehow programatically
 | |
| > find the primary key for a table so the person setting up replication (or
 | |
| > syncing) doesn't have to have an indepth knowledge of the schema in order to
 | |
| > setup a syncing schedule.
 | |
| 
 | |
| 	Hmm... Yea, maybe look to see which field(s) has a primary, unique
 | |
| index on it? Then use those field(s) as a primary key. Just require that
 | |
| any table to be synchronized to have some set of fields that uniquely
 | |
| identify each row. Either that, or add another field to each table with
 | |
| our own, cross system consistent, identification system. Don't know which
 | |
| would be more efficient and easier to work with.
 | |
| 	The former could potentially get sticky if it takes a lots of
 | |
| fields to generate a unique key value, but has the smallest effect on the
 | |
| table to be synced. The latter could be difficult to keep straight between
 | |
| systems (local vs. remote), and would require a trigger on inserts to
 | |
| generate a new, unique id number, that does not exist locally or
 | |
| remotely (nasty issue there), but would remove the uniqueness
 | |
| requirement.
 | |
| 
 | |
| >     Oops...how about defining a trigger for this?  With deletion I guess we
 | |
| > would have to move a flag into another table saying we deleted record 'X'
 | |
| > with this primary key from this table.
 | |
| 
 | |
| 	Or, according to my logic below, if a row is missing on one side
 | |
| or the other, then just compare the remaining row's timestamp to the last
 | |
| synchronization time (stored in a seperate table/db elsewhere). The
 | |
| results of the comparsion and the state of row existences tell one if the
 | |
| row was inserted or deleted since the last sync, and what should be done
 | |
| to perform the sync.
 | |
| 
 | |
| > >     Yea, this is indeed the sticky part, and would indeed require some
 | |
| > >fine-tunning. Basically, the way I see it, is if the two timestamps for a
 | |
| > >single row do not match (or even if the row and therefore timestamp is
 | |
| > >missing on one side or the other altogether):
 | |
| > >     local ts > remote ts => Local row is exported to remote.
 | |
| > >     remote ts > local ts => Remote row is exported to local.
 | |
| > >     local ts > last sync time && no remote ts =>
 | |
| > >          Local row is inserted on remote.
 | |
| > >     local ts < last sync time && no remote ts =>
 | |
| > >          Local row is deleted.
 | |
| > >     remote ts > last sync time && no local ts =>
 | |
| > >          Remote row is inserted on local.
 | |
| > >     remote ts < last sync time && no local ts =>
 | |
| > >          Remote row is deleted.
 | |
| > >where the synchronization process is running on the local machine. By
 | |
| > >exported, I mean the local values are sent to the remote machine, and the
 | |
| > >row on that remote machine is updated to the local values. How does this
 | |
| > >sound?
 | |
| 
 | |
| >     Having said that, a good algo will have to be written to cut down on
 | |
| > network traffic and to keep database conversations down to a minimum.  This
 | |
| > will be appreciated by people with low bandwidth connections I'm sure
 | |
| > (dial-ups, fractional T1's, etc).
 | |
| 
 | |
| 	Of course! In reflection, the assigned identification number I
 | |
| mentioned above might be the best then, instead of having to transfer the
 | |
| entire set of key fields back and forth.
 | |
| 
 | |
| >     What would a vacuum do to a system being used by many people?
 | |
| 
 | |
| 	Probably lock them out of tables while they are vacuumed... Maybe
 | |
| not really required in the end, possibly optional?
 | |
| 
 | |
| >     It could probably be named either way...but the one thing I really don't
 | |
| > want to do is start hacking server code.  The PostgreSQL people have enough
 | |
| > to do without worrying about trying to meld anything I've done to their
 | |
| > server.   :)
 | |
| 
 | |
| 	Yea, they probably would appreciate that. They already have enough
 | |
| on thier plate for 7.x as it is! :)
 | |
| 
 | |
| >     Besides, I like the idea of having it operate as a stand-alone product.
 | |
| > The only PostgreSQL feature we would require would be triggers and
 | |
| > plpgsql...what was the earliest version of PostgreSQL that supported
 | |
| > plpgsql?  Even then I don't see the triggers being that complex to boot.
 | |
| 
 | |
| 	No, provided that we don't do the identification number idea
 | |
| (which the more I think about it, probably will not work). As for what
 | |
| version support plpgsql, I don't know, one of the more hard-core pgsql
 | |
| hackers can probably tell us that.
 | |
| 
 | |
| >     The only thing we'd need for Python is the Python extensions for
 | |
| > PostgreSQL...which in turn requires libpq and that's about it.  So, it
 | |
| > should be able to run on any platform supported by Python and libpq.  
 | |
| 
 | |
| 	Of course. If it ran on NT as well as Linux/Unix, that would be
 | |
| even better. :)
 | |
| 
 | |
| > Unix folks should be happy....assuming they have X running on the
 | |
| > machine doing the replication or syncing.  Even then I wrote a curses
 | |
| > based Python interface awhile back which allows buttons, progress
 | |
| > bars, input fields, etc (I called it tinter and it's available at
 | |
| > http://iximd.com/~dwalker).  It's a simple interface and could
 | |
| > probably be cleaned up a bit but it works.  :)
 | |
| 
 | |
| 	Why would we want any type of GUI (X11 or curses) for this sync
 | |
| program. I imagine just a command line program with a few options (local
 | |
| machine, remote machine, db name, etc...), and nothing else.
 | |
| 	Though I will take a look at your curses interface, as I have been
 | |
| wanting to make a curses interface to a few db interfaces I have, in a
 | |
| simple as manner as possible.
 | |
| 
 | |
| >     That would be a Good Thing.  Have webspace somewhere?  If I can get
 | |
| > permission from the "powers that be" at the office I could host a website on
 | |
| > our (Domino) webserver.
 | |
| 
 | |
| 	Yea, I got my own web server (www.rkirkpat.net) with 1GB+ of disk
 | |
| space available, sitting on a decent speed DSL. Even can setup of a
 | |
| virtual server if we want (i.e. pgsync.rkirkpat.net :). CVS repository,
 | |
| email lists, etc... possible with some effort (and time). 
 | |
| 	So, where should we start? TTYL.
 | |
| 
 | |
| 	PS. The current pages on my web site are very out of date at the
 | |
| moment (save for the pgsql information). I hope to have updated ones up
 | |
| within the week. 
 | |
| 
 | |
| ---------------------------------------------------------------------------
 | |
| |   "For to me to live is Christ, and to die is gain."                    |
 | |
| |                                            --- Philippians 1:21 (KJV)   |
 | |
| ---------------------------------------------------------------------------
 | |
| |   Ryan Kirkpatrick  |  Boulder, Colorado  |  http://www.rkirkpat.net/   |
 | |
| ---------------------------------------------------------------------------
 | |
| 
 | |
| 
 | |
| ************
 | |
| 
 | |
| From owner-pgsql-hackers@hub.org Mon Dec 27 12:33:32 1999
 | |
| Received: from hub.org (hub.org [216.126.84.1])
 | |
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA24817
 | |
| 	for <pgman@candle.pha.pa.us>; Mon, 27 Dec 1999 13:33:29 -0500 (EST)
 | |
| Received: from localhost (majordom@localhost)
 | |
| 	by hub.org (8.9.3/8.9.3) with SMTP id NAA53391;
 | |
| 	Mon, 27 Dec 1999 13:29:02 -0500 (EST)
 | |
| 	(envelope-from owner-pgsql-hackers)
 | |
| Received: by hub.org (bulk_mailer v1.5); Mon, 27 Dec 1999 13:28:38 -0500
 | |
| Received: (from majordom@localhost)
 | |
| 	by hub.org (8.9.3/8.9.3) id NAA53248
 | |
| 	for pgsql-hackers-outgoing; Mon, 27 Dec 1999 13:27:40 -0500 (EST)
 | |
| 	(envelope-from owner-pgsql-hackers@postgreSQL.org)
 | |
| Received: from gtv.ca (h139-142-238-17.cg.fiberone.net [139.142.238.17])
 | |
| 	by hub.org (8.9.3/8.9.3) with ESMTP id NAA53170
 | |
| 	for <pgsql-hackers@hub.org>; Mon, 27 Dec 1999 13:26:40 -0500 (EST)
 | |
| 	(envelope-from aaron@genisys.ca)
 | |
| Received: from stilborne (24.67.90.252.ab.wave.home.com [24.67.90.252])
 | |
| 	by gtv.ca (8.9.3/8.8.7) with SMTP id MAA01200
 | |
| 	for <pgsql-hackers@hub.org>; Mon, 27 Dec 1999 12:36:39 -0700
 | |
| From: "Aaron J. Seigo" <aaron@gtv.ca>
 | |
| To: pgsql-hackers@hub.org
 | |
| Subject: Re: [HACKERS] database replication
 | |
| Date: Mon, 27 Dec 1999 11:23:19 -0700
 | |
| X-Mailer: KMail [version 1.0.28]
 | |
| Content-Type: text/plain
 | |
| References: <199912271135.TAA10184@netrinsics.com>
 | |
| In-Reply-To: <199912271135.TAA10184@netrinsics.com>
 | |
| MIME-Version: 1.0
 | |
| Message-Id: <99122711245600.07929@stilborne>
 | |
| Content-Transfer-Encoding: 8bit
 | |
| Sender: owner-pgsql-hackers@postgreSQL.org
 | |
| Status: OR
 | |
| 
 | |
| hi..
 | |
| 
 | |
| > Before anyone starts implementing any database replication, I'd strongly
 | |
| > suggest doing some research, first:
 | |
| > 
 | |
| > http://sybooks.sybase.com:80/onlinebooks/group-rs/rsg1150e/rs_admin/@Generic__BookView;cs=default;ts=default
 | |
| 
 | |
| good idea, but perhaps sybase isn't the best study case.. here's some extremely
 | |
| detailed online coverage of Oracle 8i's replication, from the oracle online
 | |
| library:
 | |
| 
 | |
| http://bach.towson.edu/oracledocs/DOC/server803/A54651_01/toc.htm
 | |
| 
 | |
| -- 
 | |
| Aaron J. Seigo
 | |
| Sys Admin
 | |
| 
 | |
| ************
 | |
| 
 | |
| From owner-pgsql-hackers@hub.org Thu Dec 30 08:01:09 1999
 | |
| Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
 | |
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id JAA10317
 | |
| 	for <pgman@candle.pha.pa.us>; Thu, 30 Dec 1999 09:01:08 -0500 (EST)
 | |
| Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id IAA02365 for <pgman@candle.pha.pa.us>; Thu, 30 Dec 1999 08:37:10 -0500 (EST)
 | |
| Received: from localhost (majordom@localhost)
 | |
| 	by hub.org (8.9.3/8.9.3) with SMTP id IAA87902;
 | |
| 	Thu, 30 Dec 1999 08:34:22 -0500 (EST)
 | |
| 	(envelope-from owner-pgsql-hackers)
 | |
| Received: by hub.org (bulk_mailer v1.5); Thu, 30 Dec 1999 08:32:24 -0500
 | |
| Received: (from majordom@localhost)
 | |
| 	by hub.org (8.9.3/8.9.3) id IAA85771
 | |
| 	for pgsql-hackers-outgoing; Thu, 30 Dec 1999 08:31:27 -0500 (EST)
 | |
| 	(envelope-from owner-pgsql-hackers@postgreSQL.org)
 | |
| Received: from sandman.acadiau.ca (dcurrie@sandman.acadiau.ca [131.162.129.111])
 | |
| 	by hub.org (8.9.3/8.9.3) with ESMTP id IAA85234
 | |
| 	for <pgsql-hackers@postgresql.org>; Thu, 30 Dec 1999 08:31:10 -0500 (EST)
 | |
| 	(envelope-from dcurrie@sandman.acadiau.ca)
 | |
| Received: (from dcurrie@localhost)
 | |
| 	by sandman.acadiau.ca (8.8.8/8.8.8/Debian/GNU) id GAA18698;
 | |
| 	Thu, 30 Dec 1999 06:30:58 -0400
 | |
| From: Duane Currie <dcurrie@sandman.acadiau.ca>
 | |
| Message-Id: <199912301030.GAA18698@sandman.acadiau.ca>
 | |
| Subject: Re: [HACKERS] database replication
 | |
| In-Reply-To: <OFD38C9424.B391F434-ON85256851.0054F41A@black-oak.COM> from "DWalker@black-oak.com" at "Dec 24, 99 10:27:59 am"
 | |
| To: DWalker@black-oak.com
 | |
| Date: Thu, 30 Dec 1999 10:30:58 +0000 (AST)
 | |
| Cc: pgsql-hackers@postgresql.org
 | |
| X-Mailer: ELM [version 2.4ME+ PL39 (25)]
 | |
| MIME-Version: 1.0
 | |
| Content-Type: text/plain; charset=US-ASCII
 | |
| Content-Transfer-Encoding: 7bit
 | |
| Sender: owner-pgsql-hackers@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| Hi Guys,
 | |
| 
 | |
| Now for one of my REALLY rare posts.
 | |
| Having done a little bit of distributed data systems, I figured I'd
 | |
| pitch in a couple cents worth.
 | |
| 
 | |
| > 2) The replication system will need to add at least one field to each 
 | |
| >    table in each database that needs to be re plicated.  This 
 | |
| >    field will be a date/time stamp which identifies the " last 
 | |
| >    update" of the record.  This field will be called PGR_TIME 
 | |
| >    for la ck of a better name.  Because this field will be used 
 | |
| >    from within programs and triggers it can be longer so as to not 
 | |
| >    mistake it for a user field.
 | |
| 
 | |
| I just started reading this thread, but I figured I'd throw in a couple
 | |
| suggestions for distributed data control  (a few idioms I've had to
 | |
| deal with b4):
 | |
| 	- Never use time (not reliable from system to system).  Use
 | |
| 	  a version number of some sort that can stay consistent across
 | |
| 	  all replicas
 | |
| 
 | |
| 	  This way, if a system's time is or goes out of wack, it doesn't
 | |
| 	  cause your database to disintegrate, and it's easier to track
 | |
| 	  conflicts (see below.  If using time, the algorithm gets
 | |
| 	  nightmarish)
 | |
| 
 | |
| 	- On an insert, set to version 1
 | |
| 
 | |
| 	- On an update, version++
 | |
| 
 | |
| 	- On a delete, mark deleted, and add a delete stub somewhere for the
 | |
| 	  replicator process to deal with in sync'ing the databases.
 | |
| 
 | |
| 	- If two records have the same version but different data, there's
 | |
| 	  a conflict.  A few choices:
 | |
| 	  	1.  Pick one as the correct one (yuck!! invisible data loss)
 | |
| 		2.  Store both copies, pick one as current, and alert 
 | |
| 		    database owner of the conflict, so they can deal with
 | |
| 		    it "manually."
 | |
| 		3.  If possible, some conflicts can be merged.  If a disjoint
 | |
| 		    set of fields were changed in each instance, these changes
 | |
| 		    may both be applied and the record merged.  (Problem:
 | |
| 		    takes a lot more space.  Requires a version number for
 | |
| 		    every field, or persistent storage of some old records.
 | |
| 		    However, this might help the "which fields changed" issue
 | |
| 		    you were talking about in #6)
 | |
| 
 | |
| 	- A unique id across all systems should exist (or something that
 | |
| 	  effectively simulates a unique id.  Maybe a composition of the
 | |
| 	  originating oid (from the insert) and the originating database
 | |
| 	  (oid of the database's record?) might do it.  Store this as
 | |
| 	  an extra field in every record.  
 | |
| 	  
 | |
| 	  (Two extra fieldss so far: 'unique id' and 'version')
 | |
| 
 | |
| I do like your approach:  triggers and a separate process. (Maintainable!! :)
 | |
| 
 | |
| Anyway, just figured I'd throw in a few suggestions,
 | |
| Duane
 | |
| 
 | |
| ************
 | |
| 
 | |
| From owner-pgsql-patches@hub.org Sun Jan  2 23:01:38 2000
 | |
| Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
 | |
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA16274
 | |
| 	for <pgman@candle.pha.pa.us>; Mon, 3 Jan 2000 00:01:28 -0500 (EST)
 | |
| Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id XAA02655 for <pgman@candle.pha.pa.us>; Sun, 2 Jan 2000 23:45:55 -0500 (EST)
 | |
| Received: from hub.org (hub.org [216.126.84.1])
 | |
| 	by hub.org (8.9.3/8.9.3) with ESMTP id XAA13828;
 | |
| 	Sun, 2 Jan 2000 23:40:47 -0500 (EST)
 | |
| 	(envelope-from owner-pgsql-patches@hub.org)
 | |
| Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sun, 02 Jan 2000 23:38:34 +0000 (EST)
 | |
| Received: (from majordom@localhost)
 | |
| 	by hub.org (8.9.3/8.9.3) id XAA13624
 | |
| 	for pgsql-patches-outgoing; Sun, 2 Jan 2000 23:37:36 -0500 (EST)
 | |
| 	(envelope-from owner-pgsql-patches@postgreSQL.org)
 | |
| Received: from falla.videotron.net (falla.videotron.net [205.151.222.106])
 | |
| 	by hub.org (8.9.3/8.9.3) with ESMTP id XAA13560
 | |
| 	for <pgsql-patches@postgresql.org>; Sun, 2 Jan 2000 23:37:02 -0500 (EST)
 | |
| 	(envelope-from P.Marchesso@Videotron.ca)
 | |
| Received: from Videotron.ca ([207.253.210.234])
 | |
| 	by falla.videotron.net (Sun Internet Mail Server sims.3.5.1999.07.30.00.05.p8)
 | |
| 	with ESMTP id <0FNQ000TEST8VI@falla.videotron.net> for pgsql-patches@postgresql.org; Sun,
 | |
| 	2 Jan 2000 23:37:01 -0500 (EST)
 | |
| Date: Sun, 02 Jan 2000 23:39:23 -0500
 | |
| From: Philippe Marchesseault <P.Marchesso@Videotron.ca>
 | |
| Subject: [PATCHES] Distributed PostgreSQL!
 | |
| To: pgsql-patches@postgreSQL.org
 | |
| Message-id: <387027FB.EB88D757@Videotron.ca>
 | |
| MIME-version: 1.0
 | |
| X-Mailer: Mozilla 4.51 [en] (X11; I; Linux 2.2.11 i586)
 | |
| Content-type: MULTIPART/MIXED; BOUNDARY="Boundary_(ID_GeYGc69fE1/bkYLTPwOGFg)"
 | |
| X-Accept-Language: en
 | |
| Sender: owner-pgsql-patches@postgreSQL.org
 | |
| Precedence: bulk
 | |
| Status: ORr
 | |
| 
 | |
| This is a multi-part message in MIME format.
 | |
| 
 | |
| --Boundary_(ID_GeYGc69fE1/bkYLTPwOGFg)
 | |
| Content-type: text/plain; charset=us-ascii
 | |
| Content-transfer-encoding: 7bit
 | |
| 
 | |
| Hi all!
 | |
| 
 | |
| Here is a small patch to make postgres a distributed database. By
 | |
| distributed I mean that you can have the same copy of the database on N
 | |
| different machines and keep them all in sync.
 | |
| It does not improve performances unless you distribute your clients in a
 | |
| sensible manner. It does not allow you to do parallel selects.
 | |
| 
 | |
| The support page is : pages.infinit.net/daemon  and soon to be in
 | |
| english.
 | |
| 
 | |
| The patch was tested with RedHat Linux 6.0 on Intel with kernel 2.2.11.
 | |
| Only two machines where used so i'm not competely sure that it works
 | |
| with more than two. -But it should-
 | |
| 
 | |
| I would like to know if somebody else is interested in this otherwise
 | |
| i'm probably not gonna keep it growing. So please reply me to my e-mail
 | |
| (P.Marchesso@videotron.ca) to give me an idea of the amount of people
 | |
| interested in this.
 | |
| 
 | |
| Thanks all.
 | |
| 
 | |
| Philippe Marchesseault
 | |
| 
 | |
| --
 | |
| It's not the size of the dog in the fight,
 | |
| but the size of the fight in the dog.
 | |
|                         -Archie Griffen
 | |
| 
 | |
| 
 | |
| 
 | |
| --Boundary_(ID_GeYGc69fE1/bkYLTPwOGFg)
 | |
| Content-type: application/octet-stream; name=replicator-0.1.tgz
 | |
| Content-disposition: attachment; filename=replicator-0.1.tgz
 | |
| Content-transfer-encoding: base64
 | |
| 
 | |
| H4sIAOeZbzgAA+w8a3PayLL7Nara/9Dx3iRAME/brO11tjDGDjc2OIBvTm5OihLSAFoLiUgi
 | |
| LGfX//12z0MaAX7UOU72nLpW7cagmenp6enp9xCwmetYZuQHxR++1QM7pVq5DD8AQG2vxv+W
 | |
| d3b4X/mUsKVUrdbKO7XKLrZWdkqVH2D3m2GkPfMwMgOAH2Z+GI0DFt7ejwW3Nv7nPkGy/xfm
 | |
| NRs5Lnv0Ocql0l6pdOv+l8u1XbH/e7W90m6NWsu7pR+g9OiYbHj+n+9/q904vzppwhFst6A4
 | |
| D4NiGFjF2Tj84vJPjme5c5vFra5vma5sl21FwzBdFw7AGwb+NQtgKv8mvAXYZWw6HnayXGZ6
 | |
| d3cVXQ7AeBZMYXukOquv0/TXZKhhGLJrjEvBAvlhYjwbW5b2/r8ycu1Z2B6P7SFs+8lMYRQ4
 | |
| s/irodA8UJOvwZ3eCXeahqu+GtqyD7SFrEFPNW2aQAO07c6+wPb52m65ztB49pNAQKfZzIys
 | |
| CR7DA+OZ7YxGm5hgaFrXzLOLkeXPiuqgFHzXfnhnC96A+rJX2C3wWf9q5n96dPn/zbaH5L+U
 | |
| 95vkf6VUS/T/Xq2Crbt7ldqT/P8eD+6NWS6V8+VSzXgDxdzlBfTnDE6YBZWfcSMOqtWDnX0o
 | |
| 7+/v54rY4yelELaKE3/KYqYpaoykZNeW3v+XcBkWo+UMZcHkzVqDM7M2vZ6GY/G6mHuOmHEM
 | |
| jL3arrn3cxU/PnO8SJmQan72fs6C5SH22sNeu/m9/T3qWcwBAuiZEfw36hYoQ7l6UKrg6pD9
 | |
| SiXqATlojSCaMECOiNiUeVFIHwP8s3CiCZjw6vWrPPXw0t3AQkqEMAr8qQREzZpIngW+xcKw
 | |
| IFppEepxRpD5QggPSDB740+lz3B0RBNl/+C9Ya0VXsGrQ9GWXjM2ndbPe03eeEP/MDdkf9zS
 | |
| td+90npy+iJme7V9a29/3/gFnskH4CVHIQoYG7hOGOXBZvSvabmdrywIHJtlD43t7W0x0V3P
 | |
| gyDVSjsmegH5WnWfID7j/6wxZvWgXDso7SSMSX0APuC2oPKRG4DThRHR2IleheD5Ee5hyFxm
 | |
| Rcnm8aEgQcTbQThuv8F9nZqe3UeuhedH0Lg4GfSa581GH16+XKGo2C5cf69/0R9c9M4AefdQ
 | |
| viM2nhLvI93L6mU4wy2NRhnRyH5HQmy9fhFu5VNbnlXd5R+HDwg9m/6MWZRpd06ag3fNj3lo
 | |
| XTYGjW6z3oc/obS3t5fNw0vslAfEZtBr/W8zD6Us/FKKUQWYIdX9ILN1EY57SLVTE81/+wC2
 | |
| 4lk5cwgWURshGOWvlltPz+M8mtiOTdjHnuNu/69cre0I/69Sq5XLpSq1Vit7T/r/ezzFnP4Y
 | |
| /YkTQivk8nOGhrw5ZvgZtSZ6eP4ijIWe43vKnO+9PwfbjMyhGTJAF880bIdk13AeMRuY99UJ
 | |
| fI8kbcEwGv5sGTjjSQSZRpbLboDLieM6M5SNF2ZgTVBNMnPuRoZABTXnODCn4JB6ZSjc/VG0
 | |
| MAN2CEt/Dhaq8oAl06GcB5TXRXRrpj46M0t6Mfds9NxoQRELpiH4Qsmfta/gjHksMF24nA9x
 | |
| VXDuWMwLmWHivPQmnOAChkve/ZRm78nZ4dRHqJwKh8DQNsAJUIGFRJWKISeQ0PKABkAGKYgI
 | |
| B+DPaFAWsVyCi7ojHldYX7BORkeYHBN/JvcDF7Zw0OseMkDOHM3dvIE94UOr/7Zz1Yd6+yN8
 | |
| qHe79Xb/4yE3X3xsZV+l6eJMcRsRLC4mML1oiUQxLprdxlvsXz9unbf6Hwnt01a/3ez14LTT
 | |
| hTpc1rv9VuPqvN6Fy6vuZafXLKC6Y4QQM24nJ4wQ0tRHqtksQgWDZpDxETcvRJTQg5yYX0lV
 | |
| W8z5igiZaEvNlg/ZI9f3xsIuizS6HZK2R0Wfh0XgIENE/vruGcnuodL0rEIedvehz5AmDC5d
 | |
| 02KwDb05Da9WS3k4Rj6nrhd1gFKlXC5vl6ulWh6uevWCkTo+RcNAIyQJBRxAf0J8HdJGyxCC
 | |
| Iw+XMAlxBeh8R7gJ4RS3lO+NhwSbiiNGJiXSxPNtZqBZqT8CXKhW6HMmpI6qpUCTI9Vxwvbr
 | |
| 8goWwdzz0LoA31uFS8Cmoi9KPOROvn9tDlrABIQmV4FNBI0vwrQsNkNb2fI9Dy0sRB93GnIE
 | |
| /wOZzAsmrGkSJfTF9gnKwsRl0wwEThuKJ5060QmnSQiKJBla2pmLeq/f7MJxt/MO/1x2Ow1k
 | |
| 02Yva5Ahp7kQke345ECkXrnOcPUdWVrpd2gYeatD0SEJfeuaRen3Hosc/L/oeOv9aXn01rjF
 | |
| D9roBW30gZK3W4l/ZRhffQeF7O9ONPDwZLjLDLlEhOWpjXYeWnjCyNxquH5IG56QmIw8AAvf
 | |
| s4wcQC8IVqaEH2+ImXsT349WvB3JcZp/E02CuWAc3B9SGWjBzhluBceuPpu5y54anrEmZpBD
 | |
| cFM8pN3mWavX79b7rU4bN3gQsDHHmhbxxbHzgIZul4WoDwi1tGmNL4q5fhoRjzGbH4lrz1+I
 | |
| 07NANzWF2wLFO/oIEaKXtsz59NtvWic0F/KENVvqxjlhzClEExdPjq/OFG0z8sRnIV7lAbxA
 | |
| Nn1hZ//uoUGfghK4zEvgZhVIXC7iII16wd+3mPWHdyDw/qp1cnAEL2w+L8JU4BO/gdP1NsdA
 | |
| dwq0ZZFLNeLOgeAaDpID5eeEXNZm5bh1lo35jX9VnVP96g08rFpH/r23setp/eq8r3Xl3zf2
 | |
| bJ10L7SO9HVzv3a/q/fDr7f0+5/6eaojft/Ys91BKmo9+XfVkx+mMv9yIw6UMLNQmnFtfzr3
 | |
| +FnkTqsQqa0TMmHAM6dMqUEu1ddOHXq05ERzkcfPWZeNHZLbbeyeuedkrR8nvqX6cUAIg/7H
 | |
| yyY1Tdl05TRwiMjOzj+YP0rN9k9y9L/Ko2L1gdCbTwz7CAyLNOMsC2/JjOD6GY1G5B+p/Ikt
 | |
| j4UqSpiwIfRL32/HrasqKeY7cjAYl+w6/yCLJzt5KFnWm0+PlxELlfzNdYXFmOrMJTquNaN6
 | |
| I/uhZflVqjfkKL33ZuYlNruTzxrcaqVYUrABhY0HH9JHM4WF4v0FekEMR4hp9RWmI5J8jXJP
 | |
| b1+noGy8QkXx1dU9cIVqUSsLo+fGkCeLMG6gsYD2/oL8ExRXaOBJ+5zJrY6RD7HBmmTEWyFx
 | |
| NJQs8iTPO413g8t6412zf4BWKzOvD1c7nMXtsgPi0PFc9Kz8cXrWZ3wEJ4MaYjyDVcskRodL
 | |
| uNVtegYb0Wj+rZUA1Q0xZVMhUk18jQRFq95dHnC6MImjE/4aEwXQQRqR73uQkGKzpJPO0tC0
 | |
| 5RIpQqnvEmzEJG6+0XYP/yczihzsWBkJz8xjX3lmNpoHdLQS9bX5KRp0UKem42X4Xl6z5QDP
 | |
| PVp7zH7HlofSWsPXSpeJNlIufrBEFTieCrNMnXm2uERNBZufYu4qFO45+QeZrDj8ZJ6awwYe
 | |
| Jjtg3qdy6fOhWp6UWqQzSbGSN27JfvFYLx6KR6q0OnUaEB6+IdIHQXEwtKfclw1n5sJjNoeJ
 | |
| 7DO3hPAzbTsYoBvPU7J1/KJEGzWOUCELg7aH5J6tOEO4yejaFnJFgYYUcmIc4il8kkz9dNBq
 | |
| N/t56NHJQYHWrF/I8w7Jgb/ruFvI3hGT8BJ2Sh35G6VM43UU0K8YjMyp41JaQWJxuNZj5geU
 | |
| qzmCCTrTYeYcRW6zPbjsdPvZ9c4m/8D/xCNa7frJSXdQb3/kZ+qcBqApwek8/AcL/MzLTBoK
 | |
| vUSR93NiASDdhg7aF4J2ecisbFAuCy9jGLH4XOn0UP1AM+n6QD+OI1vqBoEUMSV6BQqtYxQn
 | |
| A5Rv+tbdNZMY/pC5ijkZBzj3/VlIR4cOOadhWgNt4Fyh/RXrxme0x6VL8gop30O6EWNuJB+C
 | |
| ynLvjbD5oNx/buNqnlzC/VI+cn2nZkPgItxw504mCKM0l2il1d9dRBUTrDnMd1CY20xSHxZz
 | |
| ZzzdlAx/LheiVi6iJxTdYYs4HCS7o2CTMjIxstJ0uOTGtRB9Dz/h1P8hy9ARbY1ItJnKBOSy
 | |
| TqKboKQwOtIQ4fLeU0uiBZHM1UJU4HsrsatCoRCrw02WZMwA2cM1raWrq8RM2kSNNbixxuM2
 | |
| DwF4/vxuMpGkNkcsWuK0UYK1IJxIuyoifCCdx2YU5LOuVzVPgqiuthI19Po1ZXylJkzvzQ0Z
 | |
| FujmIMQPdHQ5pI1benNH/H9D2v7Rcwz31P9VK3s7qv6vWqtWeP3fTvUp//M9nqf8z1P+5yn/
 | |
| 83j5H+MnZ4QMN5K5icHbH42f8KvjMe1N/Oqi/rdBo9NuNxsUfOhRLVLcplnIXEzu7lSrCSxp
 | |
| ISrXpFJKA20fJ63VSnm/koBFt/ld86MaCLs/7yRDVTGJatwraQglITw5Eo9v0ijDdGpkuVRB
 | |
| uCLWf3Lcrl80j/7Ymi7t4dbNITmSypMKtYyCTDvRN4asTSzOQwkUjyS9ibpcSB6HDKUppTMc
 | |
| Mg+pA5GcoUsGf2iBg7wWJMjr7n9ed9vzcajzBujfQeeyKaJBiKmCLczLP7ingvPykAXPn+AK
 | |
| gUcMPikSfBbRMxV0wW+x4xwHfvG/eZhIl8TK48HfdaLIxsiX0SYWGLLMjKxHbjrBBZ0xPXtI
 | |
| 4hHlA5rDOJOQNsrz5quZB4xH7mDDKosS0dYJODai44wcFgIzrQmX+3KrkjQintwpNhIrZBx+
 | |
| egOmEJiitPs1cdBFgoUTboKg2uaUfdpFR924ScUBkfh4dpiHchyxUYcnV/zxqfboER/N/vP+
 | |
| mvof2N2t7Kj6H5Rxu7z+p7z7ZP99j+fJ/nuy/57sv0et/7mj2sXxnMgxXQqP3RoA/5E0O1kB
 | |
| vDRGi55oJpKQ1NQT1eG/Y71Lqre9Mn/kTNm3LIrR8hA/bg7CCzKqUKbsQMYIGV453Lcx43aJ
 | |
| nibPqTz5vclKFZN0WfxKptyFYXVpBiFTMf6RM55Ls5YuX8a1UEJSiPiQIeNIVN4iA4ZyY2Tk
 | |
| 6RHyAd6/kg8oFvudkw78Znrz0BDRLEVCRAg/EW2HSypqyGzxXgpcshO35hHWOt2TSljrz9MI
 | |
| 1D+XUWFi+TaX1TDdfjPhLyUQlVNYAbaeVNictFH7It/cGahOprgr57CeCti0Z4m8EFBvyUhL
 | |
| vCl+OA/Ws+hyy/iG6Y0FZbFz7halSDleFIiOgvI3vCWKZJTOc3QbuPAzw+vVkjFicx+dBREV
 | |
| D1EL2f40Q5Ih0746PxdFJKmZcYIjkP14KxKXymZiyj4wu/9AQvJrJA9I7ce05NKeRtHSabVk
 | |
| FKXlNl9rqizmjtsbh2s5mdWCnY2lFEmCQaVWzLj0TZaTKlP3C5oP4rJmEsEWNTiB9fWWGpx0
 | |
| Gc4mEqq6BVF+A4nc0LLnWoJBrwFAumhO+qFKRFHpUTo3n5TPpYt70vxwVxXEg4UggbxD9tH/
 | |
| SQ3lveH2f7tH8//0S8aPOsd9/l9V+n90/3Nnl+5/7JbK5Sf/73s838gES16i9Tn7sj1i/HUx
 | |
| d+b6Q9Mls+byjHRVDsYDSn3hx08UM/x8aKRLF3mzXtynuol6DUrihTyiCOl0oFSEFGcjISys
 | |
| JinuVOIxQMM7EtG3cK2A7Vj0lfXKswGOp8m40ODBKzlDT1zN3C1XPitrz7nXTDTimsoVrRXP
 | |
| c0dtJVU5kd9IMTkqtUZVnJoAvFd073GGfqzNfjPR0QRunl3URbUEKoWMc1RCX+mXmILw+rUj
 | |
| 5WEx1/ahzUQOWoUbxbVXzxc2KDbI1cdqAzLrO+V8pnJqygqvaPKsFKcy26rp0nUgEkPKg6ar
 | |
| WrOoVV3XtzK3EkqS+HaYDzIaOKjXr2NekwpH3d9MsUEetshCOnqBHilaqUe7O9UK2EOyo474
 | |
| 3c5bjCkRIt9Q2v0i5JXcqUl4t+TYOESay/eyiz3MrHc2xA5dvqdCwnmY0QdnaX+SJMTguH4i
 | |
| GUHEtsPxp3Ll589SGyeXVnGpf/cyidKIM+vEgcgg8vzAi/AAqSGLcR9CgFgf4yR31lHH0fT4
 | |
| 0JszimuE6xcWeOBann4aFqLVazHun6hyLq3gWomJxcQXsfj5Csik3pWXEMbCIswk3uJsQPgT
 | |
| JS/PAn6HIYeLD1My4iGH8dQJ8DBbE2Zdg7NyQR1N7qk063gUJEacAg7eqwgWplg/t2adSNWg
 | |
| FIsOdzGs6Uz5Y2SBEjccHaFxhByEPv6cxeXDfDHbb6SVdrThmIqzvjpUzpbiaHE3QYcodzng
 | |
| layX79nvzEqxqNY7sfmIoZ/TkD//xDGCwj3B3vglSxe2L8+6zd6g0bm4qLdPBp13scEXc3al
 | |
| tKNY+x7mjstEgY6kNG7JvCYeBnFMdSzzt9BIsXxsUq4xO9BhdpkZ8HXoBrNeWLK5zw0SWxaF
 | |
| IG4rNZv8qlcH3b3f/CFnexLvps1DVNoNmVAxVBxu4gqSx6QUK/EYFQ7z6W69cnKUCaXYMOR5
 | |
| IQk8uT8m7qT5VKiLHiBBcjzkaM+S50+Lgmr5pQQdylmlq01/TEdZ5DECPFXGP3ET4T6n6y/x
 | |
| r1a48R4nS3xedbUc9WMD8hCrtKdWsfVBFrGmNT/uiG49pau/YN1kWnXPbimLEjPp8qwg5Knk
 | |
| KV1oJ3VhqxL3ZXJyhEP2H+V//dWP5v/hOTi5aH6DOe75/Z+9cq0W+3/lCv3+G358uv//XZ5u
 | |
| EpL7WiqUDcM4b120+tzy7cmUlMwD0sVJLrUj9C4oeUTJFxd1/e+Abp/HXKgUKoVymaR1l9lv
 | |
| zUi27hVKBaPuhn4qrUjgdEgC9II8i+lsHtHdZin2PYavg+sC/T5P6E8ZaY0JjqYK3ZCrMZ5e
 | |
| kvUHiDAqxlD+EAyaRzzxRj/lQvdnCpChlFX9qv+208Uu3G40hsz1F1nDaIEZhvMpaSuyRma+
 | |
| F5pDx3UiMiqTZJPKQhbgCueJc3sLDwInvMY5WrCgAJLhOtdSlgpx7Xi4skS9cp5aVXp5uvqs
 | |
| sptkTZkWKjpmmOIermkHjJSx+MaErufjbd+aCyHa15EEDpptm+5sYj7nqTeOHIVqIz8ykeoX
 | |
| 9ROO5dwTkCiLS+jYc2FW60lcIABccg8Z8wycwWN2gX7/Rz2G8eEtatRWD1r9X9Ms5JAVQQUp
 | |
| 28JHSFLISLM4yUz1IiKlzHeRLqGHwEOlvayBo4gZcPR4wsgaoIEhMnEkbPTIxB0Ac8qrdAiQ
 | |
| MF5wmDPFJSHeqFJ5jQwaHGEhjZ/tM/E7QUuGQOcznmmg2xeui+zNf5cHKM3KE3NEQwph2+YS
 | |
| 9TtZNgbn4YW5jPdvw5x5osJwmaR3ycARPMrr1YntDfolPjSUKMQh7Ka6NKPk7w9pP1FFW6yu
 | |
| xseRDX6JQZ2IgsFvRdH7dqe9fSuImTk2tdLzpCybUrbG284HOOk0aVfhQ6f77leDJzrp1wm8
 | |
| NZi0RpXUzSeXWMmhinO9aY+KZ7P5kU3ngFNuHP8dJ7E2vY48idvym+ho1dG+slQfHr9Irtmu
 | |
| XWfXTF2+l15ee21wSPEvS8WG8vpVeFXIv2K5huSnye0wMglgDWPQadx/2wTyW67arYaIINUv
 | |
| Ou0zXqDXk+Tvo9xzPGeqsslobwUmUsfkXMCNJ48Hktbse9pvrcbMSGWV11eP7hJd7OPL5dfp
 | |
| nWidFgU4UTxrqHKQUJ6SuUx3ciqIWfJJMQJ/q1yAAjQV4kAyPvKN2BDFI87cUXKpMLVlHQHM
 | |
| JikcOrTsib+gKwF55TXrW2mjQBdLxAW5hB1tVhj5MzprkAkZo/pCd5mFse/bVANnamKTV2CM
 | |
| jERLMJ3hN/pPYqkyPSQYk4nfjdM8mwIdNC491Ulrd+izsU1JYyGFxA+YhbLjqx78X3vX2tzG
 | |
| dWQ/79T+iAlSFYlVEEyKlmRbqexCJCQhoUiZDzv6tBkSQ3BsAIOdAUhjf/32Od195w4ethMn
 | |
| dtUuUXFsAjP32befp/ueDN5eppdn8th/yHOXEZAy+z4v71WDn43lt2+9dJ09IGP+Xo4d8CUL
 | |
| 2Rd5qNYFDKadHLeQqiAN9MGg00tznd5nfEoZrBH67WTVCw8Ko8dshd6XdbqcA3MxG+9ZQZOe
 | |
| DTa7Lu+1WFsB2xECXv6vYpLepByPeYoc+omXPoihXzyTcwPblSJxHkBJpFFKJ7cvXazizeEm
 | |
| N3Y2s6VaX/omv8mWTGjNZ2sPweAUTQFnqc1KMjtzsqBTUS3b9U1aT97K2a3vxJhNP+TZjEZn
 | |
| d53v8qSOSiUZ7qIJ9xsnxbqc3CtKSWaNXU7Xp8OF2CiKF6UXyUHIyHbrnMvcFKdBdZiGkRj9
 | |
| Z7TQ1sw2DmqKsr3ZTMfvW+ioQll+1CCxc0BIAaZRyaFZgGaS5M3VuwsnUlPzfJE7VonvtS2O
 | |
| /9lBa3Xuf+t6jblewjhSLBCcSYRQLR4KsJchZVPGCiEzEeZjTn74ZEq5/9+E/dTLClirB92m
 | |
| H7CZohD04hMU0RxfUm6yAs6HYXifalH5ZHvpRankkKXXxUJdstATqijR2z0JHSQPutM0dIyn
 | |
| R6CxDoBEo+sOv6HkbKz2abnWFouxQBdayrFe9Ri+gDNR8Qq2cJ1iflN31MeDrvB3Ne3YIa00
 | |
| AQzrIntPtYobUf8OUkuV6WQnlE9+6Pk35X/eCy8tF6JO9m6yRyjvb/2J7P/l/F/Uh9j/r37E
 | |
| /k+fvzww+3//4OWrA9j/+4fPH+3/X+Pz+999Jqzms/ouSX5vduNNVcyDOrCcwxosq1FefSVP
 | |
| HOzFCoar355bKQ881wdOt2nn8vPhnmkPQTzFbs2k50Uo/yD/OWv+s3nmD7/1gv0f+0TnH5rf
 | |
| v6SPnzr/6SvD/7/Yf7XP+t+Hh4eP+P9f5dOcf1gC0Pv9cgL/e7b2d0Mxv/XYHz+//BOd/6Oz
 | |
| j5+Gp+/++X3I+f989/n//HD/1cs0ffn54YsXL/afHwL/dfji5WP9/1/lo3XAmWgxOB2c90/S
 | |
| j1dvToZHqfwzOL0YJP/mNd2/8QSXbvrnpZhmB19+eZAk6XpKzxdfdvnTzowZzblI0p2fl69e
 | |
| oFppnfbvxSI+yoQhFaNxzgSM/ecHh18y9SJJB/d5tVIjERb9tFgsHJo0X9GSiTKD5Nlr6X6K
 | |
| H4u8ToLXfGJZKu4979LBe3OXzeh5KGgush4IXBpwdSf/pmvyscplbJMctUwuWVmILdWW8FIv
 | |
| Ghc8zfi8LsYzc69m38uX5iiuEqQ2jeAyKjUlhIPnEJDS1EvTNytGAqoM5fK3Z8Yknm5D//1C
 | |
| 7H3tarzMkOKTWxzkx7rCb4mP+dkzQsi/N2tYrfYmosBsLEa/malbayyjl1KHTHZkAgV4Tqnr
 | |
| Y76XXdk5T6IghoEPguUZIAIPd7C0s+XirqyYFsw6umWyrHX7ZEhPL+BT0td2UWVrcjfwmtGr
 | |
| kvhinxTXVVatduU4wX2ZZ6PeXsr4CFz/WRPJTrj0NmJ4CsqyB6oJmULzPGNpkFaOW9frrlT5
 | |
| LW5HoFPDN7ALmkzmFX0aBHVsH1m9QXvxnmroKgmB+Ig6orOjR2ZjfOlTo51qTFJINDaWV/fS
 | |
| tTs3Hor6bq8bugrODgVgSdNiJ6DcjyzYOEdgJfEXgVwqFtGrjIcppbaoUV6Hl07GeKOjRCMz
 | |
| IhY4Xl/31+bdsOZYItbbHZXmSaKTrebuXJZ4dYFsCu4fuVzNXXEQJNdShIaslJUtZvOyGNfF
 | |
| KBFiBXvCYuYzjQlpJ9oSBg6Srr/Xn0rsSpWHdEV9ipGMer0XFIxG/iDZXV4tcLeWuaKLELDE
 | |
| 8UTLuqLJ1h2NV5Kpirb8IX+SS/FWfsh/yJAh1/UntjZXL1EJugllPtzlOHbJGO5bzlgRM7e5
 | |
| NMR+4CQdu/tLqKOYaxDMgxa2VlhXHCM6unp6yvjuGjnDCccD1g2kFpEXgh0R5Uk7fSGJMI6a
 | |
| Pr+7fOrEwIRRjVuvlGDo0kt8axLeOLKFSixI8SB7usjn9Vfp04M9yiUVle1VF7JMnorhXCJ+
 | |
| YmQSSaaHO5TXxhrV/HGSj+WYU+LVtYEt0XQ33mFNg/VtjPvjqBGN73Iv6N5V9vmk9qkQS6m5
 | |
| QUrwjiPkahvBJVzw3KUw82wRxBnVYSuUnc7KJtFUXeEmQJImkB7u3onYMAdfqHfbHcIAEXFo
 | |
| 88zinRhfYtyijinIy0BzMA9OHOopNZlOd7tsSTHLJoi765QgZGQhRLRPKUsZCNdhqNdTXa5a
 | |
| LeqW18l4JStrKzF59ATW0nKRWTV0nCT8PFl12UnMnhSHyow7MGoR91jLhYgQzt6E4xw/w5cM
 | |
| ugNvJQchElVTjoQ7VjrjKDG6VJRBposeJCcmUcxGxX0xoms4La/JSLSToM904QHKhTZveNos
 | |
| vSk0g8BxVeSiRa96xjQR2ltwm7uhLOI0GzE/m6jF1NfZJqTH7zroUIqsdNJ6YuoGuDxLxS2a
 | |
| 5zQ1vec62Bz7H04u5VMpM1Suecu6cvMVcdpN+F1pXTPG3aV/W0Lb6yX/nvykgiy/Xg7OP1yk
 | |
| /dNj4KiPh1rLBUnTZlJ102PAyYdvrjSUKw9+ODsevrXYLga/bzGULaqSkSMXG/Ec6jHEQChn
 | |
| IEZEFJAEsacFZO8cScwhdb1hO3flBMKlzlam2k5FA73O40zzZGu6vAxsu3rR02XvfNTxdUR7
 | |
| BpSmm1BnCcOnWIjmgNF71KnDqSBcHNAL3lqCiFLtGfbRL2iDIJG8Ku5lx+5zXRAdfDPhSfbw
 | |
| lZ5phbvKzKVbfdaWzbP14paJ3QcZUJnoJk2Kv9oQmAH4e0wytbPcIJuRtc/5c8eSiZzNZTbG
 | |
| kj1FRWVhBLcLhKf9BatsY5kzo5C1PylEpbWfZ4nvTNqJe+9A82Tk3E6GYm1Go0phIVmddkR2
 | |
| dOSg9IW936uCUNq6Emi041y0JkllEopnoyErdRg5vFYWS61suUAYnky5ltadVIBmKm8Tx0fE
 | |
| S29M2TUdIEcsHoXWWsgpeyWJlHXGOXnXF8sokk8qGy0WlIjpBqEl3vNTYYP5HKrXjFaJxlKJ
 | |
| cFLGJfPcMuK9HjAwvohKZNUS6vbc4c8ud8IkATEiuzpQKBWO4M8wWF1Xs2ae1LEeg+2NlWuo
 | |
| zcBwyQmZihRYiiKGwHdBk9CVfizNvLhZlst6or0LzyEvF9qVb6wERsCS2CDjp5LmpBnnsUnc
 | |
| TLJiqsVxXfK/1iqIheLZTLtL9LXaJdatlyiKOeEsIJIACLOYPeYWmk7wDJXIxj6MFIH20rXw
 | |
| RU0/7YoS4WkrVsFdUkuHymsI6c/vVnUBPJLStR5mN9e0J1XwVtZKuxKI6XxBPYr0LwjdH9wy
 | |
| d6WZlPO8oRzT7wziiFlV2wnGOaZxtkQ5W0pkBrAQOtydrLhrslTpNFY0ydrbjNAY/LbKKxc2
 | |
| uYOEsI8tdCmkIQr3NM8XjkLxEL/L8a+0eGy21xgBitNg0Nt1RhRTIM+/kbXlwsocFVBHkrPk
 | |
| YkJSIhuT6608R1twDjSCtWWEp0/1dBzXG+MgbRKj6M1G6wWAiZ4sM22LGZtRHBIMMNZ7XgSx
 | |
| zu9qFXUxtqi9sZZLXZnaXd7CCGppVADMWi8ZVsHpGSKKp7GoRqEVENAuTcBFv07/Zs9V97D0
 | |
| Lui14rmwEiB16gZrAPdUlUEMCZ+xyQujFQYb2YS6lKBR/shiKtK2c2GcCJCeJjY0DVJJtGo5
 | |
| 6mOqRiJpK8IyuSZM8sGqyaaIogSCVnqazcqlcBeFL1II81C0OF66leNlbMC+2G37PIVOO0FJ
 | |
| INPAAn3YKdBxhBf2GoeFIoVx4lv1jsiBbLW5XWxh/cCYGM0nE5dfaM4BLvdF/rDGE9lKo+E9
 | |
| HfyAUsjS1FeOpQsiWyF5RWtT4UFlEwFOGyhBF1+9BLPWkneVibU4kM9mU0MgWGzUbOdaY709
 | |
| 0dzdb8JnDeiq2KWyTa/sszkeNEYTr3qXiRnoOHE7AzAn+YoqQzuPZpdyyapiyBmoy5m0Rlcu
 | |
| VKOKGmKjd2jlP6CNF6rO1qbvTWWN7w2vOGudQd1ZQp9xRLsOP4rmWYpoC8N/CCiiFg1NU96J
 | |
| FHcNp/NyEV5I1oiOmM3QLApmZQr3chajpgmRVs2WJutChYw1VjhNaGkbbhTaW86FkvYKqAO4
 | |
| cYeonadKgCvDNcCxeVAkEmxtZd24kql54uoOkS9ofOq0qnycVaMJEXG3xDA/QEyrc+xSXuxG
 | |
| YQKMlP73RWCYdZMUS8Uo8v9RUa0XSew6coQ7KlhJO6kOVh0B8tzrVHbpjoZD0xXNmyT/Ia/U
 | |
| /HXHmWVqL6pysnWxIwOqrESdm8Cb4eZUvVUVkDkPieEvNJozJTpsPMYqebMOAuQ8WAt/S0PJ
 | |
| uq5FBmlXLe/URPa02OR9OVkyYTcBALOsAHRTnt7MT3XfhgtdV87/otEp2yRNw0rZKuUOf1xV
 | |
| X5/C+uhhQqowdfXn+R5RNdffwafiPnBFKpLfQCPbIn+TCz9xBxzDcwPP7lCihBnAZWZnSl0a
 | |
| sgKN/tS/Qa4N1BWWk7PdwHeTnLKuUp8yBaGh159BmGOQqkA1RkjXzryf2rh83W5NUGVNezpa
 | |
| TVQ370ZaK6dZhapQS3cMNU5CCB3Vxl7LEnaDRrY5syycJ6rc3fQ+mxTaHBKVhDsv6H/Tea3y
 | |
| rGKgpjErqCCRIay6ppCbBjWzBCQa0gzoUTGyCJdbCJB+ijsm1FwXLqbXLqWwrj1bWF/xVl5E
 | |
| e3Na+0DFTwXwz9uD3euvM/kH9uBmF3XFBWojm5X6qSeeYINU9q/FoXZMGToKvWfZRMYyU35m
 | |
| aoyFbdU9oDdNzJgMJpxSzLYNd4e7ESD08H4rXSX4an/y8HK+QUHNAtXBLJd1qSzp7WJ57dLh
 | |
| WldfVBfmrsTm/W3DVNQjpmNhWFC3YxokJx5CMM48tW3LjCk0PY3nZLN40OqRC0dfe0/Yu3bp
 | |
| 8ZiNcQEyVI6WsJWKxmoRy26yrGmZZHVd3hTuEJMjgAqLLBtdhLJ3/rzyYQAQrUCySDWXXxhc
 | |
| YX4yqj3wkE8mWaw4NDOSWb73BBDodkk9z7njuSuz3Y35xMeFIT5IDfPHJaGsmbt6glIbv/YU
 | |
| Zru6C61lWaNrWiAJ9mkvTl/5jhrAVCia2ulTnSFGrAmVqprUYON7NsPEUiFwAFb1QlQ3Opl4
 | |
| i2xr/rCUZFWXM+otHHPoKjG1PbMT6iUzotUTIX+7oS1ErUPFik4AojXmJyOhIxOhSc8k37Lk
 | |
| eA1Fkxropjat1t9Koa4La8Yo1xrYoD5Xt6mMsjHkRVDPr5NtamWLS1r+Ubkc30W8vbCIuTo5
 | |
| p/Oc6TFbhrDmLooWA1GDNP280Rm0Lh8cQequEfuPTvRQHna7LpEopYJ68x/mcOTSgDJR7+w8
 | |
| UlUQzYSDSahivkio4zxQGyx3dr+7d/BPxJWUBhkrypYQAwsTZpAiBTayFffcMqwknENfYKjQ
 | |
| 7bqv6rPiYniYndsLCeEaWuQTDPE3Ry4UVQO/CQPj0eE2wbwBL/YBiD2IQJf873Y5Uc4yKZh+
 | |
| Bun1QrfOzbvY2rRLgdoWSF3AKenBaZKOwS3IbMP0LYVHY5hjmPjqtm2Hcs2lJyx8x8aUzIRb
 | |
| j30o9gYWb+ZWWcUg3V1xXSzUVT/JHkL0vowvOVojo0t4L3A3VBcQGB0Q07CqdtbqhuLrjq9t
 | |
| TvY9de4g4HgTqEb7b9X+8D3WGwQRpobH0WFGf09gT0cchp+sLeKaiWNQh5eWgYhSlKag/Jiq
 | |
| /xMzXsSghrUDZMQPE9lPo7O0xAPJ9osiRfQQt32JUYDfxyWn+zstAZosIhhDe80cQmHsqRDJ
 | |
| YJ7L22XFeFULcBIKprlT/UkajE3PoFMGQLqWpbhjiKuXtE+SIVSszm0OXfBGr6X2E2ghpYgd
 | |
| cx5rFtkrFANQwU53CrLaPDIAISBW+3fLEespp6qkRNapxpwT0UQhcXJ/6Nb20+MH8NekTzXa
 | |
| PC0MW2jxas1L2+smERVSGeY6khBAO08N/4JJ6aiYSSEDF3PZO2449Z7LaUD95JgsTNMPXayd
 | |
| kVbOGcQFnJ/oN4jG3e8q5MLwT3g99umXpo3XQO0IedXFdDmRY5prsEgDGCJDxqZXNlw/icM2
 | |
| EVovrxbqfo9eM9G/sYlQvZ0wd5w9C/tvIpMy392AngmVPBUjmlblSsyE1TNCCqLDHekJ3osw
 | |
| P1V79VrTpo6DhVhGhWYvqts+/CVmJLUKmYdOkZynXRuAo/LlvZZFstr1bR4YFWpARL2C0Aru
 | |
| IG7yjwxfdbgo6LPhkJL/vMsn0KTVGAaSbqaHMqeWp6KXTeAw3iwnqHtYVDfLqaZrK4e7ziZx
 | |
| Lm3UfIRETdQp6fEUfygKS6whVw1AOVMSSuJuEUEdtlxu82VFDrbF5yY7szT5zL/01Efok7qB
 | |
| VcDRL6S6Mu8Z3XUO1DNfnToOChbNZyP0ZuuTr9udM4GdKuOkNUKP8hmSBpMeV9biwmCYjYHd
 | |
| 2mJV+rvBv5qglD85iYr4ucIznPrndMljwdL0A/cxL1HmPkBykjFwHXKsletYN8EUf0AIv2IM
 | |
| Eui+jSHlo8SpnazLbBKiEY2fay3OutDi6nHerDnTE33ptTlRUUnbwr0EUX02Kme6ASORPiMi
 | |
| Swm1Sus70gyUQYr3lrMgjNXH1zAjG6TCTwJewtigSUJlxHdlQZ3wcu3UxGRKSBwGil7g3SfA
 | |
| 6cGMxGtZhvxeDwBqY69LK5Wq9WKDPdOI+KLnwbV1P8Vnhnpd41isQuDwCYQPHBxKw6jiHT6F
 | |
| S6SI+q9XTWQrttOVRzfqyAaWCFyRplfdGsemGUCOno1G6ncAEch2j3M8Pr9jBL01xQj0InJN
 | |
| Y3GJMuIwla5CM7NF+9VWOoC6c2ZUAlATJmkWQlnHsrYO8hFE4kyDU7jaudvmxaLkl3KCESKp
 | |
| ydCjIco5F6p0B6OFH6/L0QbKgMrLlyyxsRuKjpVy9EWV3xeM3uqWA9RsN4bUftHIrms3qANA
 | |
| i8VxwpULaXqBucVt8PCAMEXCF2DuMvZ6XlRRUT+hJxxce0PTIzBCLVGDF/ReD7J4BRz5laaK
 | |
| oNQwhxAiIZBUrv2yFFkY+Ffhb8QWyh4vZdLgi/6E3rLc4EPdNqY3h3dSZWvPbhgSyiojQJ1J
 | |
| 2g6Yd+sSlk63seIosh2j0TjPIwdqW6F2kJhHCH1QZeWogVZX22+YSbaQw8bcm4CGLsJq2xKs
 | |
| BclWAcNSup7vr/Da1B+/7ybKyVDo0n7PlUfHoEang7rCBv6EWDjlvzEKtbb4XesErynVoRjS
 | |
| 2q0uKh8Sw9BDfW8saVMNgxQI8ciYzf3Eyv/YJTKtK4CQwlFOcxyyOqE8CE7GOiCeLU0DQozr
 | |
| 7tf3CMmPmrEAMj4uswlPN89ede9kp2oBS5ySpuT9xglQ+9WybeUBtrNqHNMy2OzI/FFsA644
 | |
| MDESXhkrP2FRDkt1Oj0Ltwlx/w966ZvBUf/qYsBKRR/Pz96d9z+g5JehYo/Tt+eDQXr2Nj16
 | |
| 3z9/N+jiufMBnojbAkY2aqCLMjb4e/DXy8HpZfpxcP5heHkprb35lPY/fpTG+29OBulJ/1tZ
 | |
| zcFfjwYfL9Nv3w9OkzM0/+1QxnNx2ccLw9P02/Ph5fD0ndVS+vjpfPju/WX6/uzkeHBOtO5n
 | |
| 0jtf1KuNBheJjOOb4XF7Up3+hQy7E65W8sFjcrhm6S/D0+NuOhiyocFfP54PLmT+ibQ9/CAj
 | |
| HsiPw9Ojk6tjAoHfXGlNH1bZk3FennFp/FlvXQYj7W/cyQTk8M+4lIlLKI3Igp8PL/6S9i8S
 | |
| W9ivr/qhIVldaeND//SIG7W2kZhu+unsClJD5n1yjAcSfwALNUiPB29RNPob2V55Urq5uPow
 | |
| sPW+uOQCnZykp4MjGW///FN6MTj/ZniEdUjOBx/7Q1l+YKTPz7X0tPKW5z1snlDJ4BvQwNXp
 | |
| CWZ7Pvj6SuazhRLQRv+dUBsWM9r35NuhdI4dWt/8Ll+RH5rN/yRkhALpnxSY/cnIQ4YZkNtt
 | |
| qhCiaKiz/+YMa/BGxjPksGQgWBBs0XH/Q//d4KKbBCJg1wYm76YXHwdHQ/yH/C6kJ3t9oqsi
 | |
| p+jrK+yifGGNpH3ZTkwNdGhbhjMIWjt1GpG+18/l06bvNfoDXZycXYDYpJPLfsoRy7/fDPD0
 | |
| +eBU1ovHqX90dHUuRwtP4A0ZzcWVHLbhKTclwXx5mofnx36euM7p2/7w5Op8g8akZ71oc6C0
 | |
| FjbEiexir0saSIdvpauj97Z7aevUfkrfy1a8Gchj/eNvhuA82k8iZ+FiaGtyZi3YOpKxMddU
 | |
| 5sfntwD4gf3vzwHOKX74Ck5cyAEtT6t+1ktqAfLlJ7DdU1F5TNbVoGOTjyMRr5Ny3lzz3qAp
 | |
| oyw3w+qZyBwzC6ReJGKJqLNsWQcppAae2d05K0yt1DN9B0NDVR9Fu1MSFYukLRFUEoa0nY07
 | |
| 9KKE0BAydiei58W5Y3axyCzw1ChIAdLr+qM6I1KrvFRnt5gaRhzenjb3khqMiDAci7TwWixP
 | |
| GdU8FEUOippwn68sciUqfG3KWgM5JpAHTbGN+MY5j/lTk+8EpaCDkqXmvErnJe0gLXuXWxIs
 | |
| AwYG9UMaE9QAg0L+EevJ9x03EC3Ak1pLzGvT12KB3GpVOUKK9EY/YsP/xLbW06pXK2nfa9Sr
 | |
| 4vMn7fUX3ZNo/mNFUf4DdyWma3clBsTez78v0TuJ7ktkKz/rzsRtC/D33ptIyMgvuzsRTfyy
 | |
| +xNDktHPvkMRb/zyexQVIvGP36WI9zfvU/x5KfxIRgFOCV6BGBYCz5myWy9/CyIWBZmVD6ty
 | |
| JuPXFEDR91EDfqKuzhZCo4VI7Tov9ESSDMtWBRCv1umF+5qAR9S3XNCIYRZFC9sqByY3BNW7
 | |
| mSjV96rNOzm/lKltObvtk7vxthaC5A7031ycnYi2cfIp1pRfkwJs8/VG7L8xW/XhSa85BOun
 | |
| v5EzZPz5BP1olbgWM2ALljsV/EVugr2Ou7t5Eg+kp1CVu9Uchh3jWg3K28fHMYS3jVo907aV
 | |
| TdKyG3fmm53dMpRi0Y+mP4aKa3g1V3BoIMbGCLDYZfQoRMlOW4dmuUvqmedpv86TaSlNPruR
 | |
| EXxPR8Y0ny1lwfJp/ewZuDaN5xr1/9I4x799xSnBeEg/5iO4Z7RcoS6WZ7oH+LG9PUVdWM3d
 | |
| rpIaJvtEYxszRbAjuIzEucYZ16TcdJrMFNc1UKgUqfG1Zmi+N2R6BtzEfCIigqgpvgMy1fyK
 | |
| T+WqHK1muZ9oyD8tWmw+8WwSewN5QqCNGMO1zqWhv0V0/gQBMWIE5TTWmsLLUtYOfKn3ghNN
 | |
| OvszRpO+R6H/igzvjwodQbK3UMnlSk5aOftTNz0QvawqJqw+AgVFf+iiQkddeE7XN0JB5snd
 | |
| wWSDX8UiRY1PI5RonkfejCTKfA1FBkJYrYpZUYagbFUiJg1mw1ISwSmTOB6cGZlg8iqZGG7U
 | |
| kYhSQTRX3GOr6qnjUBJr3J1GyhQeHBbqadwjUd48Y2ZLdYtke3WLTWfmb1295vHzSz9R/afh
 | |
| qZhzJyf//D5+4v6H588P9v3+h8NXuAvw4PDF4eP9f7/KZ+P+h6+vhkd/SY0WNi+AiG9siC6A
 | |
| D+pP7wWFqvy7dwhFbiiCYWLXQSThOohu+8YIYe4He2l6NUNPUYG5ZzKi3mL8P+o8Vqba9Gnm
 | |
| 4aJiWuRzaeBDA4ahwed7itHMs8XNXagOrY9UdYB3qrEXCrpDqBmCGkH9EhyVuup0rYvm0kNZ
 | |
| kygZn0/opWahloNDmsF5P51dnadeJ51yt5ce57eolwuO3ZmuRtedXnoUKvV+EJGJHnvJ4Z5d
 | |
| gqOS1ufF5bHQZpSu3NEHnn3cT/+4uSCd5HNp7Tw3ARwe6KVPj3mBmGgTLJxSsjg664QTGbaX
 | |
| vJAXj5pyPPPxf91dZz3caG0phaJVBS+Dgrn4p5WxfvpJhO3QK3QYWLteip6117NyWvLeUPP5
 | |
| 6jUfgzT5FV/g5Vv+gUoizONLIamXXwjl7IO5vHjR83/2+dSikiOfvOS01d8Q7q1SRO+qQXU+
 | |
| KwDoBaY8X3gd5730dpKNheJeSRNUjTs0LqT3jtosupSRGlQnX+BZSw1O14C6pKNZKJbaVE7d
 | |
| CwlJ/NIFuJORQjIKRHFd7fKbUpIvMb2lftlZzjsGqpZzts+rzDDrCZMQUXn70nZX94couspf
 | |
| RlVQf113revbxvLwSxTJlzZ21H7eWfn5QlSSP4sunh6mz/f3fxVW//h5/Dx+Hj+Pn8fP4+fx
 | |
| 8/h5/Dx+Hj+Pn8fP/9fP/wKykq3cAMgAAA==
 | |
| 
 | |
| --Boundary_(ID_GeYGc69fE1/bkYLTPwOGFg)--
 | |
| 
 | |
| ************
 | |
| 
 | |
| 
 | |
| From owner-pgsql-hackers@hub.org Mon Jan  3 13:47:07 2000
 | |
| Received: from hub.org (hub.org [216.126.84.1])
 | |
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA23987
 | |
| 	for <pgman@candle.pha.pa.us>; Mon, 3 Jan 2000 14:47:06 -0500 (EST)
 | |
| Received: from localhost (majordom@localhost)
 | |
| 	by hub.org (8.9.3/8.9.3) with SMTP id OAA03234;
 | |
| 	Mon, 3 Jan 2000 14:39:56 -0500 (EST)
 | |
| 	(envelope-from owner-pgsql-hackers)
 | |
| Received: by hub.org (bulk_mailer v1.5); Mon, 3 Jan 2000 14:39:49 -0500
 | |
| Received: (from majordom@localhost)
 | |
| 	by hub.org (8.9.3/8.9.3) id OAA03050
 | |
| 	for pgsql-hackers-outgoing; Mon, 3 Jan 2000 14:38:50 -0500 (EST)
 | |
| 	(envelope-from owner-pgsql-hackers@postgreSQL.org)
 | |
| Received: from ara.zf.jcu.cz (zakkr@ara.zf.jcu.cz [160.217.161.4])
 | |
| 	by hub.org (8.9.3/8.9.3) with ESMTP id OAA02975
 | |
| 	for <pgsql-hackers@postgreSQL.org>; Mon, 3 Jan 2000 14:38:05 -0500 (EST)
 | |
| 	(envelope-from zakkr@zf.jcu.cz)
 | |
| Received: from localhost (zakkr@localhost)
 | |
| 	by ara.zf.jcu.cz (8.9.3/8.9.3/Debian/GNU) with SMTP id UAA19297;
 | |
| 	Mon, 3 Jan 2000 20:23:35 +0100
 | |
| Date: Mon, 3 Jan 2000 20:23:35 +0100 (CET)
 | |
| From: Karel Zak - Zakkr <zakkr@zf.jcu.cz>
 | |
| To: P.Marchesso@videotron.ca
 | |
| cc: pgsql-hackers <pgsql-hackers@postgresql.org>
 | |
| Subject: [HACKERS] replicator
 | |
| Message-ID: <Pine.LNX.3.96.1000103194931.19115A-100000@ara.zf.jcu.cz>
 | |
| MIME-Version: 1.0
 | |
| Content-Type: TEXT/PLAIN; charset=US-ASCII
 | |
| Sender: owner-pgsql-hackers@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| 
 | |
| Hi,
 | |
| 
 | |
| I look at your (Philippe's) replicator, but I don't good understand
 | |
| your replication concept.
 | |
| 
 | |
| 
 | |
|     node1:  SQL --IPC--> node-broker
 | |
|                        |
 | |
|                       TCP/IP
 | |
|                        |
 | |
|                     master-node --IPC--> replikator
 | |
|                                          |   |   |
 | |
|                                            libpq
 | |
|                                          |   |   |
 | |
|                                        node2 node..n     
 | |
| 
 | |
| (Is it right picture?)
 | |
| 
 | |
| If I good understand, all nodes make connection to master node and data
 | |
| replicate "replicator" on this master node. But it (master node) is very
 | |
| critical space in this concept - If master node not work replication for 
 | |
| *all* nodes is lost. Hmm.. but I want use replication for high available
 | |
| applications...
 | |
| 
 | |
| IMHO is problem with node registration / authentification on master node.
 | |
| Why concept is not more upright? As:
 | |
| 
 | |
| 	SQL --IPC--> node-replicator
 | |
| 			|  |  | 
 | |
| 		     via libpq send data to all nodes with
 | |
|                      current client/backend auth.
 | |
| 
 | |
| 	(not exist any master node, all nodes have connection to all nodes)	
 | |
| 
 | |
| 
 | |
| Use replicator as external proces and copy data from SQL to this replicator
 | |
| via IPC is (your) very good idea. 
 | |
| 
 | |
| 							Karel
 | |
| 
 | |
| 
 | |
| ----------------------------------------------------------------------
 | |
| Karel Zak <zakkr@zf.jcu.cz>              http://home.zf.jcu.cz/~zakkr/
 | |
| 
 | |
| Docs:        http://docs.linux.cz                    (big docs archive)	
 | |
| Kim Project: http://home.zf.jcu.cz/~zakkr/kim/        (process manager)
 | |
| FTP:         ftp://ftp2.zf.jcu.cz/users/zakkr/        (C/ncurses/PgSQL)
 | |
| -----------------------------------------------------------------------
 | |
| 
 | |
| 
 | |
| ************
 | |
| 
 | |
| From owner-pgsql-hackers@hub.org Tue Jan  4 10:31:01 2000
 | |
| Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
 | |
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA17522
 | |
| 	for <pgman@candle.pha.pa.us>; Tue, 4 Jan 2000 11:31:00 -0500 (EST)
 | |
| Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id LAA01541 for <pgman@candle.pha.pa.us>; Tue, 4 Jan 2000 11:27:30 -0500 (EST)
 | |
| Received: from localhost (majordom@localhost)
 | |
| 	by hub.org (8.9.3/8.9.3) with SMTP id LAA09992;
 | |
| 	Tue, 4 Jan 2000 11:18:07 -0500 (EST)
 | |
| 	(envelope-from owner-pgsql-hackers)
 | |
| Received: by hub.org (bulk_mailer v1.5); Tue, 4 Jan 2000 11:17:58 -0500
 | |
| Received: (from majordom@localhost)
 | |
| 	by hub.org (8.9.3/8.9.3) id LAA09856
 | |
| 	for pgsql-hackers-outgoing; Tue, 4 Jan 2000 11:17:17 -0500 (EST)
 | |
| 	(envelope-from owner-pgsql-hackers@postgreSQL.org)
 | |
| Received: from ara.zf.jcu.cz (zakkr@ara.zf.jcu.cz [160.217.161.4])
 | |
| 	by hub.org (8.9.3/8.9.3) with ESMTP id LAA09763
 | |
| 	for <pgsql-hackers@postgreSQL.org>; Tue, 4 Jan 2000 11:16:43 -0500 (EST)
 | |
| 	(envelope-from zakkr@zf.jcu.cz)
 | |
| Received: from localhost (zakkr@localhost)
 | |
| 	by ara.zf.jcu.cz (8.9.3/8.9.3/Debian/GNU) with SMTP id RAA31673;
 | |
| 	Tue, 4 Jan 2000 17:02:06 +0100
 | |
| Date: Tue, 4 Jan 2000 17:02:06 +0100 (CET)
 | |
| From: Karel Zak - Zakkr <zakkr@zf.jcu.cz>
 | |
| To: Philippe Marchesseault <P.Marchesso@Videotron.ca>
 | |
| cc: pgsql-hackers <pgsql-hackers@postgreSQL.org>
 | |
| Subject: Re: [HACKERS] replicator
 | |
| In-Reply-To: <38714B6F.2DECAEC0@Videotron.ca>
 | |
| Message-ID: <Pine.LNX.3.96.1000104162226.27234D-100000@ara.zf.jcu.cz>
 | |
| MIME-Version: 1.0
 | |
| Content-Type: TEXT/PLAIN; charset=US-ASCII
 | |
| Sender: owner-pgsql-hackers@postgreSQL.org
 | |
| Status: OR
 | |
| 
 | |
| 
 | |
| On Mon, 3 Jan 2000, Philippe Marchesseault wrote:
 | |
| 
 | |
| > So it could become:
 | |
| > 
 | |
| > SQL --IPC--> node-replicator
 | |
| >                            |   |   |
 | |
| >       via TCP send statements to each node
 | |
| >                       replicator (on local node)
 | |
| >                            |
 | |
| >          via libpq send data to
 | |
| >         current (local) backend.
 | |
| > 
 | |
| > >  (not exist any master node, all nodes have connection to all nodes)
 | |
| > 
 | |
| > Exactly, if the replicator dies only the node dies, everything else keeps
 | |
| > working.
 | |
| 
 | |
| 
 | |
|  Hi,
 | |
| 
 | |
|  I a little explore replication conception on Oracle and Sybase (in manuals).
 | |
| (Know anyone some interesting links or publication about it?)
 | |
| 
 | |
|  Firstly, I sure, untimely is write replication to PgSQL now, if we
 | |
| haven't exactly conception for it. It need more suggestion from more
 | |
| developers. We need firstly answers for next qestion:
 | |
| 
 | |
| 	1/ How replication concept choose for PG?
 | |
| 	2/ How manage transaction for nodes? (and we need define any 
 | |
|            replication protocol for this)
 | |
| 	3/ How involve replication in current PG transaction code?
 | |
| 
 | |
| My idea (dream:-) is replication that allow you use full read-write on all
 | |
| nodes and replication which use current transaction method in PG - not is
 | |
| difference between more backends on one host or more backend on more hosts
 | |
| - it makes "global transaction consistency".
 | |
| 
 | |
| Now is transaction manage via ICP (one host), my dream is alike manage 
 | |
| this transaction, but between more host via TCP. (And make optimalization 
 | |
| for this - transfer commited data/commands only.)
 | |
| 
 | |
| 
 | |
| Any suggestion?
 | |
| 
 | |
| 
 | |
| -------------------
 | |
| Note:
 | |
|  
 | |
| (transaction oriented replication)
 | |
| 
 | |
|  Sybase - I. model (only one node is read-write) 
 | |
| 
 | |
| 	 primary SQL data (READ-WRITE)
 | |
|                 |
 | |
| 	 replication agent (transaction log monitoring)
 | |
| 		|
 | |
| 	 primary distribution server (one or more repl. servers)
 | |
| 	        |               /  |  \
 | |
|                 |            nodes (READ-ONLY)
 | |
|                 |
 | |
|          secondary dist. server
 | |
|                           /  |  \
 | |
|                        nodes (READ-ONLY)
 | |
| 
 | |
| 
 | |
|        If primary SQL is read-write and the other nodes *read-only* 
 | |
|        => system good work if connection is disable (data are save to
 | |
|           replication-log and if connection is available log is write 
 | |
| 	  to node).   
 | |
| 
 | |
| 
 | |
|  Sybase - II. model (all nodes read-write)
 | |
| 
 | |
|      	    SQL data 1 --->--+                        NODE I.
 | |
|                 |            |
 | |
|                 ^            |
 | |
| 	        |     replication agent 1 (transaction log monitoring)
 | |
|                 V        |
 | |
| 		|        V
 | |
|                 |        |
 | |
|          replication server 1
 | |
|                 |
 | |
| 		^
 | |
|                 V
 | |
|                 |
 | |
|          replication server 2                        NODE II.
 | |
|                 |         |
 | |
|                 ^         +-<-->--- SQL data 2
 | |
|                 |                    |                      
 | |
|                replcation agent 2 -<--
 | |
| 
 | |
| 
 | |
| 
 | |
| Sorry, I not sure if I re-draw previous picture total good..
 | |
| 
 | |
| 								Karel   
 | |
| 
 | |
| 
 | |
| 	
 | |
|     
 | |
| 
 | |
| 
 | |
| ************
 | |
| 
 | |
| From pgsql-hackers-owner+M3133@hub.org Fri Jun  9 15:02:25 2000
 | |
| Received: from hub.org (root@hub.org [216.126.84.1])
 | |
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id PAA22319
 | |
| 	for <pgman@candle.pha.pa.us>; Fri, 9 Jun 2000 15:02:24 -0400 (EDT)
 | |
| Received: from hub.org (majordom@localhost [127.0.0.1])
 | |
| 	by hub.org (8.10.1/8.10.1) with SMTP id e59IsET81137;
 | |
| 	Fri, 9 Jun 2000 14:54:14 -0400 (EDT)
 | |
| Received: from ultra2.quiknet.com (ultra2.quiknet.com [207.183.249.4])
 | |
| 	by hub.org (8.10.1/8.10.1) with SMTP id e59IrQT80458
 | |
| 	for <pgsql-hackers@postgresql.org>; Fri, 9 Jun 2000 14:53:26 -0400 (EDT)
 | |
| Received: (qmail 13302 invoked from network); 9 Jun 2000 18:53:21 -0000
 | |
| Received: from 18.67.tc1.oro.pmpool.quiknet.com (HELO quiknet.com) (pecondon@207.231.67.18)
 | |
|   by ultra2.quiknet.com with SMTP; 9 Jun 2000 18:53:21 -0000
 | |
| Message-ID: <39413D08.A6BDC664@quiknet.com>
 | |
| Date: Fri, 09 Jun 2000 11:52:57 -0700
 | |
| From: Paul Condon <pecondon@quiknet.com>
 | |
| X-Mailer: Mozilla 4.73 [en] (X11; U; Linux 2.2.14-5.0 i686)
 | |
| X-Accept-Language: en
 | |
| MIME-Version: 1.0
 | |
| To: ohp@pyrenet.fr, pgsql-hackers@postgresql.org
 | |
| Subject: [HACKERS] Re: Big project, please help
 | |
| Content-Type: text/plain; charset=us-ascii
 | |
| Content-Transfer-Encoding: 7bit
 | |
| X-Mailing-List: pgsql-hackers@postgresql.org
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@hub.org
 | |
| Status: OR
 | |
| 
 | |
| Two way replication on a single "table" is availabe in Lotus Notes. In
 | |
| Notes, every record has a time-stamp, which contains the time of the
 | |
| last update. (It also has a creation timestamp.) During replication,
 | |
| timestamps are compared at the row/record level, and compared with the
 | |
| timestamp of the last replication. If, for corresponding rows in two
 | |
| replicas, the timestamp of one row is newer than the last replication,
 | |
| the contents of this newer row is copied to the other replica. But if
 | |
| both of the corresponding rows have newer timestamps, there is a
 | |
| problem. The Lotus Notes solution is to:
 | |
|   1. send a replication conflict message to the Notes Administrator,
 | |
| which message contains full copies of both rows.
 | |
|   2. copy the newest row over the less new row in the replicas.
 | |
|   3. there is a mechanism for the Administrator to reverse the default
 | |
| decision in 2, if the semantics of the message history, or off-line
 | |
| investigation indicates that the wrong decision was made.
 | |
| 
 | |
| In practice, the Administrator is not overwhelmed with replication
 | |
| conflict messages because updates usually only originate at the site
 | |
| that originally created the row. Or updates fill only fields that were
 | |
| originally 'TBD'. The full logic is perhaps more complicated than I have
 | |
| described here, but it is already complicated enough to give you an idea
 | |
| of what you're really being asked to do. I am not aware of a supplier of
 | |
| relational database who really supports two way replication at the level
 | |
| that Notes supports it, but Notes isn't a relational database.
 | |
| 
 | |
| The difficulty of the position that you appear to be in is that
 | |
| management might believe that the full problem is solved in brand X
 | |
| RDBMS, and you will have trouble convincing management that this is not
 | |
| really true.
 | |
| 
 | |
| 
 | |
| From pgsql-hackers-owner+M2401@hub.org Tue May 23 12:19:54 2000
 | |
| Received: from news.tht.net (news.hub.org [216.126.91.242])
 | |
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA28410
 | |
| 	for <pgman@candle.pha.pa.us>; Tue, 23 May 2000 12:19:53 -0400 (EDT)
 | |
| Received: from hub.org (majordom@hub.org [216.126.84.1])
 | |
| 	by news.tht.net (8.9.3/8.9.3) with ESMTP id MAB53304;
 | |
| 	Tue, 23 May 2000 12:00:08 -0400 (EDT)
 | |
| 	(envelope-from pgsql-hackers-owner+M2401@hub.org)
 | |
| Received: from gwineta.repas.de (gwineta.repas.de [193.101.49.1])
 | |
| 	by hub.org (8.9.3/8.9.3) with ESMTP id LAA39896
 | |
| 	for <pgsql-hackers@postgresql.org>; Tue, 23 May 2000 11:57:31 -0400 (EDT)
 | |
| 	(envelope-from kardos@repas-aeg.de)
 | |
| Received: (from smap@localhost)
 | |
| 	by gwineta.repas.de (8.8.8/8.8.8) id RAA27154
 | |
| 	for <pgsql-hackers@postgresql.org>; Tue, 23 May 2000 17:57:23 +0200
 | |
| Received: from dragon.dr.repas.de(172.30.48.206) by gwineta.repas.de via smap (V2.1)
 | |
| 	id xma027101; Tue, 23 May 00 17:56:20 +0200
 | |
| Received: from kardos.dr.repas.de ([172.30.48.153])
 | |
|   by dragon.dr.repas.de (UCX V4.2-21C, OpenVMS V6.2 Alpha);
 | |
| 	Tue, 23 May 2000 17:57:24 +0200
 | |
| Message-ID: <010201bfc4cf$7334d5a0$99301eac@Dr.repas.de>
 | |
| From: "Kardos, Dr. Andreas" <kardos@repas-aeg.de>
 | |
| To: "Todd M. Shrider" <tshrider@varesearch.com>,
 | |
|         <pgsql-hackers@postgresql.org>
 | |
| References: <Pine.LNX.4.04.10005180846290.15739-100000@silicon.su.valinux.com>
 | |
| Subject: Re: [HACKERS] failing over with postgresql
 | |
| Date: Tue, 23 May 2000 17:56:20 +0200
 | |
| Organization: repas AEG Automation GmbH
 | |
| MIME-Version: 1.0
 | |
| Content-Type: text/plain;
 | |
| 	charset="iso-8859-1"
 | |
| Content-Transfer-Encoding: 8bit
 | |
| X-Priority: 3
 | |
| X-MSMail-Priority: Normal
 | |
| X-Mailer: Microsoft Outlook Express 5.00.2314.1300
 | |
| X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
 | |
| X-Mailing-List: pgsql-hackers@postgresql.org
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@hub.org
 | |
| Status: OR
 | |
| 
 | |
| For a SCADA system (Supervisory Control and Data Akquisition) which consists
 | |
| of one  master and one hot-standby server I have implemented such a
 | |
| solution. To these UNIX servers client workstations are connected (NT and/or
 | |
| UNIX). The database client programms run on client and server side.
 | |
| 
 | |
| When developing this approach I had to goals in mind:
 | |
| 1) Not to get dependend on the PostgreSQL sources since they change very
 | |
| dynamically.
 | |
| 2) Not to get dependend on the fe/be protocol  since there are discussions
 | |
| around to change it.
 | |
| 
 | |
| So the approach is quite simple: Forward all database requests to the
 | |
| standby server on TCP/IP level.
 | |
| 
 | |
| On both servers the postmaster listens on port 5433 and not on 5432. On
 | |
| standard port 5432 my program listens instead. This program forks twice for
 | |
| every incomming connection. The first instance forwards all packets from the
 | |
| frontend to both backends. The second instance receives the packets from all
 | |
| backends and forwards the packets from the master backend to the frontend.
 | |
| So a frontend running on a server machine connects to port 5432 of
 | |
| localhost.
 | |
| 
 | |
| On the client machine runs another program (on NT as a service). This
 | |
| program forks for every incomming connections twice. The first instance
 | |
| forwards all packets to port 5432 of the current master server and the
 | |
| second instance forwards the packets from the master server to the frontend.
 | |
| 
 | |
| During standby computer startup the database of the master computer is
 | |
| dumped, zipped, copied to the standby computer, unzipped and loaded into
 | |
| that database.
 | |
| If a standby startup took place, all client connections are aborted to allow
 | |
| a login into the standby database. The frontends need to reconnect in this
 | |
| case. So the database of the standby computer is always in sync.
 | |
| 
 | |
| The disadvantage of this method is that a query cannot be canceled in the
 | |
| standby server since the request key of this connections gets lost. But we
 | |
| can live with that.
 | |
| 
 | |
| Both programms are able to run on Unix and on (native!) NT. On NT threads
 | |
| are created instead of forked processes.
 | |
| 
 | |
| This approach is simple, but it is effective and it works.
 | |
| 
 | |
| We hope to survive this way until real replication will be implemented in
 | |
| PostgreSQL.
 | |
| 
 | |
| Andreas Kardos
 | |
| 
 | |
| -----Ursprüngliche Nachricht-----
 | |
| Von: Todd M. Shrider <tshrider@varesearch.com>
 | |
| An: <pgsql-hackers@postgresql.org>
 | |
| Gesendet: Donnerstag, 18. Mai 2000 17:48
 | |
| Betreff: [HACKERS] failing over with postgresql
 | |
| 
 | |
| 
 | |
| >
 | |
| > is anyone working on or have working a fail-over implentation for the
 | |
| > postgresql stuff. i'd be interested in seeing if and how any might be
 | |
| > dealing with just general issues as well as the database syncing issues.
 | |
| >
 | |
| > we are looking to do this with heartbeat and lvs in mind. also if anyone
 | |
| > is load ballancing their databases that would be cool to talk about to.
 | |
| >
 | |
| > ---
 | |
| > Todd M. Shrider VA Linux Systems
 | |
| > Systems Engineer
 | |
| > tshrider@valinux.com www.valinux.com
 | |
| >
 | |
| 
 | |
| 
 | |
| From pgsql-hackers-owner+M3662@postgresql.org Tue Jan 23 16:23:34 2001
 | |
| Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id QAA04456
 | |
| 	for <pgman@candle.pha.pa.us>; Tue, 23 Jan 2001 16:23:34 -0500 (EST)
 | |
| Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0NLKf004705;
 | |
| 	Tue, 23 Jan 2001 16:20:41 -0500 (EST)
 | |
| 	(envelope-from pgsql-hackers-owner+M3662@postgresql.org)
 | |
| Received: from sectorbase2.sectorbase.com ([208.48.122.131])
 | |
| 	by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0NLAe003753
 | |
| 	for <pgsql-hackers@postgresql.org>; Tue, 23 Jan 2001 16:10:40 -0500 (EST)
 | |
| 	(envelope-from vmikheev@SECTORBASE.COM)
 | |
| Received: by sectorbase2.sectorbase.com with Internet Mail Service (5.5.2653.19)
 | |
| 	id <DG1W4Q8F>; Tue, 23 Jan 2001 12:49:07 -0800
 | |
| Message-ID: <8F4C99C66D04D4118F580090272A7A234D32AF@sectorbase1.sectorbase.com>
 | |
| From: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
 | |
| To: "'dom@idealx.com'" <dom@idealx.com>, pgsql-hackers@postgresql.org
 | |
| Subject: RE: [HACKERS] Re: AW: Re: MySQL and BerkleyDB (fwd)
 | |
| Date: Tue, 23 Jan 2001 13:10:34 -0800
 | |
| MIME-Version: 1.0
 | |
| X-Mailer: Internet Mail Service (5.5.2653.19)
 | |
| Content-Type: text/plain;
 | |
| 	charset="iso-8859-1"
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: ORr
 | |
| 
 | |
| >   I had thought that the pre-commit information could be stored in an
 | |
| > auxiliary table by the middleware program ; we would then have
 | |
| > to re-implement some sort of higher-level WAL (I thought of the list
 | |
| > of the commands performed in the current transaction, with a sequence
 | |
| > number for each of them that would guarantee correct ordering between
 | |
| > concurrent transactions in case of a REDO). But I fear I am missing
 | |
| 
 | |
| This wouldn't work for READ COMMITTED isolation level.
 | |
| But why do you want to log commands into WAL where each modification
 | |
| is already logged in, hm, correct order?
 | |
| Well, it has sense if you're looking for async replication but
 | |
| you need not in two-phase commit for this and should aware about
 | |
| problems with READ COMMITTED isolevel.
 | |
| 
 | |
| Back to two-phase commit - it's easiest part of work required for
 | |
| distributed transaction processing.
 | |
| Currently we place single commit record to log and transaction is
 | |
| committed when this record (and so all other transaction records)
 | |
| is on disk.
 | |
| Two-phase commit:
 | |
| 
 | |
| 1. For 1st phase we'll place into log "prepared-to-commit" record
 | |
|    and this phase will be accomplished after record is flushed on disk.
 | |
|    At this point transaction may be committed at any time because of
 | |
|    all its modifications are logged. But it still may be rolled back
 | |
|    if this phase failed on other sites of distributed system.
 | |
| 
 | |
| 2. When all sites are prepared to commit we'll place "committed"
 | |
|    record into log. No need to flush it because of in the event of
 | |
|    crash for all "prepared" transactions recoverer will have to
 | |
|    communicate other sites to know their statuses anyway.
 | |
| 
 | |
| That's all! It is really hard to implement distributed lock- and
 | |
| communication- managers but there is no problem with logging two
 | |
| records instead of one. Period.
 | |
| 
 | |
| Vadim
 | |
| 
 | |
| From pgsql-hackers-owner+M3665@postgresql.org Tue Jan 23 17:05:26 2001
 | |
| Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA05972
 | |
| 	for <pgman@candle.pha.pa.us>; Tue, 23 Jan 2001 17:05:24 -0500 (EST)
 | |
| Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0NM31008120;
 | |
| 	Tue, 23 Jan 2001 17:03:01 -0500 (EST)
 | |
| 	(envelope-from pgsql-hackers-owner+M3665@postgresql.org)
 | |
| Received: from candle.pha.pa.us (candle.navpoint.com [162.33.245.46])
 | |
| 	by mail.postgresql.org (8.11.1/8.11.1) with ESMTP id f0NLsU007188
 | |
| 	for <pgsql-hackers@postgresql.org>; Tue, 23 Jan 2001 16:54:30 -0500 (EST)
 | |
| 	(envelope-from pgman@candle.pha.pa.us)
 | |
| Received: (from pgman@localhost)
 | |
| 	by candle.pha.pa.us (8.9.0/8.9.0) id QAA05300;
 | |
| 	Tue, 23 Jan 2001 16:53:53 -0500 (EST)
 | |
| From: Bruce Momjian <pgman@candle.pha.pa.us>
 | |
| Message-Id: <200101232153.QAA05300@candle.pha.pa.us>
 | |
| Subject: Re: [HACKERS] Re: AW: Re: MySQL and BerkleyDB (fwd)
 | |
| In-Reply-To: <8F4C99C66D04D4118F580090272A7A234D32AF@sectorbase1.sectorbase.com>
 | |
| 	"from Mikheev, Vadim at Jan 23, 2001 01:10:34 pm"
 | |
| To: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
 | |
| Date: Tue, 23 Jan 2001 16:53:53 -0500 (EST)
 | |
| CC: "'dom@idealx.com'" <dom@idealx.com>, pgsql-hackers@postgresql.org
 | |
| X-Mailer: ELM [version 2.4ME+ PL77 (25)]
 | |
| MIME-Version: 1.0
 | |
| Content-Transfer-Encoding: 7bit
 | |
| Content-Type: text/plain; charset=US-ASCII
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| [ Charset ISO-8859-1 unsupported, converting... ]
 | |
| > >   I had thought that the pre-commit information could be stored in an
 | |
| > > auxiliary table by the middleware program ; we would then have
 | |
| > > to re-implement some sort of higher-level WAL (I thought of the list
 | |
| > > of the commands performed in the current transaction, with a sequence
 | |
| > > number for each of them that would guarantee correct ordering between
 | |
| > > concurrent transactions in case of a REDO). But I fear I am missing
 | |
| > 
 | |
| > This wouldn't work for READ COMMITTED isolation level.
 | |
| > But why do you want to log commands into WAL where each modification
 | |
| > is already logged in, hm, correct order?
 | |
| > Well, it has sense if you're looking for async replication but
 | |
| > you need not in two-phase commit for this and should aware about
 | |
| > problems with READ COMMITTED isolevel.
 | |
| > 
 | |
| 
 | |
| I believe the issue here is that while SERIALIZABLE ISOLATION means all
 | |
| queries can be run serially, our default is READ COMMITTED, meaning that
 | |
| open transactions see committed transactions, even if the transaction
 | |
| committed after our transaction started.  (FYI, see my chapter on
 | |
| transactions for help,  http://www.postgresql.org/docs/awbook.html.)
 | |
| 
 | |
| To do higher-level WAL, you would have to record not only the queries,
 | |
| but the other queries that were committed at the start of each command
 | |
| in your transaction.
 | |
| 
 | |
| Ideally, you could number every commit by its XID your log, and then
 | |
| when processing the query, pass the "committed" transaction ids that
 | |
| were visible at the time each command began.
 | |
| 
 | |
| In other words, you can replay the queries in transaction commit order,
 | |
| except that you have to have some transactions committed at specific
 | |
| points while other transactions are open, i.e.:
 | |
| 
 | |
| XID	Open XIDS	Query
 | |
| 500			UPDATE t SET col = 3;
 | |
| 501	500		BEGIN;
 | |
| 501	500		UPDATE t SET col = 4;
 | |
| 501			UPDATE t SET col = 5;
 | |
| 501			COMMIT;
 | |
| 
 | |
| This is a silly example, but it shows that 500 must commit after the
 | |
| first command in transaction 501, but before the second command in the
 | |
| transaction.  This is because UPDATE t SET col = 5 actually sees the
 | |
| changes made by transaction 500 in READ COMMITTED isolation level.
 | |
| 
 | |
| I am not advocating this.  I think WAL is a better choice.  I just
 | |
| wanted to outline how replaying the queries in commit order is 
 | |
| insufficient.
 | |
| 
 | |
| > Back to two-phase commit - it's easiest part of work required for
 | |
| > distributed transaction processing.
 | |
| > Currently we place single commit record to log and transaction is
 | |
| > committed when this record (and so all other transaction records)
 | |
| > is on disk.
 | |
| > Two-phase commit:
 | |
| > 
 | |
| > 1. For 1st phase we'll place into log "prepared-to-commit" record
 | |
| >    and this phase will be accomplished after record is flushed on disk.
 | |
| >    At this point transaction may be committed at any time because of
 | |
| >    all its modifications are logged. But it still may be rolled back
 | |
| >    if this phase failed on other sites of distributed system.
 | |
| > 
 | |
| > 2. When all sites are prepared to commit we'll place "committed"
 | |
| >    record into log. No need to flush it because of in the event of
 | |
| >    crash for all "prepared" transactions recoverer will have to
 | |
| >    communicate other sites to know their statuses anyway.
 | |
| > 
 | |
| > That's all! It is really hard to implement distributed lock- and
 | |
| > communication- managers but there is no problem with logging two
 | |
| > records instead of one. Period.
 | |
| 
 | |
| Great.
 | |
| 
 | |
| 
 | |
| -- 
 | |
|   Bruce Momjian                        |  http://candle.pha.pa.us
 | |
|   pgman@candle.pha.pa.us               |  (610) 853-3000
 | |
|   +  If your life is a hard drive,     |  830 Blythe Avenue
 | |
|   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
 | |
| 
 | |
| From pgsql-general-owner+M805@postgresql.org Tue Nov 21 23:53:04 2000
 | |
| Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA19262
 | |
| 	for <pgman@candle.pha.pa.us>; Wed, 22 Nov 2000 00:53:03 -0500 (EST)
 | |
| Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by mail.postgresql.org (8.11.1/8.11.1) with SMTP id eAM5qYs47249;
 | |
| 	Wed, 22 Nov 2000 00:52:34 -0500 (EST)
 | |
| 	(envelope-from pgsql-general-owner+M805@postgresql.org)
 | |
| Received: from racerx.cabrion.com (racerx.cabrion.com [166.82.231.4])
 | |
| 	by mail.postgresql.org (8.11.1/8.11.1) with ESMTP id eAM5lJs46653
 | |
| 	for <pgsql-general@postgresql.org>; Wed, 22 Nov 2000 00:47:19 -0500 (EST)
 | |
| 	(envelope-from rob@cabrion.com)
 | |
| Received: from cabrionhome (gso163-25-211.triad.rr.com [24.163.25.211])
 | |
| 	by racerx.cabrion.com (8.8.7/8.8.7) with SMTP id AAA13731
 | |
| 	for <pgsql-general@postgresql.org>; Wed, 22 Nov 2000 00:45:20 -0500
 | |
| Message-ID: <006501c05447$fb9aa0c0$4100fd0a@cabrion.org>
 | |
| From: "rob" <rob@cabrion.com>
 | |
| To: <pgsql-general@postgresql.org>
 | |
| Subject: [GENERAL] Synchronization Toolkit
 | |
| Date: Wed, 22 Nov 2000 00:49:29 -0500
 | |
| MIME-Version: 1.0
 | |
| Content-Type: multipart/mixed;
 | |
| 	boundary="----=_NextPart_000_0062_01C0541E.125CAF30"
 | |
| X-Priority: 3
 | |
| X-MSMail-Priority: Normal
 | |
| X-Mailer: Microsoft Outlook Express 5.50.4133.2400
 | |
| X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400
 | |
| Precedence: bulk
 | |
| Sender: pgsql-general-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| This is a multi-part message in MIME format.
 | |
| 
 | |
| ------=_NextPart_000_0062_01C0541E.125CAF30
 | |
| Content-Type: text/plain; charset="iso-8859-1"
 | |
| Content-Transfer-Encoding: 7bit
 | |
| 
 | |
| Not to be confused with replication, my concept of synchronization is to
 | |
| manage changes between a server table (or tables) and one or more mobile,
 | |
| disconnected databases (i.e. PalmPilot, laptop, etc.).
 | |
| 
 | |
| I read through the notes in the TODO for this topic and devised a tool kit
 | |
| for doing synchronization.  I hope that the Postgresql development community
 | |
| will find this useful and will help me refine this concept by offering
 | |
| insight, experience and some good old fashion hacking if you are so
 | |
| inclined.
 | |
| 
 | |
| The bottom of this message describes how to use the attached files.
 | |
| 
 | |
| I look forward to your feedback.
 | |
| 
 | |
| --rob
 | |
| 
 | |
| 
 | |
| Methodology:
 | |
| 
 | |
| I devised a concept that I call "session versioning".  This means that every
 | |
| time a row changes it does NOT get a new version.  Rather it gets stamped
 | |
| with the current session version common to all published tables.  Clients,
 | |
| when they connect for synchronization, will immediately increment this
 | |
| common version number reserve the result as a "post version" and then
 | |
| increment the session version again.  This version number, implemented as a
 | |
| sequence, is common to all synchronized tables and rows.
 | |
| 
 | |
| Any time the server makes changes to the row gets stamped with the current
 | |
| session version, when the client posts its changes it uses the reserved
 | |
| "post version".  The client then makes all it's changes stamping the changed
 | |
| rows with it's reserved "post version" rather than the current version.  The
 | |
| reason why is explained later.  It is important that the client post all its
 | |
| own changes first so that it does not end up receiving records which changed
 | |
| since it's last session that it is about to update anyway.
 | |
| 
 | |
| Reserving the post version is a two step process.  First, the number is
 | |
| simply stored in a variable for later use.  Second, the value is added to a
 | |
| lock table (last_stable) to indicate to any concurrent sessions that rows
 | |
| with higher version numbers are to be considered "unstable" at the moment
 | |
| and they should not attempt to retrieve them at this time.  Each client,
 | |
| upon connection, will use the lowest value in this lock table (max_version)
 | |
| to determine the upper boundary for versions it should retrieve.  The lower
 | |
| boundary is simply the  previous session's "max_version" plus one.  Thus
 | |
| when the client retrieves changes is uses the following SQL "where"
 | |
| expression:
 | |
| 
 | |
| WHERE row_version >= max_version and row_version <= last_stable_version and
 | |
| version <> this_post_version
 | |
| 
 | |
| The point of reserving and locking a post version is important in that it
 | |
| allows concurrent synchronization by multiple clients.  The first, of many,
 | |
| clients to connect basically dictates to all future clients that they must
 | |
| not take any rows equal to or greater than the one which it just reserved
 | |
| and locked.  The reason the session version is incremented a second time is
 | |
| so that the server may continue to post changes concurrent with any client
 | |
| changes and be certain that these concurrent server changes will not taint
 | |
| rows the client is about to retrieve. Once the client is finished with it's
 | |
| session it removes the lock on it's post version.
 | |
| 
 | |
| Partitioning data for use by each node is the next challenge we face.  How
 | |
| can we control which "slice" of data each client receives?  A slice can be
 | |
| horizontal or vertical within a table.  Horizontal slices are easy,  it's
 | |
| just the where clause of an SQL statement that says "give me the rows that
 | |
| match X criteria".  We handle this by storing and appending a where clause
 | |
| to each client's retrieval statement  in addition to where clause described
 | |
| above.  Actually, two where clauses are stored and appended.  One is per
 | |
| client and one is per publication (table).
 | |
| 
 | |
| We defined horizontal slices by filtering rows.  Vertical slices are limits
 | |
| by column.  The tool kit does provide a mechanism for pseudo vertical
 | |
| partitioning.  When a client is "subscribed" to a publication, the toolkit
 | |
| stores what columns that node is to receive during a session.  These are
 | |
| stored in the subscribed_cols table.  While this does limit the number
 | |
| columns transmitted, the insert/update/delete triggers do not recognize
 | |
| changes based on columns.   The "pseudo" nature of our vertical partitioning
 | |
| is evident by example:
 | |
| 
 | |
| Say you have a table with name, address and phone number as columns.  You
 | |
| restrict a client to see only name and address.  This means that phone
 | |
| number information will not be sent to the client during synchronization,
 | |
| and the client can't attempt to alter the phone number of a given entry.
 | |
| Great, but . . . if, on the server, the phone number (but not the name or
 | |
| address) is changed, the entire row gets marked with a new version.  This
 | |
| means that the name and address will get sent to the client even though they
 | |
| didn't change.
 | |
| 
 | |
| Well, there's the flaw in vertical partitioning.  Other than wasting
 | |
| bandwidth, the extra row does no harm to the process.  The workaround for
 | |
| this is to highly normalize your schema when possible.
 | |
| 
 | |
| Collisions are the next crux one encounters with synchronization.  When two
 | |
| clients retrieve the same row and both make (different)changes, which one is
 | |
| correct?  So far the system operates totally independent of time.  This is
 | |
| good because it doesn't rely on the server or client to keep accurate time.
 | |
| We can just ignore time all together, but then we force our clients to
 | |
| synchronize on a strict schedule in order to avoid (or reduce) collisions.
 | |
| If every node synchronized immediately after making changes we could just
 | |
| stop here.  Unfortunately this isn't reality.  Reality dictates that of two
 | |
| clients: Client A & B will each pick up the same record on Monday.  A will
 | |
| make changes on Monday, then leave for vacation.  B will make changes on
 | |
| Wednesday because new information was gathered in A's absence.  Client B
 | |
| posts those changes Wednesday.  Meanwhile, client A returns from vacation on
 | |
| Friday and synchronizes his changes.  A over writes B's changes even though
 | |
| A made changes before the most recent information was posted by B.
 | |
| 
 | |
| It is clear that we need some form of time stamp to cope with the above
 | |
| example.  While clocks aren't the most reliable, they are the only common
 | |
| version control available to solve this problem.  The system is set up to
 | |
| accept (but not require) timestamps from clients and changes on the server
 | |
| are time stamped.  The system, when presented a time stamp with a row, will
 | |
| compare them to figure out who wins in a tie.   The system makes certain
 | |
| "sanity" checks with regard to these time stamps.  A client may not attempt
 | |
| to post a change with a timestamp that is more than one hour in the future
 | |
| (according to what the server thinks "now" is) nor one hour before it's last
 | |
| synchronization date/time.  The client row will be immediately placed into
 | |
| the collision table if the timestamp is that far out of whack.
 | |
| Implementations of the tool kit should take care to ensure that client &
 | |
| server agree on what "now" is before attempting to submit changes with
 | |
| timestamps.
 | |
| 
 | |
| Time stamps are not required.  Should a client be incapable of tracking
 | |
| timestamps, etc.  The system will assume that any server row which has been
 | |
| changed since the client's last session will win a tie.  This is quite error
 | |
| prone, so timestamps are encouraged where possible.
 | |
| 
 | |
| Inserts pose an interesting challenge.  Since multiple clients cannot share
 | |
| a sequence (often used as a primary key) while disconnected.  They will be
 | |
| responsible for their own unique "row_id" when inserting records.   Inserts
 | |
| accept any arbitrary key, and write back to the client a special kind of
 | |
| update that gives the server's row_id.  The client is responsible for making
 | |
| sure that this update takes place locally.
 | |
| 
 | |
| Deletes are the last portion of the process.  When deletes occur, the
 | |
| row_id, version, etc. are stored in a "deleted" table.  These entries are
 | |
| retrieved by the client using the same version filter as described above.
 | |
| The table is pruned at the end of each session by deleting all records with
 | |
| versions that are less than the lowest 'last_version' stored for each
 | |
| client.
 | |
| 
 | |
| Having wrapped up the synchronization process, I'll move on to describe some
 | |
| points about managing clients, publications and the like.
 | |
| 
 | |
| The tool kit is split into two objects: SyncManagement and Synchronization.
 | |
| The Synchronization object exposes an API that client implementations use to
 | |
| communicate and receive changes.  The management functions handle system
 | |
| install and uninstall in addition to publication of tables and client
 | |
| subscriptions.
 | |
| 
 | |
| Installation and uninstallation are handled by their corresponding functions
 | |
| in the API.  All system tables are prefixed and suffixed with four
 | |
| underscores, in hopes that this avoids conflict with an existing tables.
 | |
| Calling the install function more than once will generate an error message.
 | |
| Uninstall will remove all related tables, sequences,  functions and triggers
 | |
| from the system.
 | |
| 
 | |
| The first step, after installing the system, is to publish a table.  A table
 | |
| can be published more than once under different names.  Simply provide a
 | |
| unique name as the second argument to the publish function.  Since object
 | |
| names are restricted to 32 characters in Postgres, each table is given a
 | |
| unique id and this id is used to create the trigger and sequence names.
 | |
| Since one table can be published multiple times, but only needs one set of
 | |
| triggers and one sequence for change management a reference count is kept so
 | |
| that we know when to add/drop triggers and functions.  By default, all
 | |
| columns are published, but the third argument to the publish function
 | |
| accepts an array reference of column names that allows you to specify a
 | |
| limited set.  Information about the table is stored in the "tables" table,
 | |
| info about the publication is in the "publications" table and column names
 | |
| are stored in "subscribed_cols" table.
 | |
| 
 | |
| The next step is to subscribe a client to a table.  A client is identified
 | |
| by a user name and a node name.  The subscribe function takes three
 | |
| arguments: user, node & publication.  The subscription process writes an
 | |
| entry into the "subscribed" table with default values.  Of note, the
 | |
| "RefreshOnce" attribute is set to true whenever a table is published.  This
 | |
| indicates to the system that a full table refresh should be sent the next
 | |
| time the client connects even if the client requests synchronization rather
 | |
| than refresh.
 | |
| 
 | |
| The toolkit does not, yet, provide a way to manage the whereclause stored at
 | |
| either the publication or client level.  To use or test this feature, you
 | |
| will need to set the whereclause attributes manually.
 | |
| 
 | |
| Tables and users can be unpublished and unsubscribed using the corresponding
 | |
| functions within the tool kit's management interface.  Because postgres
 | |
| lacks an "ALTER TABLE DROP COLUMN" function, the unpublish function only
 | |
| removes default values and indexes for those columns.
 | |
| 
 | |
| The API isn't the most robust thing in the world right now.  All functions
 | |
| return undef on success and an error string otherwise (like DBD).  I hope to
 | |
| clean up the API considerably over the next month.  The code has not been
 | |
| field tested at this time.
 | |
| 
 | |
| 
 | |
| The files attached are:
 | |
| 
 | |
| 1) SynKit.pm (A perl module that contains install/uninstall functions and a
 | |
| simple api for synchronization & management)
 | |
| 
 | |
| 2) sync_install.pl (Sample code to demonstrate the installation, publishing
 | |
| and subscribe process)
 | |
| 
 | |
| 3) sync_uninstall.pl (Sample code to demonstrate the uninstallation,
 | |
| unpublishing and unsubscribe process)
 | |
| 
 | |
| 
 | |
| To use them on Linux (don't know about Win32 but should work fine):
 | |
| 
 | |
|  - set up a test database and make SURE plpgsql is installed
 | |
| 
 | |
|  - install perl 5.05 along with Date::Parse(TimeDate-1.1) , DBI and DBD::Pg
 | |
| modules [www.cpan.org]
 | |
| 
 | |
|  - copy all three attached files to a test directory
 | |
| 
 | |
|  - cd to your test directory
 | |
| 
 | |
|  - edit all three files and change the three DBI variables to suit your
 | |
| system (they are clearly marked)
 | |
| 
 | |
|  - % perl sync_install.pl
 | |
| 
 | |
|  - check out the tables, functions & triggers installed
 | |
| 
 | |
|  - % perl sync.pl
 | |
| 
 | |
|  - check out the 'sync_test' table, do some updates/inserts/deletes and run
 | |
| sync.pl again
 | |
|         NOTE: Sanity checks default to allow no more than 50% of the table
 | |
| to be changed by the client in a single session.
 | |
|         If you delete all (or most of) the rows  you will get errors when
 | |
| you run sync.pl again! (by design)
 | |
| 
 | |
|  - % perl sync_uninstall.pl  (when you are done)
 | |
| 
 | |
|  - check out  the sample scripts and the perl module code (commented, but
 | |
| not documented)
 | |
| 
 | |
| 
 | |
| 
 | |
| 
 | |
| 
 | |
| 
 | |
| ------=_NextPart_000_0062_01C0541E.125CAF30
 | |
| Content-Type: application/octet-stream; name="sync.pl"
 | |
| Content-Transfer-Encoding: quoted-printable
 | |
| Content-Disposition: attachment; filename="sync.pl"
 | |
| 
 | |
| 
 | |
| 
 | |
| # This script depicts the syncronization process for two users.
 | |
| 
 | |
| 
 | |
| ##  CHANGE THESE THREE VARIABLE TO MATCH YOUR SYSTEM  ###########
 | |
| my $dbi_connect_string =3D 'dbi:Pg:dbname=3Dtest;host=3Dsnoopy';	#
 | |
| my $db_user =3D 'test';						#
 | |
| my $db_pass =3D 'test';						#
 | |
| #################################################################
 | |
| 
 | |
| my $ret; #holds return value
 | |
| 
 | |
| use SynKit;
 | |
| 
 | |
| #create a synchronization object (pass dbi connection info)
 | |
| my $s =3D Synchronize->new($dbi_connect_string,$db_user,$db_pass);
 | |
| 
 | |
| #start a session by passing a user name, "node" identifier and a collision =
 | |
| queue name (client or server)
 | |
| $ret =3D $s->start_session('JOE','REMOTE_NODE_NAME','server');
 | |
| print "Handle this error: $ret\n\n" if $ret;
 | |
| 
 | |
| #call this once before attempting to apply individual changes
 | |
| $ret =3D $s->start_changes('sync_test',['name']);
 | |
| print "Handle this error: $ret\n\n" if $ret;
 | |
| 
 | |
| #call this for each change the client wants to make to the database
 | |
| $ret =3D  $s->apply_change(CLIENTROWID,'insert',undef,['ted']);
 | |
| print "Handle this error: $ret\n\n" if $ret;
 | |
| 
 | |
| #call this for each change the client wants to make to the database
 | |
| $ret =3D  $s->apply_change(CLIENTROWID,'insert','1973-11-10 11:25:00 AM -05=
 | |
| ',['tim']);
 | |
| print "Handle this error: $ret\n\n" if $ret;
 | |
| 
 | |
| #call this for each change the client wants to make to the database
 | |
| $ret =3D  $s->apply_change(999,'update',undef,['tom']);
 | |
| print "Handle this error: $ret\n\n" if $ret;
 | |
| 
 | |
| #call this for each change the client wants to make to the database
 | |
| $ret =3D  $s->apply_change(1,'update',undef,['tom']);
 | |
| print "Handle this error: $ret\n\n" if $ret;
 | |
| 
 | |
| #call this once after all changes have been submitted
 | |
| $ret =3D $s->end_changes();
 | |
| print "Handle this error: $ret\n\n" if $ret;
 | |
| 
 | |
| #call this to get updates from all subscribed tables
 | |
| $ret =3D $s->get_all_updates();
 | |
| print "Handle this error: $ret\n\n" if $ret;
 | |
| 
 | |
| print "\n\nSyncronization session is complete. (JOE) \n\n";
 | |
| 
 | |
| 
 | |
| # make some changes to the database (server perspective)
 | |
| 
 | |
| print "\n\nMaking changes to the the database. (server side) \n\n";
 | |
| 
 | |
| use DBI;
 | |
| my $dbh =3D DBI->connect($dbi_connect_string,$db_user,$db_pass);
 | |
| 
 | |
| $dbh->do("insert into sync_test values ('roger')");
 | |
| $dbh->do("insert into sync_test values ('john')");
 | |
| $dbh->do("insert into sync_test values ('harry')");
 | |
| $dbh->do("delete from sync_test where name =3D 'roger'");
 | |
| $dbh->do("update sync_test set name =3D 'tom' where name =3D 'harry'");
 | |
| 
 | |
| $dbh->disconnect;
 | |
| 
 | |
| 
 | |
| #now do another session for a different user
 | |
| 
 | |
| #start a session by passing a user name, "node" identifier and a collision =
 | |
| queue name (client or server)
 | |
| $ret =3D $s->start_session('KEN','ANOTHER_REMOTE_NODE_NAME','server');
 | |
| print "Handle this error: $ret\n\n" if $ret;
 | |
| 
 | |
| #call this to get updates from all subscribed tables
 | |
| $ret =3D $s->get_all_updates();
 | |
| print "Handle this error: $ret\n\n" if $ret;
 | |
| 
 | |
| print "\n\nSynchronization session is complete. (KEN)\n\n";
 | |
| 
 | |
| print "Now look at your database and see what happend, make changes to the =
 | |
| test table, etc. and run this again.\n\n";
 | |
| 
 | |
| ------=_NextPart_000_0062_01C0541E.125CAF30
 | |
| Content-Type: application/octet-stream; name="sync_uninstall.pl"
 | |
| Content-Transfer-Encoding: quoted-printable
 | |
| Content-Disposition: attachment; filename="sync_uninstall.pl"
 | |
| 
 | |
| 
 | |
| # this script uninstalls the synchronization system using the SyncManager o=
 | |
| bject;
 | |
| 
 | |
| use SynKit;
 | |
| 
 | |
| ###  CHANGE THESE TO MATCH YOUR SYSTEM   ########################
 | |
| my $dbi_connect_string =3D 'dbi:Pg:dbname=3Dtest;host=3Dsnoopy';	#
 | |
| my $db_user =3D 'test';						#
 | |
| my $db_pass =3D 'test';						#
 | |
| #################################################################
 | |
| 
 | |
| 
 | |
| my $ret; #holds return value
 | |
| 
 | |
| #create an instance of the SyncManager object
 | |
| my $m =3D SyncManager->new($dbi_connect_string,$db_user,$db_pass);
 | |
| 
 | |
| # call this to unsubscribe a user/node (not necessary if you are uninstalli=
 | |
| ng)
 | |
| print $m->unsubscribe('KEN','ANOTHER_REMOTE_NODE_NAME','sync_test');
 | |
| 
 | |
| #call this to unpublish a table (not necessary if you are uninstalling)
 | |
| print $m->unpublish('sync_test');
 | |
| 
 | |
| #call this to uninstall the syncronization system
 | |
| #  NOTE: this will automatically unpublish & unsubscribe all users
 | |
| print $m->UNINSTALL;
 | |
| 
 | |
| # now let's drop our little test table
 | |
| use DBI;
 | |
| my $dbh =3D DBI->connect($dbi_connect_string,$db_user,$db_pass);
 | |
| $dbh->do("drop table sync_test");
 | |
| $dbh->disconnect;
 | |
| 
 | |
| print "\n\nI hope you enjoyed this little demonstration\n\n";
 | |
| 
 | |
| 
 | |
| 
 | |
| ------=_NextPart_000_0062_01C0541E.125CAF30
 | |
| Content-Type: application/octet-stream; name="sync_install.pl"
 | |
| Content-Transfer-Encoding: quoted-printable
 | |
| Content-Disposition: attachment; filename="sync_install.pl"
 | |
| 
 | |
| 
 | |
| # This script shows how to install the synchronization system=20
 | |
| # using the SyncManager object
 | |
| 
 | |
| use SynKit;
 | |
| 
 | |
| ### CHANGE THESE TO MATCH YOUR SYSTEM  ##########################
 | |
| my $dbi_connect_string =3D 'dbi:Pg:dbname=3Dtest;host=3Dsnoopy';	#
 | |
| my $db_user =3D 'test';						#
 | |
| my $db_pass =3D 'test';						#
 | |
| #################################################################
 | |
| my $ret; #holds return value
 | |
| 
 | |
| 
 | |
| #create an instance of the sync manager object
 | |
| my $m =3D SyncManager->new($dbi_connect_string,$db_user,$db_pass);
 | |
| 
 | |
| #Call this to install the syncronization management tables, etc.
 | |
| $ret =3D $m->INSTALL;
 | |
| die "Handle this error: $ret\n\n" if $ret;
 | |
| 
 | |
| 
 | |
| 
 | |
| #create a test table for us to demonstrate with
 | |
| use DBI;
 | |
| my $dbh =3D DBI->connect($dbi_connect_string,$db_user,$db_pass);
 | |
| $dbh->do("create table sync_test (name text)");
 | |
| $dbh->do("insert into sync_test values ('rob')");
 | |
| $dbh->do("insert into sync_test values ('rob')");
 | |
| $dbh->do("insert into sync_test values ('rob')");
 | |
| $dbh->do("insert into sync_test values ('ted')");
 | |
| $dbh->do("insert into sync_test values ('ted')");
 | |
| $dbh->do("insert into sync_test values ('ted')");
 | |
| $dbh->disconnect;
 | |
| 
 | |
| 
 | |
| 
 | |
| 
 | |
| #call this to "publish" a table
 | |
| $ret =3D $m->publish('sync_test');
 | |
| print "Handle this error: $ret\n\n" if $ret;
 | |
| 
 | |
| #call this to "subscribe" a user/node to a publication (table)
 | |
| $ret =3D $m->subscribe('JOE','REMOTE_NODE_NAME','sync_test');
 | |
| print "Handle this error: $ret\n\n" if $ret;
 | |
| 
 | |
| #call this to "subscribe" a user/node to a publication (table)
 | |
| $ret =3D $m->subscribe('KEN','ANOTHER_REMOTE_NODE_NAME','sync_test');
 | |
| print "Handle this error: $ret\n\n" if $ret;
 | |
| 
 | |
| 
 | |
| print "Now you can do: 'perl sync.pl' a few times to play\n\n";
 | |
| print "Do 'perl sync_uninstall.pl' to uninstall the system\n";
 | |
| 
 | |
| 
 | |
| ------=_NextPart_000_0062_01C0541E.125CAF30
 | |
| Content-Type: application/octet-stream; name="SynKit.pm"
 | |
| Content-Transfer-Encoding: quoted-printable
 | |
| Content-Disposition: attachment; filename="SynKit.pm"
 | |
| 
 | |
| # Perl DB synchronization toolkit
 | |
| 
 | |
| #created for postgres 7.0.2 +
 | |
| use strict;
 | |
| 
 | |
| BEGIN {
 | |
|         use vars       qw($VERSION);
 | |
|         # set the version for version checking
 | |
|         $VERSION     =3D 1.00;
 | |
| }
 | |
| 
 | |
| 
 | |
| package Synchronize;
 | |
| 
 | |
| use DBI;
 | |
| 
 | |
| use Date::Parse;
 | |
| 
 | |
| # new requires 3 arguments: dbi connection string, plus the corresponding u=
 | |
| sername and password to get connected to the database
 | |
| sub new {
 | |
| 	my $proto =3D shift;
 | |
| 	my $class =3D ref($proto) || $proto;
 | |
| 	my $self =3D {};
 | |
| 
 | |
| 	my $dbi =3D shift;
 | |
| 	my $user =3D shift;
 | |
| 	my $pass =3D shift;
 | |
| 
 | |
| 	$self->{DBH} =3D DBI->connect($dbi,$user,$pass) || die "Failed to connect =
 | |
| to database: ".DBI->errstr();
 | |
| 
 | |
| 	$self->{user} =3D undef;
 | |
| 	$self->{node} =3D undef;
 | |
| 	$self->{status} =3D undef; # holds status of table update portion of sessi=
 | |
| on
 | |
| 	$self->{pubs} =3D {}; #holds hash of pubs available to sessiom with val =
 | |
| =3D 1 if ok to request sync
 | |
| 	$self->{orderpubs} =3D undef; #holds array ref of subscribed pubs ordered =
 | |
| by sync_order
 | |
| 	$self->{this_post_ver} =3D undef; #holds the version number under which th=
 | |
| is session will post changes
 | |
| 	$self->{max_ver} =3D undef; #holds the maximum safe version for getting up=
 | |
| dates
 | |
| 	$self->{current} =3D {}; #holds the current publication info to which chan=
 | |
| ges are being applied
 | |
| 	$self->{queue} =3D 'server'; # tells collide function what to do with coll=
 | |
| isions. (default is to hold on server)
 | |
| 
 | |
| 	$self->{DBLOG}=3D DBI->connect($dbi,$user,$pass) || die "cannot log to DB:=
 | |
|  ".DBI->errstr();=20
 | |
| 
 | |
| 
 | |
| 	return bless ($self, $class);
 | |
| }
 | |
| 
 | |
| sub dblog {=20
 | |
| 	my $self =3D shift;
 | |
| 	my $msg =3D $self->{DBLOG}->quote($_[0]);
 | |
| 	my $quser =3D $self->{DBH}->quote($self->{user});
 | |
| 	my $qnode =3D $self->{DBH}->quote($self->{node});
 | |
| 	$self->{DBLOG}->do("insert into ____sync_log____ (username, nodename,stamp=
 | |
| , message) values($quser, $qnode, now(), $msg)");
 | |
| }
 | |
| 
 | |
| 
 | |
| #start_session establishes session wide information and other housekeeping =
 | |
| chores
 | |
| 	# Accepts username, nodename and queue (client or server) as arguments;
 | |
| 
 | |
| sub start_session {
 | |
| 	my $self =3D shift;
 | |
| 	$self->{user} =3D shift || die 'Username is required';
 | |
| 	$self->{node} =3D shift || die 'Nodename is required';
 | |
| 	$self->{queue} =3D shift;
 | |
| 
 | |
| 
 | |
| 	if ($self->{queue} ne 'server' && $self->{queue} ne 'client') {
 | |
| 		die "You must provide a queue argument of either 'server' or 'client'";
 | |
| 	}
 | |
| 
 | |
| 	my $quser =3D $self->{DBH}->quote($self->{user});
 | |
| 	my $qnode =3D $self->{DBH}->quote($self->{node});
 | |
| 
 | |
| 	my $sql =3D "select pubname from ____subscribed____ where username =3D $qu=
 | |
| ser and nodename =3D $qnode";
 | |
| 	my @pubs =3D $self->GetColList($sql);
 | |
| 
 | |
| 	return 'User/Node has no subscriptions!' if !defined(@pubs);
 | |
| 
 | |
| 	# go though the list and check permissions and rules for each
 | |
| 	foreach my $pub (@pubs) {
 | |
| 		my $qpub =3D $self->{DBH}->quote($pub);
 | |
| 		my $sql =3D "select disabled, pubname, fullrefreshonly, refreshonce,post_=
 | |
| ver from ____subscribed____ where username =3D $quser and pubname =3D $qpub=
 | |
|  and nodename =3D $qnode";
 | |
| 		my $sth =3D $self->{DBH}->prepare($sql) || die $self->{DBH}->errstr;
 | |
| 		$sth->execute || die $self->{DBH}->errstr;
 | |
| 		my @row;
 | |
| 		while (@row =3D $sth->fetchrow_array) {
 | |
| 			next if $row[0]; #publication is disabled
 | |
| 			next if !defined($row[1]); #publication does not exist (should never occ=
 | |
| ur)
 | |
| 			if ($row[2] || $row[3]) { #refresh of refresh once flag is set
 | |
| 				$self->{pubs}->{$pub} =3D 0; #refresh only
 | |
| 				next;
 | |
| 			}
 | |
| 			if (!defined($row[4])) { #no previous session exists, must refresh
 | |
| 				$self->{pubs}->{$pub} =3D 0; #refresh only
 | |
| 				next;
 | |
| 			}
 | |
| 			$self->{pubs}->{$pub} =3D 1; #OK for sync
 | |
| 		}
 | |
| 		$sth->finish;
 | |
| 	}
 | |
| 
 | |
| 
 | |
| 	$sql =3D "select pubname from ____publications____ order by sync_order";
 | |
| 	my @op =3D $self->GetColList($sql);
 | |
| 	my @orderpubs;
 | |
| 
 | |
| 	#loop through ordered pubs and remove non subscribed publications
 | |
| 	foreach my $pub (@op) {
 | |
| 		push @orderpubs, $pub if defined($self->{pubs}->{$pub});
 | |
| 	}
 | |
| =09
 | |
| 	$self->{orderpubs} =3D \@orderpubs;
 | |
| 
 | |
| # Now we obtain a session version number, etc.
 | |
| 
 | |
| 	$self->{DBH}->{AutoCommit} =3D 0; #allows "transactions"
 | |
| 	$self->{DBH}->{RaiseError} =3D 1; #script [or eval] will automatically die=
 | |
|  on errors
 | |
| 
 | |
| 	eval { #start DB transaction
 | |
| 
 | |
| 	#lock the version sequence until we determin that we have gotten
 | |
| 	#a good  value.  Lock will be released on commit.
 | |
| 		$self->{DBH}->do('lock ____version_seq____ in access exclusive mode');
 | |
| 
 | |
| 	# remove stale locks if they exist
 | |
| 		my $sql =3D "delete from ____last_stable____ where username =3D $quser an=
 | |
| d nodename =3D $qnode";
 | |
| 		$self->{DBH}->do($sql);
 | |
| 
 | |
| 	# increment version sequence & grab the next val as post_ver
 | |
| 		my $sql =3D "select nextval('____version_seq____')";
 | |
| 		my $sth =3D $self->{DBH}->prepare($sql);
 | |
| 		$sth->execute;
 | |
| 		($self->{this_post_ver}) =3D $sth->fetchrow_array();
 | |
| 		$sth->finish;
 | |
| 	# grab max_ver from last_stable
 | |
| 
 | |
| 		$sql =3D "select min(version) from ____last_stable____";=20
 | |
| 		$sth =3D $self->{DBH}->prepare($sql);
 | |
| 		$sth->execute;
 | |
| 		($self->{max_ver}) =3D $sth->fetchrow_array();
 | |
| 		$sth->finish;
 | |
| 
 | |
| 	# if there was no version in lock table, then take the ID that was in use
 | |
| 	# when we started the session ($max_ver -1)
 | |
| 
 | |
| 		$self->{max_ver} =3D $self->{this_post_ver} -1 if (!defined($self->{max_v=
 | |
| er}));
 | |
| 
 | |
| 	# lock post_ver by placing it in last_stable
 | |
| 		$self->{DBH}->do("insert into ____last_stable____ (version, username, nod=
 | |
| ename) values ($self->{this_post_ver}, $quser,$qnode)");
 | |
| 
 | |
| 	# increment version sequence again (discard result)
 | |
| 		$sql =3D "select nextval('____version_seq____')";
 | |
| 		$sth =3D $self->{DBH}->prepare($sql);
 | |
| 		$sth->execute;
 | |
| 		$sth->fetchrow_array();
 | |
| 		$sth->finish;
 | |
| 
 | |
| 	}; #end eval/transaction
 | |
| 
 | |
| 	if ($@) { # part of transaction failed
 | |
| 		return 'Start session failed';
 | |
| 		$self->{DBH}->rollback;
 | |
| 	} else { # all's well commit block
 | |
| 		$self->{DBH}->commit;
 | |
| 	}
 | |
| 	$self->{DBH}->{AutoCommit} =3D 1;
 | |
| 	$self->{DBH}->{RaiseError} =3D 0;
 | |
| 
 | |
| 	return undef;
 | |
| 
 | |
| }
 | |
| 
 | |
| #start changes should be called once before applying individual change requ=
 | |
| ests
 | |
| 	# Requires publication and ref to columns that will be updated as arguments
 | |
| sub start_changes {
 | |
| 	my $self =3D shift;
 | |
| 	my $pub =3D shift || die 'Publication is required';
 | |
| 	my $colref =3D shift || die 'Reference to column array is required';
 | |
| 
 | |
| 	$self->{status} =3D 'starting';
 | |
| 
 | |
| 	my $qpub =3D $self->{DBH}->quote($pub);
 | |
| 	my $quser =3D $self->{DBH}->quote($self->{user});
 | |
| 	my $qnode =3D $self->{DBH}->quote($self->{node});
 | |
| 
 | |
| 	my @cols =3D @{$colref};
 | |
| 	my @subcols =3D $self->GetColList("select col_name from ____subscribed_col=
 | |
| s____ where username =3D $quser and nodename =3D $qnode and pubname =3D $qp=
 | |
| ub");
 | |
| 	my %subcols;
 | |
| 	foreach my $col (@subcols) {
 | |
| 		$subcols{$col} =3D 1;
 | |
| 	}
 | |
| 	foreach my $col (@cols) {=09
 | |
| 		return "User/node is not subscribed to column '$col'" if !$subcols{$col};
 | |
| 	}
 | |
| 
 | |
| 	my $sql =3D "select pubname, readonly, last_session, post_ver, last_ver, w=
 | |
| hereclause, sanity_limit,=20
 | |
| sanity_delete, sanity_update, sanity_insert from ____subscribed____ where u=
 | |
| sername =3D $quser and pubname =3D $qpub and nodename =3D $qnode";
 | |
| 	my ($junk, $readonly, $last_session, $post_ver, $last_ver, $whereclause, $=
 | |
| sanity_limit,=20
 | |
| $sanity_delete, $sanity_update, $sanity_insert) =3D $self->GetOneRow($sql);
 | |
| =09
 | |
| 	return 'Publication is read only' if $readonly;
 | |
| 
 | |
| 	$sql =3D "select whereclause from ____publications____ where pubname =3D $=
 | |
| qpub";
 | |
| 	my ($wc) =3D $self->GetOneRow($sql);
 | |
| 	$whereclause =3D '('.$whereclause.')' if $whereclause;
 | |
| 	$whereclause =3D $whereclause.' and ('.$wc.')' if $wc;
 | |
| 
 | |
| 	my ($table) =3D $self->GetOneRow("select tablename from ____publications__=
 | |
| __ where pubname =3D $qpub");
 | |
| 
 | |
| 	return 'Publication is not registered correctly' if !defined($table);
 | |
| 
 | |
| 	my %info;
 | |
| 	$info{pub} =3D $pub;
 | |
| 	$info{whereclause} =3D $whereclause;
 | |
| 	$info{post_ver} =3D $post_ver;
 | |
| 	$last_session =3D~ s/([+|-]\d\d?)$/ $1/;	#put a space before timezone=09
 | |
| 	$last_session =3D str2time ($last_session); #convert to perltime (seconds =
 | |
| since 1970)
 | |
| 	$info{last_session} =3D $last_session;
 | |
| 	$info{last_ver} =3D $last_ver;
 | |
| 	$info{table}  =3D $table;
 | |
| 	$info{cols} =3D \@cols;
 | |
| 
 | |
| 	my $sql =3D "select count(oid) from $table";
 | |
| 	$sql =3D $sql .' '.$whereclause if $whereclause;
 | |
| 	my ($rowcount) =3D $self->GetOneRow($sql);
 | |
| 
 | |
| 	#calculate sanity levels (convert from % to number of rows)
 | |
| 	# limits defined as less than 1 mean no limit
 | |
| 	$info{sanitylimit} =3D $rowcount * ($sanity_limit / 100) if $sanity_limit =
 | |
| > 0;
 | |
| 	$info{insertlimit} =3D $rowcount * ($sanity_insert / 100) if $sanity_inser=
 | |
| t > 0;
 | |
| 	$info{updatelimit} =3D $rowcount * ($sanity_update / 100) if $sanity_updat=
 | |
| e > 0;
 | |
| 	$info{deletelimit} =3D $rowcount * ($sanity_delete / 100) if $sanity_delet=
 | |
| e > 0;
 | |
| 
 | |
| 	$self->{sanitycount} =3D 0;
 | |
| 	$self->{updatecount} =3D 0;
 | |
| 	$self->{insertcount} =3D 0;
 | |
| 	$self->{deletecount} =3D 0;
 | |
| 
 | |
| 	$self->{current} =3D \%info;
 | |
| 
 | |
| 	$self->{DBH}->{AutoCommit} =3D 0; #turn on transaction behavior so we can =
 | |
| roll back on sanity limits, etc.
 | |
| 
 | |
| 	$self->{status} =3D 'ready';
 | |
| 
 | |
| 	return undef;
 | |
| }
 | |
| 
 | |
| #call this once all changes are submitted to commit them;
 | |
| sub end_changes {
 | |
| 	my $self =3D shift;
 | |
| 	return undef if $self->{status} ne 'ready';
 | |
| 	$self->{DBH}->commit;
 | |
| 	$self->{DBH}->{AutoCommit} =3D 1;
 | |
| 	$self->{status} =3D 'success';
 | |
| 	return undef;
 | |
| }
 | |
| 
 | |
| #call apply_change once for each row level client update
 | |
| 	# Accepts 4 params: rowid, action, timestamp and reference to data array
 | |
| 	#	Note: timestamp can be undef, data can be undef
 | |
| 	#		timestamp MUST be in perl time (secs since 1970)
 | |
| 
 | |
| #this routine checks basic timestamp info and sanity limits, then passes th=
 | |
| e info along to do_action() for processing
 | |
| sub apply_change {
 | |
| 	my $self =3D shift;
 | |
| 	my $rowid =3D shift || return 'Row ID is required'; #don't die just for on=
 | |
| e bad row
 | |
| 	my $action =3D shift || return 'Action is required'; #don't die just for o=
 | |
| ne bad row
 | |
| 	my $timestamp =3D shift;
 | |
| 	my $dataref =3D shift;
 | |
| 	$action =3D lc($action);
 | |
| 
 | |
| 	$timestamp =3D str2time($timestamp) if $timestamp;
 | |
| 
 | |
| 	return 'Status failure, cannot accept changes: '.$self->{status} if $self-=
 | |
| >{status} ne 'ready';
 | |
| 
 | |
| 	my %info =3D %{$self->{current}};
 | |
| 
 | |
| 	$self->{sanitycount}++;
 | |
| 	if ($info{sanitylimit} && $self->{sanitycount} > $info{sanitylimit}) {
 | |
| 		# too many changes from client
 | |
| 		my $ret =3D $self->sanity('limit');
 | |
| 		return $ret if $ret;
 | |
| 	}
 | |
| 
 | |
| =09
 | |
| 	if ($timestamp && $timestamp > time() + 3600) { # current time + one hour
 | |
| 		#client's clock is way off, cannot submit changes in future
 | |
| 		my $ret =3D $self->collide('future', $info{table}, $rowid, $action, undef=
 | |
| , $timestamp, $dataref, $self->{queue});
 | |
| 		return $ret if $ret;
 | |
| 	}
 | |
| 
 | |
| 	if ($timestamp && $timestamp < $info{last_session} - 3600) { # last sessio=
 | |
| n time less one hour
 | |
| 		#client's clock is way off, cannot submit changes that occured before las=
 | |
| t sync date
 | |
| 		my $ret =3D $self->collide('past', $info{table}, $rowid, $action, undef, =
 | |
| $timestamp, $dataref , $self->{queue});
 | |
| 		return $ret if $ret;
 | |
| 	}
 | |
| 
 | |
| 	my ($crow, $cver, $ctime); #current row,ver,time
 | |
| 	if ($action ne 'insert') {
 | |
| 		my $sql =3D "select ____rowid____, ____rowver____, ____stamp____ from $in=
 | |
| fo{table} where ____rowid____ =3D $rowid";
 | |
| 		($crow, $cver, $ctime) =3D $self->GetOneRow($sql);
 | |
| 		if (!defined($crow)) {
 | |
| 			my $ret =3D $self->collide('norow', $info{table}, $rowid, $action, undef=
 | |
| , $timestamp, $dataref , $self->{queue});
 | |
| 			return $ret if $ret;=09=09
 | |
| 		}
 | |
| 
 | |
| 		$ctime =3D~ s/([+|-]\d\d?)$/ $1/; #put space between timezone
 | |
| 		$ctime =3D str2time($ctime) if $ctime; #convert to perl time
 | |
| 
 | |
| 		if ($timestamp) {
 | |
| 			if ($ctime < $timestamp) {
 | |
| 				my $ret =3D $self->collide('time', $info{table}, $rowid, $action, undef=
 | |
| , $timestamp, $dataref, $self->{queue} );=09=09
 | |
| 				return $ret if $ret;
 | |
| 			}
 | |
| 
 | |
| 		} else {
 | |
| 			if ($cver > $self->{this_post_ver}) {
 | |
| 				my $ret =3D $self->collide('version', $info{table}, $rowid, $action, un=
 | |
| def, $timestamp, $dataref, $self->{queue} );
 | |
| 				return $ret if $ret;
 | |
| 			}
 | |
| 		}
 | |
| =09
 | |
| 	}
 | |
| 
 | |
| 	if ($action eq 'insert') {
 | |
| 		$self->{insertcount}++;
 | |
| 		if ($info{insertlimit} && $self->{insertcount} > $info{insertlimit}) {
 | |
| 			# too many changes from client
 | |
| 			my $ret =3D $self->sanity('insert');
 | |
| 			return $ret if $ret;
 | |
| 		}
 | |
| 
 | |
| 		my $qtable =3D $self->{DBH}->quote($info{table});
 | |
| 		my ($rowidsequence) =3D '_'.$self->GetOneRow("select table_id from ____ta=
 | |
| bles____ where tablename =3D $qtable").'__rowid_seq';
 | |
| 		return 'Table incorrectly registered, cannot get rowid sequence name: '.$=
 | |
| self->{DBH}->errstr() if not defined $rowidsequence;
 | |
| 
 | |
| 		my @data;
 | |
| 		foreach my $val (@{$dataref}) {
 | |
| 			push @data, $self->{DBH}->quote($val);
 | |
| 		}
 | |
| 		my $sql =3D "insert into $info{table} (";
 | |
| 		if ($timestamp) {
 | |
| 			$sql =3D $sql . join(',',@{$info{cols}}) . ',____rowver____, ____stamp__=
 | |
| __) values (';
 | |
| 			$sql =3D $sql . join (',',@data) .','.$self->{this_post_ver}.',\''.local=
 | |
| time($timestamp).'\')';
 | |
| 		} else {
 | |
| 			$sql =3D $sql . join(',',@{$info{cols}}) . ',____rowver____) values (';
 | |
| 			$sql =3D $sql . join (',',@data) .','.$self->{this_post_ver}.')';
 | |
| 		}
 | |
| 		my $ret =3D $self->{DBH}->do($sql);
 | |
| 		if (!$ret) {
 | |
| 			my $ret =3D $self->collide($self->{DBH}->errstr(), $info{table}, $rowid,=
 | |
|  $action, undef, $timestamp, $dataref , $self->{queue});
 | |
| 			return $ret if $ret;=09=09
 | |
| 		}
 | |
| 		my ($newrowid) =3D $self->GetOneRow("select currval('$rowidsequence')");
 | |
| 		return 'Failed to get current rowid on inserted row'.$self->{DBH}->errstr=
 | |
|  if not defined $newrowid;
 | |
| 		$self->changerowid($rowid, $newrowid);
 | |
| 	}
 | |
| 
 | |
| 	if ($action eq 'update') {
 | |
| 		$self->{updatecount}++;
 | |
| 		if ($info{updatelimit} && $self->{updatecount} > $info{updatelimit}) {
 | |
| 			# too many changes from client
 | |
| 			my $ret =3D $self->sanity('update');
 | |
| 			return $ret if $ret;
 | |
| 		}
 | |
| 		my @data;
 | |
| 		foreach my $val (@{$dataref}) {
 | |
| 			push @data, $self->{DBH}->quote($val);
 | |
| 		}=09
 | |
| 
 | |
| 		my $sql =3D "update $info{table} set ";
 | |
| 		my @cols =3D @{$info{cols}};
 | |
| 		foreach my $col (@cols) {
 | |
| 			my $val =3D shift @data;
 | |
| 			$sql =3D $sql . "$col =3D $val,";
 | |
| 		}
 | |
| 		$sql =3D $sql." ____rowver____ =3D $self->{this_post_ver}";
 | |
| 		$sql =3D $sql.", ____stamp____ =3D '".localtime($timestamp)."'" if $times=
 | |
| tamp;
 | |
| 		$sql =3D $sql." where ____rowid____ =3D $rowid";
 | |
| 		$sql =3D $sql." and $info{whereclause}" if $info{whereclause};
 | |
| 		my $ret =3D $self->{DBH}->do($sql);
 | |
| 		if (!$ret) {
 | |
| 			my $ret =3D $self->collide($self->{DBH}->errstr(), $info{table}, $rowid,=
 | |
|  $action, undef, $timestamp, $dataref , $self->{queue});
 | |
| 			return $ret if $ret;=09=09
 | |
| 		}
 | |
| 
 | |
| 	}
 | |
| 
 | |
| 	if ($action eq 'delete') {
 | |
| 		$self->{deletecount}++;
 | |
| 		if ($info{deletelimit} && $self->{deletecount} > $info{deletelimit}) {
 | |
| 			# too many changes from client
 | |
| 			my $ret =3D $self->sanity('delete');
 | |
| 			return $ret if $ret;
 | |
| 		}
 | |
| 		if ($timestamp) {
 | |
| 			my $sql =3D "update $info{table} set ____rowver____ =3D $self->{this_pos=
 | |
| t_ver}, ____stamp____ =3D '".localtime($timestamp)."'  where ____rowid____ =
 | |
| =3D $rowid";
 | |
| 			$sql =3D $sql . " where $info{whereclause}" if $info{whereclause};
 | |
| 			$self->{DBH}->do($sql) || return 'Predelete update failed: '.$self->{DBH=
 | |
| }->errstr;
 | |
| 		} else {
 | |
| 			my $sql =3D "update $info{table} set ____rowver____ =3D $self->{this_pos=
 | |
| t_ver} where ____rowid____ =3D $rowid";
 | |
| 			$sql =3D $sql . " where $info{whereclause}" if $info{whereclause};
 | |
| 			$self->{DBH}->do($sql) || return 'Predelete update failed: '.$self->{DBH=
 | |
| }->errstr;
 | |
| 		}
 | |
| 		my $sql =3D "delete from $info{table} where ____rowid____ =3D $rowid";
 | |
| 		$sql =3D $sql . " where $info{whereclause}" if $info{whereclause};
 | |
| 		my $ret =3D $self->{DBH}->do($sql);
 | |
| 		if (!$ret) {
 | |
| 			my $ret =3D $self->collide($self->{DBH}->errstr(), $info{table}, $rowid,=
 | |
|  $action, undef, $timestamp, $dataref , $self->{queue});
 | |
| 			return $ret if $ret;=09=09
 | |
| 		}
 | |
| }
 | |
| =09
 | |
| =09
 | |
| 	return undef;
 | |
| }
 | |
| 
 | |
| sub changerowid {
 | |
| 	my $self =3D shift;
 | |
| 	my $oldid =3D shift;
 | |
| 	my $newid =3D shift;
 | |
| 	$self->writeclient('changeid',"$oldid\t$newid");
 | |
| }
 | |
| 
 | |
| #writes info to client
 | |
| sub writeclient {
 | |
| 	my $self =3D shift;
 | |
| 	my $type =3D shift;
 | |
| 	my @info =3D @_;
 | |
| 	print "$type: ",join("\t",@info),"\n";
 | |
| 	return undef;
 | |
| }
 | |
| 
 | |
| # Override this for custom behavior.  Default is to echo back the sanity fa=
 | |
| ilure reason.=20=20
 | |
| # If you want to override a collision, you can do so by returning undef.
 | |
| sub sanity {
 | |
| 	my $self =3D shift;
 | |
| 	my $reason =3D shift;
 | |
| 	$self->{status} =3D 'sanity exceeded';
 | |
| 	$self->{DBH}->rollback;
 | |
| 	return $reason;
 | |
| }
 | |
| 
 | |
| # Override this for custom behavior.  Default is to echo back the failure r=
 | |
| eason.=20=20
 | |
| # If you want to override a collision, you can do so by returning undef.
 | |
| sub collide {
 | |
| 	my $self =3D shift;
 | |
| 	my ($reason,$table,$rowid,$action,$rowver,$timestamp,$data, $queue) =3D @_;
 | |
| 
 | |
| 	my @data;
 | |
| 	foreach my $val (@{$data}) {
 | |
| 		push @data, $self->{DBH}->quote($val);
 | |
| 	}=09
 | |
| 
 | |
| 	if ($reason =3D~ /integrity/i || $reason =3D~ /constraint/i) {
 | |
| 		$self->{status} =3D 'intergrity violation';
 | |
| 		$self->{DBH}->rollback;
 | |
| 	}
 | |
| 
 | |
| 	my $datastring;
 | |
| 	my @cols =3D @{$self->{current}->{cols}};
 | |
| 	foreach my $col (@cols) {
 | |
| 		my $val =3D shift @data;
 | |
| 		$datastring =3D $datastring . "$col =3D $val,";
 | |
| 	}
 | |
| 	chop $datastring; #remove trailing comma
 | |
| 
 | |
| 	if ($queue eq 'server') {
 | |
| 		$timestamp =3D localtime($timestamp) if defined($timestamp);
 | |
| 		$rowid =3D $self->{DBH}->quote($rowid);
 | |
| 		$rowid =3D 'null' if !defined($rowid);
 | |
| 		$rowver =3D 'null' if !defined($rowver);
 | |
| 		$timestamp =3D $self->{DBH}->quote($timestamp);
 | |
| 		$data =3D $self->{DBH}->quote($data);
 | |
| 		my $qtable =3D $self->{DBH}->quote($table);
 | |
| 		my $qreason =3D $self->{DBH}->quote($reason);
 | |
| 		my $qaction =3D $self->{DBH}->quote($action);
 | |
| 		my $quser =3D $self->{DBH}->quote($self->{user});
 | |
| 		my $qnode =3D $self->{DBH}->quote($self->{node});
 | |
| 		$datastring =3D $self->{DBH}->quote($datastring);
 | |
| 
 | |
| 
 | |
| 		my $sql =3D "insert into ____collision____ (rowid,
 | |
| tablename, rowver, stamp, data, reason, action, username,
 | |
| nodename, queue) values($rowid,$qtable, $rowver, $timestamp,$datastring,
 | |
| $qreason, $qaction,$quser, $qnode)";
 | |
| 		$self->{DBH}->do($sql) || die 'Failed to write to collision table: '.$sel=
 | |
| f->{DBH}->errstr;
 | |
| 
 | |
| 	} else {
 | |
| 
 | |
| 		$self->writeclient('collision',$rowid,$table, $rowver, $timestamp,$reason=
 | |
| , $action,$self->{user}, $self->{node}, $data);
 | |
| 
 | |
| 	}
 | |
| 	return $reason;
 | |
| }
 | |
| 
 | |
| #calls get_updates once for each publication the user/node is subscribed to=
 | |
|  in correct sync_order
 | |
| sub get_all_updates {
 | |
| 	my $self =3D shift;
 | |
| 	my $quser =3D $self->{DBH}->quote($self->{user});
 | |
| 	my $qnode =3D $self->{DBH}->quote($self->{node});
 | |
| 
 | |
| 	foreach my $pub (@{$self->{orderpubs}}) {
 | |
| 		$self->get_updates($pub, 1); #request update as sync unless overrridden b=
 | |
| y flags
 | |
| 	}
 | |
| 
 | |
| }
 | |
| 
 | |
| # Call this once for each table the client needs refreshed or sync'ed AFTER=
 | |
|  all inbound client changes have been posted
 | |
| #	Accepts publication and sync flag as arguments
 | |
| sub get_updates {
 | |
| 	my $self =3D shift;
 | |
| 	my $pub =3D shift || die 'Publication is required';
 | |
| 	my $sync =3D shift;
 | |
| 
 | |
| 	my $qpub =3D $self->{DBH}->quote($pub);
 | |
| 	my $quser =3D $self->{DBH}->quote($self->{user});
 | |
| 	my $qnode =3D $self->{DBH}->quote($self->{node});
 | |
| 
 | |
| 	#enforce refresh and refreshonce flags
 | |
| 	undef $sync if !$self->{pubs}->{$pub};=20
 | |
| 
 | |
| 
 | |
| 	my %info =3D $self->{current};
 | |
| 
 | |
| 	my @cols =3D $self->GetColList("select col_name from ____subscribed_cols__=
 | |
| __ where username =3D $quser and nodename =3D $qnode and pubname =3D $qpub"=
 | |
| );;
 | |
| 
 | |
| 	my ($table) =3D $self->GetOneRow("select tablename from ____publications__=
 | |
| __ where pubname =3D $qpub");
 | |
| 	return 'Table incorrectly registered for read' if !defined($table);
 | |
| 	my $qtable =3D $self->{DBH}->quote($table);=09
 | |
| 
 | |
| 
 | |
| 	my $sql =3D "select pubname, last_session, post_ver, last_ver, whereclause=
 | |
|  from ____subscribed____ where username =3D $quser and pubname =3D $qpub an=
 | |
| d nodename =3D $qnode";
 | |
| 	my ($junk, $last_session, $post_ver, $last_ver, $whereclause) =3D $self->G=
 | |
| etOneRow($sql);
 | |
| 
 | |
| 	my ($wc) =3D $self->GetOneRow("select whereclause from ____publications___=
 | |
| _ where pubname =3D $qpub");
 | |
| 
 | |
| 	$whereclause =3D '('.$whereclause.')' if $whereclause;
 | |
| 
 | |
| 	$whereclause =3D $whereclause.' and ('.$wc.')' if $wc;
 | |
| 
 | |
| 
 | |
| 	if ($sync) {
 | |
| 		$self->writeclient('start synchronize', $pub);
 | |
| 	} else {
 | |
| 		$self->writeclient('start refresh', $pub);
 | |
| 		$self->{DBH}->do("update ____subscribed____ set refreshonce =3D false whe=
 | |
| re pubname =3D $qpub and username =3D $quser and nodename =3D $qnode") || r=
 | |
| eturn 'Failed to clear RefreshOnce flag: '.$self->{DBH}->errstr;
 | |
| 	}
 | |
| 
 | |
| 	$self->writeclient('columns',@cols);
 | |
| 
 | |
| 
 | |
| 
 | |
| 	my $sql =3D "select ____rowid____, ".join(',', @cols)." from $table";
 | |
| 	if ($sync) {
 | |
| 		$sql =3D $sql." where (____rowver____ <=3D $self->{max_ver} and ____rowve=
 | |
| r____ > $last_ver)";
 | |
| 		if (defined($self->{this_post_ver})) {
 | |
| 			$sql =3D $sql . " and (____rowver____ <> $post_ver)";
 | |
| 		}
 | |
| 	} else {
 | |
| 		$sql =3D $sql." where (____rowver____ <=3D $self->{max_ver})";
 | |
| 	}
 | |
| 	$sql =3D $sql." and $whereclause" if $whereclause;
 | |
| =09
 | |
| 	my $sth =3D $self->{DBH}->prepare($sql) || return 'Failed to get prepare S=
 | |
| QL for updates: '.$self->{DBH}->errstr;
 | |
| 	$sth->execute || return 'Failed to execute SQL for updates: '.$self->{DBH}=
 | |
| ->errstr;
 | |
| 	my @row;
 | |
| 	while (@row =3D $sth->fetchrow_array) {
 | |
| 		$self->writeclient('update/insert',@row);
 | |
| 	}
 | |
| 
 | |
| 	$sth->finish;
 | |
| 
 | |
| 	# now get deleted rows
 | |
| 	if ($sync) {
 | |
| 		$sql =3D "select rowid from ____deleted____ where (tablename =3D $qtable)=
 | |
| ";
 | |
| 		$sql =3D $sql." and (rowver <=3D $self->{max_ver} and rowver > $last_ver)=
 | |
| ";
 | |
| 		if (defined($self->{this_post_ver})) {
 | |
| 			$sql =3D $sql . " and (rowver <> $self->{this_post_ver})";
 | |
| 		}
 | |
| 		$sql =3D $sql." and $whereclause" if $whereclause;
 | |
| 
 | |
| 		$sth =3D $self->{DBH}->prepare($sql) || return 'Failed to get prepare SQL=
 | |
|  for deletes: '.$self->{DBH}->errstr;
 | |
| 		$sth->execute || return 'Failed to execute SQL for deletes: '.$self->{DBH=
 | |
| }->errstr;
 | |
| 		my @row;
 | |
| 		while (@row =3D $sth->fetchrow_array) {
 | |
| 			$self->writeclient('delete',@row);
 | |
| 		}
 | |
| 
 | |
| 		$sth->finish;
 | |
| 	}
 | |
| 
 | |
| 	if ($sync) {
 | |
| 		$self->writeclient('end synchronize', $pub);
 | |
| 	} else {
 | |
| 		$self->writeclient('end refresh', $pub);
 | |
| 	}
 | |
| 
 | |
| 	my $qpub =3D $self->{DBH}->quote($pub);
 | |
| 	my $quser =3D $self->{DBH}->quote($self->{user});
 | |
| 	my $qnode =3D $self->{DBH}->quote($self->{node});
 | |
| 
 | |
| 	$self->{DBH}->do("update ____subscribed____ set last_ver =3D $self->{max_v=
 | |
| er}, last_session =3D now(), post_ver =3D $self->{this_post_ver} where user=
 | |
| name =3D $quser and nodename =3D $qnode and pubname =3D $qpub");
 | |
| 	return undef;
 | |
| }
 | |
| 
 | |
| 
 | |
| # Call this once when everything else is done.  Does housekeeping.=20
 | |
| # (MAKE THIS AN OBJECT DESTRUCTOR?)
 | |
| sub DESTROY {
 | |
| 	my $self =3D shift;
 | |
| 
 | |
| #release version from lock table (including old ones)
 | |
| 	my $quser =3D $self->{DBH}->quote($self->{user});
 | |
| 	my $qnode =3D $self->{DBH}->quote($self->{node});
 | |
| 	my $sql =3D "delete from ____last_stable____ where username =3D $quser and=
 | |
|  nodename =3D $qnode";
 | |
| 	$self->{DBH}->do($sql);
 | |
| 
 | |
| #clean up deleted table
 | |
| 	my ($version) =3D $self->GetOneRow("select min(last_ver) from ____subscrib=
 | |
| ed____");
 | |
| 	return undef if not defined $version;
 | |
| 	$self->{DBH}->do("delete from ____deleted____ where rowver < $version") ||=
 | |
|  return 'Failed to prune deleted table'.$self->{DBH}->errstr;;
 | |
| 
 | |
| 
 | |
| #disconnect from DBD sessions
 | |
| 	$self->{DBH}->disconnect;
 | |
| 	$self->{DBLOG}->disconnect;
 | |
| 	return undef;
 | |
| }
 | |
| 
 | |
| ############# Helper Subs ############
 | |
| sub GetColList {
 | |
| 	my $self =3D shift;
 | |
| 	my $sql =3D shift || die 'Must provide sql select statement';
 | |
| 	my $sth =3D $self->{DBH}->prepare($sql) || return undef;
 | |
| 	$sth->execute || return undef;
 | |
| 	my $val;
 | |
| 	my @col;
 | |
| 	while (($val) =3D $sth->fetchrow_array) {
 | |
| 		push @col, $val;
 | |
| 	}
 | |
| 	$sth->finish;
 | |
| 	return @col;
 | |
| }
 | |
| 
 | |
| sub GetOneRow {
 | |
| 	my $self =3D shift;
 | |
| 	my $sql =3D shift || die 'Must provide sql select statement';
 | |
| 	my $sth =3D $self->{DBH}->prepare($sql) || return undef;
 | |
| 	$sth->execute || return undef;
 | |
| 	my @row =3D $sth->fetchrow_array;
 | |
| 	$sth->finish;
 | |
| 	return @row;
 | |
| }
 | |
| 
 | |
| =20
 | |
| 
 | |
| 
 | |
| 
 | |
| package SyncManager;
 | |
| 
 | |
| use DBI;
 | |
| # new requires 3 arguments: dbi connection string, plus the corresponding u=
 | |
| sername and password
 | |
| 
 | |
| sub new {
 | |
| 	my $proto =3D shift;
 | |
| 	my $class =3D ref($proto) || $proto;
 | |
| 	my $self =3D {};
 | |
| 
 | |
| 	my $dbi =3D shift;
 | |
| 	my $user =3D shift;
 | |
| 	my $pass =3D shift;
 | |
| 
 | |
| 	$self->{DBH} =3D DBI->connect($dbi,$user,$pass) || die "Failed to connect =
 | |
| to database: ".DBI->errstr();
 | |
| 
 | |
| 	$self->{DBLOG}=3D DBI->connect($dbi,$user,$pass) || die "cannot log to DB:=
 | |
|  ".DBI->errstr();
 | |
| =09
 | |
| 	return bless ($self, $class);
 | |
| }
 | |
| 
 | |
| sub dblog {=20
 | |
| 	my $self =3D shift;
 | |
| 	my $msg =3D $self->{DBLOG}->quote($_[0]);
 | |
| 	my $quser =3D $self->{DBH}->quote($self->{user});
 | |
| 	my $qnode =3D $self->{DBH}->quote($self->{node});
 | |
| 	$self->{DBLOG}->do("insert into ____sync_log____ (username, nodename,stamp=
 | |
| , message) values($quser, $qnode, now(), $msg)");
 | |
| }
 | |
| 
 | |
| #this should never need to be called, but it might if a node bails without =
 | |
| releasing their locks
 | |
| sub ReleaseAllLocks {
 | |
| 	my $self =3D shift;
 | |
| 	$self->{DBH}->do("delete from ____last_stable____)");
 | |
| }
 | |
| # Adds a publication to the system.  Also adds triggers, sequences, etc ass=
 | |
| ociated with the table if approproate.
 | |
| 	# accepts two argument: the name of a physical table and the name under wh=
 | |
| ich to publish it=20
 | |
| 	# 	NOTE: the publication name is optional and will default to the table na=
 | |
| me if not supplied
 | |
| 	# returns undef if ok, else error string;
 | |
| sub publish {
 | |
| 	my $self =3D shift;
 | |
| 	my $table =3D shift || die 'You must provide a table name (and optionally =
 | |
| a unique publication name)';
 | |
| 	my $pub =3D shift;
 | |
| 	$pub =3D $table if not defined($pub);
 | |
| 
 | |
| 	my $qpub =3D $self->{DBH}->quote($pub);
 | |
| 	my $sql =3D "select tablename from ____publications____ where pubname =3D =
 | |
| $qpub";
 | |
| 	my ($junk) =3D $self->GetOneRow($sql);
 | |
| 	return 'Publication already exists' if defined($junk);
 | |
| 
 | |
| 	my $qtable =3D $self->{DBH}->quote($table);
 | |
| 
 | |
| 	$sql =3D "select table_id, refcount from ____tables____ where tablename =
 | |
| =3D $qtable";
 | |
| 	my ($id, $refcount) =3D $self->GetOneRow($sql);
 | |
| 
 | |
| 	if(!defined($id)) {
 | |
| 		$self->{DBH}->do("insert into ____tables____ (tablename, refcount) values=
 | |
|  ($qtable,1)") || return 'Failed to register table: ' . $self->{DBH}->errst=
 | |
| r;
 | |
| 		my $sql =3D "select table_id from ____tables____ where tablename =3D $qta=
 | |
| ble";
 | |
| 		($id) =3D $self->GetOneRow($sql);
 | |
| 	}
 | |
| 
 | |
| 	if (defined($refcount)) {
 | |
| 		$self->{DBH}->do("update ____tables____ set refcount =3D refcount+1 where=
 | |
|  table_id =3D $id") || return 'Failed to update refrence count: ' . $self->=
 | |
| {DBH}->errstr;
 | |
| 	} else {
 | |
| =09=09
 | |
| 		$id =3D '_'.$id.'_';=20
 | |
| 
 | |
| 		my @cols =3D $self->GetTableCols($table, 1); # 1 =3D get hidden cols too
 | |
| 		my %skip;
 | |
| 		foreach my $col (@cols) {
 | |
| 			$skip{$col} =3D 1;
 | |
| 		}
 | |
| =09=09
 | |
| 		if (!$skip{____rowver____}) {
 | |
| 			$self->{DBH}->do("alter table $table add column ____rowver____ int4"); #=
 | |
| don't fail here in case table is being republished, just accept the error s=
 | |
| ilently
 | |
| 		}
 | |
| 		$self->{DBH}->do("update $table set ____rowver____ =3D ____version_seq___=
 | |
| _.last_value - 1") || return 'Failed to initialize rowver: ' . $self->{DBH}=
 | |
| ->errstr;
 | |
| 
 | |
| 		if (!$skip{____rowid____}) {
 | |
| 			$self->{DBH}->do("alter table $table add column ____rowid____ int4"); #d=
 | |
| on't fail here in case table is being republished, just accept the error si=
 | |
| lently
 | |
| 		}
 | |
| 
 | |
| 		my $index =3D $id.'____rowid____idx';
 | |
| 		$self->{DBH}->do("create index $index on $table(____rowid____)") || retur=
 | |
| n 'Failed to create rowid index: ' . $self->{DBH}->errstr;
 | |
| 
 | |
| 		my $sequence =3D $id.'_rowid_seq';
 | |
| 		$self->{DBH}->do("create sequence $sequence") || return 'Failed to create=
 | |
|  rowver sequence: ' . $self->{DBH}->errstr;
 | |
| 
 | |
| 		$self->{DBH}->do("alter table $table alter column ____rowid____ set defau=
 | |
| lt nextval('$sequence')"); #don't fail here in case table is being republis=
 | |
| hed, just accept the error silently
 | |
| 
 | |
| 		$self->{DBH}->do("update $table set ____rowid____ =3D  nextval('$sequence=
 | |
| ')") || return 'Failed to initialize rowid: ' . $self->{DBH}->errstr;
 | |
| 
 | |
| 		if (!$skip{____stamp____}) {
 | |
| 			$self->{DBH}->do("alter table $table add column ____stamp____ timestamp"=
 | |
| ); #don't fail here in case table is being republished, just accept the err=
 | |
| or silently
 | |
| 		}
 | |
| 
 | |
| 		$self->{DBH}->do("update $table set ____stamp____ =3D  now()") || return =
 | |
| 'Failed to initialize stamp: ' . $self->{DBH}->errstr;
 | |
| 
 | |
| 		my $trigger =3D $id.'_ver_ins';
 | |
| 		$self->{DBH}->do("create trigger $trigger before insert on $table for eac=
 | |
| h row execute procedure sync_insert_ver()") || return 'Failed to create tri=
 | |
| gger: ' . $self->{DBH}->errstr;
 | |
| 
 | |
| 		my $trigger =3D $id.'_ver_upd';
 | |
| 		$self->{DBH}->do("create trigger $trigger before update on $table for eac=
 | |
| h row execute procedure sync_update_ver()") || return 'Failed to create tri=
 | |
| gger: ' . $self->{DBH}->errstr;
 | |
| 
 | |
| 		my $trigger =3D $id.'_del_row';
 | |
| 		$self->{DBH}->do("create trigger $trigger after delete on $table for each=
 | |
|  row execute procedure sync_delete_row()") || return 'Failed to create trig=
 | |
| ger: ' . $self->{DBH}->errstr;
 | |
| 	}
 | |
| 
 | |
| 	$self->{DBH}->do("insert into ____publications____ (pubname, tablename) va=
 | |
| lues ('$pub','$table')") || return 'Failed to create publication entry: '.$=
 | |
| self->{DBH}->errstr;
 | |
| 
 | |
| 	return undef;
 | |
| }
 | |
| 
 | |
| 
 | |
| # Removes a publication from the system.  Also drops triggers, sequences, e=
 | |
| tc associated with the table if approproate.
 | |
| 	# accepts one argument: the name of a publication
 | |
| 	# returns undef if ok, else error string;
 | |
| sub unpublish {
 | |
| 	my $self =3D shift;
 | |
| 	my $pub =3D shift || return 'You must provide a publication name';
 | |
| 	my $qpub =3D $self->{DBH}->quote($pub);
 | |
| 	my $sql =3D "select tablename from ____publications____ where pubname =3D =
 | |
| $qpub";
 | |
| 	my ($table) =3D $self->GetOneRow($sql);
 | |
| 	return 'Publication does not exist' if !defined($table);
 | |
| 
 | |
| 	my $qtable =3D $self->{DBH}->quote($table);
 | |
| 
 | |
| 	$sql =3D "select table_id, refcount from ____tables____ where tablename =
 | |
| =3D $qtable";
 | |
| 	my ($id, $refcount) =3D $self->GetOneRow($sql);
 | |
| 	return 'Table: $table is not correctly registered!' if not defined($id);
 | |
| 
 | |
| 	$self->{DBH}->do("update ____tables____ set refcount =3D refcount -1 where=
 | |
|  tablename =3D $qtable") || return 'Failed to decrement reference count: ' =
 | |
| . $self->{DBH}->errstr;
 | |
| 
 | |
| 	$self->{DBH}->do("delete from ____subscribed____ where pubname =3D $qpub")=
 | |
|  || return 'Failed to delete user subscriptions: ' . $self->{DBH}->errstr;
 | |
| 	$self->{DBH}->do("delete from ____subscribed_cols____ where pubname =3D $q=
 | |
| pub") || return 'Failed to delete subscribed columns: ' . $self->{DBH}->err=
 | |
| str;
 | |
| 	$self->{DBH}->do("delete from ____publications____ where tablename =3D $qt=
 | |
| able and pubname =3D $qpub") || return 'Failed to delete from publications:=
 | |
|  ' . $self->{DBH}->errstr;
 | |
| 
 | |
| 	#if this is the last reference, we want to drop triggers, etc;
 | |
| 	if ($refcount <=3D 1) {
 | |
| 		$id =3D "_".$id."_";
 | |
| 
 | |
| 		$self->{DBH}->do("alter table $table alter column ____rowver____ drop def=
 | |
| ault") || return 'Failed to alter column default: ' . $self->{DBH}->errstr;
 | |
| 		$self->{DBH}->do("alter table $table alter column ____rowid____ drop defa=
 | |
| ult") || return 'Failed to alter column default: ' . $self->{DBH}->errstr;
 | |
| 		$self->{DBH}->do("alter table $table alter column ____stamp____ drop defa=
 | |
| ult") || return 'Failed to alter column default: ' . $self->{DBH}->errstr;
 | |
| 
 | |
| 		my $trigger =3D $id.'_ver_upd';
 | |
| 		$self->{DBH}->do("drop trigger $trigger on $table") || return 'Failed to =
 | |
| drop trigger: ' . $self->{DBH}->errstr;
 | |
| 
 | |
| 		my $trigger =3D $id.'_ver_ins';
 | |
| 		$self->{DBH}->do("drop trigger $trigger on $table") || return 'Failed to =
 | |
| drop trigger: ' . $self->{DBH}->errstr;
 | |
| 
 | |
| 		my $trigger =3D $id.'_del_row';
 | |
| 		$self->{DBH}->do("drop trigger $trigger on $table") || return 'Failed to =
 | |
| drop trigger: ' . $self->{DBH}->errstr;
 | |
| 
 | |
| 		my $sequence =3D $id.'_rowid_seq';
 | |
| 		$self->{DBH}->do("drop sequence $sequence") || return 'Failed to drop seq=
 | |
| uence: ' . $self->{DBH}->errstr;
 | |
| 
 | |
| 		my $index =3D $id.'____rowid____idx';
 | |
| 		$self->{DBH}->do("drop index $index") || return 'Failed to drop index: ' =
 | |
| . $self->{DBH}->errstr;
 | |
| 		$self->{DBH}->do("delete from ____tables____ where tablename =3D $qtable"=
 | |
| ) || return 'remove entry from tables: ' . $self->{DBH}->errstr;
 | |
| 	}
 | |
| return undef;
 | |
| }
 | |
| 
 | |
| 
 | |
| 
 | |
| 
 | |
| 
 | |
| #Subscribe user/node to a publication
 | |
| 	# Accepts 3 arguements: Username, Nodename, Publication
 | |
| 	# 	NOTE: the remaining arguments can be supplied as column names to which =
 | |
| the user/node should be subscribed
 | |
| 	# Return undef if ok, else returns an error string
 | |
| 
 | |
| sub subscribe {
 | |
| 	my $self =3D shift;
 | |
| 	my $user =3D shift || die 'You must provide user, node and publication as =
 | |
| arguments';
 | |
| 	my $node =3D shift || die 'You must provide user, node and publication as =
 | |
| arguments';
 | |
| 	my $pub =3D shift || die 'You must provide user, node and publication as a=
 | |
| rguments';
 | |
| 	my @cols =3D @_;
 | |
| 
 | |
| 	my $quser =3D $self->{DBH}->quote($user);
 | |
| 	my $qnode =3D $self->{DBH}->quote($node);
 | |
| 	my $qpub =3D $self->{DBH}->quote($pub);
 | |
| 
 | |
| 	my $sql =3D "select tablename from ____publications____ where pubname =3D =
 | |
| $qpub";
 | |
| 	my ($table) =3D $self->GetOneRow($sql);
 | |
| 	return "Publication $pub does not exist." if not defined $table;
 | |
| 	my $qtable =3D $self->{DBH}->quote($table);
 | |
| 
 | |
| 	@cols =3D $self->GetTableCols($table) if !@cols; # get defaults if cols we=
 | |
| re not spefified by caller
 | |
| 
 | |
| 	$self->{DBH}->do("insert into ____subscribed____ (username, nodename,pubna=
 | |
| me,last_ver,refreshonce) values('$user', '$node','$pub',0, true)") || retur=
 | |
| n 'Failes to create subscription: ' . $self->{DBH}->errstr;=09
 | |
| 
 | |
| 	foreach my $col (@cols) {
 | |
| 		$self->{DBH}->do("insert into ____subscribed_cols____ (username, nodename=
 | |
| , pubname, col_name) values ('$user','$node','$pub','$col')") || return 'Fa=
 | |
| iles to subscribe column: ' . $self->{DBH}->errstr;=09
 | |
| 	}
 | |
| 
 | |
| 	return undef;
 | |
| }
 | |
| 
 | |
| 
 | |
| #Unsubscribe user/node to a publication
 | |
| 	# Accepts 3 arguements: Username, Nodename, Publication
 | |
| 	# Return undef if ok, else returns an error string
 | |
| 
 | |
| sub unsubscribe {
 | |
| 	my $self =3D shift;
 | |
| 	my $user =3D shift || die 'You must provide user, node and publication as =
 | |
| arguments';
 | |
| 	my $node =3D shift || die 'You must provide user, node and publication as =
 | |
| arguments';
 | |
| 	my $pub =3D shift || die 'You must provide user, node and publication as a=
 | |
| rguments';
 | |
| 	my @cols =3D @_;
 | |
| 
 | |
| 	my $quser =3D $self->{DBH}->quote($user);
 | |
| 	my $qnode =3D $self->{DBH}->quote($node);
 | |
| 	my $qpub =3D $self->{DBH}->quote($pub);
 | |
| 
 | |
| 	my $sql =3D "select tablename from ____publications____ where pubname =3D =
 | |
| $qpub";
 | |
| 	my $table =3D $self->GetOneRow($sql);
 | |
| 	return "Publication $pub does not exist." if not defined $table;
 | |
| 
 | |
| 	$self->{DBH}->do("delete from ____subscribed_cols____ where pubname =3D $q=
 | |
| pub and username =3D $quser and nodename =3D $qnode") || return 'Failed to =
 | |
| remove column subscription: '. $self->{DBH}->errstr;
 | |
| 	$self->{DBH}->do("delete from ____subscribed____ where pubname =3D $qpub a=
 | |
| nd username =3D $quser and nodename =3D $qnode") || return 'Failed to remov=
 | |
| e subscription: '. $self->{DBH}->errstr;
 | |
| 
 | |
| 
 | |
| 	return undef;
 | |
| }
 | |
| 
 | |
| 
 | |
| 
 | |
| #INSTALL creates the necessary management tables.=20=20
 | |
| 	#returns undef if everything is ok, else returns a string describing the e=
 | |
| rror;
 | |
| sub INSTALL {
 | |
| my $self =3D shift;
 | |
| 
 | |
| #check to see if management tables are already installed
 | |
| 
 | |
| my ($test) =3D $self->GetOneRow("select * from pg_class where relname =3D '=
 | |
| ____publications____'");
 | |
| if (defined($test)) {
 | |
| 	return 'It appears that synchronization manangement tables are already ins=
 | |
| talled here.  Please uninstall before reinstalling.';
 | |
| };
 | |
| 
 | |
| 
 | |
| 
 | |
| #install the management tables, etc.
 | |
| 
 | |
| $self->{DBH}->do("create table ____publications____ (pubname text primary k=
 | |
| ey,description text, tablename text, sync_order int4, whereclause text)") |=
 | |
| | return $self->{DBH}->errstr();
 | |
| 
 | |
| $self->{DBH}->do("create table ____subscribed_cols____ (nodename text, user=
 | |
| name text, pubname text, col_name text, description text, primary key(noden=
 | |
| ame, username, pubname,col_name))") || return $self->{DBH}->errstr();
 | |
| 
 | |
| $self->{DBH}->do("create table ____subscribed____ (nodename text, username =
 | |
| text, pubname text, last_session timestamp, post_ver int4, last_ver int4, w=
 | |
| hereclause text, sanity_limit int4 default 0, sanity_delete int4 default 0,=
 | |
|  sanity_update int4 default 0, sanity_insert int4 default 50, readonly bool=
 | |
| ean, disabled boolean, fullrefreshonly boolean, refreshonce boolean, primar=
 | |
| y key(nodename, username, pubname))") || return $self->{DBH}->errstr();
 | |
| 
 | |
| $self->{DBH}->do("create table ____last_stable____ (version int4, username =
 | |
| text, nodename text, primary key(version, username, nodename))") || return =
 | |
| $self->{DBH}->errstr();
 | |
| 
 | |
| $self->{DBH}->do("create table ____tables____ (tablename text, table_id int=
 | |
| 4, refcount int4, primary key(tablename, table_id))") || return $self->{DBH=
 | |
| }->errstr();
 | |
| 
 | |
| $self->{DBH}->do("create sequence ____table_id_seq____") || return $self->{=
 | |
| DBH}->errstr();
 | |
| 
 | |
| $self->{DBH}->do("alter table ____tables____ alter column table_id set defa=
 | |
| ult nextval('____table_id_seq____')") || return $self->{DBH}->errstr();
 | |
| 
 | |
| $self->{DBH}->do("create table ____deleted____ (rowid int4, tablename text,=
 | |
|  rowver int4, stamp timestamp, primary key (rowid, tablename))") || return =
 | |
| $self->{DBH}->errstr();
 | |
| 
 | |
| $self->{DBH}->do("create table ____collision____ (rowid text, tablename tex=
 | |
| t, rowver int4, stamp timestamp, faildate timestamp default now(),data text=
 | |
| ,reason text, action text, username text, nodename text,queue text)") || re=
 | |
| turn $self->{DBH}->errstr();
 | |
| 
 | |
| $self->{DBH}->do("create sequence ____version_seq____") || return $self->{D=
 | |
| BH}->errstr();
 | |
| 
 | |
| $self->{DBH}->do("create table ____sync_log____ (username text, nodename te=
 | |
| xt, stamp timestamp, message text)") || return $self->{DBH}->errstr();
 | |
| 
 | |
| $self->{DBH}->do("create function sync_insert_ver() returns opaque as
 | |
| 'begin
 | |
| if new.____rowver____ isnull then
 | |
| new.____rowver____ :=3D ____version_seq____.last_value;
 | |
| end if;
 | |
| if new.____stamp____ isnull then
 | |
| new.____stamp____ :=3D now();
 | |
| end if;
 | |
| return NEW;
 | |
| end;' language 'plpgsql'") || return $self->{DBH}->errstr();
 | |
| 
 | |
| $self->{DBH}->do("create function sync_update_ver() returns opaque as
 | |
| 'begin
 | |
| if new.____rowver____ =3D old.____rowver____ then
 | |
| new.____rowver____ :=3D ____version_seq____.last_value;
 | |
| end if;
 | |
| if new.____stamp____ =3D old.____stamp____ then
 | |
| new.____stamp____ :=3D now();
 | |
| end if;
 | |
| return NEW;
 | |
| end;' language 'plpgsql'") || return $self->{DBH}->errstr();
 | |
| 
 | |
| 
 | |
| $self->{DBH}->do("create function sync_delete_row() returns opaque as=20
 | |
| 'begin=20
 | |
| insert into ____deleted____ (rowid,tablename,rowver,stamp) values
 | |
| (old.____rowid____, TG_RELNAME, old.____rowver____,old.____stamp____);=20
 | |
| return old;=20
 | |
| end;' language 'plpgsql'") || return $self->{DBH}->errstr();
 | |
| 
 | |
| return undef;
 | |
| }
 | |
| 
 | |
| #removes all management tables & related stuff
 | |
| 	#returns undef if ok, else returns an error message as a string
 | |
| sub UNINSTALL {
 | |
| my $self =3D shift;
 | |
| 
 | |
| #Make sure all tables are unpublished first
 | |
| my $sth =3D $self->{DBH}->prepare("select pubname from ____publications____=
 | |
| ");
 | |
| $sth->execute;
 | |
| my $pub;
 | |
| while (($pub) =3D $sth->fetchrow_array) {
 | |
| 	$self->unpublish($pub);=09
 | |
| }
 | |
| $sth->finish;
 | |
| 
 | |
| $self->{DBH}->do("drop table ____publications____") || return $self->{DBH}-=
 | |
| >errstr();
 | |
| $self->{DBH}->do("drop table ____subscribed_cols____") || return $self->{DB=
 | |
| H}->errstr();
 | |
| $self->{DBH}->do("drop table ____subscribed____") || return $self->{DBH}->e=
 | |
| rrstr();
 | |
| $self->{DBH}->do("drop table ____last_stable____") || return $self->{DBH}->=
 | |
| errstr();
 | |
| $self->{DBH}->do("drop table ____deleted____") || return $self->{DBH}->errs=
 | |
| tr();
 | |
| $self->{DBH}->do("drop table ____collision____") || return $self->{DBH}->er=
 | |
| rstr();
 | |
| $self->{DBH}->do("drop table ____tables____") || return $self->{DBH}->errst=
 | |
| r();
 | |
| $self->{DBH}->do("drop table ____sync_log____") || return $self->{DBH}->err=
 | |
| str();
 | |
| 
 | |
| $self->{DBH}->do("drop sequence ____table_id_seq____") || return $self->{DB=
 | |
| H}->errstr();
 | |
| $self->{DBH}->do("drop sequence ____version_seq____") || return $self->{DBH=
 | |
| }->errstr();
 | |
| 
 | |
| $self->{DBH}->do("drop function sync_insert_ver()") || return $self->{DBH}-=
 | |
| >errstr();
 | |
| $self->{DBH}->do("drop function sync_update_ver()") || return $self->{DBH}-=
 | |
| >errstr();
 | |
| $self->{DBH}->do("drop function sync_delete_row()") || return $self->{DBH}-=
 | |
| >errstr();
 | |
| 
 | |
| return undef;
 | |
| 
 | |
| }
 | |
| 
 | |
| sub DESTROY {
 | |
| 	my $self =3D shift;
 | |
| 
 | |
| 	$self->{DBH}->disconnect;
 | |
| 	$self->{DBLOG}->disconnect;
 | |
| 	return undef;
 | |
| }
 | |
| 
 | |
| ############# Helper Subs ############
 | |
| 
 | |
| sub GetOneRow {
 | |
| 	my $self =3D shift;
 | |
| 	my $sql =3D shift || die 'Must provide sql select statement';
 | |
| 	my $sth =3D $self->{DBH}->prepare($sql) || return undef;
 | |
| 	$sth->execute || return undef;
 | |
| 	my @row =3D $sth->fetchrow_array;
 | |
| 	$sth->finish;
 | |
| 	return @row;
 | |
| }
 | |
| 
 | |
| #call this with second non-zero value to get hidden columns
 | |
| sub GetTableCols {
 | |
| 	my $self =3D shift;
 | |
| 	my $table =3D shift || die 'Must provide table name';
 | |
| 	my $wanthidden =3D shift;
 | |
| 	my $sql =3D "select * from $table where 0 =3D 1";
 | |
| 	my $sth =3D $self->{DBH}->prepare($sql) || return undef;
 | |
| 	$sth->execute || return undef;
 | |
| 	my @row =3D @{$sth->{NAME}};
 | |
| 	$sth->finish;
 | |
| 	return @row if $wanthidden;
 | |
| 	my @cols;
 | |
| 	foreach my $col (@row) {
 | |
| 		next if $col eq '____rowver____';
 | |
| 		next if $col eq '____stamp____';
 | |
| 		next if $col eq '____rowid____';
 | |
| 		push @cols, $col;=09
 | |
| 	}
 | |
| 	return @cols;
 | |
| }
 | |
| 
 | |
| 
 | |
| 1; #happy require
 | |
| 
 | |
| ------=_NextPart_000_0062_01C0541E.125CAF30--
 | |
| 
 | |
| 
 | |
| From pgsql-hackers-owner+M9917@postgresql.org Mon Jun 11 15:53:25 2001
 | |
| Return-path: <pgsql-hackers-owner+M9917@postgresql.org>
 | |
| Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5BJrPL01206
 | |
| 	for <pgman@candle.pha.pa.us>; Mon, 11 Jun 2001 15:53:25 -0400 (EDT)
 | |
| Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with SMTP id f5BJrPE67753;
 | |
| 	Mon, 11 Jun 2001 15:53:25 -0400 (EDT)
 | |
| 	(envelope-from pgsql-hackers-owner+M9917@postgresql.org)
 | |
| Received: from mail.greatbridge.com (mail.greatbridge.com [65.196.68.36])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with ESMTP id f5BJmLE65620
 | |
| 	for <pgsql-hackers@postgresql.org>; Mon, 11 Jun 2001 15:48:21 -0400 (EDT)
 | |
| 	(envelope-from djohnson@greatbridge.com)
 | |
| Received: from j2.us.greatbridge.com (djohnsonpc.us.greatbridge.com [65.196.69.70])
 | |
| 	by mail.greatbridge.com (8.11.2/8.11.2) with SMTP id f5BJm2Q28847
 | |
| 	for <pgsql-hackers@postgresql.org>; Mon, 11 Jun 2001 15:48:02 -0400
 | |
| From: Darren Johnson <djohnson@greatbridge.com>
 | |
| Date: Mon, 11 Jun 2001 19:46:44 GMT
 | |
| Message-ID: <20010611.19464400@j2.us.greatbridge.com>
 | |
| Subject: [HACKERS] Postgres Replication
 | |
| To: pgsql-hackers@postgresql.org
 | |
| Reply-To: Darren Johnson <djohnson@greatbridge.com>
 | |
| X-Mailer: Mozilla/3.0 (compatible; StarOffice/5.2;Linux)
 | |
| X-Priority: 3 (Normal)
 | |
| MIME-Version: 1.0
 | |
| Content-Type: text/plain; charset=ISO-8859-1
 | |
| Content-Transfer-Encoding: 8bit
 | |
| X-MIME-Autoconverted: from quoted-printable to 8bit by postgresql.org id f5BJmLE65621
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| We have been researching replication for several months now, and
 | |
| I have some opinions to share to the community for feedback,
 | |
| discussion, and/or participation. Our goal is to get a replication
 | |
| solution for PostgreSQL that will meet most needs of users
 | |
| and applications alike (mission impossible theme here :). 
 | |
| 
 | |
| My research work along with others contributors has been collected
 | |
| and presented here http://www.greatbridge.org/genpage?replication_top
 | |
| If there is something missing, especially PostgreSQL related
 | |
| work, I would like to know about it, and my apologies to any
 | |
| one who got left off the list. This work is ongoing and doesn't
 | |
| draw a conclusion, which IMHO should be left up to the user,
 | |
| but I'm offering my opinions to spur discussion and/or feed back
 | |
| from this list, and try not to offend any one.
 | |
| 
 | |
| Here's my opinion: of the approaches we've surveyed, the most
 | |
| promising one is the Postgres-R project from the Information and
 | |
| Communication Systems Group, ETH  in Zurich, Switzerland, originally 
 | |
| produced by Bettina Kemme, Gustavo Alonso, and others.  Although 
 | |
| Postgres-R is a synchronous approach, I believe it is the closest to 
 | |
| the goal mentioned above. Here is an abstract of the advantages.
 | |
| 
 | |
| 1) Postgres-R is built on the PostgreSQL-6.4.2 code base.  The 
 | |
| replication 
 | |
| functionality is an optional parameter, so there will be insignificant 
 | |
| overhead for non replication situations. The replication and 
 | |
| communication
 | |
| managers are the two new modules added to the PostgreSQL code base.
 | |
| 
 | |
| 2) The replication manager's main function is controlling the
 | |
| replication protocol via a message handling process. It receives
 | |
| messages from the local and remote backends and forwards write
 | |
| sets and decision messages via the communication manager to the
 | |
| other servers. The replication manager controls all the transactions 
 | |
| running on the local server by keeping track of the states, including 
 | |
| which protocol phase (read, send, lock, or write) the transaction is
 | |
| in. The replication manager maintains a two way channel
 | |
| implemented as buffered sockets to each backend.
 | |
| 
 | |
| 3) The main task of the communication manager is to provide simple
 | |
| socket based interface between the replication manager and the
 | |
| group communication system (currently Ensemble). The
 | |
| communication system is a cluster of servers connected via
 | |
| the communication manager.  The replication manager also maintains
 | |
| three one-way channels to the communication system: a broadcast
 | |
| channel to send messages, a total-order channel to receive
 | |
| totally orders write sets, and a no-order channel to listen for
 | |
| decision messages from the communication system. Decision
 | |
| messages can be received at any time where the reception of
 | |
| totally ordered write sets can be blocked in certain phases.
 | |
| 
 | |
| 4) Based on a two phase locking approach, all dead lock situations
 | |
| are local and detectable by Postgres-R code base, and aborted.
 | |
| 
 | |
| 5) The write set messages used to send database changes to other
 | |
| servers, can use either the SQL statements or the actual tuples
 | |
| changed. This is a parameter based on number of tuples changed
 | |
| by a transaction. While sending the tuple changes reduces
 | |
| overhead in query parse, plan and execution, there is a negative
 | |
| effect in sending a large write set across the network.
 | |
| 
 | |
| 6) Postgres-R uses a synchronous approach that keeps the data on 
 | |
| all sites consistent and provides serializability. The user does not 
 | |
| have to bother with conflict resolution, and receives the same 
 | |
| correctness and consistency of a centralized system.
 | |
| 
 | |
| 7) Postgres-R could be part of a good fault-resilient and load 
 | |
| distribution 
 | |
| solution.  It is peer-to-peer based and incurs low overhead propagating 
 | |
| updates to the other cluster members.  All replicated databases locally 
 | |
| process queries.
 | |
| 
 | |
| 8) Compared to other synchronous replication strategies (e.g., standard 
 | |
| distributed 2-phase-locking + 2-phase-commit), Postgres-R has much 
 | |
| better performance using 2-phase-locking.
 | |
| 
 | |
| 
 | |
| There are some issues that are not currently addressed by
 | |
| Postgres-R, but some enhancements made to PostgreSQL since the
 | |
| 6.4.2 tree are very favorable to addressing these short comings.
 | |
| 
 | |
| 1) The addition of WAL in 7.1 has the information for recovering 
 | |
| failed/off-line servers, currently all the servers would have to be 
 | |
| stopped, and a copy would be used to get all the servers synchronized
 | |
| before starting again. 
 | |
| 
 | |
| 2)Being synchronous, Postgres-R would not be a good solution 
 | |
| for off line/WAN scenarios where asynchronous replication is 
 | |
| required.  There are some theories on this issue which involve servers
 | |
| connecting and disconnecting from the cluster.
 | |
| 
 | |
| 3)As in any serialized synchronous approach there is  change in the 
 | |
| flow of execution of a transaction; while most of these changes can 
 | |
| be solved by calling newly developed functions at certain time points, 
 | |
| synchronous replica control is tightly coupled with the concurrency 
 | |
| control.
 | |
| Hence, especially in PostgreSQL 7.2 some parts of the concurrency control
 | |
| (MVCC) might have to be adjusted. This can lead to a slightly more 
 | |
| complicated maintenance than a system that does not change the backend.
 | |
| 
 | |
| 4)Partial replication is not addressed. 
 | |
| 
 | |
| 
 | |
| Any feedback on this post will be appreciated.
 | |
| 
 | |
| Thanks,
 | |
| 
 | |
| Darren 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 2: you can get off all lists at once with the unregister command
 | |
|     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
 | |
| 
 | |
| From pgsql-hackers-owner+M9923@postgresql.org Mon Jun 11 18:14:23 2001
 | |
| Return-path: <pgsql-hackers-owner+M9923@postgresql.org>
 | |
| Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5BMENL18644
 | |
| 	for <pgman@candle.pha.pa.us>; Mon, 11 Jun 2001 18:14:23 -0400 (EDT)
 | |
| Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with SMTP id f5BMEQE14877;
 | |
| 	Mon, 11 Jun 2001 18:14:26 -0400 (EDT)
 | |
| 	(envelope-from pgsql-hackers-owner+M9923@postgresql.org)
 | |
| Received: from spoetnik.xs4all.nl (spoetnik.xs4all.nl [194.109.249.226])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with ESMTP id f5BM6ME12270
 | |
| 	for <pgsql-hackers@postgresql.org>; Mon, 11 Jun 2001 18:06:23 -0400 (EDT)
 | |
| 	(envelope-from reinoud@xs4all.nl)
 | |
| Received: from KAYAK (kayak [192.168.1.20])
 | |
| 	by spoetnik.xs4all.nl (Postfix) with SMTP id 865A33E1B
 | |
| 	for <pgsql-hackers@postgresql.org>; Tue, 12 Jun 2001 00:06:16 +0200 (CEST)
 | |
| From: reinoud@xs4all.nl (Reinoud van Leeuwen)
 | |
| To: pgsql-hackers@postgresql.org
 | |
| Subject: Re: [HACKERS] Postgres Replication
 | |
| Date: Mon, 11 Jun 2001 22:06:07 GMT
 | |
| Organization: Not organized in any way
 | |
| Reply-To: reinoud@xs4all.nl
 | |
| Message-ID: <3b403d96.562404297@192.168.1.10>
 | |
| References: <20010611.19464400@j2.us.greatbridge.com>
 | |
| In-Reply-To: <20010611.19464400@j2.us.greatbridge.com>
 | |
| X-Mailer: Forte Agent 1.5/32.451
 | |
| MIME-Version: 1.0
 | |
| Content-Type: text/plain; charset=us-ascii
 | |
| Content-Transfer-Encoding: 8bit
 | |
| X-MIME-Autoconverted: from quoted-printable to 8bit by postgresql.org id f5BM6PE12276
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| On Mon, 11 Jun 2001 19:46:44 GMT, you wrote:
 | |
| 
 | |
| >We have been researching replication for several months now, and
 | |
| >I have some opinions to share to the community for feedback,
 | |
| >discussion, and/or participation. Our goal is to get a replication
 | |
| >solution for PostgreSQL that will meet most needs of users
 | |
| >and applications alike (mission impossible theme here :). 
 | |
| >
 | |
| >My research work along with others contributors has been collected
 | |
| >and presented here http://www.greatbridge.org/genpage?replication_top
 | |
| >If there is something missing, especially PostgreSQL related
 | |
| >work, I would like to know about it, and my apologies to any
 | |
| >one who got left off the list. This work is ongoing and doesn't
 | |
| >draw a conclusion, which IMHO should be left up to the user,
 | |
| >but I'm offering my opinions to spur discussion and/or feed back
 | |
| >from this list, and try not to offend any one.
 | |
| >
 | |
| >Here's my opinion: of the approaches we've surveyed, the most
 | |
| >promising one is the Postgres-R project from the Information and
 | |
| >Communication Systems Group, ETH  in Zurich, Switzerland, originally 
 | |
| >produced by Bettina Kemme, Gustavo Alonso, and others.  Although 
 | |
| >Postgres-R is a synchronous approach, I believe it is the closest to 
 | |
| >the goal mentioned above. Here is an abstract of the advantages.
 | |
| >
 | |
| >1) Postgres-R is built on the PostgreSQL-6.4.2 code base.  The 
 | |
| >replication 
 | |
| >functionality is an optional parameter, so there will be insignificant 
 | |
| >overhead for non replication situations. The replication and 
 | |
| >communication
 | |
| >managers are the two new modules added to the PostgreSQL code base.
 | |
| >
 | |
| >2) The replication manager's main function is controlling the
 | |
| >replication protocol via a message handling process. It receives
 | |
| >messages from the local and remote backends and forwards write
 | |
| >sets and decision messages via the communication manager to the
 | |
| >other servers. The replication manager controls all the transactions 
 | |
| >running on the local server by keeping track of the states, including 
 | |
| >which protocol phase (read, send, lock, or write) the transaction is
 | |
| >in. The replication manager maintains a two way channel
 | |
| >implemented as buffered sockets to each backend.
 | |
| 
 | |
| what does "manager controls all the transactions" mean? I hope it does
 | |
| *not* mean that a bug in the manager would cause transactions not to
 | |
| commit...
 | |
| 
 | |
| >
 | |
| >3) The main task of the communication manager is to provide simple
 | |
| >socket based interface between the replication manager and the
 | |
| >group communication system (currently Ensemble). The
 | |
| >communication system is a cluster of servers connected via
 | |
| >the communication manager.  The replication manager also maintains
 | |
| >three one-way channels to the communication system: a broadcast
 | |
| >channel to send messages, a total-order channel to receive
 | |
| >totally orders write sets, and a no-order channel to listen for
 | |
| >decision messages from the communication system. Decision
 | |
| >messages can be received at any time where the reception of
 | |
| >totally ordered write sets can be blocked in certain phases.
 | |
| >
 | |
| >4) Based on a two phase locking approach, all dead lock situations
 | |
| >are local and detectable by Postgres-R code base, and aborted.
 | |
| 
 | |
| Does this imply locking over different servers? That would mean a
 | |
| grinding halt when a network outage occurs...
 | |
| 
 | |
| >5) The write set messages used to send database changes to other
 | |
| >servers, can use either the SQL statements or the actual tuples
 | |
| >changed. This is a parameter based on number of tuples changed
 | |
| >by a transaction. While sending the tuple changes reduces
 | |
| >overhead in query parse, plan and execution, there is a negative
 | |
| >effect in sending a large write set across the network.
 | |
| >
 | |
| >6) Postgres-R uses a synchronous approach that keeps the data on 
 | |
| >all sites consistent and provides serializability. The user does not 
 | |
| >have to bother with conflict resolution, and receives the same 
 | |
| >correctness and consistency of a centralized system.
 | |
| >
 | |
| >7) Postgres-R could be part of a good fault-resilient and load 
 | |
| >distribution 
 | |
| >solution.  It is peer-to-peer based and incurs low overhead propagating 
 | |
| >updates to the other cluster members.  All replicated databases locally 
 | |
| >process queries.
 | |
| >
 | |
| >8) Compared to other synchronous replication strategies (e.g., standard 
 | |
| >distributed 2-phase-locking + 2-phase-commit), Postgres-R has much 
 | |
| >better performance using 2-phase-locking.
 | |
| 
 | |
| Coming from a Sybase background I have some experience with
 | |
| replication. The way it works in Sybase Replication server is as
 | |
| follows:
 | |
| - for each replicated database, there is a "log reader" process that
 | |
| reads the WAL and captures only *committed transactions* to the
 | |
| replication server. (it does not make much sense to replicate other
 | |
| things IMHO :-).
 | |
| - the replication server stores incoming data in a que ("stable
 | |
| device"), until it is sure it has reached its final destination
 | |
| 
 | |
| - a replication server can send data to another replication server in
 | |
| a compact (read: WAN friendly) way. A chain of replication servers can
 | |
| be made, depending on network architecture)
 | |
| 
 | |
| - the final replication server makes a almost standard client
 | |
| connection to the target database and translates the compact
 | |
| transactions back to SQL statements. By using masks, extra
 | |
| functionality can be built in. 
 | |
| 
 | |
| This kind of architecture has several advantages:
 | |
| - only committed transactions are replicated which saves overhead
 | |
| - it does not have very much impact on performance of the source
 | |
| server (apart from reading the WAL)
 | |
| - since every replication server has a stable device, data is stored
 | |
| when the network is down and nothing gets lost (nor stops performing)
 | |
| - because only the log reader and the connection from the final
 | |
| replication server are RDBMS specific, it is possible to replicate
 | |
| from MS to Oracle using a Sybase replication server (or different
 | |
| versions etc).
 | |
| 
 | |
| I do not know how much of this is patented or copyrighted, but the
 | |
| architecture seems elegant and robust to me. I have done
 | |
| implementations of bi-directional replication too. It *is* possible
 | |
| but does require some funky setup and maintenance. (but it is better
 | |
| that letting offices on different continents working on the same
 | |
| database :-)
 | |
| 
 | |
| just my 2 EURO cts  :-)
 | |
| 
 | |
| 
 | |
| -- 
 | |
| __________________________________________________
 | |
| "Nothing is as subjective as reality"
 | |
| Reinoud van Leeuwen       reinoud@xs4all.nl
 | |
| http://www.xs4all.nl/~reinoud
 | |
| __________________________________________________
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
 | |
| 
 | |
| From pgsql-hackers-owner+M9924@postgresql.org Mon Jun 11 18:41:51 2001
 | |
| Return-path: <pgsql-hackers-owner+M9924@postgresql.org>
 | |
| Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5BMfpL28917
 | |
| 	for <pgman@candle.pha.pa.us>; Mon, 11 Jun 2001 18:41:51 -0400 (EDT)
 | |
| Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with SMTP id f5BMfsE25092;
 | |
| 	Mon, 11 Jun 2001 18:41:54 -0400 (EDT)
 | |
| 	(envelope-from pgsql-hackers-owner+M9924@postgresql.org)
 | |
| Received: from spider.pilosoft.com (p55-222.acedsl.com [160.79.55.222])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with ESMTP id f5BMalE23024
 | |
| 	for <pgsql-hackers@postgresql.org>; Mon, 11 Jun 2001 18:36:47 -0400 (EDT)
 | |
| 	(envelope-from alex@pilosoft.com)
 | |
| Received: from localhost (alexmail@localhost)
 | |
| 	by spider.pilosoft.com (8.9.3/8.9.3) with ESMTP id SAA06092;
 | |
| 	Mon, 11 Jun 2001 18:46:05 -0400 (EDT)
 | |
| Date: Mon, 11 Jun 2001 18:46:05 -0400 (EDT)
 | |
| From: Alex Pilosov <alex@pilosoft.com>
 | |
| To: Reinoud van Leeuwen <reinoud@xs4all.nl>
 | |
| cc: pgsql-hackers@postgresql.org
 | |
| Subject: Re: [HACKERS] Postgres Replication
 | |
| In-Reply-To: <3b403d96.562404297@192.168.1.10>
 | |
| Message-ID: <Pine.BSO.4.10.10106111828450.9902-100000@spider.pilosoft.com>
 | |
| MIME-Version: 1.0
 | |
| Content-Type: TEXT/PLAIN; charset=US-ASCII
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| On Mon, 11 Jun 2001, Reinoud van Leeuwen wrote:
 | |
| 
 | |
| > On Mon, 11 Jun 2001 19:46:44 GMT, you wrote:
 | |
| 
 | |
| > what does "manager controls all the transactions" mean? I hope it does
 | |
| > *not* mean that a bug in the manager would cause transactions not to
 | |
| > commit...
 | |
| Well yeah it does. Bugs are a fact of life. :)
 | |
| 
 | |
| > >4) Based on a two phase locking approach, all dead lock situations
 | |
| > >are local and detectable by Postgres-R code base, and aborted.
 | |
| > 
 | |
| > Does this imply locking over different servers? That would mean a
 | |
| > grinding halt when a network outage occurs...
 | |
| Don't know, but see below.
 | |
| 
 | |
| > Coming from a Sybase background I have some experience with
 | |
| > replication. The way it works in Sybase Replication server is as
 | |
| > follows:
 | |
| > - for each replicated database, there is a "log reader" process that
 | |
| > reads the WAL and captures only *committed transactions* to the
 | |
| > replication server. (it does not make much sense to replicate other
 | |
| > things IMHO :-).
 | |
| > - the replication server stores incoming data in a que ("stable
 | |
| > device"), until it is sure it has reached its final destination
 | |
| > 
 | |
| > - a replication server can send data to another replication server in
 | |
| > a compact (read: WAN friendly) way. A chain of replication servers can
 | |
| > be made, depending on network architecture)
 | |
| > 
 | |
| > - the final replication server makes a almost standard client
 | |
| > connection to the target database and translates the compact
 | |
| > transactions back to SQL statements. By using masks, extra
 | |
| > functionality can be built in. 
 | |
| > 
 | |
| > This kind of architecture has several advantages:
 | |
| > - only committed transactions are replicated which saves overhead
 | |
| > - it does not have very much impact on performance of the source
 | |
| > server (apart from reading the WAL)
 | |
| > - since every replication server has a stable device, data is stored
 | |
| > when the network is down and nothing gets lost (nor stops performing)
 | |
| > - because only the log reader and the connection from the final
 | |
| > replication server are RDBMS specific, it is possible to replicate
 | |
| > from MS to Oracle using a Sybase replication server (or different
 | |
| > versions etc).
 | |
| > 
 | |
| > I do not know how much of this is patented or copyrighted, but the
 | |
| > architecture seems elegant and robust to me. I have done
 | |
| > implementations of bi-directional replication too. It *is* possible
 | |
| > but does require some funky setup and maintenance. (but it is better
 | |
| > that letting offices on different continents working on the same
 | |
| > database :-)
 | |
| Yes, the above architecture is what almost every vendor of replication
 | |
| software uses. And I'm sure if you worked much with Sybase, you hate the
 | |
| garbage that their repserver is :). 
 | |
| 
 | |
| The architecture of postgres-r and repserver are fundamentally different
 | |
| for a good reason: repserver only wants to replicate committed
 | |
| transactions, while postgres-r is more of a 'clustering' solution (albeit
 | |
| they don't say this word), and is capable to do much more than simple rep
 | |
| server. 
 | |
| 
 | |
| I.E. you can safely put half of your clients to second server in a
 | |
| replicated postgres-r cluster without being worried that a conflict (or a
 | |
| wierd locking situation) may occur.
 | |
| 
 | |
| Try that with sybase, it is fundamentally designed for one-way
 | |
| replication, and the fact that you can do one-way replication in both
 | |
| directions doesn't mean its safe to do that!
 | |
| 
 | |
| I'm not sure how postgres-r handles network problems. To be useful, a good
 | |
| replication solution must have an option of "no network->no updates" as
 | |
| well as "no network->queue updates and send them later". However, it is
 | |
| far easier to add queuing to a correct 'eager locking' database than it is
 | |
| to add proper locking to a queue-based replicator.
 | |
| 
 | |
| -alex
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 3: if posting/reading through Usenet, please send an appropriate
 | |
| subscribe-nomail command to majordomo@postgresql.org so that your
 | |
| message can get through to the mailing list cleanly
 | |
| 
 | |
| From pgsql-hackers-owner+M9932@postgresql.org Mon Jun 11 22:17:54 2001
 | |
| Return-path: <pgsql-hackers-owner+M9932@postgresql.org>
 | |
| Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5C2HsL15803
 | |
| 	for <pgman@candle.pha.pa.us>; Mon, 11 Jun 2001 22:17:54 -0400 (EDT)
 | |
| Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with SMTP id f5C2HtE86836;
 | |
| 	Mon, 11 Jun 2001 22:17:55 -0400 (EDT)
 | |
| 	(envelope-from pgsql-hackers-owner+M9932@postgresql.org)
 | |
| Received: from femail15.sdc1.sfba.home.com (femail15.sdc1.sfba.home.com [24.0.95.142])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with ESMTP id f5C2BXE85020
 | |
| 	for <pgsql-hackers@postgresql.org>; Mon, 11 Jun 2001 22:11:33 -0400 (EDT)
 | |
| 	(envelope-from djohnson@greatbridge.com)
 | |
| Received: from greatbridge.com ([65.2.95.27])
 | |
|           by femail15.sdc1.sfba.home.com
 | |
|           (InterMail vM.4.01.03.20 201-229-121-120-20010223) with ESMTP
 | |
|           id <20010612021124.OZRG17243.femail15.sdc1.sfba.home.com@greatbridge.com>;
 | |
|           Mon, 11 Jun 2001 19:11:24 -0700
 | |
| Message-ID: <3B257969.6050405@greatbridge.com>
 | |
| Date: Mon, 11 Jun 2001 22:07:37 -0400
 | |
| From: Darren Johnson <djohnson@greatbridge.com>
 | |
| User-Agent: Mozilla/5.0 (Windows; U; WinNT4.0; en-US; m18) Gecko/20001108 Netscape6/6.0
 | |
| X-Accept-Language: en
 | |
| MIME-Version: 1.0
 | |
| To: Alex Pilosov <alex@pilosoft.com>, Reinoud van Leeuwen <reinoud@xs4all.nl>
 | |
| cc: pgsql-hackers@postgresql.org
 | |
| Subject: Re: [HACKERS] Postgres Replication
 | |
| References: <Pine.BSO.4.10.10106111828450.9902-100000@spider.pilosoft.com>
 | |
| Content-Type: text/plain; charset=us-ascii; format=flowed
 | |
| Content-Transfer-Encoding: 7bit
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| 
 | |
| Thanks for the feedback.  I'll try to address both your issues here.
 | |
| 
 | |
| >> what does "manager controls all the transactions" mean? 
 | |
| > 
 | |
| The replication manager controls the transactions by serializing the 
 | |
| write set messages. 
 | |
| This ensures all transactions are committed in the same order on each 
 | |
| server, so bugs
 | |
| here are not allowed  ;-)
 | |
| 
 | |
| >> I hope it does
 | |
| >> *not* mean that a bug in the manager would cause transactions not to
 | |
| >> commit...
 | |
| > 
 | |
| > Well yeah it does. Bugs are a fact of life. :
 | |
| 
 | |
| > 
 | |
| >>> 4) Based on a two phase locking approach, all dead lock situations
 | |
| >>> are local and detectable by Postgres-R code base, and aborted.
 | |
| >> 
 | |
| >> Does this imply locking over different servers? That would mean a
 | |
| >> grinding halt when a network outage occurs...
 | |
| > 
 | |
| > Don't know, but see below.
 | |
| 
 | |
| There is a branch of the Postgres-R code that has some failure detection 
 | |
| implemented,
 | |
| so we will have to merge this functionality with the version of 
 | |
| Postgres-R we have, and
 | |
| test this issue.  I'll let you the results.
 | |
| 
 | |
| >> 
 | |
| >> - the replication server stores incoming data in a que ("stable
 | |
| >> device"), until it is sure it has reached its final destination
 | |
| > 
 | |
| I like this idea for recovering servers that have been down a short 
 | |
| period of time, using WAL
 | |
| to recover transactions missed during the outage.
 | |
| 
 | |
| >> 
 | |
| >> This kind of architecture has several advantages:
 | |
| >> - only committed transactions are replicated which saves overhead
 | |
| >> - it does not have very much impact on performance of the source
 | |
| >> server (apart from reading the WAL)
 | |
| >> - since every replication server has a stable device, data is stored
 | |
| >> when the network is down and nothing gets lost (nor stops performing)
 | |
| >> - because only the log reader and the connection from the final
 | |
| >> replication server are RDBMS specific, it is possible to replicate
 | |
| >> from MS to Oracle using a Sybase replication server (or different
 | |
| >> versions etc).
 | |
| > 
 | |
| There are some issues with the "log reader" approach:
 | |
| 1) The databases are not synchronized until the log reader completes its 
 | |
| processing.
 | |
| 2) I'm not sure about Sybase, but the log reader sends SQL statements to 
 | |
| the other servers
 | |
| which are then parsed, planned and executed.  This over head could be 
 | |
| avoided if only
 | |
| the tuple changes are replicated.
 | |
| 3) Works fine for read only situations, but peer-to-peer applications 
 | |
| using this approach
 | |
| must be designed with a conflict resolution scheme. 
 | |
| 
 | |
| Don't get me wrong, I believe we can learn from the replication 
 | |
| techniques used by commercial
 | |
| databases like Sybase, and try to implement the good ones into 
 | |
| PostgreSQL.  Postgres-R is
 | |
| a synchronous approach which out performs the traditional approaches to 
 | |
| synchronous replication.
 | |
| Being based on PostgreSQL-6.4.2, getting this approach in the 7.2 tree 
 | |
| might be better than
 | |
| reinventing the wheel.
 | |
| 
 | |
| Thanks again,
 | |
| 
 | |
| Darren
 | |
| 
 | |
| 
 | |
| Thanks again,
 | |
| 
 | |
| Darren
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 6: Have you searched our list archives?
 | |
| 
 | |
| http://www.postgresql.org/search.mpl
 | |
| 
 | |
| From pgsql-hackers-owner+M9936@postgresql.org Tue Jun 12 03:22:51 2001
 | |
| Return-path: <pgsql-hackers-owner+M9936@postgresql.org>
 | |
| Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5C7MoL11061
 | |
| 	for <pgman@candle.pha.pa.us>; Tue, 12 Jun 2001 03:22:50 -0400 (EDT)
 | |
| Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with SMTP id f5C7MPE35441;
 | |
| 	Tue, 12 Jun 2001 03:22:25 -0400 (EDT)
 | |
| 	(envelope-from pgsql-hackers-owner+M9936@postgresql.org)
 | |
| Received: from reorxrsm.server.lan.at (zep3.it-austria.net [213.150.1.73])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with ESMTP id f5C72ZE25009
 | |
| 	for <pgsql-hackers@postgresql.org>; Tue, 12 Jun 2001 03:02:36 -0400 (EDT)
 | |
| 	(envelope-from ZeugswetterA@wien.spardat.at)
 | |
| Received: from gz0153.gc.spardat.at (gz0153.gc.spardat.at [172.20.10.149])
 | |
| 	by reorxrsm.server.lan.at (8.11.2/8.11.2) with ESMTP id f5C72Qu27966
 | |
| 	for <pgsql-hackers@postgresql.org>; Tue, 12 Jun 2001 09:02:26 +0200
 | |
| Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2650.21)
 | |
| 	id <M3L15341>; Tue, 12 Jun 2001 09:02:21 +0200
 | |
| Message-ID: <11C1E6749A55D411A9670001FA68796336831B@sdexcsrv1.f000.d0188.sd.spardat.at>
 | |
| From: Zeugswetter Andreas SB  <ZeugswetterA@wien.spardat.at>
 | |
| To: "'Darren Johnson'" <djohnson@greatbridge.com>,
 | |
|    pgsql-hackers@postgresql.org
 | |
| Subject: AW: [HACKERS] Postgres Replication
 | |
| Date: Tue, 12 Jun 2001 09:02:20 +0200
 | |
| MIME-Version: 1.0
 | |
| X-Mailer: Internet Mail Service (5.5.2650.21)
 | |
| Content-Type: text/plain;
 | |
| 	charset="iso-8859-1"
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| 
 | |
| > Although 
 | |
| > Postgres-R is a synchronous approach, I believe it is the closest to 
 | |
| > the goal mentioned above. Here is an abstract of the advantages.
 | |
| 
 | |
| If you only want synchronous replication, why not simply use triggers ?
 | |
| All you would then need is remote query access and two phase commit,
 | |
| and maybe a little script that helps create the appropriate triggers.
 | |
| 
 | |
| Doing a replicate all or nothing approach that only works synchronous
 | |
| is imho not flexible enough.
 | |
| 
 | |
| Andreas
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 6: Have you searched our list archives?
 | |
| 
 | |
| http://www.postgresql.org/search.mpl
 | |
| 
 | |
| From pgsql-hackers-owner+M9945@postgresql.org Tue Jun 12 10:18:29 2001
 | |
| Return-path: <pgsql-hackers-owner+M9945@postgresql.org>
 | |
| Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5CEISL06372
 | |
| 	for <pgman@candle.pha.pa.us>; Tue, 12 Jun 2001 10:18:28 -0400 (EDT)
 | |
| Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with SMTP id f5CEIQE77517;
 | |
| 	Tue, 12 Jun 2001 10:18:26 -0400 (EDT)
 | |
| 	(envelope-from pgsql-hackers-owner+M9945@postgresql.org)
 | |
| Received: from krypton.netropolis.org ([208.222.215.99])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with ESMTP id f5CEDuE75514
 | |
| 	for <pgsql-hackers@postgresql.org>; Tue, 12 Jun 2001 10:13:56 -0400 (EDT)
 | |
| 	(envelope-from root@generalogic.com)
 | |
| Received: from [132.216.183.103] (helo=localhost)
 | |
| 	by krypton.netropolis.org with esmtp (Exim 3.12 #1 (Debian))
 | |
| 	id 159ouq-0003MU-00
 | |
| 	for <pgsql-hackers@postgresql.org>; Tue, 12 Jun 2001 10:13:08 -0400
 | |
| To: pgsql-hackers@postgresql.org
 | |
| Subject: Re: AW: [HACKERS] Postgres Replication
 | |
| In-Reply-To: <20010612.13321600@j2.us.greatbridge.com>
 | |
| References: <Pine.BSF.4.33.0106120605130.411-100000@mobile.hub.org>
 | |
| 	<20010612.13321600@j2.us.greatbridge.com>
 | |
| X-Mailer: Mew version 1.94.2 on Emacs 20.7 / Mule 4.0 (HANANOEN)
 | |
| MIME-Version: 1.0
 | |
| Content-Type: Text/Plain; charset=us-ascii
 | |
| Content-Transfer-Encoding: 7bit
 | |
| Message-ID: <20010612123623O.root@generalogic.com>
 | |
| Date: Tue, 12 Jun 2001 12:36:23 +0530
 | |
| From: root <root@generalogic.com>
 | |
| X-Dispatcher: imput version 20000414(IM141)
 | |
| Lines: 47
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| 
 | |
| Hello
 | |
| 
 | |
| I have hacked up a replication layer for Perl code accessing a
 | |
| database throught the DBI interface. It works pretty well with MySQL
 | |
| (I can run pre-bender slashcode replicated, haven't tried the more
 | |
| recent releases).
 | |
| 
 | |
| Potentially this hack should also work with Pg but I haven't tried
 | |
| yet. If someone would like to test it out with a complex Pg app and
 | |
| let me know how it went that would be cool.
 | |
| 
 | |
| The replication layer is based on Eric Newton's Recall replication
 | |
| library (www.fault-tolerant.org/recall), and requires that all
 | |
| database accesses be through the DBI interface.
 | |
| 
 | |
| The replicas are live, in that every operation affects all the
 | |
| replicas in real time. Replica outages are invisible to the user, so
 | |
| long as a majority of the replicas are functioning. Disconnected
 | |
| replicas can be used for read-only access.
 | |
| 
 | |
| The only code modification that should be required to use the
 | |
| replication layer is to change the DSN in connect():
 | |
| 
 | |
|   my $replicas = '192.168.1.1:7000,192.168.1.2:7000,192.168.1.3:7000';
 | |
|   my $dbh = DBI->connect("DBI:Recall:database=$replicas");
 | |
| 
 | |
| You should be able to install the replication modules with:
 | |
| 
 | |
| perl -MCPAN -eshell
 | |
| cpan> install Replication::Recall::DBServer
 | |
| 
 | |
| and then install DBD::Recall (which doesn't seem to be accessible from
 | |
| the CPAN shell yet, for some reason), by:
 | |
| 
 | |
| wget http://www.cpan.org/authors/id/AGUL/DBD-Recall-1.10.tar.gz
 | |
| tar xzvf DBD-Recall-1.10.tar.gz
 | |
| cd DBD-Recall-1.10
 | |
| perl Makefile.PL
 | |
| make install
 | |
| 
 | |
| I would be very interested in hearing about your experiences with
 | |
| this...
 | |
| 
 | |
| Thanks
 | |
| 
 | |
| #!
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 3: if posting/reading through Usenet, please send an appropriate
 | |
| subscribe-nomail command to majordomo@postgresql.org so that your
 | |
| message can get through to the mailing list cleanly
 | |
| 
 | |
| From pgsql-hackers-owner+M9938@postgresql.org Tue Jun 12 05:12:54 2001
 | |
| Return-path: <pgsql-hackers-owner+M9938@postgresql.org>
 | |
| Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5C9CrL15228
 | |
| 	for <pgman@candle.pha.pa.us>; Tue, 12 Jun 2001 05:12:53 -0400 (EDT)
 | |
| Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with SMTP id f5C9CnE91297;
 | |
| 	Tue, 12 Jun 2001 05:12:49 -0400 (EDT)
 | |
| 	(envelope-from pgsql-hackers-owner+M9938@postgresql.org)
 | |
| Received: from mobile.hub.org (SHW39-29.accesscable.net [24.138.39.29])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with ESMTP id f5C98DE89175
 | |
| 	for <pgsql-hackers@postgresql.org>; Tue, 12 Jun 2001 05:08:13 -0400 (EDT)
 | |
| 	(envelope-from scrappy@hub.org)
 | |
| Received: from localhost (scrappy@localhost)
 | |
| 	by mobile.hub.org (8.11.3/8.11.1) with ESMTP id f5C97f361630;
 | |
| 	Tue, 12 Jun 2001 06:07:46 -0300 (ADT)
 | |
| 	(envelope-from scrappy@hub.org)
 | |
| X-Authentication-Warning: mobile.hub.org: scrappy owned process doing -bs
 | |
| Date: Tue, 12 Jun 2001 06:07:41 -0300 (ADT)
 | |
| From: The Hermit Hacker <scrappy@hub.org>
 | |
| To: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>
 | |
| cc: "'Darren Johnson'" <djohnson@greatbridge.com>,
 | |
|    <pgsql-hackers@postgresql.org>
 | |
| Subject: Re: AW: [HACKERS] Postgres Replication
 | |
| In-Reply-To: <11C1E6749A55D411A9670001FA68796336831B@sdexcsrv1.f000.d0188.sd.spardat.at>
 | |
| Message-ID: <Pine.BSF.4.33.0106120605130.411-100000@mobile.hub.org>
 | |
| MIME-Version: 1.0
 | |
| Content-Type: TEXT/PLAIN; charset=US-ASCII
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| 
 | |
| which I believe is what the rserv implementation in contrib currently does
 | |
| ... no?
 | |
| 
 | |
| its funny ... what is in contrib right now was developed in a weekend by
 | |
| Vadim, put in contrib, yet nobody has either used it *or* seen fit to
 | |
| submit patches to improve it ... ?
 | |
| 
 | |
| On Tue, 12 Jun 2001, Zeugswetter Andreas SB wrote:
 | |
| 
 | |
| >
 | |
| > > Although
 | |
| > > Postgres-R is a synchronous approach, I believe it is the closest to
 | |
| > > the goal mentioned above. Here is an abstract of the advantages.
 | |
| >
 | |
| > If you only want synchronous replication, why not simply use triggers ?
 | |
| > All you would then need is remote query access and two phase commit,
 | |
| > and maybe a little script that helps create the appropriate triggers.
 | |
| >
 | |
| > Doing a replicate all or nothing approach that only works synchronous
 | |
| > is imho not flexible enough.
 | |
| >
 | |
| > Andreas
 | |
| >
 | |
| > ---------------------------(end of broadcast)---------------------------
 | |
| > TIP 6: Have you searched our list archives?
 | |
| >
 | |
| > http://www.postgresql.org/search.mpl
 | |
| >
 | |
| 
 | |
| Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
 | |
| Systems Administrator @ hub.org
 | |
| primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
 | |
| 
 | |
| From pgsql-hackers-owner+M9940@postgresql.org Tue Jun 12 09:39:08 2001
 | |
| Return-path: <pgsql-hackers-owner+M9940@postgresql.org>
 | |
| Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5CDd8L03200
 | |
| 	for <pgman@candle.pha.pa.us>; Tue, 12 Jun 2001 09:39:08 -0400 (EDT)
 | |
| Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with SMTP id f5CDcmE58175;
 | |
| 	Tue, 12 Jun 2001 09:38:48 -0400 (EDT)
 | |
| 	(envelope-from pgsql-hackers-owner+M9940@postgresql.org)
 | |
| Received: from mail.greatbridge.com (mail.greatbridge.com [65.196.68.36])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with ESMTP id f5CDYAE56164
 | |
| 	for <pgsql-hackers@postgresql.org>; Tue, 12 Jun 2001 09:34:10 -0400 (EDT)
 | |
| 	(envelope-from djohnson@greatbridge.com)
 | |
| Received: from j2.us.greatbridge.com (djohnsonpc.us.greatbridge.com [65.196.69.70])
 | |
| 	by mail.greatbridge.com (8.11.2/8.11.2) with SMTP id f5CDXeQ03585;
 | |
| 	Tue, 12 Jun 2001 09:33:40 -0400
 | |
| From: Darren Johnson <djohnson@greatbridge.com>
 | |
| Date: Tue, 12 Jun 2001 13:32:16 GMT
 | |
| Message-ID: <20010612.13321600@j2.us.greatbridge.com>
 | |
| Subject: Re: AW: [HACKERS] Postgres Replication
 | |
| To: The Hermit Hacker <scrappy@hub.org>
 | |
| cc: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>,
 | |
|    <pgsql-hackers@postgresql.org>
 | |
| Reply-To: Darren Johnson <djohnson@greatbridge.com>
 | |
| In-Reply-To: <Pine.BSF.4.33.0106120605130.411-100000@mobile.hub.org>
 | |
| References: <Pine.BSF.4.33.0106120605130.411-100000@mobile.hub.org>
 | |
| X-Mailer: Mozilla/3.0 (compatible; StarOffice/5.2;Linux)
 | |
| X-Priority: 3 (Normal)
 | |
| MIME-Version: 1.0
 | |
| Content-Type: text/plain; charset=ISO-8859-1
 | |
| Content-Transfer-Encoding: 8bit
 | |
| X-MIME-Autoconverted: from quoted-printable to 8bit by postgresql.org id f5CDYAE56166
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| 
 | |
| > which I believe is what the rserv implementation in contrib currently 
 | |
| does
 | |
| > ... no?
 | |
| 
 | |
| We tried rserv, PG Link (Joseph Conway), and PosrgreSQL Replicator.  All
 | |
| these projects are trigger based asynchronous replication.  They all have
 | |
| some advantages over the current functionality of Postgres-R some of 
 | |
| which I believe can be addressed:
 | |
| 
 | |
| 1) Partial replication - being able to replicate just one or part of a
 | |
| table(s)
 | |
| 2) They make no changes to the PostgreSQL code base. (Postgres-R can't 
 | |
| address this one ;)
 | |
| 3) PostgreSQL Replicator has some very nice conflict resolution schemes.
 | |
| 
 | |
| 
 | |
| Here are some disadvantages to using a "trigger based" approach:
 | |
| 
 | |
| 1) Triggers simply transfer individual data items when they are modified,
 | |
| they do not keep track of transactions.
 | |
| 2) The execution of triggers within a database imposes a performance 
 | |
| overhead to that database.
 | |
| 3) Triggers require careful management by database administrators.  
 | |
| Someone needs to keep track of all the "alarms" going off.
 | |
| 4) The activation of triggers in a database cannot be easily 
 | |
| rolled back or undone.
 | |
| 
 | |
| 
 | |
| 
 | |
| > On Tue, 12 Jun 2001, Zeugswetter Andreas SB wrote:
 | |
| 
 | |
| > > Doing a replicate all or nothing approach that only works synchronous
 | |
| > > is imho not flexible enough.
 | |
| > >
 | |
| 
 | |
| 
 | |
| I agree.  Partial and asynchronous replication need to be addressed, 
 | |
| and some of the common functionality of Postgres-R could possibly 
 | |
| be used to meet those needs. 
 | |
|  
 | |
| 
 | |
| Thanks for your feedback,
 | |
| 
 | |
| Darren
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 5: Have you checked our extensive FAQ?
 | |
| 
 | |
| http://www.postgresql.org/users-lounge/docs/faq.html
 | |
| 
 | |
| From pgsql-hackers-owner+M9969@postgresql.org Tue Jun 12 16:53:45 2001
 | |
| Return-path: <pgsql-hackers-owner+M9969@postgresql.org>
 | |
| Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5CKriL23104
 | |
| 	for <pgman@candle.pha.pa.us>; Tue, 12 Jun 2001 16:53:44 -0400 (EDT)
 | |
| Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with SMTP id f5CKrlE87423;
 | |
| 	Tue, 12 Jun 2001 16:53:47 -0400 (EDT)
 | |
| 	(envelope-from pgsql-hackers-owner+M9969@postgresql.org)
 | |
| Received: from sectorbase2.sectorbase.com (sectorbase2.sectorbase.com [63.88.121.62] (may be forged))
 | |
| 	by postgresql.org (8.11.3/8.11.1) with SMTP id f5CHWkE69562
 | |
| 	for <pgsql-hackers@postgresql.org>; Tue, 12 Jun 2001 13:32:46 -0400 (EDT)
 | |
| 	(envelope-from vmikheev@SECTORBASE.COM)
 | |
| Received: by sectorbase2.sectorbase.com with Internet Mail Service (5.5.2653.19)
 | |
| 	id <MX6MWMV8>; Tue, 12 Jun 2001 10:30:29 -0700
 | |
| Message-ID: <3705826352029646A3E91C53F7189E32016670@sectorbase2.sectorbase.com>
 | |
| From: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
 | |
| To: "'Darren Johnson'" <djohnson@greatbridge.com>,
 | |
|    The Hermit Hacker
 | |
|   <scrappy@hub.org>
 | |
| cc: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>,
 | |
|    pgsql-hackers@postgresql.org
 | |
| Subject: RE: AW: [HACKERS] Postgres Replication
 | |
| Date: Tue, 12 Jun 2001 10:30:27 -0700
 | |
| MIME-Version: 1.0
 | |
| X-Mailer: Internet Mail Service (5.5.2653.19)
 | |
| Content-Type: text/plain;
 | |
| 	charset="iso-8859-1"
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| > Here are some disadvantages to using a "trigger based" approach:
 | |
| > 
 | |
| > 1) Triggers simply transfer individual data items when they 
 | |
| > are modified, they do not keep track of transactions.
 | |
| 
 | |
| I don't know about other *async* replication engines but Rserv
 | |
| keeps track of transactions (if I understood you corectly).
 | |
| Rserv transfers not individual modified data items but
 | |
| *consistent* snapshot of changes to move slave database from
 | |
| one *consistent* state (when all RI constraints satisfied)
 | |
| to another *consistent* state.
 | |
| 
 | |
| > 4) The activation of triggers in a database cannot be easily
 | |
| > rolled back or undone.
 | |
| 
 | |
| What do you mean?
 | |
| 
 | |
| Vadim
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 2: you can get off all lists at once with the unregister command
 | |
|     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
 | |
| 
 | |
| From pgsql-hackers-owner+M9967@postgresql.org Tue Jun 12 16:42:11 2001
 | |
| Return-path: <pgsql-hackers-owner+M9967@postgresql.org>
 | |
| Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5CKgBL17982
 | |
| 	for <pgman@candle.pha.pa.us>; Tue, 12 Jun 2001 16:42:11 -0400 (EDT)
 | |
| Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with SMTP id f5CKgDE80566;
 | |
| 	Tue, 12 Jun 2001 16:42:13 -0400 (EDT)
 | |
| 	(envelope-from pgsql-hackers-owner+M9967@postgresql.org)
 | |
| Received: from mail.greatbridge.com (mail.greatbridge.com [65.196.68.36])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with ESMTP id f5CIVdE07561
 | |
| 	for <pgsql-hackers@postgresql.org>; Tue, 12 Jun 2001 14:31:39 -0400 (EDT)
 | |
| 	(envelope-from djohnson@greatbridge.com)
 | |
| Received: from j2.us.greatbridge.com (djohnsonpc.us.greatbridge.com [65.196.69.70])
 | |
| 	by mail.greatbridge.com (8.11.2/8.11.2) with SMTP id f5CIUfQ10080;
 | |
| 	Tue, 12 Jun 2001 14:30:41 -0400
 | |
| From: Darren Johnson <djohnson@greatbridge.com>
 | |
| Date: Tue, 12 Jun 2001 18:29:20 GMT
 | |
| Message-ID: <20010612.18292000@j2.us.greatbridge.com>
 | |
| Subject: RE: AW: [HACKERS] Postgres Replication
 | |
| To: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
 | |
| cc: The Hermit Hacker <scrappy@hub.org>,
 | |
|    Zeugswetter Andreas SB
 | |
| 	<ZeugswetterA@wien.spardat.at>,
 | |
|    pgsql-hackers@postgresql.org
 | |
| Reply-To: Darren Johnson <djohnson@greatbridge.com>
 | |
| 	<3705826352029646A3E91C53F7189E32016670@sectorbase2.sectorbase.com>
 | |
| References: <3705826352029646A3E91C53F7189E32016670@sectorbase2.sectorbase.com>
 | |
| X-Mailer: Mozilla/3.0 (compatible; StarOffice/5.2;Linux)
 | |
| X-Priority: 3 (Normal)
 | |
| MIME-Version: 1.0
 | |
| Content-Type: text/plain; charset=ISO-8859-1
 | |
| Content-Transfer-Encoding: 8bit
 | |
| X-MIME-Autoconverted: from quoted-printable to 8bit by postgresql.org id f5CIVdE07562
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| 
 | |
| 
 | |
| > > Here are some disadvantages to using a "trigger based" approach:
 | |
| > >
 | |
| > > 1) Triggers simply transfer individual data items when they
 | |
| > > are modified, they do not keep track of transactions.
 | |
| 
 | |
| > I don't know about other *async* replication engines but Rserv
 | |
| > keeps track of transactions (if I understood you corectly).
 | |
| > Rserv transfers not individual modified data items but
 | |
| > *consistent* snapshot of changes to move slave database from
 | |
| > one *consistent* state (when all RI constraints satisfied)
 | |
| > to another *consistent* state.
 | |
| 
 | |
| I thought Andreas did a good job of correcting me here. Transaction-
 | |
| based replication with triggers do not apply to points 1 and 4.  I
 | |
| should have made a distinction between non-transaction and 
 | |
| transaction based replication with triggers.  I was not trying to
 | |
| single out rserv or any other project, and I can see how my wording 
 | |
| implies this misinterpretation (my apologies).
 | |
|  
 | |
| 
 | |
| > > 4) The activation of triggers in a database cannot be easily
 | |
| > > rolled back or undone.
 | |
| 
 | |
| > What do you mean?
 | |
| 
 | |
| Once the trigger fires, it is not an easy task  to abort that 
 | |
| execution via rollback or undo.  Again this is not an issue 
 | |
| with a transaction-based trigger approach.
 | |
| 
 | |
| 
 | |
| Sincerely,
 | |
| 
 | |
| Darren
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 2: you can get off all lists at once with the unregister command
 | |
|     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
 | |
| 
 | |
| From pgsql-hackers-owner+M9943@postgresql.org Tue Jun 12 10:03:02 2001
 | |
| Return-path: <pgsql-hackers-owner+M9943@postgresql.org>
 | |
| Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5CE32L04619
 | |
| 	for <pgman@candle.pha.pa.us>; Tue, 12 Jun 2001 10:03:02 -0400 (EDT)
 | |
| Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with SMTP id f5CE31E70430;
 | |
| 	Tue, 12 Jun 2001 10:03:01 -0400 (EDT)
 | |
| 	(envelope-from pgsql-hackers-owner+M9943@postgresql.org)
 | |
| Received: from fizbanrsm.server.lan.at (zep4.it-austria.net [213.150.1.74])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with ESMTP id f5CDoQE64062
 | |
| 	for <pgsql-hackers@postgresql.org>; Tue, 12 Jun 2001 09:50:26 -0400 (EDT)
 | |
| 	(envelope-from ZeugswetterA@wien.spardat.at)
 | |
| Received: from gz0153.gc.spardat.at (gz0153.gc.spardat.at [172.20.10.149])
 | |
| 	by fizbanrsm.server.lan.at (8.11.2/8.11.2) with ESMTP id f5CDoJe11224
 | |
| 	for <pgsql-hackers@postgresql.org>; Tue, 12 Jun 2001 15:50:19 +0200
 | |
| Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2650.21)
 | |
| 	id <M3L15S4T>; Tue, 12 Jun 2001 15:50:15 +0200
 | |
| Message-ID: <11C1E6749A55D411A9670001FA68796336831F@sdexcsrv1.f000.d0188.sd.spardat.at>
 | |
| From: Zeugswetter Andreas SB  <ZeugswetterA@wien.spardat.at>
 | |
| To: "'Darren Johnson'" <djohnson@greatbridge.com>,
 | |
|    The Hermit Hacker
 | |
|   <scrappy@hub.org>
 | |
| cc: pgsql-hackers@postgresql.org
 | |
| Subject: AW: AW: [HACKERS] Postgres Replication
 | |
| Date: Tue, 12 Jun 2001 15:50:09 +0200
 | |
| MIME-Version: 1.0
 | |
| X-Mailer: Internet Mail Service (5.5.2650.21)
 | |
| Content-Type: text/plain;
 | |
| 	charset="iso-8859-1"
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| 
 | |
| > Here are some disadvantages to using a "trigger based" approach:
 | |
| > 
 | |
| > 1) Triggers simply transfer individual data items when they 
 | |
| > are modified, they do not keep track of transactions.
 | |
| > 2) The execution of triggers within a database imposes a performance 
 | |
| > overhead to that database.
 | |
| > 3) Triggers require careful management by database administrators.  
 | |
| > Someone needs to keep track of all the "alarms" going off.
 | |
| > 4) The activation of triggers in a database cannot be easily 
 | |
| > rolled back or undone.
 | |
| 
 | |
| Yes, points 2 and 3 are a given, although point 2 buys you the functionality
 | |
| of transparent locking across all involved db servers.
 | |
| Points 1 and 4 are only the case for a trigger mechanism that does 
 | |
| not use remote connection and 2-phase commit. 
 | |
| 
 | |
| Imho an implementation that opens a separate client connection to the 
 | |
| replication target is only suited for async replication, and for that a WAL 
 | |
| based solution would probably impose less overhead.
 | |
| 
 | |
| Andreas
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 2: you can get off all lists at once with the unregister command
 | |
|     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
 | |
| 
 | |
| From pgsql-hackers-owner+M9946@postgresql.org Tue Jun 12 10:47:09 2001
 | |
| Return-path: <pgsql-hackers-owner+M9946@postgresql.org>
 | |
| Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5CEl9L08144
 | |
| 	for <pgman@candle.pha.pa.us>; Tue, 12 Jun 2001 10:47:09 -0400 (EDT)
 | |
| Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with SMTP id f5CEihE88714;
 | |
| 	Tue, 12 Jun 2001 10:44:43 -0400 (EDT)
 | |
| 	(envelope-from pgsql-hackers-owner+M9946@postgresql.org)
 | |
| Received: from mail.greatbridge.com (mail.greatbridge.com [65.196.68.36])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with ESMTP id f5CEd6E85859
 | |
| 	for <pgsql-hackers@postgresql.org>; Tue, 12 Jun 2001 10:39:06 -0400 (EDT)
 | |
| 	(envelope-from djohnson@greatbridge.com)
 | |
| Received: from j2.us.greatbridge.com (djohnsonpc.us.greatbridge.com [65.196.69.70])
 | |
| 	by mail.greatbridge.com (8.11.2/8.11.2) with SMTP id f5CEcgQ04905;
 | |
| 	Tue, 12 Jun 2001 10:38:42 -0400
 | |
| From: Darren Johnson <djohnson@greatbridge.com>
 | |
| Date: Tue, 12 Jun 2001 14:37:18 GMT
 | |
| Message-ID: <20010612.14371800@j2.us.greatbridge.com>
 | |
| Subject: Re: AW: AW: [HACKERS] Postgres Replication
 | |
| To: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>
 | |
| cc: pgsql-hackers@postgresql.org
 | |
| Reply-To: Darren Johnson <djohnson@greatbridge.com>
 | |
| 	<11C1E6749A55D411A9670001FA68796336831F@sdexcsrv1.f000.d0188.sd.spardat.at>
 | |
| References: <11C1E6749A55D411A9670001FA68796336831F@sdexcsrv1.f000.d0188.sd.spardat.at>
 | |
| X-Mailer: Mozilla/3.0 (compatible; StarOffice/5.2;Linux)
 | |
| X-Priority: 3 (Normal)
 | |
| MIME-Version: 1.0
 | |
| Content-Type: text/plain; charset=ISO-8859-1
 | |
| Content-Transfer-Encoding: 8bit
 | |
| X-MIME-Autoconverted: from quoted-printable to 8bit by postgresql.org id f5CEd6E85860
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| 
 | |
| 
 | |
| > Imho an implementation that opens a separate client connection to the
 | |
| > replication target is only suited for async replication, and for that a 
 | |
| WAL
 | |
| > based solution would probably impose less overhead.
 | |
| 
 | |
| 
 | |
| Yes there is significant overhead with opening a connection to a 
 | |
| client, so Postgres-R creates a pool of backends at start up, 
 | |
| coupled with the group communication system (Ensemble) that
 | |
| significantly reduces this issue.
 | |
| 
 | |
| 
 | |
| Very good points,
 | |
| 
 | |
| Darren
 | |
| 
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 6: Have you searched our list archives?
 | |
| 
 | |
| http://www.postgresql.org/search.mpl
 | |
| 
 | |
| From pgsql-hackers-owner+M9982@postgresql.org Tue Jun 12 19:04:06 2001
 | |
| Return-path: <pgsql-hackers-owner+M9982@postgresql.org>
 | |
| Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5CN46E10043
 | |
| 	for <pgman@candle.pha.pa.us>; Tue, 12 Jun 2001 19:04:06 -0400 (EDT)
 | |
| Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with SMTP id f5CN4AE62160;
 | |
| 	Tue, 12 Jun 2001 19:04:10 -0400 (EDT)
 | |
| 	(envelope-from pgsql-hackers-owner+M9982@postgresql.org)
 | |
| Received: from spoetnik.xs4all.nl (spoetnik.xs4all.nl [194.109.249.226])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with ESMTP id f5CMxaE60194
 | |
| 	for <pgsql-hackers@postgresql.org>; Tue, 12 Jun 2001 18:59:36 -0400 (EDT)
 | |
| 	(envelope-from reinoud@xs4all.nl)
 | |
| Received: from KAYAK (kayak [192.168.1.20])
 | |
| 	by spoetnik.xs4all.nl (Postfix) with SMTP id 435353E1B
 | |
| 	for <pgsql-hackers@postgresql.org>; Wed, 13 Jun 2001 00:59:28 +0200 (CEST)
 | |
| From: reinoud@xs4all.nl (Reinoud van Leeuwen)
 | |
| To: pgsql-hackers@postgresql.org
 | |
| Subject: Re: AW: AW: [HACKERS] Postgres Replication
 | |
| Date: Tue, 12 Jun 2001 22:59:23 GMT
 | |
| Organization: Not organized in any way
 | |
| Reply-To: reinoud@xs4all.nl
 | |
| Message-ID: <3b499c5b.652202125@192.168.1.10>
 | |
| References: <11C1E6749A55D411A9670001FA68796336831F@sdexcsrv1.f000.d0188.sd.spardat.at>
 | |
| In-Reply-To: <11C1E6749A55D411A9670001FA68796336831F@sdexcsrv1.f000.d0188.sd.spardat.at>
 | |
| X-Mailer: Forte Agent 1.5/32.451
 | |
| MIME-Version: 1.0
 | |
| Content-Type: text/plain; charset=us-ascii
 | |
| Content-Transfer-Encoding: 8bit
 | |
| X-MIME-Autoconverted: from quoted-printable to 8bit by postgresql.org id f5CMxcE60196
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| On Tue, 12 Jun 2001 15:50:09 +0200, you wrote:
 | |
| 
 | |
| >
 | |
| >> Here are some disadvantages to using a "trigger based" approach:
 | |
| >> 
 | |
| >> 1) Triggers simply transfer individual data items when they 
 | |
| >> are modified, they do not keep track of transactions.
 | |
| >> 2) The execution of triggers within a database imposes a performance 
 | |
| >> overhead to that database.
 | |
| >> 3) Triggers require careful management by database administrators.  
 | |
| >> Someone needs to keep track of all the "alarms" going off.
 | |
| >> 4) The activation of triggers in a database cannot be easily 
 | |
| >> rolled back or undone.
 | |
| >
 | |
| >Yes, points 2 and 3 are a given, although point 2 buys you the functionality
 | |
| >of transparent locking across all involved db servers.
 | |
| >Points 1 and 4 are only the case for a trigger mechanism that does 
 | |
| >not use remote connection and 2-phase commit. 
 | |
| >
 | |
| >Imho an implementation that opens a separate client connection to the 
 | |
| >replication target is only suited for async replication, and for that a WAL 
 | |
| >based solution would probably impose less overhead.
 | |
| 
 | |
| Well as I read back the thread I see 2 different approaches to
 | |
| replication:
 | |
| 
 | |
| 1: tight integrated replication. 
 | |
| pro:
 | |
| - bi-directional (or multidirectional): updates are possible
 | |
| everywhere
 | |
| - A cluster of servers allways has the same state. 
 | |
| - it does not matter to which server you connect
 | |
| con:
 | |
| - network between servers will be a bottleneck, especially if it is a
 | |
| WAN connection
 | |
| - only full replication possible
 | |
| - what happens if one server is down? (or the network between) are
 | |
| commits still possible
 | |
| 
 | |
| 2: async replication
 | |
| pro:
 | |
| - long distance possible
 | |
| - no problems with network outages
 | |
| - only changes are replicated, selects do not have impact 
 | |
| - no locking issues accross servers
 | |
| - partial replication possible (many->one (datawarehouse), or one-many
 | |
| (queries possible everywhere, updates only central) 
 | |
| - goof for failover situations (backup server is standing by)
 | |
| con:
 | |
| - bidirectional replication hard to set up (you'll have to implement
 | |
| conflict resolution according to your business rules)
 | |
| - different servers are not guaranteed to be in the same state.
 | |
| 
 | |
| I can think of some scenarios where I would definitely want to
 | |
| *choose* one of the options. A load-balanced web environment would
 | |
| likely want the first option, but synchronizing offices in different
 | |
| continents might not work with 2-phase commit over the network....
 | |
| 
 | |
| And we have not even started talking about *managing* replicated
 | |
| environments. A lot of fail-over scenarios stop planning after the
 | |
| backup host has take control. But how to get back? 
 | |
| -- 
 | |
| __________________________________________________
 | |
| "Nothing is as subjective as reality"
 | |
| Reinoud van Leeuwen       reinoud@xs4all.nl
 | |
| http://www.xs4all.nl/~reinoud
 | |
| __________________________________________________
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
 | |
| 
 | |
| From pgsql-hackers-owner+M9986@postgresql.org Tue Jun 12 19:48:48 2001
 | |
| Return-path: <pgsql-hackers-owner+M9986@postgresql.org>
 | |
| Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5CNmmE13125
 | |
| 	for <pgman@candle.pha.pa.us>; Tue, 12 Jun 2001 19:48:48 -0400 (EDT)
 | |
| Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with SMTP id f5CNmqE76673;
 | |
| 	Tue, 12 Jun 2001 19:48:52 -0400 (EDT)
 | |
| 	(envelope-from pgsql-hackers-owner+M9986@postgresql.org)
 | |
| Received: from sss.pgh.pa.us ([192.204.191.242])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with ESMTP id f5CNdQE73923
 | |
| 	for <pgsql-hackers@postgresql.org>; Tue, 12 Jun 2001 19:39:26 -0400 (EDT)
 | |
| 	(envelope-from tgl@sss.pgh.pa.us)
 | |
| Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
 | |
| 	by sss.pgh.pa.us (8.11.3/8.11.3) with ESMTP id f5CNdI016442;
 | |
| 	Tue, 12 Jun 2001 19:39:18 -0400 (EDT)
 | |
| To: reinoud@xs4all.nl
 | |
| cc: pgsql-hackers@postgresql.org
 | |
| Subject: Re: AW: AW: [HACKERS] Postgres Replication 
 | |
| In-Reply-To: <3b499c5b.652202125@192.168.1.10> 
 | |
| References: <11C1E6749A55D411A9670001FA68796336831F@sdexcsrv1.f000.d0188.sd.spardat.at> <3b499c5b.652202125@192.168.1.10>
 | |
| Comments: In-reply-to reinoud@xs4all.nl (Reinoud van Leeuwen)
 | |
| 	message dated "Tue, 12 Jun 2001 22:59:23 +0000"
 | |
| Date: Tue, 12 Jun 2001 19:39:18 -0400
 | |
| Message-ID: <16439.992389158@sss.pgh.pa.us>
 | |
| From: Tom Lane <tgl@sss.pgh.pa.us>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| reinoud@xs4all.nl (Reinoud van Leeuwen) writes:
 | |
| > Well as I read back the thread I see 2 different approaches to
 | |
| > replication:
 | |
| > ...
 | |
| > I can think of some scenarios where I would definitely want to
 | |
| > *choose* one of the options.
 | |
| 
 | |
| Yes.  IIRC, it looks to be possible to support a form of async
 | |
| replication using the Postgres-R approach: you allow the cluster
 | |
| to break apart when communications fail, and then rejoin when
 | |
| your link comes back to life.  (This can work in principle, how
 | |
| close it is to reality is another question; but the rejoin operation
 | |
| is the same as crash recovery, so you have to have it anyway.)
 | |
| 
 | |
| So this seems to me to allow getting most of the benefits of the async
 | |
| approach.  OTOH it is difficult to see how to go the other way: getting
 | |
| the benefits of a synchronous solution atop a basically-async
 | |
| implementation doesn't seem like it can work.
 | |
| 
 | |
| 			regards, tom lane
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 6: Have you searched our list archives?
 | |
| 
 | |
| http://www.postgresql.org/search.mpl
 | |
| 
 | |
| From pgsql-hackers-owner+M9997@postgresql.org Wed Jun 13 09:05:56 2001
 | |
| Return-path: <pgsql-hackers-owner+M9997@postgresql.org>
 | |
| Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5DD5tE28260
 | |
| 	for <pgman@candle.pha.pa.us>; Wed, 13 Jun 2001 09:05:55 -0400 (EDT)
 | |
| Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with SMTP id f5DD5xE12437;
 | |
| 	Wed, 13 Jun 2001 09:05:59 -0400 (EDT)
 | |
| 	(envelope-from pgsql-hackers-owner+M9997@postgresql.org)
 | |
| Received: from fizbanrsm.server.lan.at (zep4.it-austria.net [213.150.1.74])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with ESMTP id f5DD19E00635
 | |
| 	for <pgsql-hackers@postgresql.org>; Wed, 13 Jun 2001 09:01:10 -0400 (EDT)
 | |
| 	(envelope-from ZeugswetterA@wien.spardat.at)
 | |
| Received: from gz0153.gc.spardat.at (gz0153.gc.spardat.at [172.20.10.149])
 | |
| 	by fizbanrsm.server.lan.at (8.11.2/8.11.2) with ESMTP id f5DD13m08153
 | |
| 	for <pgsql-hackers@postgresql.org>; Wed, 13 Jun 2001 15:01:03 +0200
 | |
| Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2650.21)
 | |
| 	id <M6AB97MY>; Wed, 13 Jun 2001 15:00:02 +0200
 | |
| Message-ID: <11C1E6749A55D411A9670001FA687963368322@sdexcsrv1.f000.d0188.sd.spardat.at>
 | |
| From: Zeugswetter Andreas SB  <ZeugswetterA@wien.spardat.at>
 | |
| To: "'reinoud@xs4all.nl'" <reinoud@xs4all.nl>, pgsql-hackers@postgresql.org
 | |
| Subject: AW: AW: AW: [HACKERS] Postgres Replication
 | |
| Date: Wed, 13 Jun 2001 11:55:48 +0200
 | |
| MIME-Version: 1.0
 | |
| X-Mailer: Internet Mail Service (5.5.2650.21)
 | |
| Content-Type: text/plain;
 | |
| 	charset="iso-8859-1"
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| 
 | |
| > Well as I read back the thread I see 2 different approaches to
 | |
| > replication:
 | |
| > 
 | |
| > 1: tight integrated replication. 
 | |
| > pro:
 | |
| > - bi-directional (or multidirectional): updates are possible everywhere
 | |
| > - A cluster of servers allways has the same state. 
 | |
| > - it does not matter to which server you connect
 | |
| > con:
 | |
| > - network between servers will be a bottleneck, especially if it is a
 | |
| > WAN connection
 | |
| > - only full replication possible
 | |
| 
 | |
| I do not understand that point, if it is trigger based, you 
 | |
| have all the flexibility you need. (only some tables, only some rows,
 | |
| different rows to different targets ....), 
 | |
| (or do you mean not all targets, that could also be achieved with triggers)
 | |
| 
 | |
| > - what happens if one server is down? (or the network between) are
 | |
| > commits still possible
 | |
| 
 | |
| No, updates are not possible if one target is not reachable, 
 | |
| that would not be synchronous and would again need business rules
 | |
| to resolve conflicts.
 | |
| 
 | |
| Allowing updates when a target is not reachable would require admin 
 | |
| intervention.
 | |
| 
 | |
| Andreas
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 4: Don't 'kill -9' the postmaster
 | |
| 
 | |
| From pgsql-hackers-owner+M10005@postgresql.org Wed Jun 13 11:15:48 2001
 | |
| Return-path: <pgsql-hackers-owner+M10005@postgresql.org>
 | |
| Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5DFFmE08382
 | |
| 	for <pgman@candle.pha.pa.us>; Wed, 13 Jun 2001 11:15:48 -0400 (EDT)
 | |
| Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with SMTP id f5DFFoE53621;
 | |
| 	Wed, 13 Jun 2001 11:15:50 -0400 (EDT)
 | |
| 	(envelope-from pgsql-hackers-owner+M10005@postgresql.org)
 | |
| Received: from mail.greatbridge.com (mail.greatbridge.com [65.196.68.36])
 | |
| 	by postgresql.org (8.11.3/8.11.1) with ESMTP id f5DEk7E38930
 | |
| 	for <pgsql-hackers@postgresql.org>; Wed, 13 Jun 2001 10:46:07 -0400 (EDT)
 | |
| 	(envelope-from djohnson@greatbridge.com)
 | |
| Received: from j2.us.greatbridge.com (djohnsonpc.us.greatbridge.com [65.196.69.70])
 | |
| 	by mail.greatbridge.com (8.11.2/8.11.2) with SMTP id f5DEhfQ22566;
 | |
| 	Wed, 13 Jun 2001 10:43:41 -0400
 | |
| From: Darren Johnson <djohnson@greatbridge.com>
 | |
| Date: Wed, 13 Jun 2001 14:44:11 GMT
 | |
| Message-ID: <20010613.14441100@j2.us.greatbridge.com>
 | |
| Subject: Re: AW: AW: AW: [HACKERS] Postgres Replication
 | |
| To: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>
 | |
| cc: "'reinoud@xs4all.nl'" <reinoud@xs4all.nl>, pgsql-hackers@postgresql.org
 | |
| Reply-To: Darren Johnson <djohnson@greatbridge.com>
 | |
| 	<11C1E6749A55D411A9670001FA687963368322@sdexcsrv1.f000.d0188.sd.spardat.at>
 | |
| References: <11C1E6749A55D411A9670001FA687963368322@sdexcsrv1.f000.d0188.sd.spardat.at>
 | |
| X-Mailer: Mozilla/3.0 (compatible; StarOffice/5.2;Linux)
 | |
| X-Priority: 3 (Normal)
 | |
| MIME-Version: 1.0
 | |
| Content-Type: text/plain; charset=ISO-8859-1
 | |
| Content-Transfer-Encoding: 8bit
 | |
| X-MIME-Autoconverted: from quoted-printable to 8bit by postgresql.org id f5DEk8E38931
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| 
 | |
| > > - only full replication possible
 | |
| 
 | |
| > I do not understand that point, if it is trigger based, you
 | |
| > have all the flexibility you need. (only some tables, only some rows,
 | |
| > different rows to different targets ....),
 | |
| > (or do you mean not all targets, that could also be achieved with 
 | |
| triggers)
 | |
| 
 | |
| Currently with Postgres-R, it is one database replicating all tables to 
 | |
| all servers in the group communication system.  There are some ways 
 | |
| around
 | |
| this by invoking the -r option when a SQL statement should be replicated, 
 | |
| and leaving the -r option off for non-replicated scenarios.  IMHO this is
 | |
| not a good solution.  
 | |
| 
 | |
| A better solution will need to be implemented, which involves a 
 | |
| subscription table(s) with relation/server information.  There are two
 | |
| ideas for subscribing and receiving replicated data.
 | |
| 
 | |
| 1) Receiver driven propagation - A simple solution where all 
 | |
| transactions are propagated and the receiving servers will reference
 | |
| the subscription information before applying updates.
 | |
| 
 | |
| 2) Sender driven propagation - A more optimal and complex solution 
 | |
| where servers do not receive any messages regarding data items for 
 | |
| which they have not subscribed
 | |
| 
 | |
| 
 | |
| > > - what happens if one server is down? (or the network between) are
 | |
| > > commits still possible
 | |
| 
 | |
| > No, updates are not possible if one target is not reachable,
 | |
| 
 | |
| AFAIK, Postgres-R can still replicate if one target is not reachable,
 | |
| but only to the remaining servers ;).  
 | |
| 
 | |
| There is a scenario that could arise if a server issues a lock 
 | |
| request then fails or goes off line.  There is code that checks 
 | |
| for this condition, which needs to be merged with the branch we have.
 | |
| 
 | |
| > that would not be synchronous and would again need business rules
 | |
| > to resolve conflicts.
 | |
| 
 | |
| Yes the failed server would not be synchronized, and getting this
 | |
| failed server back in sync needs to be addressed.
 | |
| 
 | |
| > Allowing updates when a target is not reachable would require admin
 | |
| > intervention.
 | |
| 
 | |
| In its current state yes, but our goal would be to eliminate this
 | |
| requirement as well.
 | |
| 
 | |
| 
 | |
| 
 | |
| Darren
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 3: if posting/reading through Usenet, please send an appropriate
 | |
| subscribe-nomail command to majordomo@postgresql.org so that your
 | |
| message can get through to the mailing list cleanly
 | |
| 
 | |
| From pgsql-hackers-owner+M18443=candle.pha.pa.us=pgman@postgresql.org Mon Feb  4 19:16:17 2002
 | |
| Return-path: <pgsql-hackers-owner+M18443=candle.pha.pa.us=pgman@postgresql.org>
 | |
| Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
 | |
| 	by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g150GGP03822
 | |
| 	for <pgman@candle.pha.pa.us>; Mon, 4 Feb 2002 19:16:16 -0500 (EST)
 | |
| Received: (qmail 77444 invoked by alias); 5 Feb 2002 00:16:11 -0000
 | |
| Received: from unknown (HELO postgresql.org) (64.49.215.8)
 | |
|   by www.postgresql.org with SMTP; 5 Feb 2002 00:16:11 -0000
 | |
| Received: from snoopy.mohawksoft.com (h0050bf7a618d.ne.mediaone.net [24.147.138.78])
 | |
| 	by postgresql.org (8.11.3/8.11.4) with ESMTP id g150Esl77040
 | |
| 	for <pgsql-hackers@postgresql.org>; Mon, 4 Feb 2002 19:14:54 -0500 (EST)
 | |
| 	(envelope-from markw@mohawksoft.com)
 | |
| Received: from mohawksoft.com (localhost [127.0.0.1])
 | |
| 	by snoopy.mohawksoft.com (8.11.6/8.11.6) with ESMTP id g150AWh08676
 | |
| 	for <pgsql-hackers@postgresql.org>; Mon, 4 Feb 2002 19:10:33 -0500
 | |
| Message-ID: <3C5F22F8.C9B958F0@mohawksoft.com>
 | |
| Date: Mon, 04 Feb 2002 19:10:32 -0500
 | |
| From: mlw <markw@mohawksoft.com>
 | |
| X-Mailer: Mozilla 4.78 [en] (X11; U; Linux 2.4.17 i686)
 | |
| X-Accept-Language: en
 | |
| MIME-Version: 1.0
 | |
| To: PostgreSQL-development <pgsql-hackers@postgresql.org>
 | |
| Subject: [HACKERS] Replication
 | |
| Content-Type: text/plain; charset=us-ascii
 | |
| Content-Transfer-Encoding: 7bit
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| I re-wrote RServ.pm to C, and wrote a replication daemon. It works, but it
 | |
| works like the whole rserv project. I don't like it.
 | |
| 
 | |
| OK, what the hell do we need to do to get PostgreSQL replicating?
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 4: Don't 'kill -9' the postmaster
 | |
| 
 | |
| From pgsql-hackers-owner+M18445=candle.pha.pa.us=pgman@postgresql.org Mon Feb  4 19:57:01 2002
 | |
| Return-path: <pgsql-hackers-owner+M18445=candle.pha.pa.us=pgman@postgresql.org>
 | |
| Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
 | |
| 	by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g150v0P06518
 | |
| 	for <pgman@candle.pha.pa.us>; Mon, 4 Feb 2002 19:57:00 -0500 (EST)
 | |
| Received: (qmail 90440 invoked by alias); 5 Feb 2002 00:56:59 -0000
 | |
| Received: from unknown (HELO postgresql.org) (64.49.215.8)
 | |
|   by www.postgresql.org with SMTP; 5 Feb 2002 00:56:59 -0000
 | |
| Received: from www1.navtechinc.com ([192.234.226.140])
 | |
| 	by postgresql.org (8.11.3/8.11.4) with ESMTP id g150rMl89885
 | |
| 	for <pgsql-hackers@postgresql.org>; Mon, 4 Feb 2002 19:53:22 -0500 (EST)
 | |
| 	(envelope-from ssinger@navtechinc.com)
 | |
| Received: from pcNavYkfAdm1.ykf.navtechinc.com (wall [192.234.226.190])
 | |
| 	by www1.navtechinc.com (8.9.3/8.9.3) with ESMTP id AAA06047;
 | |
| 	Tue, 5 Feb 2002 00:53:22 GMT
 | |
| Received: from localhost (ssinger@localhost)
 | |
| 	by pcNavYkfAdm1.ykf.navtechinc.com (8.9.3/8.9.3) with ESMTP id AAA10675;
 | |
| 	Tue, 5 Feb 2002 00:52:43 GMT
 | |
| Date: Tue, 5 Feb 2002 00:52:43 +0000 (GMT)
 | |
| From: Steven <ssinger@navtechinc.com>
 | |
| X-X-Sender: <ssinger@pcNavYkfAdm1.ykf.navtechinc.com>
 | |
| To: mlw <markw@mohawksoft.com>
 | |
| cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
 | |
| Subject: Re: [HACKERS] Replication
 | |
| In-Reply-To: <3C5F22F8.C9B958F0@mohawksoft.com>
 | |
| Message-ID: <Pine.LNX.4.33.0202050040190.24027-100000@pcNavYkfAdm1.ykf.navtechinc.com>
 | |
| MIME-Version: 1.0
 | |
| Content-Type: TEXT/PLAIN; charset=US-ASCII
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| On Mon, 4 Feb 2002, mlw wrote:
 | |
| 
 | |
| I've developed a replacement for Rserv and we are planning on releasing 
 | |
| it as open source(ie as a contrib module).  
 | |
| 
 | |
| Like Rserv its trigger based but its much more flexible.
 | |
| The key adventages it has over Rserv is that it has
 | |
| -Support for multiple slaves
 | |
| -It Perserves transactions while doing the mirroring. Ie  If rows A,B are 
 | |
| originally added in the same transaction they will be mirrored in the same 
 | |
| transaction.
 | |
| 
 | |
| We have plans on adding filtering based on data/selective mirroring as 
 | |
| well. (Ie only rows with COUNTRY='Canada' go to 
 | |
| slave A, and  rows with COUNTRY='China' go to slave B).
 | |
| But I'm not sure when I'll get to that.
 | |
| 
 | |
| Support for conflict resolution(If allow edits to be made on the slaves) 
 | |
| would be nice.
 | |
| 
 | |
| I hope to be able to send a tarball with the source to the pgpatches list 
 | |
| within the next few days.
 | |
| 
 | |
| We've been using the system operationally for a number of months and have
 | |
| been happy with it.
 | |
| 
 | |
| > I re-wrote RServ.pm to C, and wrote a replication daemon. It works, but it
 | |
| > works like the whole rserv project. I don't like it. 
 | |
| > OK, what the hell do we need to do to get PostgreSQL replicating?
 | |
| > 
 | |
| > ---------------------------(end of broadcast)---------------------------
 | |
| > TIP 4: Don't 'kill -9' the postmaster
 | |
| > 
 | |
| 
 | |
| -- 
 | |
| Steven Singer                                       ssinger@navtechinc.com
 | |
| Aircraft Performance Systems                Phone:  519-747-1170 ext 282
 | |
| Navtech Systems Support Inc.                AFTN:   CYYZXNSX SITA: YYZNSCR
 | |
| Waterloo, Ontario                           ARINC:  YKFNSCR
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 2: you can get off all lists at once with the unregister command
 | |
|     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
 | |
| 
 | |
| From pgsql-hackers-owner+M18447=candle.pha.pa.us=pgman@postgresql.org Mon Feb  4 20:06:57 2002
 | |
| Return-path: <pgsql-hackers-owner+M18447=candle.pha.pa.us=pgman@postgresql.org>
 | |
| Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
 | |
| 	by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g1516vP07508
 | |
| 	for <pgman@candle.pha.pa.us>; Mon, 4 Feb 2002 20:06:57 -0500 (EST)
 | |
| Received: (qmail 92753 invoked by alias); 5 Feb 2002 01:06:55 -0000
 | |
| Received: from unknown (HELO postgresql.org) (64.49.215.8)
 | |
|   by www.postgresql.org with SMTP; 5 Feb 2002 01:06:55 -0000
 | |
| Received: from inflicted.crimelabs.net (crimelabs.net [66.92.101.112])
 | |
| 	by postgresql.org (8.11.3/8.11.4) with ESMTP id g150vhl91978
 | |
| 	for <pgsql-hackers@postgresql.org>; Mon, 4 Feb 2002 19:57:44 -0500 (EST)
 | |
| 	(envelope-from bpalmer@crimelabs.net)
 | |
| Received: from mizer.crimelabs.net (mizer.crimelabs.net [192.168.88.10])
 | |
| 	by inflicted.crimelabs.net (Postfix) with ESMTP
 | |
| 	id 9D6EE8779; Mon,  4 Feb 2002 19:57:46 -0500 (EST)
 | |
| Date: Mon, 4 Feb 2002 19:57:34 -0500 (EST)
 | |
| From: bpalmer <bpalmer@crimelabs.net>
 | |
| To: mlw <markw@mohawksoft.com>
 | |
| cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
 | |
| Subject: Re: [HACKERS] Replication
 | |
| In-Reply-To: <3C5F22F8.C9B958F0@mohawksoft.com>
 | |
| Message-ID: <Pine.BSO.4.43.0202041955420.17121-100000@mizer.crimelabs.net>
 | |
| MIME-Version: 1.0
 | |
| Content-Type: TEXT/PLAIN; charset=US-ASCII
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| >
 | |
| > OK, what the hell do we need to do to get PostgreSQL replicating?
 | |
| 
 | |
| I hope you understand that replication,  done right,  is a massive
 | |
| project.  I know that Darren any myself (and the rest of the pg-repl
 | |
| folks) have been waiting till 7.2 went gold till we did anymore work.  I
 | |
| think we hope to have master / slave replicatin working for 7.3 and then
 | |
| target multimaster for 7.4.  At least that's the hope.
 | |
| 
 | |
| - Brandon
 | |
| 
 | |
| ----------------------------------------------------------------------------
 | |
|  c: 646-456-5455                                            h: 201-798-4983
 | |
|  b. palmer,  bpalmer@crimelabs.net           pgp:crimelabs.net/bpalmer.pgp5
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 2: you can get off all lists at once with the unregister command
 | |
|     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
 | |
| 
 | |
| From pgsql-hackers-owner+M18449=candle.pha.pa.us=pgman@postgresql.org Mon Feb  4 21:16:56 2002
 | |
| Return-path: <pgsql-hackers-owner+M18449=candle.pha.pa.us=pgman@postgresql.org>
 | |
| Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
 | |
| 	by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g152GtP10503
 | |
| 	for <pgman@candle.pha.pa.us>; Mon, 4 Feb 2002 21:16:55 -0500 (EST)
 | |
| Received: (qmail 6711 invoked by alias); 5 Feb 2002 02:16:53 -0000
 | |
| Received: from unknown (HELO postgresql.org) (64.49.215.8)
 | |
|   by www.postgresql.org with SMTP; 5 Feb 2002 02:16:53 -0000
 | |
| Received: from snoopy.mohawksoft.com (h0050bf7a618d.ne.mediaone.net [24.147.138.78])
 | |
| 	by postgresql.org (8.11.3/8.11.4) with ESMTP id g151qSl99469
 | |
| 	for <pgsql-hackers@postgresql.org>; Mon, 4 Feb 2002 20:52:28 -0500 (EST)
 | |
| 	(envelope-from markw@mohawksoft.com)
 | |
| Received: from mohawksoft.com (localhost [127.0.0.1])
 | |
| 	by snoopy.mohawksoft.com (8.11.6/8.11.6) with ESMTP id g151lph09147;
 | |
| 	Mon, 4 Feb 2002 20:47:51 -0500
 | |
| Message-ID: <3C5F39C7.970F4549@mohawksoft.com>
 | |
| Date: Mon, 04 Feb 2002 20:47:51 -0500
 | |
| From: mlw <markw@mohawksoft.com>
 | |
| X-Mailer: Mozilla 4.78 [en] (X11; U; Linux 2.4.17 i686)
 | |
| X-Accept-Language: en
 | |
| MIME-Version: 1.0
 | |
| To: Steven <ssinger@navtechinc.com>
 | |
| cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
 | |
| Subject: Re: [HACKERS] Replication
 | |
| References: <Pine.LNX.4.33.0202050040190.24027-100000@pcNavYkfAdm1.ykf.navtechinc.com>
 | |
| Content-Type: text/plain; charset=us-ascii
 | |
| Content-Transfer-Encoding: 7bit
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| Steven wrote:
 | |
| > 
 | |
| > On Mon, 4 Feb 2002, mlw wrote:
 | |
| > 
 | |
| > I've developed a replacement for Rserv and we are planning on releasing
 | |
| > it as open source(ie as a contrib module).
 | |
| > 
 | |
| > Like Rserv its trigger based but its much more flexible.
 | |
| > The key adventages it has over Rserv is that it has
 | |
| > -Support for multiple slaves
 | |
| > -It Perserves transactions while doing the mirroring. Ie  If rows A,B are
 | |
| > originally added in the same transaction they will be mirrored in the same
 | |
| > transaction.
 | |
| 
 | |
| I did a similar thing. I took the rserv trigger "as is," but rewrote the
 | |
| replication support code. What I eventually did was write a "snapshot daemon"
 | |
| which created snapshot files. Then a "slave daemon" which would check the last
 | |
| snapshot applied and apply all the snapshots, in order, as needed. One would
 | |
| run one of these daemons per slave server.
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 5: Have you checked our extensive FAQ?
 | |
| 
 | |
| http://www.postgresql.org/users-lounge/docs/faq.html
 | |
| 
 | |
| From pgsql-hackers-owner+M18448=candle.pha.pa.us=pgman@postgresql.org Mon Feb  4 20:57:25 2002
 | |
| Return-path: <pgsql-hackers-owner+M18448=candle.pha.pa.us=pgman@postgresql.org>
 | |
| Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
 | |
| 	by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g151vOP09239
 | |
| 	for <pgman@candle.pha.pa.us>; Mon, 4 Feb 2002 20:57:24 -0500 (EST)
 | |
| Received: (qmail 99828 invoked by alias); 5 Feb 2002 01:57:19 -0000
 | |
| Received: from unknown (HELO postgresql.org) (64.49.215.8)
 | |
|   by www.postgresql.org with SMTP; 5 Feb 2002 01:57:19 -0000
 | |
| Received: from snoopy.mohawksoft.com (h0050bf7a618d.ne.mediaone.net [24.147.138.78])
 | |
| 	by postgresql.org (8.11.3/8.11.4) with ESMTP id g151s0l99529
 | |
| 	for <pgsql-hackers@postgresql.org>; Mon, 4 Feb 2002 20:54:00 -0500 (EST)
 | |
| 	(envelope-from markw@mohawksoft.com)
 | |
| Received: from mohawksoft.com (localhost [127.0.0.1])
 | |
| 	by snoopy.mohawksoft.com (8.11.6/8.11.6) with ESMTP id g151nah09156;
 | |
| 	Mon, 4 Feb 2002 20:49:37 -0500
 | |
| Message-ID: <3C5F3A30.A4C46FB8@mohawksoft.com>
 | |
| Date: Mon, 04 Feb 2002 20:49:36 -0500
 | |
| From: mlw <markw@mohawksoft.com>
 | |
| X-Mailer: Mozilla 4.78 [en] (X11; U; Linux 2.4.17 i686)
 | |
| X-Accept-Language: en
 | |
| MIME-Version: 1.0
 | |
| To: bpalmer <bpalmer@crimelabs.net>
 | |
| cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
 | |
| Subject: Re: [HACKERS] Replication
 | |
| References: <Pine.BSO.4.43.0202041955420.17121-100000@mizer.crimelabs.net>
 | |
| Content-Type: text/plain; charset=us-ascii
 | |
| Content-Transfer-Encoding: 7bit
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| bpalmer wrote:
 | |
| > 
 | |
| > >
 | |
| > > OK, what the hell do we need to do to get PostgreSQL replicating?
 | |
| > 
 | |
| > I hope you understand that replication,  done right,  is a massive
 | |
| > project.  I know that Darren any myself (and the rest of the pg-repl
 | |
| > folks) have been waiting till 7.2 went gold till we did anymore work.  I
 | |
| > think we hope to have master / slave replicatin working for 7.3 and then
 | |
| > target multimaster for 7.4.  At least that's the hope.
 | |
| 
 | |
| I do know how hard replication is. I also understand how important it is.
 | |
| 
 | |
| If you guys have a project going, and need developers, I am more than willing.
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 5: Have you checked our extensive FAQ?
 | |
| 
 | |
| http://www.postgresql.org/users-lounge/docs/faq.html
 | |
| 
 | |
| From pgsql-hackers-owner+M18450=candle.pha.pa.us=pgman@postgresql.org Mon Feb  4 21:42:13 2002
 | |
| Return-path: <pgsql-hackers-owner+M18450=candle.pha.pa.us=pgman@postgresql.org>
 | |
| Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
 | |
| 	by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g152gCP11957
 | |
| 	for <pgman@candle.pha.pa.us>; Mon, 4 Feb 2002 21:42:13 -0500 (EST)
 | |
| Received: (qmail 14229 invoked by alias); 5 Feb 2002 02:42:09 -0000
 | |
| Received: from unknown (HELO postgresql.org) (64.49.215.8)
 | |
|   by www.postgresql.org with SMTP; 5 Feb 2002 02:42:09 -0000
 | |
| Received: from www1.navtechinc.com ([192.234.226.140])
 | |
| 	by postgresql.org (8.11.3/8.11.4) with ESMTP id g152SBl10682
 | |
| 	for <pgsql-hackers@postgresql.org>; Mon, 4 Feb 2002 21:28:11 -0500 (EST)
 | |
| 	(envelope-from ssinger@navtechinc.com)
 | |
| Received: from pcNavYkfAdm1.ykf.navtechinc.com (wall [192.234.226.190])
 | |
| 	by www1.navtechinc.com (8.9.3/8.9.3) with ESMTP id CAA06384;
 | |
| 	Tue, 5 Feb 2002 02:28:13 GMT
 | |
| Received: from localhost (ssinger@localhost)
 | |
| 	by pcNavYkfAdm1.ykf.navtechinc.com (8.9.3/8.9.3) with ESMTP id CAA10682;
 | |
| 	Tue, 5 Feb 2002 02:27:35 GMT
 | |
| Date: Tue, 5 Feb 2002 02:27:35 +0000 (GMT)
 | |
| From: Steven <ssinger@navtechinc.com>
 | |
| X-X-Sender: <ssinger@pcNavYkfAdm1.ykf.navtechinc.com>
 | |
| To: mlw <markw@mohawksoft.com>
 | |
| cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
 | |
| Subject: Re: [HACKERS] Replication
 | |
| In-Reply-To: <3C5F39C7.970F4549@mohawksoft.com>
 | |
| Message-ID: <Pine.LNX.4.33.0202050159591.26756-100000@pcNavYkfAdm1.ykf.navtechinc.com>
 | |
| MIME-Version: 1.0
 | |
| Content-Type: TEXT/PLAIN; charset=US-ASCII
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| 
 | |
| DBMirror doesn't use snapshot's instead it records a log of transactions 
 | |
| that are committed to the database in a pair of tables.  
 | |
| In the case of an INSERT this is the row that is being added.
 | |
| In the case of a delete the primary key of the row being deleted.
 | |
| 
 | |
| And in the case of an UPDATE, the primary key before the update along with 
 | |
| all of the data the row should have after an update.
 | |
| 
 | |
| Then for each slave database a perl script walks though the transactions 
 | |
| that are pending for that host and reconstructs SQL to send the row edits 
 | |
| to that host.  A record of the fact that transaction Y has been sent to 
 | |
| host X is also kept.
 | |
| 
 | |
| When transaction X has been sent to all of the hosts that are in the 
 | |
| system it is then deleted from the Pending tables.
 | |
| 
 | |
| I suspect that all of the information I'm storing in the Pending tables is 
 | |
| also being stored by Postgres in its log but I haven't investigated how 
 | |
| the information could be extracted(or how long it is kept for).  That 
 | |
| would  reduce the extra storage overhead that the replication system 
 | |
| imposes.
 | |
| 
 | |
| As I remember(Its been a while since I've looked at it) RServ uses OID's 
 | |
| in its tables to point to the data that needs to be replicated.  We tried 
 | |
| a similar approach but found difficulties with doing partial updates.
 | |
| 
 | |
| 
 | |
| 
 | |
| 
 | |
| 
 | |
| 
 | |
| On Mon, 4 Feb 2002, mlw wrote:
 | |
| 
 | |
| > I did a similar thing. I took the rserv trigger "as is," but rewrote the
 | |
| > replication support code. What I eventually did was write a "snapshot daemon"
 | |
| > which created snapshot files. Then a "slave daemon" which would check the last
 | |
| > snapshot applied and apply all the snapshots, in order, as needed. One would
 | |
| > run one of these daemons per slave server.
 | |
| 
 | |
| 
 | |
| 
 | |
| 
 | |
|  
 | |
| 
 | |
| -- 
 | |
| Steven Singer                                       ssinger@navtechinc.com
 | |
| Aircraft Performance Systems                Phone:  519-747-1170 ext 282
 | |
| Navtech Systems Support Inc.                AFTN:   CYYZXNSX SITA: YYZNSCR
 | |
| Waterloo, Ontario                           ARINC:  YKFNSCR
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 2: you can get off all lists at once with the unregister command
 | |
|     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
 | |
| 
 | |
| From pgsql-hackers-owner+M18554=candle.pha.pa.us=pgman@postgresql.org Thu Feb  7 02:49:48 2002
 | |
| Return-path: <pgsql-hackers-owner+M18554=candle.pha.pa.us=pgman@postgresql.org>
 | |
| Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
 | |
| 	by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g177nlP04347
 | |
| 	for <pgman@candle.pha.pa.us>; Thu, 7 Feb 2002 02:49:47 -0500 (EST)
 | |
| Received: (qmail 22556 invoked by alias); 7 Feb 2002 07:49:49 -0000
 | |
| Received: from unknown (HELO postgresql.org) (64.49.215.8)
 | |
|   by www.postgresql.org with SMTP; 7 Feb 2002 07:49:49 -0000
 | |
| Received: from linuxworld.com.au (www.linuxworld.com.au [203.34.46.50])
 | |
| 	by postgresql.org (8.11.3/8.11.4) with ESMTP id g177QfE19572
 | |
| 	for <pgsql-hackers@postgresql.org>; Thu, 7 Feb 2002 02:26:42 -0500 (EST)
 | |
| 	(envelope-from swm@linuxworld.com.au)
 | |
| Received: from localhost (swm@localhost)
 | |
| 	by linuxworld.com.au (8.11.4/8.11.4) with ESMTP id g177RiU06086;
 | |
| 	Thu, 7 Feb 2002 18:27:45 +1100
 | |
| Date: Thu, 7 Feb 2002 18:27:44 +1100 (EST)
 | |
| From: Gavin Sherry <swm@linuxworld.com.au>
 | |
| To: mlw <markw@mohawksoft.com>
 | |
| cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
 | |
| Subject: Re: [HACKERS] Replication
 | |
| In-Reply-To: <3C5F22F8.C9B958F0@mohawksoft.com>
 | |
| Message-ID: <Pine.LNX.4.21.0202071751240.5160-100000@linuxworld.com.au>
 | |
| MIME-Version: 1.0
 | |
| Content-Type: TEXT/PLAIN; charset=US-ASCII
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| On Mon, 4 Feb 2002, mlw wrote:
 | |
| 
 | |
| > I re-wrote RServ.pm to C, and wrote a replication daemon. It works, but it
 | |
| > works like the whole rserv project. I don't like it.
 | |
| > 
 | |
| > OK, what the hell do we need to do to get PostgreSQL replicating?
 | |
| 
 | |
| The trigger model is not a very sophisticated one. I think I have a better
 | |
| -- though more complicated -- one. This model would be able to handle
 | |
| multiple masters and master->slave.
 | |
| 
 | |
| First of all, all machines in the cluster would have to be aware all the
 | |
| machines in the cluster. This would have to be stored in a new system
 | |
| table.
 | |
| 
 | |
| The FE/BE protocol would need to be modified to accepted parsed node trees
 | |
| generated by pg_analyze_and_rewrite(). These could then be dispatched by 
 | |
| the executing server, inside of pg_exec_query_string, to all other servers
 | |
| in the cluster (excluding itself). Naturally, this dispatch would need to
 | |
| be non-blocking.
 | |
| 
 | |
| pg_exec_query_string() would need to check that nodetags to make sure
 | |
| selects and perhaps some commands are not dispatched.
 | |
| 
 | |
| Before the executing server runs finish_xact_command(), it would check
 | |
| that the query was successfully executed on all machines otherwise
 | |
| abort. Such a system would need a few configuration options: whether or
 | |
| not you abort on failed replication to slaves, the ability to replicate
 | |
| only certain tables, etc.
 | |
| 
 | |
| Naturally, this would slow down writes to the system (possibly a lot
 | |
| depending on the performance difference between the executing machine and
 | |
| the least powerful machine in the cluster), but most usages of postgresql
 | |
| are read intensive, not write.
 | |
| 
 | |
| Any reason this model would not work?
 | |
| 
 | |
| Gavin
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 4: Don't 'kill -9' the postmaster
 | |
| 
 | |
| From pgsql-hackers-owner+M18558=candle.pha.pa.us=pgman@postgresql.org Thu Feb  7 08:31:00 2002
 | |
| Return-path: <pgsql-hackers-owner+M18558=candle.pha.pa.us=pgman@postgresql.org>
 | |
| Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
 | |
| 	by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g17DUxP13923
 | |
| 	for <pgman@candle.pha.pa.us>; Thu, 7 Feb 2002 08:30:59 -0500 (EST)
 | |
| Received: (qmail 91796 invoked by alias); 7 Feb 2002 13:30:55 -0000
 | |
| Received: from unknown (HELO postgresql.org) (64.49.215.8)
 | |
|   by www.postgresql.org with SMTP; 7 Feb 2002 13:30:55 -0000
 | |
| Received: from snoopy.mohawksoft.com (h0050bf7a618d.ne.mediaone.net [24.147.138.78])
 | |
| 	by postgresql.org (8.11.3/8.11.4) with ESMTP id g17Cw0E87782
 | |
| 	for <pgsql-hackers@postgresql.org>; Thu, 7 Feb 2002 07:58:01 -0500 (EST)
 | |
| 	(envelope-from markw@mohawksoft.com)
 | |
| Received: from mohawksoft.com (localhost [127.0.0.1])
 | |
| 	by snoopy.mohawksoft.com (8.11.6/8.11.6) with ESMTP id g17CqNt16887;
 | |
| 	Thu, 7 Feb 2002 07:52:24 -0500
 | |
| Message-ID: <3C627887.CC9FF837@mohawksoft.com>
 | |
| Date: Thu, 07 Feb 2002 07:52:23 -0500
 | |
| From: mlw <markw@mohawksoft.com>
 | |
| X-Mailer: Mozilla 4.78 [en] (X11; U; Linux 2.4.17 i686)
 | |
| X-Accept-Language: en
 | |
| MIME-Version: 1.0
 | |
| To: Gavin Sherry <swm@linuxworld.com.au>
 | |
| cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
 | |
| Subject: Re: [HACKERS] Replication
 | |
| References: <Pine.LNX.4.21.0202071751240.5160-100000@linuxworld.com.au>
 | |
| Content-Type: text/plain; charset=us-ascii
 | |
| Content-Transfer-Encoding: 7bit
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| Gavin Sherry wrote:
 | |
| > Naturally, this would slow down writes to the system (possibly a lot
 | |
| > depending on the performance difference between the executing machine and
 | |
| > the least powerful machine in the cluster), but most usages of postgresql
 | |
| > are read intensive, not write.
 | |
| > 
 | |
| > Any reason this model would not work?
 | |
| 
 | |
| What, then is the purpose of replication to multiple masters?
 | |
| 
 | |
| I can think of only two reasons why you want replication. (1) Redundancy, make
 | |
| sure that if one server dies, then another server has the same data and is used
 | |
| seamlessly. (2) Increase performance over one system.
 | |
| 
 | |
| In reason (1) I submit that a server load balance which sits on top of
 | |
| PostgreSQL, and executes writes on both servers while distributing reads would
 | |
| be best. This is a HUGE project. The load balancer must know EXACTLY how the
 | |
| system is configured, which includes all functions and everything. 
 | |
| 
 | |
| In reason (2) your system would fail to provide the scalability that would be
 | |
| needed. If writes take a long time, but reads are fine, what is the difference
 | |
| between the trigger based replicator?
 | |
| 
 | |
| I have in the back of my mind, an idea of patching into the WAL stuff, and
 | |
| using that mechanism to push changes out to the slaves.
 | |
| 
 | |
| Where one machine is still the master, but no trigger stuff, just a WAL patch.
 | |
| Perhaps some shared memory paradigm to manage WAL visibility? I'm not sure
 | |
| exactly, the idea hasn't completely formed yet.
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 5: Have you checked our extensive FAQ?
 | |
| 
 | |
| http://www.postgresql.org/users-lounge/docs/faq.html
 | |
| 
 | |
| From pgsql-hackers-owner+M18574=candle.pha.pa.us=pgman@postgresql.org Thu Feb  7 12:51:42 2002
 | |
| Return-path: <pgsql-hackers-owner+M18574=candle.pha.pa.us=pgman@postgresql.org>
 | |
| Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
 | |
| 	by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g17HpfP16661
 | |
| 	for <pgman@candle.pha.pa.us>; Thu, 7 Feb 2002 12:51:41 -0500 (EST)
 | |
| Received: (qmail 62955 invoked by alias); 7 Feb 2002 17:50:42 -0000
 | |
| Received: from unknown (HELO postgresql.org) (64.49.215.8)
 | |
|   by www.postgresql.org with SMTP; 7 Feb 2002 17:50:42 -0000
 | |
| Received: from www1.navtechinc.com ([192.234.226.140])
 | |
| 	by postgresql.org (8.11.3/8.11.4) with ESMTP id g17HnTE62256
 | |
| 	for <pgsql-hackers@postgresql.org>; Thu, 7 Feb 2002 12:49:29 -0500 (EST)
 | |
| 	(envelope-from ssinger@navtechinc.com)
 | |
| Received: from pcNavYkfAdm1.ykf.navtechinc.com (wall [192.234.226.190])
 | |
| 	by www1.navtechinc.com (8.9.3/8.9.3) with ESMTP id RAA07908;
 | |
| 	Thu, 7 Feb 2002 17:49:31 GMT
 | |
| Received: from localhost (ssinger@localhost)
 | |
| 	by pcNavYkfAdm1.ykf.navtechinc.com (8.9.3/8.9.3) with ESMTP id RAA05687;
 | |
| 	Thu, 7 Feb 2002 17:48:52 GMT
 | |
| Date: Thu, 7 Feb 2002 17:48:51 +0000 (GMT)
 | |
| From: Steven Singer <ssinger@navtechinc.com>
 | |
| X-X-Sender: <ssinger@pcNavYkfAdm1.ykf.navtechinc.com>
 | |
| To: Gavin Sherry <swm@linuxworld.com.au>
 | |
| cc: mlw <markw@mohawksoft.com>,
 | |
|    PostgreSQL-development <pgsql-hackers@postgresql.org>
 | |
| Subject: Re: [HACKERS] Replication
 | |
| In-Reply-To: <Pine.LNX.4.21.0202071751240.5160-100000@linuxworld.com.au>
 | |
| Message-ID: <Pine.LNX.4.33.0202071735360.6435-100000@pcNavYkfAdm1.ykf.navtechinc.com>
 | |
| MIME-Version: 1.0
 | |
| Content-Type: TEXT/PLAIN; charset=US-ASCII
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| 
 | |
| 
 | |
| What you describe sounds like a form of a two-stage commit protocol.
 | |
| 
 | |
| If the command worked on two of the replicated databases but failed on a 
 | |
| third then the executing server would have to be able to undo the command
 | |
| on the replicated databases as well as itself.
 | |
| 
 | |
| The problems with two stage commit type approches to replication are 
 | |
| 1) Speed as you mentioned.  Write speed isn't a concern for some 
 | |
| applications but it is very important in others.
 | |
| 
 | |
| and 
 | |
| 2) All of the databases must be able to communicate with each other at 
 | |
| all times in order for any edits to work.   If the servers are 
 | |
| connected over some sort of WAN that periodically has short outages this 
 | |
| is a problem.   Also if your using replication because you want to be able 
 | |
| to take down one of the databases for short periods of time without 
 | |
| bringing down the others your in trouble.
 | |
| 
 | |
| 
 | |
| btw: I posted the alternative to Rserv that I mentioned the other day to 
 | |
| the  pg-patches mailing list.  If anyone is intreasted you should be able 
 | |
| to grab it off the archives.
 | |
| 
 | |
| On Thu, 7 Feb 2002, Gavin Sherry wrote:
 | |
| 
 | |
| > 
 | |
| > First of all, all machines in the cluster would have to be aware all the
 | |
| > machines in the cluster. This would have to be stored in a new system
 | |
| > table.
 | |
| > 
 | |
| > The FE/BE protocol would need to be modified to accepted parsed node trees
 | |
| > generated by pg_analyze_and_rewrite(). These could then be dispatched by 
 | |
| > the executing server, inside of pg_exec_query_string, to all other servers
 | |
| > in the cluster (excluding itself). Naturally, this dispatch would need to
 | |
| > be non-blocking.
 | |
| > 
 | |
| > pg_exec_query_string() would need to check that nodetags to make sure
 | |
| > selects and perhaps some commands are not dispatched.
 | |
| > 
 | |
| > Before the executing server runs finish_xact_command(), it would check
 | |
| > that the query was successfully executed on all machines otherwise
 | |
| > abort. Such a system would need a few configuration options: whether or
 | |
| > not you abort on failed replication to slaves, the ability to replicate
 | |
| > only certain tables, etc.
 | |
| > 
 | |
| > Naturally, this would slow down writes to the system (possibly a lot
 | |
| > depending on the performance difference between the executing machine and
 | |
| > the least powerful machine in the cluster), but most usages of postgresql
 | |
| > are read intensive, not write.
 | |
| > 
 | |
| > Any reason this model would not work?
 | |
| > 
 | |
| > Gavin
 | |
| > 
 | |
| > 
 | |
| > ---------------------------(end of broadcast)---------------------------
 | |
| > TIP 4: Don't 'kill -9' the postmaster
 | |
| > 
 | |
| 
 | |
| -- 
 | |
| Steven Singer                                       ssinger@navtechinc.com
 | |
| Aircraft Performance Systems                Phone:  519-747-1170 ext 282
 | |
| Navtech Systems Support Inc.                AFTN:   CYYZXNSX SITA: YYZNSCR
 | |
| Waterloo, Ontario                           ARINC:  YKFNSCR
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
 | |
| 
 | |
| From pgsql-hackers-owner+M18590=candle.pha.pa.us=pgman@postgresql.org Thu Feb  7 17:50:42 2002
 | |
| Return-path: <pgsql-hackers-owner+M18590=candle.pha.pa.us=pgman@postgresql.org>
 | |
| Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
 | |
| 	by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g17MoeP27121
 | |
| 	for <pgman@candle.pha.pa.us>; Thu, 7 Feb 2002 17:50:40 -0500 (EST)
 | |
| Received: (qmail 39930 invoked by alias); 7 Feb 2002 22:50:17 -0000
 | |
| Received: from unknown (HELO postgresql.org) (64.49.215.8)
 | |
|   by www.postgresql.org with SMTP; 7 Feb 2002 22:50:17 -0000
 | |
| Received: from odin.fts.net (wall.icgate.net [209.26.177.2])
 | |
| 	by postgresql.org (8.11.3/8.11.4) with ESMTP id g17Ma4E38041
 | |
| 	for <pgsql-hackers@postgresql.org>; Thu, 7 Feb 2002 17:36:04 -0500 (EST)
 | |
| 	(envelope-from fharvell@odin.fts.net)
 | |
| Received: from odin.fts.net (fharvell@localhost)
 | |
| 	by odin.fts.net (8.11.6/8.11.6) with ESMTP id g17MZhR17707;
 | |
| 	Thu, 7 Feb 2002 17:35:43 -0500
 | |
| Message-ID: <200202072235.g17MZhR17707@odin.fts.net>
 | |
| X-Mailer: exmh version 2.2 06/23/2000 with nmh-1.0.4
 | |
| From: F Harvell <fharvell@fts.net>
 | |
| To: mlw <markw@mohawksoft.com>
 | |
| cc: Gavin Sherry <swm@linuxworld.com.au>,
 | |
|    PostgreSQL-development <pgsql-hackers@postgresql.org>
 | |
| Subject: Re: [HACKERS] Replication 
 | |
| In-Reply-To: Message from mlw
 | |
|     of "Thu, 07 Feb 2002 07:52:23 EST."
 | |
|     <3C627887.CC9FF837@mohawksoft.com> 
 | |
| MIME-Version: 1.0
 | |
| Content-Type: text/plain; charset=us-ascii
 | |
| Date: Thu, 07 Feb 2002 17:35:43 -0500
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| I'm not that familiar with the whole replication issues in PostgreSQL,
 | |
| however, I would be partial to replication that was based upon the
 | |
| playback of the (a?) journal file.  (I believe that the WAL is a
 | |
| journal file.)
 | |
| 
 | |
| By being based upon a journal file, it would be possible to accomplish
 | |
| two significant items.  First, it would be possible to "restore" a
 | |
| database to an exact state just before a failure.  Most commercial
 | |
| databases provide the ability to do this.  Banks, etc. log the journal
 | |
| files directly to tape to provide a complete transaction history such
 | |
| that they can rebuild their database from any given snapshot.  (Note
 | |
| that the journal file needs to be "editable" as a failure may be
 | |
| "delete from x" with a missing where clause.)
 | |
| 
 | |
| This leads directly into the second advantage, the ability to have a
 | |
| replicated database operating anywhere, over any connection on any
 | |
| server.  Speed of writes would not be a factor.  In essence, as long
 | |
| as the replicated database had a snapshot of the database and then was
 | |
| provided with all journal files since the snapshot, it would be
 | |
| possible to build a current database.  If the replicant got behind in
 | |
| the processing, it would catch up when things slowed down.
 | |
| 
 | |
| In my opionion, the first advantage is in many ways most important.
 | |
| Replication becomes simply the restoration of the database in realtime
 | |
| on a second server.  The "replication" task becomes the definition of
 | |
| a protocol for distributing the journal file.  At least one major
 | |
| database vendor does replication (shadowing) in exactly this mannor.
 | |
| 
 | |
| Maybe I'm all wet and the journal file and journal playback already
 | |
| exists.  If so, IMHO, basing replication off of this would be the
 | |
| right direction.
 | |
| 
 | |
| 
 | |
| On Thu, 07 Feb 2002 07:52:23 EST, mlw wrote:
 | |
| > 
 | |
| > I have in the back of my mind, an idea of patching into the WAL stuff, and
 | |
| > using that mechanism to push changes out to the slaves.
 | |
| > 
 | |
| > Where one machine is still the master, but no trigger stuff, just a WAL patch.
 | |
| > Perhaps some shared memory paradigm to manage WAL visibility? I'm not sure
 | |
| > exactly, the idea hasn't completely formed yet.
 | |
| > 
 | |
| 
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 4: Don't 'kill -9' the postmaster
 | |
| 
 | |
| From pgsql-hackers-owner+M18605=candle.pha.pa.us=pgman@postgresql.org Fri Feb  8 00:50:08 2002
 | |
| Return-path: <pgsql-hackers-owner+M18605=candle.pha.pa.us=pgman@postgresql.org>
 | |
| Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
 | |
| 	by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g185o7P27878
 | |
| 	for <pgman@candle.pha.pa.us>; Fri, 8 Feb 2002 00:50:07 -0500 (EST)
 | |
| Received: (qmail 17348 invoked by alias); 8 Feb 2002 05:50:03 -0000
 | |
| Received: from unknown (HELO postgresql.org) (64.49.215.8)
 | |
|   by www.postgresql.org with SMTP; 8 Feb 2002 05:50:03 -0000
 | |
| Received: from lakemtao03.mgt.cox.net (mtao3.east.cox.net [68.1.17.242])
 | |
| 	by postgresql.org (8.11.3/8.11.4) with ESMTP id g185cTE15241
 | |
| 	for <pgsql-hackers@postgresql.org>; Fri, 8 Feb 2002 00:38:29 -0500 (EST)
 | |
| 	(envelope-from darren.johnson@cox.net)
 | |
| Received: from cox.net ([68.10.181.230]) by lakemtao03.mgt.cox.net
 | |
|           (InterMail vM.5.01.04.05 201-253-122-122-105-20011231) with ESMTP
 | |
|           id <20020208053833.YKTV6710.lakemtao03.mgt.cox.net@cox.net>
 | |
|           for <pgsql-hackers@postgresql.org>;
 | |
|           Fri, 8 Feb 2002 00:38:33 -0500
 | |
| Message-ID: <3C636232.6060206@cox.net>
 | |
| Date: Fri, 08 Feb 2002 00:29:22 -0500
 | |
| From: Darren Johnson <darren.johnson@cox.net>
 | |
| User-Agent: Mozilla/5.0 (Windows; U; WinNT4.0; en-US; m18) Gecko/20001108 Netscape6/6.0
 | |
| X-Accept-Language: en
 | |
| MIME-Version: 1.0
 | |
| To: PostgreSQL-development <pgsql-hackers@postgresql.org>
 | |
| Subject: Re: [HACKERS] Replication
 | |
| References: <Pine.LNX.4.33.0202071735360.6435-100000@pcNavYkfAdm1.ykf.navtechinc.com>
 | |
| Content-Type: text/plain; charset=us-ascii; format=flowed
 | |
| Content-Transfer-Encoding: 7bit
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| 
 | |
|  >
 | |
|  > The problems with two stage commit type approches to replication are
 | |
| 
 | |
| IMHO the biggest problem with two phased commit is it doesn't scale.
 | |
| The more servers
 | |
| you add to the replica the slower it goes.  Also there's the potential
 | |
| for dead locks across
 | |
| server boundaries.
 | |
| 
 | |
|  >
 | |
|  > 2) All of the databases must be able to communicate with each other at
 | |
|  > all times in order for any edits to work.   If the servers are
 | |
|  > connected over some sort of WAN that periodically has short outages this
 | |
|  > is a problem.   Also if your using replication because you want to be 
 | |
| able
 | |
|  > to take down one of the databases for short periods of time without
 | |
|  > bringing down the others your in trouble.
 | |
| 
 | |
| All true for two phased commit protocol.  To have multi master
 | |
| replication, you must have all
 | |
| systems communicating, but you can use a multicast group communication
 | |
| system instead of
 | |
| 2PC.  Using total order messaging, you can ensure all changes are
 | |
| delivered to all servers in the
 | |
| replica in the same order.   This group communication system also allows
 | |
| failures to be detected
 | |
| while other servers in the replica continue processing.
 | |
| 
 | |
| A few of us are working with this theory, and trying to integrate with
 | |
| 7.2.  There is a working
 | |
| model for 6.4, but its very limited.  (insert, update, and deletes)  We
 | |
| are currently hosted at
 | |
| 
 | |
| http://gborg.postgresql.org/project/pgreplication/projdisplay.php
 | |
| But the site has been down the last 2 days.  I've contacted the web
 | |
| master, but haven't seen
 | |
| any results yet.  If any one knows what going on with gborg, I'd
 | |
| appreciate a status.
 | |
| 
 | |
| Darren
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 2: you can get off all lists at once with the unregister command
 | |
|     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
 | |
| 
 | |
| From pgsql-hackers-owner+M18617=candle.pha.pa.us=pgman@postgresql.org Fri Feb  8 06:20:44 2002
 | |
| Return-path: <pgsql-hackers-owner+M18617=candle.pha.pa.us=pgman@postgresql.org>
 | |
| Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
 | |
| 	by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g18BKhP06132
 | |
| 	for <pgman@candle.pha.pa.us>; Fri, 8 Feb 2002 06:20:43 -0500 (EST)
 | |
| Received: (qmail 90815 invoked by alias); 8 Feb 2002 11:20:40 -0000
 | |
| Received: from unknown (HELO postgresql.org) (64.49.215.8)
 | |
|   by www.postgresql.org with SMTP; 8 Feb 2002 11:20:40 -0000
 | |
| Received: from laptop.kieser.demon.co.uk (kieser.demon.co.uk [62.49.6.72])
 | |
| 	by postgresql.org (8.11.3/8.11.4) with ESMTP id g18B9ZE89589
 | |
| 	for <pgsql-hackers@postgresql.org>; Fri, 8 Feb 2002 06:09:36 -0500 (EST)
 | |
| 	(envelope-from brad@kieser.net)
 | |
| Received: from laptop.kieser.demon.co.uk (localhost.localdomain [127.0.0.1])
 | |
| 	by laptop.kieser.demon.co.uk (Postfix) with SMTP
 | |
| 	id 598393A132; Fri,  8 Feb 2002 11:09:36 +0000 (GMT)
 | |
| From: Bradley Kieser <brad@kieser.net>
 | |
| Date: Fri, 08 Feb 2002 11:09:36 GMT
 | |
| Message-ID: <20020208.11093600@laptop.kieser.demon.co.uk>
 | |
| Subject: Re: [HACKERS] Replication
 | |
| To: Darren Johnson <darren.johnson@cox.net>
 | |
| cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
 | |
| In-Reply-To: <3C636232.6060206@cox.net>
 | |
| References: <Pine.LNX.4.33.0202071735360.6435-100000@pcNavYkfAdm1.ykf.navtechinc.com> <3C636232.6060206@cox.net>
 | |
| X-Mailer: Mozilla/3.0 (compatible; StarOffice/5.2;Linux)
 | |
| X-Priority: 3 (Normal)
 | |
| MIME-Version: 1.0
 | |
| Content-Type: text/plain; charset=ISO-8859-1
 | |
| Content-Transfer-Encoding: 8bit
 | |
| X-MIME-Autoconverted: from quoted-printable to 8bit by postgresql.org id g18BJoF90352
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| Darren,
 | |
| Given that different replication strategies will probably be developed 
 | |
| for PG, do you envisage DBAs to be able to select the type of replication 
 | |
| for their installation? I.e. Replication being selectable rther like 
 | |
| storage structures?
 | |
| 
 | |
| Would be a killer bit of flexibility, given how enormous the impact of 
 | |
| replication will be to corporate adoption of PG.
 | |
| 
 | |
| Brad 
 | |
| 
 | |
| 
 | |
| >>>>>>>>>>>>>>>>>> Original Message <<<<<<<<<<<<<<<<<<
 | |
| 
 | |
| On 2/8/02, 5:29:22 AM, Darren Johnson <darren.johnson@cox.net> wrote 
 | |
| regarding Re: [HACKERS] Replication:
 | |
| 
 | |
| 
 | |
| >  >
 | |
| >  > The problems with two stage commit type approches to replication are
 | |
| 
 | |
| > IMHO the biggest problem with two phased commit is it doesn't scale.
 | |
| > The more servers
 | |
| > you add to the replica the slower it goes.  Also there's the potential
 | |
| > for dead locks across
 | |
| > server boundaries.
 | |
| 
 | |
| >  >
 | |
| >  > 2) All of the databases must be able to communicate with each other at
 | |
| >  > all times in order for any edits to work.   If the servers are
 | |
| >  > connected over some sort of WAN that periodically has short outages this
 | |
| >  > is a problem.   Also if your using replication because you want to be
 | |
| > able
 | |
| >  > to take down one of the databases for short periods of time without
 | |
| >  > bringing down the others your in trouble.
 | |
| 
 | |
| > All true for two phased commit protocol.  To have multi master
 | |
| > replication, you must have all
 | |
| > systems communicating, but you can use a multicast group communication
 | |
| > system instead of
 | |
| > 2PC.  Using total order messaging, you can ensure all changes are
 | |
| > delivered to all servers in the
 | |
| > replica in the same order.   This group communication system also allows
 | |
| > failures to be detected
 | |
| > while other servers in the replica continue processing.
 | |
| 
 | |
| > A few of us are working with this theory, and trying to integrate with
 | |
| > 7.2.  There is a working
 | |
| > model for 6.4, but its very limited.  (insert, update, and deletes)  We
 | |
| > are currently hosted at
 | |
| 
 | |
| > http://gborg.postgresql.org/project/pgreplication/projdisplay.php
 | |
| > But the site has been down the last 2 days.  I've contacted the web
 | |
| > master, but haven't seen
 | |
| > any results yet.  If any one knows what going on with gborg, I'd
 | |
| > appreciate a status.
 | |
| 
 | |
| > Darren
 | |
| 
 | |
| 
 | |
| > ---------------------------(end of broadcast)---------------------------
 | |
| > TIP 2: you can get off all lists at once with the unregister command
 | |
| >     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
 | |
| 
 | |
| From pgsql-hackers-owner+M18642=candle.pha.pa.us=pgman@postgresql.org Fri Feb  8 12:40:36 2002
 | |
| Return-path: <pgsql-hackers-owner+M18642=candle.pha.pa.us=pgman@postgresql.org>
 | |
| Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
 | |
| 	by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g18HeZP08450
 | |
| 	for <pgman@candle.pha.pa.us>; Fri, 8 Feb 2002 12:40:35 -0500 (EST)
 | |
| Received: (qmail 74089 invoked by alias); 8 Feb 2002 17:40:30 -0000
 | |
| Received: from unknown (HELO postgresql.org) (64.49.215.8)
 | |
|   by www.postgresql.org with SMTP; 8 Feb 2002 17:40:30 -0000
 | |
| Received: from lakemtao03.mgt.cox.net (mtao3.east.cox.net [68.1.17.242])
 | |
| 	by postgresql.org (8.11.3/8.11.4) with ESMTP id g18HbwE73437
 | |
| 	for <pgsql-hackers@postgresql.org>; Fri, 8 Feb 2002 12:37:58 -0500 (EST)
 | |
| 	(envelope-from darren.johnson@cox.net)
 | |
| Received: from cox.net ([68.10.181.230]) by lakemtao03.mgt.cox.net
 | |
|           (InterMail vM.5.01.04.05 201-253-122-122-105-20011231) with ESMTP
 | |
|           id <20020208173804.DKQS6710.lakemtao03.mgt.cox.net@cox.net>;
 | |
|           Fri, 8 Feb 2002 12:38:04 -0500
 | |
| Message-ID: <3C63FB71.206@cox.net>
 | |
| Date: Fri, 08 Feb 2002 11:23:13 -0500
 | |
| From: Darren Johnson <darren.johnson@cox.net>
 | |
| User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; m18) Gecko/20010131 Netscape6/6.01
 | |
| X-Accept-Language: en
 | |
| MIME-Version: 1.0
 | |
| To: Bradley Kieser <brad@kieser.net>
 | |
| cc: pgsql-hackers@postgresql.org
 | |
| Subject: Re: [HACKERS] Replication
 | |
| References: <Pine.LNX.4.33.0202071735360.6435-100000@pcNavYkfAdm1.ykf.navtechinc.com> <3C636232.6060206@cox.net> <20020208.11093600@laptop.kieser.demon.co.uk>
 | |
| Content-Type: text/plain; charset=us-ascii; format=flowed
 | |
| Content-Transfer-Encoding: 7bit
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| > 
 | |
| > Given that different replication strategies will probably be developed 
 | |
| > for PG, do you envisage DBAs to be able to select the type of replication 
 | |
| > for their installation? I.e. Replication being selectable rther like 
 | |
| > storage structures?
 | |
| 
 | |
| I can't speak for other replication solutions, but we are using the 
 | |
| --with-replication or
 | |
| -r parameter when starting postmaster.  Some day I hope there will be 
 | |
| parameters for
 | |
| master/slave partial/full and sync/async,  but it will be some time 
 | |
| before we cross those
 | |
| bridges.
 | |
| 
 | |
| Darren
 | |
| 
 | |
| 
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 6: Have you searched our list archives?
 | |
| 
 | |
| http://archives.postgresql.org
 | |
| 
 | |
| From pgsql-hackers-owner+M18658=candle.pha.pa.us=pgman@postgresql.org Fri Feb  8 14:42:40 2002
 | |
| Return-path: <pgsql-hackers-owner+M18658=candle.pha.pa.us=pgman@postgresql.org>
 | |
| Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
 | |
| 	by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g18JgdP28166
 | |
| 	for <pgman@candle.pha.pa.us>; Fri, 8 Feb 2002 14:42:39 -0500 (EST)
 | |
| Received: (qmail 18650 invoked by alias); 8 Feb 2002 19:42:39 -0000
 | |
| Received: from unknown (HELO postgresql.org) (64.49.215.8)
 | |
|   by www.postgresql.org with SMTP; 8 Feb 2002 19:42:39 -0000
 | |
| Received: from enigma.trueimpact.net (enigma.trueimpact.net [209.82.45.201])
 | |
| 	by postgresql.org (8.11.3/8.11.4) with ESMTP id g18JYBE17341
 | |
| 	for <pgsql-hackers@postgresql.org>; Fri, 8 Feb 2002 14:34:11 -0500 (EST)
 | |
| 	(envelope-from rjonasz@trueimpact.com)
 | |
| Received: from nietzsche.trueimpact.net (unknown [209.82.45.200])
 | |
| 	by enigma.trueimpact.net (Postfix) with ESMTP id A785066B04
 | |
| 	for <pgsql-hackers@postgresql.org>; Fri,  8 Feb 2002 14:33:28 -0500 (EST)
 | |
| Date: Fri, 8 Feb 2002 14:34:34 -0500 (EST)
 | |
| From: Randall Jonasz <rjonasz@trueimpact.com>
 | |
| X-X-Sender: <rjonasz@nietzsche.trueimpact.net>
 | |
| To: PostgreSQL-development <pgsql-hackers@postgresql.org>
 | |
| Subject: Re: [HACKERS] Replication
 | |
| In-Reply-To: <3C627887.CC9FF837@mohawksoft.com>
 | |
| Message-ID: <20020208142932.H6545-100000@nietzsche.trueimpact.net>
 | |
| MIME-Version: 1.0
 | |
| Content-Type: TEXT/PLAIN; charset=US-ASCII
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| I've been looking into database replication theory lately and have found
 | |
| some interesting papers discussing various approaches.  (Here's
 | |
| one paper that struck me as being very helpful,
 | |
| http://citeseer.nj.nec.com/460405.html )  So far I favour an
 | |
| eager replication system which is predicated on a read local/write all
 | |
| available. The system should not depend on two phase commit or primary
 | |
| copy algorithms.  The former leads to the whole system being as quick as
 | |
| the slowest machine.  In addition, 2 phase commit involves 2n messages for
 | |
| each transaction which does not scale well at all.  This idea will also
 | |
| have to take into account a crashed node which did not ack a transaction.
 | |
| The primary copy algorithms I've seen suffer from a single point of
 | |
| failure and potential bottlenecks at the primary node.
 | |
| 
 | |
| Instead I like the master to master or peer to peer algorithm as discussed
 | |
| in the above paper.  This approach accounts for network partitions, nodes
 | |
| leaving and joining a cluster and the ability to commit a transaction once
 | |
| the communication module has determined the total order of the said
 | |
| transaction, i.e. no need for waiting for acks.   This scales well and
 | |
| research has shown it to increase the number of transactions/second a
 | |
| database cluster can handle over a single node.
 | |
| 
 | |
| Postgres-R is another interesting approach which I think should be taken
 | |
| seriously. Anyone interested can read a paper on this at
 | |
| http://citeseer.nj.nec.com/330257.html
 | |
| 
 | |
| Anyways, my two cents
 | |
| 
 | |
| Randall Jonasz
 | |
| Software Engineer
 | |
| Click2net Inc.
 | |
| 
 | |
| 
 | |
| On Thu, 7 Feb 2002, mlw wrote:
 | |
| 
 | |
| > Gavin Sherry wrote:
 | |
| > > Naturally, this would slow down writes to the system (possibly a lot
 | |
| > > depending on the performance difference between the executing machine and
 | |
| > > the least powerful machine in the cluster), but most usages of postgresql
 | |
| > > are read intensive, not write.
 | |
| > >
 | |
| > > Any reason this model would not work?
 | |
| >
 | |
| > What, then is the purpose of replication to multiple masters?
 | |
| >
 | |
| > I can think of only two reasons why you want replication. (1) Redundancy, make
 | |
| > sure that if one server dies, then another server has the same data and is used
 | |
| > seamlessly. (2) Increase performance over one system.
 | |
| >
 | |
| > In reason (1) I submit that a server load balance which sits on top of
 | |
| > PostgreSQL, and executes writes on both servers while distributing reads would
 | |
| > be best. This is a HUGE project. The load balancer must know EXACTLY how the
 | |
| > system is configured, which includes all functions and everything.
 | |
| >
 | |
| > In reason (2) your system would fail to provide the scalability that would be
 | |
| > needed. If writes take a long time, but reads are fine, what is the difference
 | |
| > between the trigger based replicator?
 | |
| >
 | |
| > I have in the back of my mind, an idea of patching into the WAL stuff, and
 | |
| > using that mechanism to push changes out to the slaves.
 | |
| >
 | |
| > Where one machine is still the master, but no trigger stuff, just a WAL patch.
 | |
| > Perhaps some shared memory paradigm to manage WAL visibility? I'm not sure
 | |
| > exactly, the idea hasn't completely formed yet.
 | |
| >
 | |
| > ---------------------------(end of broadcast)---------------------------
 | |
| > TIP 5: Have you checked our extensive FAQ?
 | |
| >
 | |
| > http://www.postgresql.org/users-lounge/docs/faq.html
 | |
| >
 | |
| >
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 5: Have you checked our extensive FAQ?
 | |
| 
 | |
| http://www.postgresql.org/users-lounge/docs/faq.html
 | |
| 
 | |
| From pgsql-hackers-owner+M18660=candle.pha.pa.us=pgman@postgresql.org Fri Feb  8 15:20:32 2002
 | |
| Return-path: <pgsql-hackers-owner+M18660=candle.pha.pa.us=pgman@postgresql.org>
 | |
| Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
 | |
| 	by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g18KKSP03731
 | |
| 	for <pgman@candle.pha.pa.us>; Fri, 8 Feb 2002 15:20:29 -0500 (EST)
 | |
| Received: (qmail 28961 invoked by alias); 8 Feb 2002 20:20:27 -0000
 | |
| Received: from unknown (HELO postgresql.org) (64.49.215.8)
 | |
|   by www.postgresql.org with SMTP; 8 Feb 2002 20:20:27 -0000
 | |
| Received: from inflicted.crimelabs.net (crimelabs.net [66.92.101.112])
 | |
| 	by postgresql.org (8.11.3/8.11.4) with ESMTP id g18KC7E27667
 | |
| 	for <pgsql-hackers@postgresql.org>; Fri, 8 Feb 2002 15:12:07 -0500 (EST)
 | |
| 	(envelope-from bpalmer@crimelabs.net)
 | |
| Received: from mizer.crimelabs.net (mizer.crimelabs.net [192.168.88.10])
 | |
| 	by inflicted.crimelabs.net (Postfix) with ESMTP
 | |
| 	id 1066F8787; Fri,  8 Feb 2002 15:12:08 -0500 (EST)
 | |
| Date: Fri, 8 Feb 2002 15:12:00 -0500 (EST)
 | |
| From: bpalmer <bpalmer@crimelabs.net>
 | |
| To: Randall Jonasz <rjonasz@trueimpact.com>
 | |
| cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
 | |
| Subject: Re: [HACKERS] Replication
 | |
| In-Reply-To: <20020208142932.H6545-100000@nietzsche.trueimpact.net>
 | |
| Message-ID: <Pine.BSO.4.43.0202081510130.21860-100000@mizer.crimelabs.net>
 | |
| MIME-Version: 1.0
 | |
| Content-Type: TEXT/PLAIN; charset=US-ASCII
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| I've not looked at the first paper,  but I wil.
 | |
| 
 | |
| > Postgres-R is another interesting approach which I think should be taken
 | |
| > seriously. Anyone interested can read a paper on this at
 | |
| > http://citeseer.nj.nec.com/330257.html
 | |
| 
 | |
| I would point you to the info on gborg,  but it seems to be down at the
 | |
| moment.
 | |
| 
 | |
| - Brandon
 | |
| 
 | |
| ----------------------------------------------------------------------------
 | |
|  c: 646-456-5455                                            h: 201-798-4983
 | |
|  b. palmer,  bpalmer@crimelabs.net           pgp:crimelabs.net/bpalmer.pgp5
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 3: if posting/reading through Usenet, please send an appropriate
 | |
| subscribe-nomail command to majordomo@postgresql.org so that your
 | |
| message can get through to the mailing list cleanly
 | |
| 
 | |
| From pgsql-hackers-owner+M18666=candle.pha.pa.us=pgman@postgresql.org Fri Feb  8 17:41:03 2002
 | |
| Return-path: <pgsql-hackers-owner+M18666=candle.pha.pa.us=pgman@postgresql.org>
 | |
| Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
 | |
| 	by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g18Mf2P18046
 | |
| 	for <pgman@candle.pha.pa.us>; Fri, 8 Feb 2002 17:41:03 -0500 (EST)
 | |
| Received: (qmail 63057 invoked by alias); 8 Feb 2002 22:41:02 -0000
 | |
| Received: from unknown (HELO postgresql.org) (64.49.215.8)
 | |
|   by www.postgresql.org with SMTP; 8 Feb 2002 22:41:02 -0000
 | |
| Received: from lakemtao03.mgt.cox.net (mtao3.east.cox.net [68.1.17.242])
 | |
| 	by postgresql.org (8.11.3/8.11.4) with ESMTP id g18MR9E60361
 | |
| 	for <pgsql-hackers@postgresql.org>; Fri, 8 Feb 2002 17:27:11 -0500 (EST)
 | |
| 	(envelope-from darren.johnson@cox.net)
 | |
| Received: from cox.net ([68.10.181.230]) by lakemtao03.mgt.cox.net
 | |
|           (InterMail vM.5.01.04.05 201-253-122-122-105-20011231) with ESMTP
 | |
|           id <20020208222634.GTRG6710.lakemtao03.mgt.cox.net@cox.net>;
 | |
|           Fri, 8 Feb 2002 17:26:34 -0500
 | |
| Message-ID: <3C643F0F.70303@cox.net>
 | |
| Date: Fri, 08 Feb 2002 16:11:43 -0500
 | |
| From: Darren Johnson <darren.johnson@cox.net>
 | |
| User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; m18) Gecko/20010131 Netscape6/6.01
 | |
| X-Accept-Language: en
 | |
| MIME-Version: 1.0
 | |
| To: Randall Jonasz <rjonasz@trueimpact.com>
 | |
| cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
 | |
| Subject: Re: [HACKERS] Replication
 | |
| References: <20020208142932.H6545-100000@nietzsche.trueimpact.net>
 | |
| Content-Type: text/plain; charset=us-ascii; format=flowed
 | |
| Content-Transfer-Encoding: 7bit
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| 
 | |
| > I've been looking into database replication theory lately and have found
 | |
| > some interesting papers discussing various approaches.  (Here's
 | |
| > one paper that struck me as being very helpful,
 | |
| > http://citeseer.nj.nec.com/460405.html )
 | |
| 
 | |
| 
 | |
| Here is another one from that same group, that addresses  the WAN issues.
 | |
| 
 | |
| > http://www.cnds.jhu.edu/pub/papers/cnds-2002-1.pdf
 | |
| 
 | |
| 
 | |
| enjoy,
 | |
| 
 | |
| Darren
 | |
| 
 | |
| 
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
 | |
| 
 | |
| From pgsql-hackers-owner+M18674=candle.pha.pa.us=pgman@postgresql.org Fri Feb  8 19:20:30 2002
 | |
| Return-path: <pgsql-hackers-owner+M18674=candle.pha.pa.us=pgman@postgresql.org>
 | |
| Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
 | |
| 	by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g190KTP26980
 | |
| 	for <pgman@candle.pha.pa.us>; Fri, 8 Feb 2002 19:20:29 -0500 (EST)
 | |
| Received: (qmail 88124 invoked by alias); 9 Feb 2002 00:20:27 -0000
 | |
| Received: from unknown (HELO postgresql.org) (64.49.215.8)
 | |
|   by www.postgresql.org with SMTP; 9 Feb 2002 00:20:27 -0000
 | |
| Received: from localhost.localdomain (bgp01077650bgs.wanarb01.mi.comcast.net [68.40.135.112])
 | |
| 	by postgresql.org (8.11.3/8.11.4) with ESMTP id g190H3E87489
 | |
| 	for <pgsql-hackers@postgresql.org>; Fri, 8 Feb 2002 19:17:03 -0500 (EST)
 | |
| 	(envelope-from camber@ais.org)
 | |
| Received: from localhost (camber@localhost)
 | |
| 	by localhost.localdomain (8.11.6/8.11.6) with ESMTP id g190H0P18427;
 | |
| 	Fri, 8 Feb 2002 19:17:00 -0500
 | |
| X-Authentication-Warning: localhost.localdomain: camber owned process doing -bs
 | |
| Date: Fri, 8 Feb 2002 19:17:00 -0500 (EST)
 | |
| From: Brian Bruns <camber@ais.org>
 | |
| X-X-Sender: <camber@localhost.localdomain>
 | |
| To: Randall Jonasz <rjonasz@trueimpact.com>
 | |
| cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
 | |
| Subject: Re: [HACKERS] Replication
 | |
| In-Reply-To: <20020208142932.H6545-100000@nietzsche.trueimpact.net>
 | |
| Message-ID: <Pine.LNX.4.33.0202081904190.18420-100000@localhost.localdomain>
 | |
| MIME-Version: 1.0
 | |
| Content-Type: TEXT/PLAIN; charset=US-ASCII
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Status: OR
 | |
| 
 | |
| > > I have in the back of my mind, an idea of patching into the WAL stuff, and
 | |
| > > using that mechanism to push changes out to the slaves.
 | |
| > >
 | |
| > > Where one machine is still the master, but no trigger stuff, just a WAL patch.
 | |
| > > Perhaps some shared memory paradigm to manage WAL visibility? I'm not sure
 | |
| > > exactly, the idea hasn't completely formed yet.
 | |
| > >
 | |
| 
 | |
| FWIW, Sybase Replication Server does just such a thing.  
 | |
| 
 | |
| They have a secondary log marker (prevents the log from truncating past 
 | |
| the oldest unreplicated transaction).  A thread within the system called 
 | |
| the "rep agent" (but it use to be a separate process call the LTM), reads 
 | |
| the log and forwards it to the rep server, once the rep server has the 
 | |
| whole transaction and it is written to a stable device (aka synced to 
 | |
| disk) the rep server responds to the LTM telling him it's OK to move the 
 | |
| log marker forward.
 | |
| 
 | |
| Anyway, once the replication server proper has the transaction it uses a 
 | |
| publish/subscribe methodology to see who wants get the update.
 | |
| 
 | |
| Bidirectional replication is done by making two oneway replications.  The 
 | |
| whole thing is table based, it marks the tables as replicated or not in 
 | |
| the database to save the trip to the repserver on un replicated tables.
 | |
| 
 | |
| Plus you can take parts of a database (replicate all rows where the 
 | |
| country is "us" to this server and all the rows with "uk" to that server).  
 | |
| Or opposite you can roll up smaller regional databases to bigger ones, 
 | |
| it's very flexible.
 | |
| 
 | |
| 
 | |
| Cheers,
 | |
| 
 | |
| Brian
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 4: Don't 'kill -9' the postmaster
 | |
| 
 |