mirror of
				https://github.com/postgres/postgres.git
				synced 2025-10-24 01:29:19 +03:00 
			
		
		
		
	
		
			
				
	
	
		
			3380 lines
		
	
	
		
			149 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			3380 lines
		
	
	
		
			149 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| From pgsql-hackers-owner+M77861=pgman=candle.pha.pa.us@postgresql.org Fri Dec 23 05:19:20 2005
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| From: Simon Riggs <simon@2ndquadrant.com>
 | |
| To: Stephen Frost <sfrost@snowman.net>
 | |
| cc: Martijn van Oosterhout <kleptog@svana.org>, 
 | |
| 	   Jim C. Nasby <jnasby@pervasive.com>, 
 | |
| 	   bizgres-general <bizgres-general@pgfoundry.org>, 
 | |
| 	  pgsql-hackers@postgresql.org
 | |
| In-Reply-To: <20051222223625.GC6026@ns.snowman.net>
 | |
| References: <1135261893.2964.502.camel@localhost.localdomain>
 | |
| 	  <20051222183751.GG72143@pervasive.com>  <20051222201826.GH21783@svana.org>
 | |
| 	  <1135289583.2964.536.camel@localhost.localdomain>
 | |
| 	  <20051222223625.GC6026@ns.snowman.net>
 | |
| Date: Fri, 23 Dec 2005 10:18:43 +0000
 | |
| Message-ID: <1135333123.2964.589.camel@localhost.localdomain>
 | |
| X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4) 
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.1 required=5 tests=[AWL=0.100]
 | |
| X-Spam-Score: 0.1
 | |
| X-Spam-Level: 
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Length:  4728
 | |
| 
 | |
| On Thu, 2005-12-22 at 17:36 -0500, Stephen Frost wrote:
 | |
| > * Simon Riggs (simon@2ndquadrant.com) wrote:
 | |
| > > On Thu, 2005-12-22 at 21:18 +0100, Martijn van Oosterhout wrote:
 | |
| > > > Considering "WAL bypass" is code for "breaks PITR"
 | |
| > > 
 | |
| > > No it isn't. All of the WAL bypass logic does *not* operate when PITR is
 | |
| > > active. The WAL bypass logic is aimed at Data Warehouses, which
 | |
| > > typically never operate in PITR mode for performance reasons, however
 | |
| > > the choice is yours.
 | |
| 
 | |
| OK, thanks for saying all of that; you probably speak for many in
 | |
| raising these concerns. I'll answer each bit as we come to it. Suffice
 | |
| to say, your concerns are good and so are the answers:
 | |
| 
 | |
| > Eh?  PITR mode is bad for performance?  Maybe I missed something but I
 | |
| > wouldn't have thought PITR would degrade regular performance all that
 | |
| > badly.  
 | |
| 
 | |
| PITR mode is *not* bad for performance. On a very heavily loaded
 | |
| write-intensive test system, the general PITR overhead on regular
 | |
| performance was around 1% - so almost negligible.
 | |
| 
 | |
| We have been discussing a number of optimizations to specific commands
 | |
| that would allow them to avoid writing WAL and thus speed up their
 | |
| performance. If archive_command is set then WAL will always be written;
 | |
| if it is not set then these commands will (or could) go faster:
 | |
| 
 | |
| - CREATE TABLE AS SELECT (in 8.1)
 | |
| - COPY LOCK (patch submitted)
 | |
| - COPY in same transaction as CREATE TABLE (patch submitted)
 | |
| - INSERT SELECT in same transaction as CREATE TABLE (this discussion)
 | |
| 
 | |
| (There are a number of other conditions also, such as there must be no
 | |
| indexes on a table. All of which now documented with the patch)
 | |
| 
 | |
| > So long as it doesn't take 15 minutes or some such to move the
 | |
| > WAL to somewhere else (and I'm not sure that'd even slow things down..).
 | |
| > For a Data Warehouse, have you got a better way of doing backups such
 | |
| > that you don't lose at minimum most of a day's work?  
 | |
| 
 | |
| Yes. Don't just use the backup facilities on their own. Think about how
 | |
| the architecture of your systems will work and see if there is a better
 | |
| way when you look at very large systems.
 | |
| 
 | |
| > I'm not exactly a
 | |
| > big fan do doing a pg_dump every night either given that the database is
 | |
| > 360GB.  Much nicer to take a weekly dump of the database and then do
 | |
| > PITR for a week or two before taking another dump of the db.
 | |
| 
 | |
| e.g. Keep your reference data (low volume) in an Operational Data Store
 | |
| (ODS) database, protected by archiving. Keep your main fact data (high
 | |
| volume) in the Data Warehouse, but save the data in slices as you load
 | |
| it, so that a recovery is simply a reload of the database: no PITR or
 | |
| pg_dump required, so high performance data transformation and load work
 | |
| is possible. This is a commonly used architectural design pattern.
 | |
| 
 | |
| > I like the idea of making COPY go faster, but please don't break my
 | |
| > backup system while you're at it.  
 | |
| 
 | |
| On a personal note, I would only add that I spent a long time working on
 | |
| PITR and I would never design anything that would intentionally break it
 | |
| (nor would patches be accepted that did that). That probably gives me
 | |
| the confidence to approach designs that might look like I'm doing that,
 | |
| but without actually straying over the edge.
 | |
| 
 | |
| > I'm honestly kind of nervous about
 | |
| > what you mean by checking it PITR is active- how is that done, exactly?
 | |
| > Check if you have a script set to rotate the logs elsewhere?  Or is it
 | |
| > checking if you're in the taking-a-full-database-backup stage?  Or what?
 | |
| 
 | |
| Internally, we use XLogArchivingActive(). Externally this will be set
 | |
| when the admin sets archive_command to a particular value.
 | |
| 
 | |
| My original preference was for a parameter called archive_mode= ON | OFF
 | |
| which would allow us to more easily discuss this, but this does not
 | |
| currently exist.
 | |
| 
 | |
| > What's the performance decrease when using PITR, and what's it from?  Is
 | |
| > it just that COPY isn't as fast?  Honestly, I could live with COPY being
 | |
| > not as fast as it could be if my backups work. :)
 | |
| 
 | |
| These commands will not be optimized for speed when archive_command is set:
 | |
| - CREATE TABLE AS SELECT (in 8.1)
 | |
| - COPY LOCK (patch submitted)
 | |
| 
 | |
| > Sorry for sounding concerned but, well, backups are very important and
 | |
| > so is performance and I'm afraid either I've not read all the
 | |
| > documentation about the issues being discussed here or there isn't
 | |
| > enough out there to make sense of it all yet. :)
 | |
| 
 | |
| If you choose PITR, then you are safe. If you do not, the crash recovery
 | |
| of the database is not endangered by these optimizations.
 | |
| 
 | |
| Hope that covers all of your concerns?
 | |
| 
 | |
| I'm just writing a course that explains many of these techniques,
 | |
| available in the New Year.
 | |
| 
 | |
| Best Regards, Simon Riggs
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 2: Don't 'kill -9' the postmaster
 | |
| 
 | |
| From pgsql-hackers-owner+M78004=pgman=candle.pha.pa.us@postgresql.org Wed Dec 28 20:59:03 2005
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| From: Bruce Momjian <pgman@candle.pha.pa.us>
 | |
| Message-ID: <200512290158.jBT1wEK28785@candle.pha.pa.us>
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| In-Reply-To: <20051226122206.GA12934@svana.org>
 | |
| To: Martijn van Oosterhout <kleptog@svana.org>
 | |
| Date: Wed, 28 Dec 2005 20:58:14 -0500 (EST)
 | |
| cc: Simon Riggs <simon@2ndquadrant.com>, Tom Lane <tgl@sss.pgh.pa.us>, 
 | |
| 	   Greg Stark <gsstark@mit.edu>, Rod Taylor <pg@rbt.ca>, 
 | |
| 	   Qingqing Zhou <zhouqq@cs.toronto.edu>, pgsql-hackers@postgresql.org
 | |
| X-Mailer: ELM [version 2.4ME+ PL121 (25)]
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.122 required=5 tests=[AWL=0.122]
 | |
| X-Spam-Score: 0.122
 | |
| X-Spam-Level: 
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Length:  3461
 | |
| 
 | |
| 
 | |
| Having read through this thread, I would like to propose a
 | |
| syntax/behavior.
 | |
| 
 | |
| I think we all now agree that the logging is more part of the table than
 | |
| the command itself.  Right now we have a COPY LOCK patch, but people are
 | |
| going to want to control logging for INSERT INTO ... SELECT, and UPDATE,
 | |
| and all sorts of other things, so I think we are best adding an ALTER
 | |
| TABLE capability.  I am thinking of this syntax:
 | |
| 
 | |
| 	ALTER TABLE name RELIABILITY option
 | |
| 
 | |
| where "option" is:
 | |
| 
 | |
| 	DROP [ TABLE ON CRASH ]
 | |
| 	DELETE [ ROWS ON CRASH ]
 | |
| 	EXCLUSIVE
 | |
| 	SHARE
 | |
| 
 | |
| Let me explain each option.  DROP would drop the table on a restart
 | |
| after a non-clean shutdown.  It would do _no_ logging on the table and
 | |
| allow concurrent access, plus index access.  DELETE is the same as DROP,
 | |
| but it just truncates the table (perhaps TRUNCATE is a better word).
 | |
| 
 | |
| EXCLUSIVE would allow only a single session to modify the table, and
 | |
| would do all changes by appending to the table, similar to COPY LOCK. 
 | |
| EXCLUSIVE would also not allow indexes because those can not be isolated
 | |
| like appending to the heap.  EXCLUSIVE would write all dirty shared
 | |
| buffers for the table and fsync them before committing.  SHARE is the
 | |
| functionality we have now, with full logging.
 | |
| 
 | |
| Does this get us any closer to a TODO item?  It isn't great, but I think
 | |
| it is pretty clear, and I assume pg_dump would use ALTER to load each
 | |
| table.  The advanage is that the COPY statements themselves are
 | |
| unchanged so they would work in loading into older versions of
 | |
| PostgreSQL.
 | |
| 
 | |
| ---------------------------------------------------------------------------
 | |
| 
 | |
| Martijn van Oosterhout wrote:
 | |
| -- Start of PGP signed section.
 | |
| > On Mon, Dec 26, 2005 at 12:03:27PM +0000, Simon Riggs wrote:
 | |
| > > I would not be against such a table-level switch, but the exact
 | |
| > > behaviour would need to be specified more closely before this became a
 | |
| > > TODO item, IMHO.
 | |
| > 
 | |
| > Well, I think at a per table level is the only sensible level. If a
 | |
| > table isn't logged, neither are the indexes. After an unclean shutdown
 | |
| > the data could be anywhere between OK and rubbish, with no way of
 | |
| > finding out which way.
 | |
| > 
 | |
| > > If someone has a 100 GB table, they would not appreciate the table being
 | |
| > > truncated if a transaction to load 1 GB of data aborts, forcing recovery
 | |
| > > of the 100 GB table.
 | |
| > 
 | |
| > Ah, but wouldn't such a large table be partitioned in such a way that
 | |
| > you could have the most recent partition having the loaded data.
 | |
| > Personally, I think these "shared temp tables" have more applications
 | |
| > than meet the eye. I've had systems with cache tables which could be
 | |
| > wiped on boot. Though I think my preference would be to TRUNCATE rather
 | |
| > than DROP on unclean shutdown.
 | |
| > 
 | |
| > Have a nice day,
 | |
| > -- 
 | |
| > Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
 | |
| > > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 | |
| > > tool for doing 5% of the work and then sitting around waiting for someone
 | |
| > > else to do the other 95% so you can sue them.
 | |
| -- End of PGP section, PGP failed!
 | |
| 
 | |
| -- 
 | |
|   Bruce Momjian                        |  http://candle.pha.pa.us
 | |
|   pgman@candle.pha.pa.us               |  (610) 359-1001
 | |
|   +  If your life is a hard drive,     |  13 Roberts Road
 | |
|   +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 3: Have you checked our extensive FAQ?
 | |
| 
 | |
|                http://www.postgresql.org/docs/faq
 | |
| 
 | |
| From pgsql-hackers-owner+M78007=pgman=candle.pha.pa.us@postgresql.org Wed Dec 28 22:06:13 2005
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Message-ID: <43B3527A.4040709@commandprompt.com>
 | |
| Date: Wed, 28 Dec 2005 19:05:30 -0800
 | |
| From: Joshua D. Drake <jd@commandprompt.com>
 | |
| Organization: Command Prompt, Inc.
 | |
| User-Agent: Mozilla Thunderbird 1.0.2 (Windows/20050317)
 | |
| X-Accept-Language: en-us, en
 | |
| To: Bruce Momjian <pgman@candle.pha.pa.us>
 | |
| cc: Martijn van Oosterhout <kleptog@svana.org>, 
 | |
| 	   Simon Riggs <simon@2ndquadrant.com>, Tom Lane <tgl@sss.pgh.pa.us>, 
 | |
| 	   Greg Stark <gsstark@mit.edu>, Rod Taylor <pg@rbt.ca>, 
 | |
| 	   Qingqing Zhou <zhouqq@cs.toronto.edu>, pgsql-hackers@postgresql.org
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| References: <200512290158.jBT1wEK28785@candle.pha.pa.us>
 | |
| In-Reply-To: <200512290158.jBT1wEK28785@candle.pha.pa.us>
 | |
| X-Greylist: Sender succeded SMTP AUTH authentication, not delayed by milter-greylist-1.6 (hosting.commandprompt.com [192.168.1.101]); Wed, 28 Dec 2005 18:57:25 -0800 (PST)
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.05 required=5 tests=[AWL=0.050, UPPERCASE_25_50=0]
 | |
| X-Spam-Score: 0.05
 | |
| X-Spam-Level: 
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Length:   725
 | |
| 
 | |
|   now agree that the logging is more part of the table than
 | |
| > the command itself.  Right now we have a COPY LOCK patch, but people are
 | |
| > going to want to control logging for INSERT INTO ... SELECT, and UPDATE,
 | |
| > and all sorts of other things, so I think we are best adding an ALTER
 | |
| > TABLE capability.  I am thinking of this syntax:
 | |
| > 
 | |
| > 	ALTER TABLE name RELIABILITY option
 | |
| > 
 | |
| > where "option" is:
 | |
| > 
 | |
| > 	DROP [ TABLE ON CRASH ]
 | |
| > 	DELETE [ ROWS ON CRASH ]
 | |
| > 	EXCLUSIVE
 | |
| > 	SHARE
 | |
| 
 | |
| I would say ON FAILURE (Crash just seems way to scary :))
 | |
| 
 | |
| Joshua D. Drake
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 3: Have you checked our extensive FAQ?
 | |
| 
 | |
|                http://www.postgresql.org/docs/faq
 | |
| 
 | |
| From pgsql-hackers-owner+M78008=pgman=candle.pha.pa.us@postgresql.org Wed Dec 28 23:09:58 2005
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| From: Bruce Momjian <pgman@candle.pha.pa.us>
 | |
| Message-ID: <200512290409.jBT49LD13611@candle.pha.pa.us>
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| In-Reply-To: <43B3527A.4040709@commandprompt.com>
 | |
| To: Joshua D. Drake <jd@commandprompt.com>
 | |
| Date: Wed, 28 Dec 2005 23:09:21 -0500 (EST)
 | |
| cc: Martijn van Oosterhout <kleptog@svana.org>, 
 | |
| 	   Simon Riggs <simon@2ndquadrant.com>, Tom Lane <tgl@sss.pgh.pa.us>, 
 | |
| 	   Greg Stark <gsstark@mit.edu>, Rod Taylor <pg@rbt.ca>, 
 | |
| 	   Qingqing Zhou <zhouqq@cs.toronto.edu>, pgsql-hackers@postgresql.org
 | |
| X-Mailer: ELM [version 2.4ME+ PL121 (25)]
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.122 required=5 tests=[AWL=0.122, UPPERCASE_25_50=0]
 | |
| X-Spam-Score: 0.122
 | |
| X-Spam-Level: 
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Length:  1111
 | |
| 
 | |
| Joshua D. Drake wrote:
 | |
| >   now agree that the logging is more part of the table than
 | |
| > > the command itself.  Right now we have a COPY LOCK patch, but people are
 | |
| > > going to want to control logging for INSERT INTO ... SELECT, and UPDATE,
 | |
| > > and all sorts of other things, so I think we are best adding an ALTER
 | |
| > > TABLE capability.  I am thinking of this syntax:
 | |
| > > 
 | |
| > > 	ALTER TABLE name RELIABILITY option
 | |
| > > 
 | |
| > > where "option" is:
 | |
| > > 
 | |
| > > 	DROP [ TABLE ON CRASH ]
 | |
| > > 	DELETE [ ROWS ON CRASH ]
 | |
| > > 	EXCLUSIVE
 | |
| > > 	SHARE
 | |
| > 
 | |
| > I would say ON FAILURE (Crash just seems way to scary :))
 | |
| 
 | |
| Agreed, maybe ON RECOVERY.
 | |
| 
 | |
| -- 
 | |
|   Bruce Momjian                        |  http://candle.pha.pa.us
 | |
|   pgman@candle.pha.pa.us               |  (610) 359-1001
 | |
|   +  If your life is a hard drive,     |  13 Roberts Road
 | |
|   +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 9: In versions below 8.0, the planner will ignore your desire to
 | |
|        choose an index scan if your joining column's datatypes do not
 | |
|        match
 | |
| 
 | |
| From simon@2ndquadrant.com Thu Dec 29 08:19:47 2005
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| From: Simon Riggs <simon@2ndquadrant.com>
 | |
| To: Bruce Momjian <pgman@candle.pha.pa.us>
 | |
| cc: Martijn van Oosterhout <kleptog@svana.org>, Tom Lane <tgl@sss.pgh.pa.us>, 
 | |
| 	   Greg Stark <gsstark@mit.edu>, Rod Taylor <pg@rbt.ca>, 
 | |
| 	   Qingqing Zhou <zhouqq@cs.toronto.edu>, pgsql-hackers@postgresql.org
 | |
| In-Reply-To: <200512290158.jBT1wEK28785@candle.pha.pa.us>
 | |
| References: <200512290158.jBT1wEK28785@candle.pha.pa.us>
 | |
| Date: Thu, 29 Dec 2005 13:19:45 +0000
 | |
| Message-ID: <1135862385.2964.804.camel@localhost.localdomain>
 | |
| X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4) 
 | |
| Content-Length:  7026
 | |
| 
 | |
| On Wed, 2005-12-28 at 20:58 -0500, Bruce Momjian wrote:
 | |
| > Having read through this thread, I would like to propose a
 | |
| > syntax/behavior.
 | |
| > 
 | |
| > I think we all now agree that the logging is more part of the table than
 | |
| > the command itself.  Right now we have a COPY LOCK patch, but people are
 | |
| > going to want to control logging for INSERT INTO ... SELECT, and UPDATE,
 | |
| > and all sorts of other things, so I think we are best adding an ALTER
 | |
| > TABLE capability.  I am thinking of this syntax:
 | |
| > 
 | |
| > 	ALTER TABLE name RELIABILITY option
 | |
| > 
 | |
| > where "option" is:
 | |
| > 
 | |
| > 	DROP [ TABLE ON CRASH ]
 | |
| > 	DELETE [ ROWS ON CRASH ]
 | |
| > 	EXCLUSIVE
 | |
| > 	SHARE
 | |
| > 
 | |
| > Let me explain each option.  DROP would drop the table on a restart
 | |
| > after a non-clean shutdown.  It would do _no_ logging on the table and
 | |
| > allow concurrent access, plus index access.  DELETE is the same as DROP,
 | |
| > but it just truncates the table (perhaps TRUNCATE is a better word).
 | |
| > 
 | |
| > EXCLUSIVE would allow only a single session to modify the table, and
 | |
| > would do all changes by appending to the table, similar to COPY LOCK. 
 | |
| > EXCLUSIVE would also not allow indexes because those can not be isolated
 | |
| > like appending to the heap.  EXCLUSIVE would write all dirty shared
 | |
| > buffers for the table and fsync them before committing.  SHARE is the
 | |
| > functionality we have now, with full logging.
 | |
| > 
 | |
| > Does this get us any closer to a TODO item?  It isn't great, but I think
 | |
| > it is pretty clear, and I assume pg_dump would use ALTER to load each
 | |
| > table.  The advanage is that the COPY statements themselves are
 | |
| > unchanged so they would work in loading into older versions of
 | |
| > PostgreSQL.
 | |
| 
 | |
| First off, thanks for summarising a complex thread.
 | |
| 
 | |
| My view would be that this thread has been complex because everybody has
 | |
| expressed a somewhat different requirement, which could be broken down
 | |
| as:
 | |
| 1. The need for a multi-user-accessible yet temporary table
 | |
| 2. Loading data into a table immediately after it is created (i.e. in
 | |
| same transaction), including but not limited to a reload from pg_dump
 | |
| 3. How to load data quickly into an existing table (COPY)
 | |
| 4. How to add/modify data quickly in an existing table (INSERT SELECT,
 | |
| UPDATE)
 | |
| 
 | |
| I can see the need for all of those individually; my existing patch
 | |
| submission covers (2) and (3) only. I very much like your thought to
 | |
| coalesce these various requirements into a single coherent model.
 | |
| 
 | |
| For requirement (1), table level options make sense. We would:
 | |
| - CREATE TABLE ALLTHINGS
 | |
| - ALTER TABLE ALLTHINGS RELIABILITY DELETE ROWS ON RECOVERY
 | |
| - lots of SQL, all fast because not logged
 | |
| 
 | |
| (2) is catered for adequately by the existing COPY patch i.e. it will
 | |
| detect whether a table has just been created and then avoid writing WAL.
 | |
| In the patch, pg_dump has *not* been altered to use COPY LOCK, so a
 | |
| pg_dump *will* work with any other version of PostgreSQL, which *would
 | |
| not* be the case if we added ALTER TABLE ... RELIABILITY statements into
 | |
| it. Also, a pg_dump created at an earlier version could also be loaded
 | |
| faster using the patch. The only requirement is to issue all SQL as part
 | |
| of the same transaction - which is catered for by the
 | |
| --single-transaction option on pg_restore and psql. So (2) is catered
 | |
| for fully without the need for an ALTER TABLE ... RELIABILITY statement
 | |
| or COPY LOCK.
 | |
| 
 | |
| For requirement (3), I would use table level options like this:
 | |
| (the table already exists and is reasonably big; we should not assume
 | |
| that everybody can and does use partitioning)
 | |
| - ALTER TABLE RELIABILITY ALLTHINGS2 EXCLUSIVE
 | |
| - COPY
 | |
| - ALTER TABLE RELIABILITY ALLTHINGS2 SHARE
 | |
| 
 | |
| For a load into an existing table I would always do all three actions
 | |
| together. COPY LOCK does exactly that *and* does it atomically. 
 | |
| 
 | |
| The two ways of doing (3) have a few pros/cons either way:
 | |
| Pro for ALTER TABLE:
 | |
| - same syntax as req (1)
 | |
| - doesn't need the keyword LOCK 
 | |
| - allows INSERT SELECT, UPDATE operations also (req 4)
 | |
| Cons:
 | |
| - existing programs have to add additional statements to take advantage
 | |
| of this; with COPY LOCK we would add just a single keyword
 | |
| - operation is not atomic, which might lead to some operations waiting
 | |
| for a lock to operate as unlogged, since they would execute before the
 | |
| second ALTER TABLE gets there
 | |
| - operation will be understood by some, but not others. They will forget
 | |
| to switch the RELIABILITY back on and then lose their whole table when
 | |
| the database crashes. (watch...)
 | |
| 
 | |
| ...but would it be a problem to have both?
 | |
| 
 | |
| 
 | |
| So, my thinking would be to separate things into two:
 | |
| a) Add a TODO item "shared temp tables" that caters for (1) and (4)
 | |
| 
 | |
| 	ALTER TABLE name RELIABILITY 
 | |
| 		{DELETE ROWS AT RECOVERY | FULL RECOVERY}
 | |
| (syntax TBD)
 | |
| 
 | |
| which would 
 | |
| - truncate all rows and remove all index entries during recovery
 | |
| - use shared_buffers, not temp_buffers
 | |
| - never write xlog records, even when in PITR mode
 | |
| - would avoid writing WAL for both heap *and* index tuples
 | |
| 
 | |
| b) Leave the COPY patch as is, since it caters for reqs (2) and (3) as
 | |
| *separate* optimizations (but using a common infrastructure in code).
 | |
| [This work was based upon discussions on -hackers only 6 months ago, so
 | |
| its not like its been snuck in or anything
 | |
| http://archives.postgresql.org/pgsql-hackers/2005-06/msg00069.php
 | |
| http://archives.postgresql.org/pgsql-hackers/2005-06/msg00075.php ]
 | |
| 
 | |
| These two thoughts are separable. There is no need to
 | |
| have-both-or-neither within PostgreSQL.
 | |
| 
 | |
| Eventually, I'd like all of these options, as a database designer.
 | |
| 
 | |
| Best Regards, Simon Riggs
 | |
| 
 | |
| > --------------------------------------------------------------------------
 | |
| > 
 | |
| > Martijn van Oosterhout wrote:
 | |
| > -- Start of PGP signed section.
 | |
| > > On Mon, Dec 26, 2005 at 12:03:27PM +0000, Simon Riggs wrote:
 | |
| > > > I would not be against such a table-level switch, but the exact
 | |
| > > > behaviour would need to be specified more closely before this became a
 | |
| > > > TODO item, IMHO.
 | |
| > > 
 | |
| > > Well, I think at a per table level is the only sensible level. If a
 | |
| > > table isn't logged, neither are the indexes. After an unclean shutdown
 | |
| > > the data could be anywhere between OK and rubbish, with no way of
 | |
| > > finding out which way.
 | |
| > > 
 | |
| > > > If someone has a 100 GB table, they would not appreciate the table being
 | |
| > > > truncated if a transaction to load 1 GB of data aborts, forcing recovery
 | |
| > > > of the 100 GB table.
 | |
| > > 
 | |
| > > Ah, but wouldn't such a large table be partitioned in such a way that
 | |
| > > you could have the most recent partition having the loaded data.
 | |
| > > Personally, I think these "shared temp tables" have more applications
 | |
| > > than meet the eye. I've had systems with cache tables which could be
 | |
| > > wiped on boot. Though I think my preference would be to TRUNCATE rather
 | |
| > > than DROP on unclean shutdown.
 | |
| > > 
 | |
| > > Have a nice day,
 | |
| > > -- 
 | |
| > > Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
 | |
| > > > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 | |
| > > > tool for doing 5% of the work and then sitting around waiting for someone
 | |
| > > > else to do the other 95% so you can sue them.
 | |
| > -- End of PGP section, PGP failed!
 | |
| > 
 | |
| 
 | |
| From pgsql-hackers-owner+M78019=pgman=candle.pha.pa.us@postgresql.org Thu Dec 29 08:20:11 2005
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| From: Simon Riggs <simon@2ndquadrant.com>
 | |
| To: Bruce Momjian <pgman@candle.pha.pa.us>
 | |
| cc: Martijn van Oosterhout <kleptog@svana.org>, Tom Lane <tgl@sss.pgh.pa.us>, 
 | |
| 	   Greg Stark <gsstark@mit.edu>, Rod Taylor <pg@rbt.ca>, 
 | |
| 	   Qingqing Zhou <zhouqq@cs.toronto.edu>, pgsql-hackers@postgresql.org
 | |
| In-Reply-To: <200512290158.jBT1wEK28785@candle.pha.pa.us>
 | |
| References: <200512290158.jBT1wEK28785@candle.pha.pa.us>
 | |
| Date: Thu, 29 Dec 2005 13:19:45 +0000
 | |
| Message-ID: <1135862385.2964.804.camel@localhost.localdomain>
 | |
| X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4) 
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.112 required=5 tests=[AWL=0.112]
 | |
| X-Spam-Score: 0.112
 | |
| X-Spam-Level: 
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Length:  7139
 | |
| 
 | |
| On Wed, 2005-12-28 at 20:58 -0500, Bruce Momjian wrote:
 | |
| > Having read through this thread, I would like to propose a
 | |
| > syntax/behavior.
 | |
| > 
 | |
| > I think we all now agree that the logging is more part of the table than
 | |
| > the command itself.  Right now we have a COPY LOCK patch, but people are
 | |
| > going to want to control logging for INSERT INTO ... SELECT, and UPDATE,
 | |
| > and all sorts of other things, so I think we are best adding an ALTER
 | |
| > TABLE capability.  I am thinking of this syntax:
 | |
| > 
 | |
| > 	ALTER TABLE name RELIABILITY option
 | |
| > 
 | |
| > where "option" is:
 | |
| > 
 | |
| > 	DROP [ TABLE ON CRASH ]
 | |
| > 	DELETE [ ROWS ON CRASH ]
 | |
| > 	EXCLUSIVE
 | |
| > 	SHARE
 | |
| > 
 | |
| > Let me explain each option.  DROP would drop the table on a restart
 | |
| > after a non-clean shutdown.  It would do _no_ logging on the table and
 | |
| > allow concurrent access, plus index access.  DELETE is the same as DROP,
 | |
| > but it just truncates the table (perhaps TRUNCATE is a better word).
 | |
| > 
 | |
| > EXCLUSIVE would allow only a single session to modify the table, and
 | |
| > would do all changes by appending to the table, similar to COPY LOCK. 
 | |
| > EXCLUSIVE would also not allow indexes because those can not be isolated
 | |
| > like appending to the heap.  EXCLUSIVE would write all dirty shared
 | |
| > buffers for the table and fsync them before committing.  SHARE is the
 | |
| > functionality we have now, with full logging.
 | |
| > 
 | |
| > Does this get us any closer to a TODO item?  It isn't great, but I think
 | |
| > it is pretty clear, and I assume pg_dump would use ALTER to load each
 | |
| > table.  The advanage is that the COPY statements themselves are
 | |
| > unchanged so they would work in loading into older versions of
 | |
| > PostgreSQL.
 | |
| 
 | |
| First off, thanks for summarising a complex thread.
 | |
| 
 | |
| My view would be that this thread has been complex because everybody has
 | |
| expressed a somewhat different requirement, which could be broken down
 | |
| as:
 | |
| 1. The need for a multi-user-accessible yet temporary table
 | |
| 2. Loading data into a table immediately after it is created (i.e. in
 | |
| same transaction), including but not limited to a reload from pg_dump
 | |
| 3. How to load data quickly into an existing table (COPY)
 | |
| 4. How to add/modify data quickly in an existing table (INSERT SELECT,
 | |
| UPDATE)
 | |
| 
 | |
| I can see the need for all of those individually; my existing patch
 | |
| submission covers (2) and (3) only. I very much like your thought to
 | |
| coalesce these various requirements into a single coherent model.
 | |
| 
 | |
| For requirement (1), table level options make sense. We would:
 | |
| - CREATE TABLE ALLTHINGS
 | |
| - ALTER TABLE ALLTHINGS RELIABILITY DELETE ROWS ON RECOVERY
 | |
| - lots of SQL, all fast because not logged
 | |
| 
 | |
| (2) is catered for adequately by the existing COPY patch i.e. it will
 | |
| detect whether a table has just been created and then avoid writing WAL.
 | |
| In the patch, pg_dump has *not* been altered to use COPY LOCK, so a
 | |
| pg_dump *will* work with any other version of PostgreSQL, which *would
 | |
| not* be the case if we added ALTER TABLE ... RELIABILITY statements into
 | |
| it. Also, a pg_dump created at an earlier version could also be loaded
 | |
| faster using the patch. The only requirement is to issue all SQL as part
 | |
| of the same transaction - which is catered for by the
 | |
| --single-transaction option on pg_restore and psql. So (2) is catered
 | |
| for fully without the need for an ALTER TABLE ... RELIABILITY statement
 | |
| or COPY LOCK.
 | |
| 
 | |
| For requirement (3), I would use table level options like this:
 | |
| (the table already exists and is reasonably big; we should not assume
 | |
| that everybody can and does use partitioning)
 | |
| - ALTER TABLE RELIABILITY ALLTHINGS2 EXCLUSIVE
 | |
| - COPY
 | |
| - ALTER TABLE RELIABILITY ALLTHINGS2 SHARE
 | |
| 
 | |
| For a load into an existing table I would always do all three actions
 | |
| together. COPY LOCK does exactly that *and* does it atomically. 
 | |
| 
 | |
| The two ways of doing (3) have a few pros/cons either way:
 | |
| Pro for ALTER TABLE:
 | |
| - same syntax as req (1)
 | |
| - doesn't need the keyword LOCK 
 | |
| - allows INSERT SELECT, UPDATE operations also (req 4)
 | |
| Cons:
 | |
| - existing programs have to add additional statements to take advantage
 | |
| of this; with COPY LOCK we would add just a single keyword
 | |
| - operation is not atomic, which might lead to some operations waiting
 | |
| for a lock to operate as unlogged, since they would execute before the
 | |
| second ALTER TABLE gets there
 | |
| - operation will be understood by some, but not others. They will forget
 | |
| to switch the RELIABILITY back on and then lose their whole table when
 | |
| the database crashes. (watch...)
 | |
| 
 | |
| ...but would it be a problem to have both?
 | |
| 
 | |
| 
 | |
| So, my thinking would be to separate things into two:
 | |
| a) Add a TODO item "shared temp tables" that caters for (1) and (4)
 | |
| 
 | |
| 	ALTER TABLE name RELIABILITY 
 | |
| 		{DELETE ROWS AT RECOVERY | FULL RECOVERY}
 | |
| (syntax TBD)
 | |
| 
 | |
| which would 
 | |
| - truncate all rows and remove all index entries during recovery
 | |
| - use shared_buffers, not temp_buffers
 | |
| - never write xlog records, even when in PITR mode
 | |
| - would avoid writing WAL for both heap *and* index tuples
 | |
| 
 | |
| b) Leave the COPY patch as is, since it caters for reqs (2) and (3) as
 | |
| *separate* optimizations (but using a common infrastructure in code).
 | |
| [This work was based upon discussions on -hackers only 6 months ago, so
 | |
| its not like its been snuck in or anything
 | |
| http://archives.postgresql.org/pgsql-hackers/2005-06/msg00069.php
 | |
| http://archives.postgresql.org/pgsql-hackers/2005-06/msg00075.php ]
 | |
| 
 | |
| These two thoughts are separable. There is no need to
 | |
| have-both-or-neither within PostgreSQL.
 | |
| 
 | |
| Eventually, I'd like all of these options, as a database designer.
 | |
| 
 | |
| Best Regards, Simon Riggs
 | |
| 
 | |
| > --------------------------------------------------------------------------
 | |
| > 
 | |
| > Martijn van Oosterhout wrote:
 | |
| > -- Start of PGP signed section.
 | |
| > > On Mon, Dec 26, 2005 at 12:03:27PM +0000, Simon Riggs wrote:
 | |
| > > > I would not be against such a table-level switch, but the exact
 | |
| > > > behaviour would need to be specified more closely before this became a
 | |
| > > > TODO item, IMHO.
 | |
| > > 
 | |
| > > Well, I think at a per table level is the only sensible level. If a
 | |
| > > table isn't logged, neither are the indexes. After an unclean shutdown
 | |
| > > the data could be anywhere between OK and rubbish, with no way of
 | |
| > > finding out which way.
 | |
| > > 
 | |
| > > > If someone has a 100 GB table, they would not appreciate the table being
 | |
| > > > truncated if a transaction to load 1 GB of data aborts, forcing recovery
 | |
| > > > of the 100 GB table.
 | |
| > > 
 | |
| > > Ah, but wouldn't such a large table be partitioned in such a way that
 | |
| > > you could have the most recent partition having the loaded data.
 | |
| > > Personally, I think these "shared temp tables" have more applications
 | |
| > > than meet the eye. I've had systems with cache tables which could be
 | |
| > > wiped on boot. Though I think my preference would be to TRUNCATE rather
 | |
| > > than DROP on unclean shutdown.
 | |
| > > 
 | |
| > > Have a nice day,
 | |
| > > -- 
 | |
| > > Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
 | |
| > > > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 | |
| > > > tool for doing 5% of the work and then sitting around waiting for someone
 | |
| > > > else to do the other 95% so you can sue them.
 | |
| > -- End of PGP section, PGP failed!
 | |
| > 
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 6: explain analyze is your friend
 | |
| 
 | |
| From pgsql-hackers-owner+M78021=pgman=candle.pha.pa.us@postgresql.org Thu Dec 29 09:35:58 2005
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| From: Rod Taylor <pg@rbt.ca>
 | |
| To: Simon Riggs <simon@2ndquadrant.com>
 | |
| cc: Bruce Momjian <pgman@candle.pha.pa.us>, 
 | |
| 	   Martijn van Oosterhout <kleptog@svana.org>, Tom Lane <tgl@sss.pgh.pa.us>, 
 | |
| 	   Greg Stark <gsstark@mit.edu>, Qingqing Zhou <zhouqq@cs.toronto.edu>, 
 | |
| 	  pgsql-hackers@postgresql.org
 | |
| In-Reply-To: <1135862385.2964.804.camel@localhost.localdomain>
 | |
| References: <200512290158.jBT1wEK28785@candle.pha.pa.us>
 | |
| 	  <1135862385.2964.804.camel@localhost.localdomain>
 | |
| Date: Thu, 29 Dec 2005 09:35:27 -0500
 | |
| Message-ID: <1135866927.61038.13.camel@home>
 | |
| X-Mailer: Evolution 2.4.2.1 FreeBSD GNOME Team Port 
 | |
| X-SA-Exim-Mail-From: pg@rbt.ca
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| X-SA-Exim-Version: 3.1 (built Tue Feb 24 05:09:27 GMT 2004)
 | |
| X-SA-Exim-Scanned: Yes
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.024 required=5 tests=[AWL=0.024, UPPERCASE_25_50=0]
 | |
| X-Spam-Score: 0.024
 | |
| X-Spam-Level: 
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Length:   506
 | |
| 
 | |
| 
 | |
| > So, my thinking would be to separate things into two:
 | |
| > a) Add a TODO item "shared temp tables" that caters for (1) and (4)
 | |
| > 
 | |
| > 	ALTER TABLE name RELIABILITY 
 | |
| > 		{DELETE ROWS AT RECOVERY | FULL RECOVERY}
 | |
| > (syntax TBD)
 | |
| 
 | |
| DELETE ROWS AT RECOVERY would need to be careful or disallowed when
 | |
| referenced via a foreign key to ensure the database is not restored in
 | |
| an inconsistent state.
 | |
| 
 | |
| -- 
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 2: Don't 'kill -9' the postmaster
 | |
| 
 | |
| From pg@rbt.ca Thu Dec 29 09:35:35 2005
 | |
| From: Rod Taylor <pg@rbt.ca>
 | |
| To: Simon Riggs <simon@2ndquadrant.com>
 | |
| cc: Bruce Momjian <pgman@candle.pha.pa.us>, 
 | |
| 	   Martijn van Oosterhout <kleptog@svana.org>, Tom Lane <tgl@sss.pgh.pa.us>, 
 | |
| 	   Greg Stark <gsstark@mit.edu>, Qingqing Zhou <zhouqq@cs.toronto.edu>, 
 | |
| 	  pgsql-hackers@postgresql.org
 | |
| In-Reply-To: <1135862385.2964.804.camel@localhost.localdomain>
 | |
| References: <200512290158.jBT1wEK28785@candle.pha.pa.us>
 | |
| 	  <1135862385.2964.804.camel@localhost.localdomain>
 | |
| Date: Thu, 29 Dec 2005 09:35:27 -0500
 | |
| Message-ID: <1135866927.61038.13.camel@home>
 | |
| X-Mailer: Evolution 2.4.2.1 FreeBSD GNOME Team Port 
 | |
| X-SA-Exim-Mail-From: pg@rbt.ca
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on psi.look.ca
 | |
| X-Spam-Level: 
 | |
| X-Spam-Status: No, hits=0.7 required=9.0 tests=UPPERCASE_25_50 autolearn=no 
 | |
| 	version=2.63
 | |
| X-SA-Exim-Version: 3.1 (built Tue Feb 24 05:09:27 GMT 2004)
 | |
| X-SA-Exim-Scanned: Yes
 | |
| Content-Length:   393
 | |
| 
 | |
| 
 | |
| > So, my thinking would be to separate things into two:
 | |
| > a) Add a TODO item "shared temp tables" that caters for (1) and (4)
 | |
| > 
 | |
| > 	ALTER TABLE name RELIABILITY 
 | |
| > 		{DELETE ROWS AT RECOVERY | FULL RECOVERY}
 | |
| > (syntax TBD)
 | |
| 
 | |
| DELETE ROWS AT RECOVERY would need to be careful or disallowed when
 | |
| referenced via a foreign key to ensure the database is not restored in
 | |
| an inconsistent state.
 | |
| 
 | |
| -- 
 | |
| 
 | |
| From pgsql-hackers-owner+M78022=pgman=candle.pha.pa.us@postgresql.org Thu Dec 29 10:10:57 2005
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| From: Simon Riggs <simon@2ndquadrant.com>
 | |
| To: Rod Taylor <pg@rbt.ca>
 | |
| cc: Bruce Momjian <pgman@candle.pha.pa.us>, 
 | |
| 	   Martijn van Oosterhout <kleptog@svana.org>, Tom Lane <tgl@sss.pgh.pa.us>, 
 | |
| 	   Greg Stark <gsstark@mit.edu>, Qingqing Zhou <zhouqq@cs.toronto.edu>, 
 | |
| 	  pgsql-hackers@postgresql.org
 | |
| In-Reply-To: <1135866927.61038.13.camel@home>
 | |
| References: <200512290158.jBT1wEK28785@candle.pha.pa.us>
 | |
| 	  <1135862385.2964.804.camel@localhost.localdomain>
 | |
| 	  <1135866927.61038.13.camel@home>
 | |
| Date: Thu, 29 Dec 2005 15:10:40 +0000
 | |
| Message-ID: <1135869040.2964.824.camel@localhost.localdomain>
 | |
| X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4) 
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.113 required=5 tests=[AWL=0.113]
 | |
| X-Spam-Score: 0.113
 | |
| X-Spam-Level: 
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Length:   888
 | |
| 
 | |
| On Thu, 2005-12-29 at 09:35 -0500, Rod Taylor wrote:
 | |
| > > So, my thinking would be to separate things into two:
 | |
| > > a) Add a TODO item "shared temp tables" that caters for (1) and (4)
 | |
| > > 
 | |
| > > 	ALTER TABLE name RELIABILITY 
 | |
| > > 		{DELETE ROWS AT RECOVERY | FULL RECOVERY}
 | |
| > > (syntax TBD)
 | |
| > 
 | |
| > DELETE ROWS AT RECOVERY would need to be careful or disallowed when
 | |
| > referenced via a foreign key to ensure the database is not restored in
 | |
| > an inconsistent state.
 | |
| 
 | |
| I think we'd need to apply the same rule as we do for temp tables: they
 | |
| cannot be referenced by a permanent table.
 | |
| 
 | |
| There are possibly some other restrictions also. Anyone?
 | |
| 
 | |
| Best Regards, Simon Riggs
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 9: In versions below 8.0, the planner will ignore your desire to
 | |
|        choose an index scan if your joining column's datatypes do not
 | |
|        match
 | |
| 
 | |
| From tgl@sss.pgh.pa.us Thu Dec 29 11:12:13 2005
 | |
| To: Simon Riggs <simon@2ndquadrant.com>
 | |
| cc: Bruce Momjian <pgman@candle.pha.pa.us>, 
 | |
| 	   Martijn van Oosterhout <kleptog@svana.org>, Greg Stark <gsstark@mit.edu>, 
 | |
| 	   Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>, 
 | |
| 	  pgsql-hackers@postgresql.org
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and 
 | |
| In-Reply-To: <1135862385.2964.804.camel@localhost.localdomain> 
 | |
| References: <200512290158.jBT1wEK28785@candle.pha.pa.us>  <1135862385.2964.804.camel@localhost.localdomain>
 | |
| Comments: In-reply-to Simon Riggs <simon@2ndquadrant.com>
 | |
| 	message dated "Thu, 29 Dec 2005 13:19:45 +0000"
 | |
| Date: Thu, 29 Dec 2005 11:12:11 -0500
 | |
| Message-ID: <7273.1135872731@sss.pgh.pa.us>
 | |
| From: Tom Lane <tgl@sss.pgh.pa.us>
 | |
| Content-Length:  1963
 | |
| 
 | |
| Simon Riggs <simon@2ndquadrant.com> writes:
 | |
| > My view would be that this thread has been complex because everybody has
 | |
| > expressed a somewhat different requirement, which could be broken down
 | |
| > as:
 | |
| > 1. The need for a multi-user-accessible yet temporary table
 | |
| > 2. Loading data into a table immediately after it is created (i.e. in
 | |
| > same transaction), including but not limited to a reload from pg_dump
 | |
| > 3. How to load data quickly into an existing table (COPY)
 | |
| > 4. How to add/modify data quickly in an existing table (INSERT SELECT,
 | |
| > UPDATE)
 | |
| 
 | |
| > I can see the need for all of those individually; my existing patch
 | |
| > submission covers (2) and (3) only. I very much like your thought to
 | |
| > coalesce these various requirements into a single coherent model.
 | |
| 
 | |
| However, you then seem to be arguing for still using the COPY LOCK
 | |
| syntax, which I think Bruce intended would go away in favor of using
 | |
| these ALTER commands.  Certainly that's what I'd prefer --- COPY has
 | |
| got too darn many options already.
 | |
| 
 | |
| > In the patch, pg_dump has *not* been altered to use COPY LOCK, so a
 | |
| > pg_dump *will* work with any other version of PostgreSQL, which *would
 | |
| > not* be the case if we added ALTER TABLE ... RELIABILITY statements into
 | |
| > it.
 | |
| 
 | |
| Wrong --- the good thing about ALTER TABLE is that an old version of
 | |
| Postgres would simply reject it and keep going.  Therefore we could get
 | |
| the speedup in dumps without losing compatibility, which is not true
 | |
| of COPY LOCK.
 | |
| 
 | |
| BTW, this is a perfect example of the use-case for not abandoning a
 | |
| dump-file load simply because one command fails.  (We have relied on
 | |
| this sort of reasoning many times before, too, for example by using
 | |
| "SET default_with_oids" in preference to CREATE TABLE WITH/WITHOUT OIDS.)
 | |
| I don't think that "wrap the whole load into begin/end" is really a very
 | |
| workable answer, because there are far too many scenarios where you
 | |
| can't do that.  Another one where it doesn't help is a data-only dump.
 | |
| 
 | |
| 			regards, tom lane
 | |
| 
 | |
| From pgsql-hackers-owner+M78028=pgman=candle.pha.pa.us@postgresql.org Thu Dec 29 11:12:41 2005
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| To: Simon Riggs <simon@2ndquadrant.com>
 | |
| cc: Bruce Momjian <pgman@candle.pha.pa.us>, 
 | |
| 	   Martijn van Oosterhout <kleptog@svana.org>, Greg Stark <gsstark@mit.edu>, 
 | |
| 	   Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>, 
 | |
| 	  pgsql-hackers@postgresql.org
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and 
 | |
| In-Reply-To: <1135862385.2964.804.camel@localhost.localdomain> 
 | |
| References: <200512290158.jBT1wEK28785@candle.pha.pa.us>  <1135862385.2964.804.camel@localhost.localdomain>
 | |
| Comments: In-reply-to Simon Riggs <simon@2ndquadrant.com>
 | |
| 	message dated "Thu, 29 Dec 2005 13:19:45 +0000"
 | |
| Date: Thu, 29 Dec 2005 11:12:11 -0500
 | |
| Message-ID: <7273.1135872731@sss.pgh.pa.us>
 | |
| From: Tom Lane <tgl@sss.pgh.pa.us>
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.053 required=5 tests=[AWL=0.053]
 | |
| X-Spam-Score: 0.053
 | |
| X-Spam-Level: 
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Length:  2075
 | |
| 
 | |
| Simon Riggs <simon@2ndquadrant.com> writes:
 | |
| > My view would be that this thread has been complex because everybody has
 | |
| > expressed a somewhat different requirement, which could be broken down
 | |
| > as:
 | |
| > 1. The need for a multi-user-accessible yet temporary table
 | |
| > 2. Loading data into a table immediately after it is created (i.e. in
 | |
| > same transaction), including but not limited to a reload from pg_dump
 | |
| > 3. How to load data quickly into an existing table (COPY)
 | |
| > 4. How to add/modify data quickly in an existing table (INSERT SELECT,
 | |
| > UPDATE)
 | |
| 
 | |
| > I can see the need for all of those individually; my existing patch
 | |
| > submission covers (2) and (3) only. I very much like your thought to
 | |
| > coalesce these various requirements into a single coherent model.
 | |
| 
 | |
| However, you then seem to be arguing for still using the COPY LOCK
 | |
| syntax, which I think Bruce intended would go away in favor of using
 | |
| these ALTER commands.  Certainly that's what I'd prefer --- COPY has
 | |
| got too darn many options already.
 | |
| 
 | |
| > In the patch, pg_dump has *not* been altered to use COPY LOCK, so a
 | |
| > pg_dump *will* work with any other version of PostgreSQL, which *would
 | |
| > not* be the case if we added ALTER TABLE ... RELIABILITY statements into
 | |
| > it.
 | |
| 
 | |
| Wrong --- the good thing about ALTER TABLE is that an old version of
 | |
| Postgres would simply reject it and keep going.  Therefore we could get
 | |
| the speedup in dumps without losing compatibility, which is not true
 | |
| of COPY LOCK.
 | |
| 
 | |
| BTW, this is a perfect example of the use-case for not abandoning a
 | |
| dump-file load simply because one command fails.  (We have relied on
 | |
| this sort of reasoning many times before, too, for example by using
 | |
| "SET default_with_oids" in preference to CREATE TABLE WITH/WITHOUT OIDS.)
 | |
| I don't think that "wrap the whole load into begin/end" is really a very
 | |
| workable answer, because there are far too many scenarios where you
 | |
| can't do that.  Another one where it doesn't help is a data-only dump.
 | |
| 
 | |
| 			regards, tom lane
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 2: Don't 'kill -9' the postmaster
 | |
| 
 | |
| From pgsql-hackers-owner+M78025=pgman=candle.pha.pa.us@postgresql.org Thu Dec 29 10:57:46 2005
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Message-ID: <51082.68.143.134.146.1135872877.squirrel@www.dunslane.net>
 | |
| Date: Thu, 29 Dec 2005 10:14:37 -0600 (CST)
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| From: Andrew Dunstan <andrew@dunslane.net>
 | |
| To: <pgman@candle.pha.pa.us>
 | |
| In-Reply-To: <200512290158.jBT1wEK28785@candle.pha.pa.us>
 | |
| References: <200512290158.jBT1wEK28785@candle.pha.pa.us>
 | |
| X-Priority: 3
 | |
| Importance: Normal
 | |
| X-MSMail-Priority: Normal
 | |
| cc: <kleptog@svana.org>,  <simon@2ndquadrant.com>,  <tgl@sss.pgh.pa.us>, 
 | |
| 	   <gsstark@mit.edu>,  <pg@rbt.ca>,  <zhouqq@cs.toronto.edu>, 
 | |
| 	   <pgsql-hackers@postgresql.org>
 | |
| X-Mailer: SquirrelMail (version 1.2.5)
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.082 required=5 tests=[AWL=0.082]
 | |
| X-Spam-Score: 0.082
 | |
| X-Spam-Level: 
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Length:  1185
 | |
| 
 | |
| Bruce Momjian said:
 | |
| > DROP would drop the table on a restart
 | |
| > after a non-clean shutdown.  It would do _no_ logging on the table and
 | |
| > allow concurrent access, plus index access.  DELETE is the same as
 | |
| > DROP, but it just truncates the table (perhaps TRUNCATE is a better
 | |
| > word).
 | |
| >
 | |
| > EXCLUSIVE would allow only a single session to modify the table, and
 | |
| > would do all changes by appending to the table, similar to COPY LOCK.
 | |
| > EXCLUSIVE would also not allow indexes because those can not be
 | |
| > isolated like appending to the heap.  EXCLUSIVE would write all dirty
 | |
| > shared buffers for the table and fsync them before committing.  SHARE
 | |
| > is the functionality we have now, with full logging.
 | |
| 
 | |
| 
 | |
| I an horribly scared that this will be used as a "performance boost" for
 | |
| normal use. I would at least like to see some restrictions that make it
 | |
| harder to mis-use. Perhaps restrict to superuser?
 | |
| 
 | |
| cheers
 | |
| 
 | |
| andrew
 | |
| 
 | |
| 
 | |
| 
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 1: if posting/reading through Usenet, please send an appropriate
 | |
|        subscribe-nomail command to majordomo@postgresql.org so that your
 | |
|        message can get through to the mailing list cleanly
 | |
| 
 | |
| From tgl@sss.pgh.pa.us Thu Dec 29 11:24:30 2005
 | |
| To: Bruce Momjian <pgman@candle.pha.pa.us>
 | |
| cc: Andrew Dunstan <andrew@dunslane.net>, kleptog@svana.org, 
 | |
| 	  simon@2ndquadrant.com, gsstark@mit.edu, pg@rbt.ca, zhouqq@cs.toronto.edu, 
 | |
| 	  pgsql-hackers@postgresql.org
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and 
 | |
| In-Reply-To: <200512291605.jBTG5gi00396@candle.pha.pa.us> 
 | |
| References: <200512291605.jBTG5gi00396@candle.pha.pa.us>
 | |
| Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
 | |
| 	message dated "Thu, 29 Dec 2005 11:05:42 -0500"
 | |
| Date: Thu, 29 Dec 2005 11:24:28 -0500
 | |
| Message-ID: <7966.1135873468@sss.pgh.pa.us>
 | |
| From: Tom Lane <tgl@sss.pgh.pa.us>
 | |
| Content-Length:   612
 | |
| 
 | |
| Bruce Momjian <pgman@candle.pha.pa.us> writes:
 | |
| > Andrew Dunstan wrote:
 | |
| >> I an horribly scared that this will be used as a "performance boost" for
 | |
| >> normal use. I would at least like to see some restrictions that make it
 | |
| >> harder to mis-use. Perhaps restrict to superuser?
 | |
| 
 | |
| > Certainly restrict to table owner.
 | |
| 
 | |
| I can see the argument for superuser-only: decisions about data
 | |
| integrity tradeoffs should be reserved to the DBA, who is the one who
 | |
| will get blamed if the database loses data, no matter how stupid his
 | |
| users are.
 | |
| 
 | |
| But I'm not wedded to that.  I could live with table-owner.
 | |
| 
 | |
| 			regards, tom lane
 | |
| 
 | |
| From pgsql-hackers-owner+M78031=pgman=candle.pha.pa.us@postgresql.org Thu Dec 29 11:38:17 2005
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| From: Bruce Momjian <pgman@candle.pha.pa.us>
 | |
| Message-ID: <200512291637.jBTGbdC03848@candle.pha.pa.us>
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| In-Reply-To: <7273.1135872731@sss.pgh.pa.us>
 | |
| To: Tom Lane <tgl@sss.pgh.pa.us>
 | |
| Date: Thu, 29 Dec 2005 11:37:39 -0500 (EST)
 | |
| cc: Simon Riggs <simon@2ndquadrant.com>, 
 | |
| 	   Martijn van Oosterhout <kleptog@svana.org>, Greg Stark <gsstark@mit.edu>, 
 | |
| 	   Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>, 
 | |
| 	  pgsql-hackers@postgresql.org
 | |
| X-Mailer: ELM [version 2.4ME+ PL121 (25)]
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.122 required=5 tests=[AWL=0.122]
 | |
| X-Spam-Score: 0.122
 | |
| X-Spam-Level: 
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Length:  3932
 | |
| 
 | |
| Tom Lane wrote:
 | |
| > Simon Riggs <simon@2ndquadrant.com> writes:
 | |
| > > My view would be that this thread has been complex because everybody has
 | |
| > > expressed a somewhat different requirement, which could be broken down
 | |
| > > as:
 | |
| > > 1. The need for a multi-user-accessible yet temporary table
 | |
| > > 2. Loading data into a table immediately after it is created (i.e. in
 | |
| > > same transaction), including but not limited to a reload from pg_dump
 | |
| > > 3. How to load data quickly into an existing table (COPY)
 | |
| > > 4. How to add/modify data quickly in an existing table (INSERT SELECT,
 | |
| > > UPDATE)
 | |
| > 
 | |
| > > I can see the need for all of those individually; my existing patch
 | |
| > > submission covers (2) and (3) only. I very much like your thought to
 | |
| > > coalesce these various requirements into a single coherent model.
 | |
| > 
 | |
| > However, you then seem to be arguing for still using the COPY LOCK
 | |
| > syntax, which I think Bruce intended would go away in favor of using
 | |
| > these ALTER commands.  Certainly that's what I'd prefer --- COPY has
 | |
| > got too darn many options already.
 | |
| > 
 | |
| > > In the patch, pg_dump has *not* been altered to use COPY LOCK, so a
 | |
| > > pg_dump *will* work with any other version of PostgreSQL, which *would
 | |
| > > not* be the case if we added ALTER TABLE ... RELIABILITY statements into
 | |
| > > it.
 | |
| > 
 | |
| > Wrong --- the good thing about ALTER TABLE is that an old version of
 | |
| > Postgres would simply reject it and keep going.  Therefore we could get
 | |
| > the speedup in dumps without losing compatibility, which is not true
 | |
| > of COPY LOCK.
 | |
| > 
 | |
| > BTW, this is a perfect example of the use-case for not abandoning a
 | |
| > dump-file load simply because one command fails.  (We have relied on
 | |
| > this sort of reasoning many times before, too, for example by using
 | |
| > "SET default_with_oids" in preference to CREATE TABLE WITH/WITHOUT OIDS.)
 | |
| > I don't think that "wrap the whole load into begin/end" is really a very
 | |
| > workable answer, because there are far too many scenarios where you
 | |
| > can't do that.  Another one where it doesn't help is a data-only dump.
 | |
| 
 | |
| Yep, Tom is echoing my reaction.  There is a temptation to add things up
 | |
| onto existing commands, e.g. LOCK, and while it works, it makes for some
 | |
| very complex user API's.  Having COPY behave differently because it is
 | |
| in a transaction is fine as long as it is user-invisible, but once you
 | |
| require users to do that to get the speedup, it isn't user-invisible
 | |
| anymore.
 | |
| 
 | |
| (I can see it now, "Why is pg_dump putting things in transactions?",
 | |
| "Because it prevents it from being logged."  "Oh, should I be doing that
 | |
| in my code?"  "Perhaps, if you want ..."  You can see where that
 | |
| discussion is going.  Having them see "ATER TABLE ... RELIBILITY
 | |
| TRUNCATE" is very clear, and very clear on how it can be used in user
 | |
| code.)
 | |
| 
 | |
| I think there is great utility in giving users one API, namely
 | |
| RELIABILITY (or some other keyword), and telling them that is where they
 | |
| control logging.  I realize adding one keyword, LOCK, to an existing
 | |
| command isn't a big deal, but once you decentralize your API enough
 | |
| times, you end up with a terribly complex database system.  It is this
 | |
| design rigidity that helps make PostgreSQL so much easier to use than
 | |
| other database systems.
 | |
| 
 | |
| I do think it is valid concern about someone use the table between the
 | |
| CREATE and the ALTER TABLE RELIABILITY.  One solution would be to allow
 | |
| the RELIABILITY as part of the CREATE TABLE, another is to tell users to
 | |
| create the table inside a transaction.
 | |
| 
 | |
| -- 
 | |
|   Bruce Momjian                        |  http://candle.pha.pa.us
 | |
|   pgman@candle.pha.pa.us               |  (610) 359-1001
 | |
|   +  If your life is a hard drive,     |  13 Roberts Road
 | |
|   +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 9: In versions below 8.0, the planner will ignore your desire to
 | |
|        choose an index scan if your joining column's datatypes do not
 | |
|        match
 | |
| 
 | |
| From pgsql-hackers-owner+M78036=pgman=candle.pha.pa.us@postgresql.org Thu Dec 29 12:21:12 2005
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| To: Andrew Dunstan <andrew@dunslane.net>
 | |
| cc: <pgman@candle.pha.pa.us>,  <kleptog@svana.org>,  <simon@2ndquadrant.com>, 
 | |
| 	   <tgl@sss.pgh.pa.us>,  <gsstark@mit.edu>,  <pg@rbt.ca>, 
 | |
| 	   <zhouqq@cs.toronto.edu>,  <pgsql-hackers@postgresql.org>
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| References: <200512290158.jBT1wEK28785@candle.pha.pa.us>
 | |
| 	 <51082.68.143.134.146.1135872877.squirrel@www.dunslane.net>
 | |
| In-Reply-To: <51082.68.143.134.146.1135872877.squirrel@www.dunslane.net>
 | |
| From: Greg Stark <gsstark@mit.edu>
 | |
| Organization: The Emacs Conspiracy; member since 1992
 | |
| Date: 29 Dec 2005 12:20:32 -0500
 | |
| Message-ID: <87vex74y73.fsf@stark.xeocode.com>
 | |
| Lines: 42
 | |
| User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.4
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.112 required=5 tests=[AWL=0.112]
 | |
| X-Spam-Score: 0.112
 | |
| X-Spam-Level: 
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Length:  1983
 | |
| 
 | |
| "Andrew Dunstan" <andrew@dunslane.net> writes:
 | |
| 
 | |
| > Bruce Momjian said:
 | |
| > > DROP would drop the table on a restart
 | |
| > > after a non-clean shutdown.  It would do _no_ logging on the table and
 | |
| > > allow concurrent access, plus index access.  DELETE is the same as
 | |
| > > DROP, but it just truncates the table (perhaps TRUNCATE is a better
 | |
| > > word).
 | |
| > >
 | |
| > > EXCLUSIVE would allow only a single session to modify the table, and
 | |
| > > would do all changes by appending to the table, similar to COPY LOCK.
 | |
| > > EXCLUSIVE would also not allow indexes because those can not be
 | |
| > > isolated like appending to the heap.  EXCLUSIVE would write all dirty
 | |
| > > shared buffers for the table and fsync them before committing.  SHARE
 | |
| > > is the functionality we have now, with full logging.
 | |
| > 
 | |
| > I an horribly scared that this will be used as a "performance boost" for
 | |
| > normal use. I would at least like to see some restrictions that make it
 | |
| > harder to mis-use. Perhaps restrict to superuser?
 | |
| 
 | |
| Well that's its whole purpose. At least you can hardly argue that you didn't
 | |
| realize the consequences of "DELETE ROWS ON RECOVERY"... :)
 | |
| 
 | |
| Some thoughts:
 | |
| 
 | |
| a) I'm not sure I understand the purpose of EXCLUSIVE. When would I ever want to
 | |
|    use it instead of DELETE ROWS?
 | |
| 
 | |
| b) It seems like the other feature people were talking about of not logging
 | |
|    for a table created within the same transaction should be handled by
 | |
|    having this flag implicitly set for any such newly created table.
 | |
|    Ie, the test for whether to log would look like:
 | |
| 
 | |
|    if (!table->logged && table->xid != myxid) ...
 | |
| 
 | |
| c) Every option in ALTER TABLE should be in CREATE TABLE as well.
 | |
| 
 | |
| d) Yes as someone else mentioned, this should only be allowable on a table
 | |
|    with no foreign keys referencing it. 
 | |
| 
 | |
| -- 
 | |
| greg
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 9: In versions below 8.0, the planner will ignore your desire to
 | |
|        choose an index scan if your joining column's datatypes do not
 | |
|        match
 | |
| 
 | |
| From pgsql-hackers-owner+M78037=pgman=candle.pha.pa.us@postgresql.org Thu Dec 29 12:31:40 2005
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| From: Bruce Momjian <pgman@candle.pha.pa.us>
 | |
| Message-ID: <200512291730.jBTHUnn09840@candle.pha.pa.us>
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| In-Reply-To: <87vex74y73.fsf@stark.xeocode.com>
 | |
| To: Greg Stark <gsstark@mit.edu>
 | |
| Date: Thu, 29 Dec 2005 12:30:49 -0500 (EST)
 | |
| cc: Andrew Dunstan <andrew@dunslane.net>, kleptog@svana.org, 
 | |
| 	  simon@2ndquadrant.com, tgl@sss.pgh.pa.us, pg@rbt.ca, zhouqq@cs.toronto.edu, 
 | |
| 	  pgsql-hackers@postgresql.org
 | |
| X-Mailer: ELM [version 2.4ME+ PL121 (25)]
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.122 required=5 tests=[AWL=0.122]
 | |
| X-Spam-Score: 0.122
 | |
| X-Spam-Level: 
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Length:  3304
 | |
| 
 | |
| Greg Stark wrote:
 | |
| > "Andrew Dunstan" <andrew@dunslane.net> writes:
 | |
| > 
 | |
| > > Bruce Momjian said:
 | |
| > > > DROP would drop the table on a restart
 | |
| > > > after a non-clean shutdown.  It would do _no_ logging on the table and
 | |
| > > > allow concurrent access, plus index access.  DELETE is the same as
 | |
| > > > DROP, but it just truncates the table (perhaps TRUNCATE is a better
 | |
| > > > word).
 | |
| > > >
 | |
| > > > EXCLUSIVE would allow only a single session to modify the table, and
 | |
| > > > would do all changes by appending to the table, similar to COPY LOCK.
 | |
| > > > EXCLUSIVE would also not allow indexes because those can not be
 | |
| > > > isolated like appending to the heap.  EXCLUSIVE would write all dirty
 | |
| > > > shared buffers for the table and fsync them before committing.  SHARE
 | |
| > > > is the functionality we have now, with full logging.
 | |
| > > 
 | |
| > > I an horribly scared that this will be used as a "performance boost" for
 | |
| > > normal use. I would at least like to see some restrictions that make it
 | |
| > > harder to mis-use. Perhaps restrict to superuser?
 | |
| > 
 | |
| > Well that's its whole purpose. At least you can hardly argue that you didn't
 | |
| > realize the consequences of "DELETE ROWS ON RECOVERY"... :)
 | |
| 
 | |
| True.  I think we are worried about non-owners using it, but the owner
 | |
| had to grant permissions for others to modify it, so we might be OK.
 | |
| 
 | |
| > Some thoughts:
 | |
| > 
 | |
| > a) I'm not sure I understand the purpose of EXCLUSIVE. When would I ever want to
 | |
| >    use it instead of DELETE ROWS?
 | |
| 
 | |
| Good question.  The use case is doing COPY into a table that already had
 | |
| data.  EXCLUSIVE allows additions to the table but preserves the
 | |
| existing data on a crash.
 | |
| 
 | |
| > b) It seems like the other feature people were talking about of not logging
 | |
| >    for a table created within the same transaction should be handled by
 | |
| >    having this flag implicitly set for any such newly created table.
 | |
| >    Ie, the test for whether to log would look like:
 | |
| > 
 | |
| >    if (!table->logged && table->xid != myxid) ...
 | |
| 
 | |
| Yes, the question is whether we want to limit users to having this
 | |
| optimization _only_ when they have created the table in the same
 | |
| transaction, and the short answer is we don't.
 | |
| 
 | |
| > c) Every option in ALTER TABLE should be in CREATE TABLE as well.
 | |
| 
 | |
| I looked into that and see that things like:
 | |
| 
 | |
|     ALTER [ COLUMN ] column SET STATISTICS integer
 | |
|     ALTER [ COLUMN ] column SET STORAGE { PLAIN | EXTERNAL | EXTENDED | MAIN }
 | |
| 
 | |
| are not supported by CREATE TABLE, and probably shouldn't be because the
 | |
| value can be changed after the table is created.  I think the only
 | |
| things we usually support in CREATE TABLE are those that cannot be
 | |
| altered.
 | |
| 
 | |
| > d) Yes as someone else mentioned, this should only be allowable on a table
 | |
| >    with no foreign keys referencing it. 
 | |
| 
 | |
| Right, and EXCLUSIVE can not have an index either.
 | |
| 
 | |
| -- 
 | |
|   Bruce Momjian                        |  http://candle.pha.pa.us
 | |
|   pgman@candle.pha.pa.us               |  (610) 359-1001
 | |
|   +  If your life is a hard drive,     |  13 Roberts Road
 | |
|   +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 1: if posting/reading through Usenet, please send an appropriate
 | |
|        subscribe-nomail command to majordomo@postgresql.org so that your
 | |
|        message can get through to the mailing list cleanly
 | |
| 
 | |
| From simon@2ndquadrant.com Fri Dec 30 08:10:53 2005
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| From: Simon Riggs <simon@2ndquadrant.com>
 | |
| To: Bruce Momjian <pgman@candle.pha.pa.us>
 | |
| cc: Andrew Dunstan <andrew@dunslane.net>, Tom Lane <tgl@sss.pgh.pa.us>, 
 | |
| 	   Martijn van Oosterhout <kleptog@svana.org>, Greg Stark <gsstark@mit.edu>, 
 | |
| 	   Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>, 
 | |
| 	  pgsql-hackers@postgresql.org
 | |
| In-Reply-To: <200512291637.jBTGbdC03848@candle.pha.pa.us>
 | |
| References: <200512291637.jBTGbdC03848@candle.pha.pa.us>
 | |
| Date: Fri, 30 Dec 2005 13:09:12 +0000
 | |
| Message-ID: <1135948152.2862.113.camel@localhost.localdomain>
 | |
| X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4) 
 | |
| Content-Length:  6343
 | |
| 
 | |
| On Thu, 2005-12-29 at 11:37 -0500, Bruce Momjian wrote:
 | |
| > Tom Lane wrote:
 | |
| > > Simon Riggs <simon@2ndquadrant.com> writes:
 | |
| > > > My view would be that this thread has been complex because everybody has
 | |
| > > > expressed a somewhat different requirement, which could be broken down
 | |
| > > > as:
 | |
| > > > 1. The need for a multi-user-accessible yet temporary table
 | |
| > > > 2. Loading data into a table immediately after it is created (i.e. in
 | |
| > > > same transaction), including but not limited to a reload from pg_dump
 | |
| > > > 3. How to load data quickly into an existing table (COPY)
 | |
| > > > 4. How to add/modify data quickly in an existing table (INSERT SELECT,
 | |
| > > > UPDATE)
 | |
| 
 | |
| > > However, you then seem to be arguing for still using the COPY LOCK
 | |
| > > syntax, which I think Bruce intended would go away in favor of using
 | |
| > > these ALTER commands.  Certainly that's what I'd prefer --- COPY has
 | |
| > > got too darn many options already.
 | |
| 
 | |
| COPY LOCK was Tom's suggestion at the end of a long discussion thread on
 | |
| this precise issue. Nobody objected to it at that point; I implemented
 | |
| it *exactly* that way because I wanted to very visibly follow the
 | |
| consensus of the community, after informed debate.
 | |
| http://archives.postgresql.org/pgsql-hackers/2005-06/msg00068.php
 | |
| 
 | |
| Please re-read the links to previous discussions.
 | |
| http://archives.postgresql.org/pgsql-hackers/2005-06/msg00069.php
 | |
| There are points there, not made by me, that still apply and need to be
 | |
| considered here, yet have not been.
 | |
| 
 | |
| Just to restate my current thinking:
 | |
| - agree we should have ALTER TABLE ... RELIABILITY DELETE ROWS
 | |
| - we should have COPY LOCK rather than 
 | |
| ALTER TABLE .... RELIABILITY EXCLUSIVE
 | |
| (Though I welcome better wording and syntax in either case; it is the
 | |
| behaviour only that I discuss).
 | |
| 
 | |
| It seems now that we have agreed approaches for (1), (2) and (4). Please
 | |
| note that I have listened to the needs of others with regard to
 | |
| requirement (1), as espoused by earlier by Hannu and again now by
 | |
| Martijn. Some of the points about requirement (3) I made in my previous
 | |
| post have not yet been addressed, IMHO.
 | |
| 
 | |
| My mind is not fixed. AFAICS there are valid points remaining on both
 | |
| sides of the discussion about loading data quickly into an existing
 | |
| table.
 | |
| 
 | |
| > I do think it is valid concern about someone use the table between the
 | |
| > CREATE and the ALTER TABLE RELIABILITY.  One solution would be to allow
 | |
| > the RELIABILITY as part of the CREATE TABLE, another is to tell users to
 | |
| > create the table inside a transaction.
 | |
| 
 | |
| Neither solution works for this use case:
 | |
| 
 | |
| > > 3. How to load data quickly into an existing table (COPY)
 | |
| 
 | |
| This is the only use case for which ALTER TABLE ... EXCLUSIVE makes
 | |
| sense. That option means that any write lock held upon the table would
 | |
| be an EXCLUSIVE table lock, so would never be a performance gain with
 | |
| single row INSERT, UPDATE or DELETEs. 
 | |
| 
 | |
| Following Andrew's concerns, I'd also note that ALTER TABLE requires a
 | |
| much higher level of privilege to operate than does COPY. That sounds
 | |
| like it will make things more secure, but all it does is open up the
 | |
| administrative rights, since full ownership rights must be obtained
 | |
| merely to load data. 
 | |
| 
 | |
| > Having COPY behave differently because it is
 | |
| > in a transaction is fine as long as it is user-invisible
 | |
| 
 | |
| Good
 | |
| 
 | |
| > I think there is great utility in giving users one API, namely
 | |
| > RELIABILITY (or some other keyword), and telling them that is where they
 | |
| > control logging.  I realize adding one keyword, LOCK, to an existing
 | |
| > command isn't a big deal, but once you decentralize your API enough
 | |
| > times, you end up with a terribly complex database system.  It is this
 | |
| > design rigidity that helps make PostgreSQL so much easier to use than
 | |
| > other database systems.
 | |
| 
 | |
| I do see the appeal of your suggestion...
 | |
| 
 | |
| TRUNCATE is a special command to delete quickly. There is no requirement
 | |
| to do an ALTER TABLE statement before that command executes.
 | |
| 
 | |
| Balance would suggest that a special command to load data quickly would
 | |
| be reasonably accepted by users.
 | |
| 
 | |
| 
 | |
| 
 | |
| 
 | |
| Minor points below:
 | |
| 
 | |
| > > > In the patch, pg_dump has *not* been altered to use COPY LOCK, so a
 | |
| > > > pg_dump *will* work with any other version of PostgreSQL, which *would
 | |
| > > > not* be the case if we added ALTER TABLE ... RELIABILITY statements into
 | |
| > > > it.
 | |
| > > 
 | |
| > > Wrong --- the good thing about ALTER TABLE is that an old version of
 | |
| > > Postgres would simply reject it and keep going.  Therefore we could get
 | |
| > > the speedup in dumps without losing compatibility, which is not true
 | |
| > > of COPY LOCK.
 | |
| 
 | |
| That was pointing out one of Bruce's objections was not relevant because
 | |
| it assumed COPY LOCK was required to make pg_restore go faster; that was
 | |
| not the case - so there is no valid objection either way now.
 | |
| 
 | |
| > > BTW, this is a perfect example of the use-case for not abandoning a
 | |
| > > dump-file load simply because one command fails.  (We have relied on
 | |
| > > this sort of reasoning many times before, too, for example by using
 | |
| > > "SET default_with_oids" in preference to CREATE TABLE WITH/WITHOUT OIDS.)
 | |
| > > I don't think that "wrap the whole load into begin/end" is really a very
 | |
| > > workable answer, because there are far too many scenarios where you
 | |
| > > can't do that.  Another one where it doesn't help is a data-only dump.
 | |
| 
 | |
| Which is why --single-transaction is not the default, per the earlier
 | |
| discussion on that point (on -patches).
 | |
| 
 | |
| > Yep, Tom is echoing my reaction.  There is a temptation to add things up
 | |
| > onto existing commands, e.g. LOCK, and while it works, it makes for some
 | |
| > very complex user API's.  Having COPY behave differently because it is
 | |
| > in a transaction is fine as long as it is user-invisible, but once you
 | |
| > require users to do that to get the speedup, it isn't user-invisible
 | |
| > anymore.
 | |
| > 
 | |
| > (I can see it now, "Why is pg_dump putting things in transactions?",
 | |
| > "Because it prevents it from being logged."  "Oh, should I be doing that
 | |
| > in my code?"  "Perhaps, if you want ..."  You can see where that
 | |
| > discussion is going.  Having them see "ATER TABLE ... RELIBILITY
 | |
| > TRUNCATE" is very clear, and very clear on how it can be used in user
 | |
| > code.)
 | |
| 
 | |
| The above case is not an argument against COPY LOCK. Exactly what you
 | |
| say above would still occur even when we have ALTER TABLE ...
 | |
| RELIABILITY statement, since COPY LOCK and
 | |
| COPY-optimized-within-same-transaction are different things.
 | |
| 
 | |
| Best Regards, Simon Riggs
 | |
| 
 | |
| From pgsql-hackers-owner+M78064=pgman=candle.pha.pa.us@postgresql.org Fri Dec 30 11:50:49 2005
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| From: Bruce Momjian <pgman@candle.pha.pa.us>
 | |
| Message-ID: <200512301649.jBUGnxn21488@candle.pha.pa.us>
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| In-Reply-To: <1135948152.2862.113.camel@localhost.localdomain>
 | |
| To: Simon Riggs <simon@2ndquadrant.com>
 | |
| Date: Fri, 30 Dec 2005 11:49:59 -0500 (EST)
 | |
| cc: Andrew Dunstan <andrew@dunslane.net>, Tom Lane <tgl@sss.pgh.pa.us>, 
 | |
| 	   Martijn van Oosterhout <kleptog@svana.org>, Greg Stark <gsstark@mit.edu>, 
 | |
| 	   Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>, 
 | |
| 	  pgsql-hackers@postgresql.org
 | |
| X-Mailer: ELM [version 2.4ME+ PL121 (25)]
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.12 required=5 tests=[AWL=0.120]
 | |
| X-Spam-Score: 0.12
 | |
| X-Spam-Level: 
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Length:  8888
 | |
| 
 | |
| Simon Riggs wrote:
 | |
| > On Thu, 2005-12-29 at 11:37 -0500, Bruce Momjian wrote:
 | |
| > > Tom Lane wrote:
 | |
| > > > Simon Riggs <simon@2ndquadrant.com> writes:
 | |
| > > > > My view would be that this thread has been complex because everybody has
 | |
| > > > > expressed a somewhat different requirement, which could be broken down
 | |
| > > > > as:
 | |
| > > > > 1. The need for a multi-user-accessible yet temporary table
 | |
| > > > > 2. Loading data into a table immediately after it is created (i.e. in
 | |
| > > > > same transaction), including but not limited to a reload from pg_dump
 | |
| > > > > 3. How to load data quickly into an existing table (COPY)
 | |
| > > > > 4. How to add/modify data quickly in an existing table (INSERT SELECT,
 | |
| > > > > UPDATE)
 | |
| > 
 | |
| > > > However, you then seem to be arguing for still using the COPY LOCK
 | |
| > > > syntax, which I think Bruce intended would go away in favor of using
 | |
| > > > these ALTER commands.  Certainly that's what I'd prefer --- COPY has
 | |
| > > > got too darn many options already.
 | |
| > 
 | |
| > COPY LOCK was Tom's suggestion at the end of a long discussion thread on
 | |
| > this precise issue. Nobody objected to it at that point; I implemented
 | |
| > it *exactly* that way because I wanted to very visibly follow the
 | |
| > consensus of the community, after informed debate.
 | |
| > http://archives.postgresql.org/pgsql-hackers/2005-06/msg00068.php
 | |
| > 
 | |
| > Please re-read the links to previous discussions.
 | |
| > http://archives.postgresql.org/pgsql-hackers/2005-06/msg00069.php
 | |
| > There are points there, not made by me, that still apply and need to be
 | |
| > considered here, yet have not been.
 | |
| 
 | |
| Yes, I know we agreed to the COPY LOCK, but new features now being
 | |
| requested, so we have to re-evaluate where we are going with COPY LOCK
 | |
| to get a more consistent solution.
 | |
| 
 | |
| > Just to restate my current thinking:
 | |
| > - agree we should have ALTER TABLE ... RELIABILITY DELETE ROWS
 | |
| > - we should have COPY LOCK rather than 
 | |
| > ALTER TABLE .... RELIABILITY EXCLUSIVE
 | |
| > (Though I welcome better wording and syntax in either case; it is the
 | |
| > behaviour only that I discuss).
 | |
| > 
 | |
| > It seems now that we have agreed approaches for (1), (2) and (4). Please
 | |
| > note that I have listened to the needs of others with regard to
 | |
| > requirement (1), as espoused by earlier by Hannu and again now by
 | |
| > Martijn. Some of the points about requirement (3) I made in my previous
 | |
| > post have not yet been addressed, IMHO.
 | |
| > 
 | |
| > My mind is not fixed. AFAICS there are valid points remaining on both
 | |
| > sides of the discussion about loading data quickly into an existing
 | |
| > table.
 | |
| > 
 | |
| > > I do think it is valid concern about someone use the table between the
 | |
| > > CREATE and the ALTER TABLE RELIABILITY.  One solution would be to allow
 | |
| > > the RELIABILITY as part of the CREATE TABLE, another is to tell users to
 | |
| > > create the table inside a transaction.
 | |
| > 
 | |
| > Neither solution works for this use case:
 | |
| > 
 | |
| > > > 3. How to load data quickly into an existing table (COPY)
 | |
| > 
 | |
| > This is the only use case for which ALTER TABLE ... EXCLUSIVE makes
 | |
| > sense. That option means that any write lock held upon the table would
 | |
| > be an EXCLUSIVE table lock, so would never be a performance gain with
 | |
| > single row INSERT, UPDATE or DELETEs. 
 | |
| 
 | |
| Ah, but people wanted fast INSERT INTO ... SELECT, and that would use
 | |
| EXCLUSIVE too.  What about a massive UPDATE?  Perhaps that could use
 | |
| EXCLUSIVE?  We don't want to add "LOCK" to every command that might use
 | |
| EXCLUSIVE.  ALTER is much better for this.
 | |
| 
 | |
| I agree if we thought EXCLUSIVE would only be used for COPY, we could
 | |
| use LOCK, but I am thinking it will be used for other commands as well.
 | |
| 
 | |
| > Following Andrew's concerns, I'd also note that ALTER TABLE requires a
 | |
| > much higher level of privilege to operate than does COPY. That sounds
 | |
| > like it will make things more secure, but all it does is open up the
 | |
| > administrative rights, since full ownership rights must be obtained
 | |
| > merely to load data. 
 | |
| 
 | |
| True, but as pointed out by others, I don't see that happening too
 | |
| often.
 | |
| 
 | |
| > > Having COPY behave differently because it is
 | |
| > > in a transaction is fine as long as it is user-invisible
 | |
| > 
 | |
| > Good
 | |
| > 
 | |
| > > I think there is great utility in giving users one API, namely
 | |
| > > RELIABILITY (or some other keyword), and telling them that is where they
 | |
| > > control logging.  I realize adding one keyword, LOCK, to an existing
 | |
| > > command isn't a big deal, but once you decentralize your API enough
 | |
| > > times, you end up with a terribly complex database system.  It is this
 | |
| > > design rigidity that helps make PostgreSQL so much easier to use than
 | |
| > > other database systems.
 | |
| > 
 | |
| > I do see the appeal of your suggestion...
 | |
| > 
 | |
| > TRUNCATE is a special command to delete quickly. There is no requirement
 | |
| > to do an ALTER TABLE statement before that command executes.
 | |
| 
 | |
| The TRUNCATE happens during recovery.  There is no user interaction.  It
 | |
| happens because we can't restore the contents of the table in a
 | |
| consistent state because no logging was used.  Basically, a table marked
 | |
| RELIABILITY TRUNCATE would be truncated on a recovery start of the
 | |
| postmaster.
 | |
| 
 | |
| > Balance would suggest that a special command to load data quickly would
 | |
| > be reasonably accepted by users.
 | |
| > 
 | |
| > 
 | |
| > 
 | |
| > 
 | |
| > Minor points below:
 | |
| > 
 | |
| > > > > In the patch, pg_dump has *not* been altered to use COPY LOCK, so a
 | |
| > > > > pg_dump *will* work with any other version of PostgreSQL, which *would
 | |
| > > > > not* be the case if we added ALTER TABLE ... RELIABILITY statements into
 | |
| > > > > it.
 | |
| > > > 
 | |
| > > > Wrong --- the good thing about ALTER TABLE is that an old version of
 | |
| > > > Postgres would simply reject it and keep going.  Therefore we could get
 | |
| > > > the speedup in dumps without losing compatibility, which is not true
 | |
| > > > of COPY LOCK.
 | |
| > 
 | |
| > That was pointing out one of Bruce's objections was not relevant because
 | |
| > it assumed COPY LOCK was required to make pg_restore go faster; that was
 | |
| > not the case - so there is no valid objection either way now.
 | |
| 
 | |
| I don't consider the single-transaction to be a no-cost solution.  You
 | |
| are adding flags to commands, and you are using a dump layout for
 | |
| performance where the purpose for the layout is not clear.  The ALTER is
 | |
| clear to the user, and it allows nologging operations to happen after
 | |
| the table is created.
 | |
| 
 | |
| In fact, for use in pg_dump, I think DROP is the proper operation for
 | |
| loading, not your transaction wrapping solution.  We already agree we
 | |
| need DROP (or TRUNCATE), so why not use that rather than the transaction
 | |
| wrap idea?
 | |
| 
 | |
| > > > BTW, this is a perfect example of the use-case for not abandoning a
 | |
| > > > dump-file load simply because one command fails.  (We have relied on
 | |
| > > > this sort of reasoning many times before, too, for example by using
 | |
| > > > "SET default_with_oids" in preference to CREATE TABLE WITH/WITHOUT OIDS.)
 | |
| > > > I don't think that "wrap the whole load into begin/end" is really a very
 | |
| > > > workable answer, because there are far too many scenarios where you
 | |
| > > > can't do that.  Another one where it doesn't help is a data-only dump.
 | |
| > 
 | |
| > Which is why --single-transaction is not the default, per the earlier
 | |
| > discussion on that point (on -patches).
 | |
| 
 | |
| Right, but why not use DROP/TRUNCATE?  That works for old dumps too, and
 | |
| has no downsides, meaning it can be always on.
 | |
| 
 | |
| > > Yep, Tom is echoing my reaction.  There is a temptation to add things up
 | |
| > > onto existing commands, e.g. LOCK, and while it works, it makes for some
 | |
| > > very complex user API's.  Having COPY behave differently because it is
 | |
| > > in a transaction is fine as long as it is user-invisible, but once you
 | |
| > > require users to do that to get the speedup, it isn't user-invisible
 | |
| > > anymore.
 | |
| > > 
 | |
| > > (I can see it now, "Why is pg_dump putting things in transactions?",
 | |
| > > "Because it prevents it from being logged."  "Oh, should I be doing that
 | |
| > > in my code?"  "Perhaps, if you want ..."  You can see where that
 | |
| > > discussion is going.  Having them see "ATER TABLE ... RELIBILITY
 | |
| > > TRUNCATE" is very clear, and very clear on how it can be used in user
 | |
| > > code.)
 | |
| > 
 | |
| > The above case is not an argument against COPY LOCK. Exactly what you
 | |
| > say above would still occur even when we have ALTER TABLE ...
 | |
| > RELIABILITY statement, since COPY LOCK and
 | |
| > COPY-optimized-within-same-transaction are different things.
 | |
| 
 | |
| See my posting above that we might want EXCLUSIVE for other commands,
 | |
| meaning ALTER makes more sense.
 | |
| 
 | |
| So, to summarize, I think we should add DROP/TRUNCATE, and use that by
 | |
| default (or optionally off?) in pg_dump, and, assuming we want EXCLUSIVE
 | |
| for more than just COPY, we need to add ALTER TABLE EXCLUSIVE.
 | |
| 
 | |
| -- 
 | |
|   Bruce Momjian                        |  http://candle.pha.pa.us
 | |
|   pgman@candle.pha.pa.us               |  (610) 359-1001
 | |
|   +  If your life is a hard drive,     |  13 Roberts Road
 | |
|   +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 5: don't forget to increase your free space map settings
 | |
| 
 | |
| From pgsql-hackers-owner+M78065=pgman=candle.pha.pa.us@postgresql.org Fri Dec 30 12:40:48 2005
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Message-ID: <43B570C9.6060406@dunslane.net>
 | |
| Date: Fri, 30 Dec 2005 12:39:21 -0500
 | |
| From: Andrew Dunstan <andrew@dunslane.net>
 | |
| User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050922 Fedora/1.7.12-1.3.1
 | |
| X-Accept-Language: en-us, en
 | |
| To: Tom Lane <tgl@sss.pgh.pa.us>
 | |
| cc: simon@2ndquadrant.com, pgman@candle.pha.pa.us, kleptog@svana.org, 
 | |
| 	  gsstark@mit.edu, pg@rbt.ca, zhouqq@cs.toronto.edu, 
 | |
| 	  pgsql-hackers@postgresql.org
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| References: <1135948152.2862.113.camel@localhost.localdomain>  <56737.68.143.134.146.1135954413.squirrel@www.dunslane.net>  <11876.1135954626@sss.pgh.pa.us>
 | |
| In-Reply-To: <11876.1135954626@sss.pgh.pa.us>
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.041 required=5 tests=[AWL=0.041]
 | |
| X-Spam-Score: 0.041
 | |
| X-Spam-Level: 
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Length:  1815
 | |
| 
 | |
| 
 | |
| 
 | |
| Tom Lane wrote:
 | |
| 
 | |
| >"Andrew Dunstan" <andrew@dunslane.net> writes:
 | |
| >  
 | |
| >
 | |
| >>Simon Riggs said:
 | |
| >>    
 | |
| >>
 | |
| >>>Following Andrew's concerns, I'd also note that ALTER TABLE requires a
 | |
| >>>much higher level of privilege to operate than does COPY. That sounds
 | |
| >>>like it will make things more secure, but all it does is open up the
 | |
| >>>administrative rights, since full ownership rights must be obtained
 | |
| >>>merely to load data.
 | |
| >>>      
 | |
| >>>
 | |
| >
 | |
| >  
 | |
| >
 | |
| >>My concern is more about making plain that this is for special operations,
 | |
| >>not normal operations. Or maybe I have misunderstood the purpose.
 | |
| >>    
 | |
| >>
 | |
| >
 | |
| >Rephrase that as "full ownership rights must be obtained to load data in
 | |
| >a way that requires dropping any existing indexes and locking out other
 | |
| >users of the table".  I don't think the use-case for this will be very
 | |
| >large for non-owners, or indeed even for owners except during initial
 | |
| >table creation; and so I don't think the above argument is strong.
 | |
| >
 | |
| >			
 | |
| >  
 | |
| >
 | |
| 
 | |
| Those restrictions aren't true of Bruce's proposed drop and
 | |
| delete/truncate recovery modes, are they?
 | |
| 
 | |
| People do crazy things in pursuit of performance. Illustration: a few
 | |
| months ago I was instrumenting an app (based on MySQL/ISAM) and I
 | |
| noticed that under load it simply didn't update the inventory properly -
 | |
| of 1000 orders placed within a few seconds it might reduce inventory by
 | |
| 3 or 4. I reported this and they shrugged their shoulders and said
 | |
| "well, we'd have to lock the table and that would slow everything down
 | |
| ...".
 | |
| 
 | |
| I just want to be sure we aren't providing a footgun. "Oh, just set
 | |
| recovery mode to delete. It won't make any difference unless you crash
 | |
| and you'll run faster."
 | |
| 
 | |
| cheers
 | |
| 
 | |
| andrew
 | |
| 
 | |
| 
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 5: don't forget to increase your free space map settings
 | |
| 
 | |
| From pgsql-hackers-owner+M78066=pgman=candle.pha.pa.us@postgresql.org Fri Dec 30 12:58:52 2005
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| From: Bruce Momjian <pgman@candle.pha.pa.us>
 | |
| Message-ID: <200512301758.jBUHwFv03107@candle.pha.pa.us>
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| In-Reply-To: <43B570C9.6060406@dunslane.net>
 | |
| To: Andrew Dunstan <andrew@dunslane.net>
 | |
| Date: Fri, 30 Dec 2005 12:58:15 -0500 (EST)
 | |
| cc: Tom Lane <tgl@sss.pgh.pa.us>, simon@2ndquadrant.com, kleptog@svana.org, 
 | |
| 	  gsstark@mit.edu, pg@rbt.ca, zhouqq@cs.toronto.edu, 
 | |
| 	  pgsql-hackers@postgresql.org
 | |
| X-Mailer: ELM [version 2.4ME+ PL121 (25)]
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.12 required=5 tests=[AWL=0.120]
 | |
| X-Spam-Score: 0.12
 | |
| X-Spam-Level: 
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Length:  1996
 | |
| 
 | |
| Andrew Dunstan wrote:
 | |
| > >>My concern is more about making plain that this is for special operations,
 | |
| > >>not normal operations. Or maybe I have misunderstood the purpose.
 | |
| > >>    
 | |
| > >>
 | |
| > >
 | |
| > >Rephrase that as "full ownership rights must be obtained to load data in
 | |
| > >a way that requires dropping any existing indexes and locking out other
 | |
| > >users of the table".  I don't think the use-case for this will be very
 | |
| > >large for non-owners, or indeed even for owners except during initial
 | |
| > >table creation; and so I don't think the above argument is strong.
 | |
| > >
 | |
| > >			
 | |
| > >  
 | |
| > >
 | |
| > 
 | |
| > Those restrictions aren't true of Bruce's proposed drop and
 | |
| > delete/truncate recovery modes, are they?
 | |
| 
 | |
| Only the owner could do the ALTER, for sure, but once the owner sets it,
 | |
| any user with permission to write to the table would have those
 | |
| characteristics.
 | |
| 
 | |
| > People do crazy things in pursuit of performance. Illustration: a few
 | |
| > months ago I was instrumenting an app (based on MySQL/ISAM) and I
 | |
| > noticed that under load it simply didn't update the inventory properly -
 | |
| > of 1000 orders placed within a few seconds it might reduce inventory by
 | |
| > 3 or 4. I reported this and they shrugged their shoulders and said
 | |
| > "well, we'd have to lock the table and that would slow everything down
 | |
| > ...".
 | |
| > 
 | |
| > I just want to be sure we aren't providing a footgun. "Oh, just set
 | |
| > recovery mode to delete. It won't make any difference unless you crash
 | |
| > and you'll run faster."
 | |
| 
 | |
| I think we have to trust the object owner in this case.  I don't know of
 | |
| any super-user-only ALTER commands, but I suppose we could set it up
 | |
| that way if we wanted.
 | |
| 
 | |
| -- 
 | |
|   Bruce Momjian                        |  http://candle.pha.pa.us
 | |
|   pgman@candle.pha.pa.us               |  (610) 359-1001
 | |
|   +  If your life is a hard drive,     |  13 Roberts Road
 | |
|   +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 2: Don't 'kill -9' the postmaster
 | |
| 
 | |
| From pgsql-hackers-owner+M78070=pgman=candle.pha.pa.us@postgresql.org Fri Dec 30 14:29:06 2005
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| From: Simon Riggs <simon@2ndquadrant.com>
 | |
| To: Bruce Momjian <pgman@candle.pha.pa.us>
 | |
| cc: Andrew Dunstan <andrew@dunslane.net>, Tom Lane <tgl@sss.pgh.pa.us>, 
 | |
| 	   Martijn van Oosterhout <kleptog@svana.org>, Greg Stark <gsstark@mit.edu>, 
 | |
| 	   Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>, 
 | |
| 	  pgsql-hackers@postgresql.org
 | |
| In-Reply-To: <200512301649.jBUGnxn21488@candle.pha.pa.us>
 | |
| References: <200512301649.jBUGnxn21488@candle.pha.pa.us>
 | |
| Date: Fri, 30 Dec 2005 19:28:41 +0000
 | |
| Message-ID: <1135970921.5052.68.camel@localhost.localdomain>
 | |
| X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4) 
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.034 required=5 tests=[AWL=0.034]
 | |
| X-Spam-Score: 0.034
 | |
| X-Spam-Level: 
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Length:  3112
 | |
| 
 | |
| On Fri, 2005-12-30 at 11:49 -0500, Bruce Momjian wrote:
 | |
| 
 | |
| > Yes, I know we agreed to the COPY LOCK, but new features now being
 | |
| > requested, so we have to re-evaluate where we are going with COPY LOCK
 | |
| > to get a more consistent solution.
 | |
| 
 | |
| Thank you. 
 | |
| 
 | |
| > Ah, but people wanted fast INSERT INTO ... SELECT, and that would use
 | |
| > EXCLUSIVE too.  What about a massive UPDATE?  Perhaps that could use
 | |
| > EXCLUSIVE?  We don't want to add "LOCK" to every command that might use
 | |
| > EXCLUSIVE.  ALTER is much better for this.
 | |
| 
 | |
| > I agree if we thought EXCLUSIVE would only be used for COPY, we could
 | |
| > use LOCK, but I am thinking it will be used for other commands as well.
 | |
| 
 | |
| Agreed, I will look to implement this.
 | |
| 
 | |
| Could the internals of my recent patch be reviewed? Changing the user
 | |
| interface is less of a problem than changing the internals, which is
 | |
| where the hard work takes place. I do not want to extend this work
 | |
| further only to have that part rejected later. 
 | |
| 
 | |
| The implications of EXCLUSIVE are:
 | |
| - there will be a check on each and every I, U, D to check the state of
 | |
| the relation
 | |
| - *every* operation that attempts a write lock will attempt to acquire
 | |
| an EXCLUSIVE full table lock instead
 | |
| - following successful completion of *each* DML statement, the relation
 | |
| will be heap_sync'd involving a full scan of the buffer cache
 | |
| 
 | |
| Can I clarify the wording of the syntax? Is EXCLUSIVE the right word?
 | |
| How about FASTLOAD or BULKLOAD? Those words seem less likely to be
 | |
| misused in the future - i.e. we are invoking a special mode, rather than
 | |
| invoking a special "go faster" option.
 | |
| 
 | |
| > I don't consider the single-transaction to be a no-cost solution.  You
 | |
| > are adding flags to commands, and you are using a dump layout for
 | |
| > performance where the purpose for the layout is not clear.  The ALTER is
 | |
| > clear to the user, and it allows nologging operations to happen after
 | |
| > the table is created.
 | |
| > 
 | |
| > In fact, for use in pg_dump, I think DROP is the proper operation for
 | |
| > loading, not your transaction wrapping solution.  We already agree we
 | |
| > need DROP (or TRUNCATE), so why not use that rather than the transaction
 | |
| > wrap idea?
 | |
| 
 | |
| This was discussed on-list by 2 core team members, a committer and
 | |
| myself, but I see no requirements change here. You even accepted the
 | |
| invisible COPY optimization in your last post - why unpick that now?
 | |
| Please forgive my tone, but I am lost for reasonable yet expressive
 | |
| words. 
 | |
| 
 | |
| The --single-transaction mode would apply even if the dump was created
 | |
| using an earlier version of pg_dump. pg_dump has *not* been altered at
 | |
| all. (And I would again add that the idea was not my own)
 | |
| 
 | |
| > So, to summarize, I think we should add DROP/TRUNCATE, and use that by
 | |
| > default (or optionally off?) in pg_dump, and, assuming we want EXCLUSIVE
 | |
| > for more than just COPY, we need to add ALTER TABLE EXCLUSIVE.
 | |
| 
 | |
| Would you mind stating again what you mean, just so I can understand
 | |
| this? Your summary isn't enough.
 | |
| 
 | |
| Best Regards, Simon Riggs
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 3: Have you checked our extensive FAQ?
 | |
| 
 | |
|                http://www.postgresql.org/docs/faq
 | |
| 
 | |
| From pgsql-hackers-owner+M78072=pgman=candle.pha.pa.us@postgresql.org Fri Dec 30 16:15:30 2005
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| From: Bruce Momjian <pgman@candle.pha.pa.us>
 | |
| Message-ID: <200512302114.jBULEno02301@candle.pha.pa.us>
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| In-Reply-To: <1135970921.5052.68.camel@localhost.localdomain>
 | |
| To: Simon Riggs <simon@2ndquadrant.com>
 | |
| Date: Fri, 30 Dec 2005 16:14:49 -0500 (EST)
 | |
| cc: Andrew Dunstan <andrew@dunslane.net>, Tom Lane <tgl@sss.pgh.pa.us>, 
 | |
| 	   Martijn van Oosterhout <kleptog@svana.org>, Greg Stark <gsstark@mit.edu>, 
 | |
| 	   Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>, 
 | |
| 	  pgsql-hackers@postgresql.org
 | |
| X-Mailer: ELM [version 2.4ME+ PL121 (25)]
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.12 required=5 tests=[AWL=0.120]
 | |
| X-Spam-Score: 0.12
 | |
| X-Spam-Level: 
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Length:  7285
 | |
| 
 | |
| Simon Riggs wrote:
 | |
| > On Fri, 2005-12-30 at 11:49 -0500, Bruce Momjian wrote:
 | |
| > 
 | |
| > > Yes, I know we agreed to the COPY LOCK, but new features now being
 | |
| > > requested, so we have to re-evaluate where we are going with COPY LOCK
 | |
| > > to get a more consistent solution.
 | |
| > 
 | |
| > Thank you. 
 | |
| 
 | |
| Good.  I think we can be happy that COPY LOCK didn't get into a release,
 | |
| so we don't have to support it forever.  When we are adding features, we
 | |
| have to consider not only the current release, but future releases and
 | |
| what people will ask for in the future so the syntax can be expanded
 | |
| without breaking previous usage.
 | |
| 
 | |
| > > Ah, but people wanted fast INSERT INTO ... SELECT, and that would use
 | |
| > > EXCLUSIVE too.  What about a massive UPDATE?  Perhaps that could use
 | |
| > > EXCLUSIVE?  We don't want to add "LOCK" to every command that might use
 | |
| > > EXCLUSIVE.  ALTER is much better for this.
 | |
| > 
 | |
| > > I agree if we thought EXCLUSIVE would only be used for COPY, we could
 | |
| > > use LOCK, but I am thinking it will be used for other commands as well.
 | |
| > 
 | |
| > Agreed, I will look to implement this.
 | |
| > 
 | |
| > Could the internals of my recent patch be reviewed? Changing the user
 | |
| > interface is less of a problem than changing the internals, which is
 | |
| > where the hard work takes place. I do not want to extend this work
 | |
| > further only to have that part rejected later. 
 | |
| 
 | |
| OK, I will look it over this week or next.
 | |
| 
 | |
| > The implications of EXCLUSIVE are:
 | |
| > - there will be a check on each and every I, U, D to check the state of
 | |
| > the relation
 | |
| > - *every* operation that attempts a write lock will attempt to acquire
 | |
| > an EXCLUSIVE full table lock instead
 | |
| > - following successful completion of *each* DML statement, the relation
 | |
| > will be heap_sync'd involving a full scan of the buffer cache
 | |
| 
 | |
| Yes, I think that is it.  What we can do is implement EXCLUSIVE to
 | |
| affect only COPY at this point, and document that, and later add other
 | |
| commands.
 | |
| 
 | |
| > Can I clarify the wording of the syntax? Is EXCLUSIVE the right word?
 | |
| > How about FASTLOAD or BULKLOAD? Those words seem less likely to be
 | |
| > misused in the future - i.e. we are invoking a special mode, rather than
 | |
| > invoking a special "go faster" option.
 | |
| 
 | |
| The problem with the FASTLOAD/BULKLOAD words is that EXCLUSIVE mode is
 | |
| probably not the best for loading.  I would think TRUNCATE would be a
 | |
| better option.
 | |
| 
 | |
| In fact, in loading a table, I think both EXCLUSIVE and TRUNCATE would be
 | |
| the same, mostly.  You would create the table, set its RELIABILITY to
 | |
| TRUNCATE, COPY into the table, then set the RELIABILITY to SHARE or
 | |
| DEFAULT.  The second ALTER has to sync all the dirty data blocks, which
 | |
| the same thing EXCLUSIVE does at the conclusion of COPY.
 | |
| 
 | |
| So, we need a name for EXCLUSIVE mode that suggests how it is different
 | |
| from TRUNCATE, and in this case, the difference is that EXCLUSIVE
 | |
| preserves the previous contents of the table on recovery, while TRUNCATE
 | |
| does not.  Do you want to call the mode PRESERVE, or EXCLUSIVE WRITER?
 | |
| Anyway, the keywords are easy to modify, even after the patch is
 | |
| submitted.  FYI, I usually go through keywords.c looking for a keyword
 | |
| we already use.
 | |
| 
 | |
| > > I don't consider the single-transaction to be a no-cost solution.  You
 | |
| > > are adding flags to commands, and you are using a dump layout for
 | |
| > > performance where the purpose for the layout is not clear.  The ALTER is
 | |
| > > clear to the user, and it allows nologging operations to happen after
 | |
| > > the table is created.
 | |
| > > 
 | |
| > > In fact, for use in pg_dump, I think DROP is the proper operation for
 | |
| > > loading, not your transaction wrapping solution.  We already agree we
 | |
| > > need DROP (or TRUNCATE), so why not use that rather than the transaction
 | |
| > > wrap idea?
 | |
| > 
 | |
| > This was discussed on-list by 2 core team members, a committer and
 | |
| > myself, but I see no requirements change here. You even accepted the
 | |
| > invisible COPY optimization in your last post - why unpick that now?
 | |
| > Please forgive my tone, but I am lost for reasonable yet expressive
 | |
| > words. 
 | |
| 
 | |
| Do you think you are the only one who has rewritten a patch multiple
 | |
| times?  We all have.  The goal is to get the functionality into the
 | |
| system in the most seamless way possible.  Considering the number of
 | |
| people who use PostgreSQL, if it takes use 10 tries, it is worth it
 | |
| considering the thousands of people who will use it.   Would you have us
 | |
| include a sub-optimal patch and have thousands of people adjust to its
 | |
| non-optimal functionality?  I am sure you would not.  Perhaps a company
 | |
| would say, "Oh, just ship it", but we don't.
 | |
| 
 | |
| > The --single-transaction mode would apply even if the dump was created
 | |
| > using an earlier version of pg_dump. pg_dump has *not* been altered at
 | |
| > all. (And I would again add that the idea was not my own)
 | |
| 
 | |
| I assume you mean this:
 | |
| 
 | |
| 	http://archives.postgresql.org/pgsql-patches/2005-12/msg00257.php
 | |
| 
 | |
| I guess with the ALTER commands I don't see much value in the
 | |
| --single-transaction flag.  I am sure others suggested it, but would
 | |
| they suggest it now given our current direction.  The fact that the
 | |
| patch was submitted does not give it any more weight --- the question is
 | |
| does this feature make sense for 8.2.  The goal is not to cram as many
 | |
| optimizations into PostgreSQL as possible, the goal is to present a
 | |
| consistent usable system to users.
 | |
| 
 | |
| > > So, to summarize, I think we should add DROP/TRUNCATE, and use that by
 | |
| > > default (or optionally off?) in pg_dump, and, assuming we want EXCLUSIVE
 | |
| > > for more than just COPY, we need to add ALTER TABLE EXCLUSIVE.
 | |
| > 
 | |
| > Would you mind stating again what you mean, just so I can understand
 | |
| > this? Your summary isn't enough.
 | |
| 
 | |
| New ALTER TABLE mode, perhaps call it PERSISTENCE:
 | |
| 
 | |
| 	ALTER TABLE tab PERSISTENCE DROP ON RECOVERY
 | |
| 	ALTER TABLE tab PERSISTENCE TRUNCATE ON RECOVERY
 | |
| 
 | |
| These would drop or truncate all tables with this flag on a non-clean
 | |
| start of the postmaster, and write something in the server logs. 
 | |
| However, I don't know that we have the code in place to DROP/TRUNCATE in
 | |
| recovery mode, and it would affect all databases, so it could be quite
 | |
| complex to implement.  In this mode, no WAL logs would be written for
 | |
| table modifications, though DDL commands would have to be logged.
 | |
| 
 | |
| 	ALTER TABLE tab PERSISTENCE PRESERVE (or STABLE?)
 | |
| 
 | |
| Table contents are preserved across recoveries, but data modifications
 | |
| can happen only one at a time.  I don't think we have a lock mode that
 | |
| does this, so I am worried a new lock mode will have to be created.  A
 | |
| simplified solution at this stage would be to take an exclusive lock on
 | |
| the table, but really we just need a single-writer table lock, which I
 | |
| don't think we have. initially this can implemented to only affect COPY
 | |
| but later can be done for other commands. 
 | |
| 
 | |
| 	ALTER TABLE tab PERSISTENCE DEFAULT
 | |
| 
 | |
| This would be our current default mode, which is full concurrency and
 | |
| persistence.
 | |
| 
 | |
| It took me over an hour to write this, but I feel the time is worth it
 | |
| because of the number of users who use our software.
 | |
|  
 | |
| -- 
 | |
|   Bruce Momjian                        |  http://candle.pha.pa.us
 | |
|   pgman@candle.pha.pa.us               |  (610) 359-1001
 | |
|   +  If your life is a hard drive,     |  13 Roberts Road
 | |
|   +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 2: Don't 'kill -9' the postmaster
 | |
| 
 | |
| From pgsql-hackers-owner+M78076=pgman=candle.pha.pa.us@postgresql.org Fri Dec 30 17:37:00 2005
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| To: Bruce Momjian <pgman@candle.pha.pa.us>
 | |
| cc: Simon Riggs <simon@2ndquadrant.com>, Andrew Dunstan <andrew@dunslane.net>, 
 | |
| 	   Tom Lane <tgl@sss.pgh.pa.us>, Martijn van Oosterhout <kleptog@svana.org>, 
 | |
| 	   Greg Stark <gsstark@mit.edu>, Rod Taylor <pg@rbt.ca>, 
 | |
| 	   Qingqing Zhou <zhouqq@cs.toronto.edu>, pgsql-hackers@postgresql.org
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| References: <200512302114.jBULEno02301@candle.pha.pa.us>
 | |
| In-Reply-To: <200512302114.jBULEno02301@candle.pha.pa.us>
 | |
| From: Greg Stark <gsstark@mit.edu>
 | |
| Organization: The Emacs Conspiracy; member since 1992
 | |
| Date: 30 Dec 2005 17:36:24 -0500
 | |
| Message-ID: <87mzii8b6f.fsf@stark.xeocode.com>
 | |
| Lines: 28
 | |
| User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.4
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.113 required=5 tests=[AWL=0.113]
 | |
| X-Spam-Score: 0.113
 | |
| X-Spam-Level: 
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Length:  1424
 | |
| 
 | |
| 
 | |
| As far as EXCLUSIVE or COPY LOCK goes, I think this would be useful
 | |
| functionality but perhaps there doesn't have to be any proprietary user
 | |
| interface to it at all. Why not just check if the conditions are already
 | |
| present to allow the optimization and if so go ahead.
 | |
| 
 | |
| That is, if the current transaction already has an exclusive lock on the table
 | |
| and there are no indexes (and PITR isn't active) then Postgres could go ahead
 | |
| and use the same WAL skipping logic as the other operations that already so
 | |
| so. This would work for inserts whether coming from COPY or plain SQL INSERTs.
 | |
| 
 | |
| The nice thing about this is that the user's SQL wouldn't need any proprietary
 | |
| extensions at all. Just tell people to do
 | |
| 
 | |
| BEGIN;
 | |
| LOCK TABLE foo;
 | |
| COPY foo from ...
 | |
| COMMIT;
 | |
| 
 | |
| There could be a COPY LOCK option to obtain a lock, but it would be purely for
 | |
| user convenience so they don't have to bother with BEGIN and COMMIt.
 | |
| 
 | |
| The only downside is a check to see if an exclusive table lock is present on
 | |
| every copy and insert. That might be significant but perhaps there are ways to
 | |
| finess that. If not perhaps only doing it on COPY would be a good compromise.
 | |
| 
 | |
| -- 
 | |
| greg
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 1: if posting/reading through Usenet, please send an appropriate
 | |
|        subscribe-nomail command to majordomo@postgresql.org so that your
 | |
|        message can get through to the mailing list cleanly
 | |
| 
 | |
| From pgsql-hackers-owner+M78077=pgman=candle.pha.pa.us@postgresql.org Fri Dec 30 17:47:18 2005
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| From: Bruce Momjian <pgman@candle.pha.pa.us>
 | |
| Message-ID: <200512302246.jBUMkjF25196@candle.pha.pa.us>
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| In-Reply-To: <87mzii8b6f.fsf@stark.xeocode.com>
 | |
| To: Greg Stark <gsstark@mit.edu>
 | |
| Date: Fri, 30 Dec 2005 17:46:45 -0500 (EST)
 | |
| cc: Simon Riggs <simon@2ndquadrant.com>, Andrew Dunstan <andrew@dunslane.net>, 
 | |
| 	   Tom Lane <tgl@sss.pgh.pa.us>, Martijn van Oosterhout <kleptog@svana.org>, 
 | |
| 	   Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>, 
 | |
| 	  pgsql-hackers@postgresql.org
 | |
| X-Mailer: ELM [version 2.4ME+ PL121 (25)]
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.12 required=5 tests=[AWL=0.120]
 | |
| X-Spam-Score: 0.12
 | |
| X-Spam-Level: 
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Length:  2135
 | |
| 
 | |
| Greg Stark wrote:
 | |
| > 
 | |
| > As far as EXCLUSIVE or COPY LOCK goes, I think this would be useful
 | |
| > functionality but perhaps there doesn't have to be any proprietary user
 | |
| > interface to it at all. Why not just check if the conditions are already
 | |
| > present to allow the optimization and if so go ahead.
 | |
| > 
 | |
| > That is, if the current transaction already has an exclusive lock on the table
 | |
| > and there are no indexes (and PITR isn't active) then Postgres could go ahead
 | |
| > and use the same WAL skipping logic as the other operations that already so
 | |
| > so. This would work for inserts whether coming from COPY or plain SQL INSERTs.
 | |
| > 
 | |
| > The nice thing about this is that the user's SQL wouldn't need any proprietary
 | |
| > extensions at all. Just tell people to do
 | |
| > 
 | |
| > BEGIN;
 | |
| > LOCK TABLE foo;
 | |
| > COPY foo from ...
 | |
| > COMMIT;
 | |
| > 
 | |
| > There could be a COPY LOCK option to obtain a lock, but it would be purely for
 | |
| > user convenience so they don't have to bother with BEGIN and COMMIt.
 | |
| > 
 | |
| > The only downside is a check to see if an exclusive table lock is present on
 | |
| > every copy and insert. That might be significant but perhaps there are ways to
 | |
| > finess that. If not perhaps only doing it on COPY would be a good compromise.
 | |
| 
 | |
| Well, again, if we wanted to use EXCLUSIVE only for COPY, this might
 | |
| make sense.  However, also consider that the idea for EXCLUSIVE was that
 | |
| users could continue read-only queries on the table while it is being
 | |
| loaded (like COPY allows now), and that in EXCLUSIVE mode, we are only
 | |
| going to write into new pages.  
 | |
| 
 | |
| If someone has an exclusive lock on the table and does a COPY or SELECT
 | |
| INTO do we want to assume we are only going to write into new pages, and
 | |
| do we want to force an exclusive lock rather than a single-writer lock? 
 | |
| I don't think so.
 | |
| 
 | |
| -- 
 | |
|   Bruce Momjian                        |  http://candle.pha.pa.us
 | |
|   pgman@candle.pha.pa.us               |  (610) 359-1001
 | |
|   +  If your life is a hard drive,     |  13 Roberts Road
 | |
|   +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 6: explain analyze is your friend
 | |
| 
 | |
| From mpaesold@gmx.at Sat Dec 31 06:59:51 2005
 | |
| Date: Sat, 31 Dec 2005 12:59:44 +0100 (MET)
 | |
| From: Michael Paesold <mpaesold@gmx.at>
 | |
| To: Bruce Momjian <pgman@candle.pha.pa.us>
 | |
| cc: simon@2ndquadrant.com, andrew@dunslane.net, tgl@sss.pgh.pa.us, 
 | |
| 	  kleptog@svana.org, gsstark@mit.edu, pg@rbt.ca, zhouqq@cs.toronto.edu, 
 | |
| 	  pgsql-hackers@postgresql.org
 | |
| References: <200512302114.jBULEno02301@candle.pha.pa.us>
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| X-Priority: 3 (Normal)
 | |
| X-Authenticated: #1946847
 | |
| Message-ID: <14969.1136030384@www6.gmx.net>
 | |
| X-Mailer: WWW-Mail 1.6 (Global Message Exchange)
 | |
| X-Flags: 0001
 | |
| Content-Length:  1305
 | |
| 
 | |
| Bruce Momjian wrote:
 | |
| 
 | |
| > > The --single-transaction mode would apply even if the dump was created
 | |
| > > using an earlier version of pg_dump. pg_dump has *not* been altered at
 | |
| > > all. (And I would again add that the idea was not my own)
 | |
| > 
 | |
| > I assume you mean this:
 | |
| > 
 | |
| > 	http://archives.postgresql.org/pgsql-patches/2005-12/msg00257.php
 | |
| > 
 | |
| > I guess with the ALTER commands I don't see much value in the
 | |
| > --single-transaction flag.  I am sure others suggested it, but would
 | |
| > they suggest it now given our current direction.
 | |
| 
 | |
| I just want to add that --single-transaction has a value of it's own. There
 | |
| were times when I wanted to restore parts of a dump all-or-nothing. 
 | |
| 
 | |
| This is possible with PostgreSQL, unlike many other DBM systems, because
 | |
| people like Tom Lane have invested in ensuring that all DDL is working
 | |
| without implicitly committing an enclosing transaction.
 | |
| 
 | |
| Using pg_restore directly into a database, it is not possible to get a
 | |
| single transaction right now. One has to restore to a file and manually
 | |
| added BEGIN/COMMIT. Just for that I think --single-transaction is a great
 | |
| addition and a missing feature.
 | |
| 
 | |
| I think more people have a use-case for that.
 | |
| 
 | |
| Best Regards,
 | |
| Michael Paesold
 | |
| 
 | |
| -- 
 | |
| Telefonieren Sie schon oder sparen Sie noch?
 | |
| NEU: GMX Phone_Flat http://www.gmx.net/de/go/telefonie
 | |
| 
 | |
| From pgsql-hackers-owner+M78213=pgman=candle.pha.pa.us@postgresql.org Tue Jan  3 12:08:43 2006
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| From: Bruce Momjian <pgman@candle.pha.pa.us>
 | |
| Message-ID: <200601031708.k03H85j27170@candle.pha.pa.us>
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| In-Reply-To: <17173.1136306881@sss.pgh.pa.us>
 | |
| To: Tom Lane <tgl@sss.pgh.pa.us>
 | |
| Date: Tue, 3 Jan 2006 12:08:05 -0500 (EST)
 | |
| cc: Jim C. Nasby <jnasby@pervasive.com>, 
 | |
| 	   Andrew Dunstan <andrew@dunslane.net>, simon@2ndquadrant.com, 
 | |
| 	  kleptog@svana.org, gsstark@mit.edu, pg@rbt.ca, zhouqq@cs.toronto.edu, 
 | |
| 	  pgsql-hackers@postgresql.org
 | |
| X-Mailer: ELM [version 2.4ME+ PL121 (25)]
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.121 required=5 tests=[AWL=0.121]
 | |
| X-Spam-Score: 0.121
 | |
| X-Spam-Level: 
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Length:  1125
 | |
| 
 | |
| Tom Lane wrote:
 | |
| > "Jim C. Nasby" <jnasby@pervasive.com> writes:
 | |
| > > On Tue, Jan 03, 2006 at 11:26:51AM -0500, Tom Lane wrote:
 | |
| > >> Such an ALTER would certainly require exclusive lock on the table,
 | |
| > >> so I'm not sure that I see much use-case for doing it like that.
 | |
| > >> You'd want to do the ALTER and commit so as not to lock other people
 | |
| > >> out of the table entirely while doing the bulk data-pushing.
 | |
| > 
 | |
| > > Maybe this just isn't clear, but would EXCLUSIVE block writes from all
 | |
| > > other sessions then?
 | |
| > 
 | |
| > I don't think it should (which implies that EXCLUSIVE is a bad name).
 | |
| 
 | |
| Agreed, EXCLUSIVE was used to mean an _exclusive_ writer.  The new words
 | |
| I proposed were PRESERVE or STABLE.
 | |
| 
 | |
| -- 
 | |
|   Bruce Momjian                        |  http://candle.pha.pa.us
 | |
|   pgman@candle.pha.pa.us               |  (610) 359-1001
 | |
|   +  If your life is a hard drive,     |  13 Roberts Road
 | |
|   +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 4: Have you searched our list archives?
 | |
| 
 | |
|                http://archives.postgresql.org
 | |
| 
 | |
| From tgl@sss.pgh.pa.us Tue Jan  3 12:37:34 2006
 | |
| To: Stephen Frost <sfrost@snowman.net>
 | |
| cc: Jim C. Nasby <jnasby@pervasive.com>, 
 | |
| 	   Bruce Momjian <pgman@candle.pha.pa.us>, 
 | |
| 	   Andrew Dunstan <andrew@dunslane.net>, kleptog@svana.org, 
 | |
| 	  simon@2ndquadrant.com, gsstark@mit.edu, pg@rbt.ca, zhouqq@cs.toronto.edu, 
 | |
| 	  pgsql-hackers@postgresql.org
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and 
 | |
| In-Reply-To: <20060103165359.GP6026@ns.snowman.net> 
 | |
| References: <200512291605.jBTG5gi00396@candle.pha.pa.us>  <7966.1135873468@sss.pgh.pa.us>  <20060103154521.GC82560@pervasive.com>  <20060103162137.GO6026@ns.snowman.net>  <16856.1136305742@sss.pgh.pa.us>  <20060103165359.GP6026@ns.snowman.net>
 | |
| Comments: In-reply-to Stephen Frost <sfrost@snowman.net>
 | |
| 	message dated "Tue, 03 Jan 2006 11:54:01 -0500"
 | |
| Date: Tue, 03 Jan 2006 12:37:32 -0500
 | |
| Message-ID: <17841.1136309852@sss.pgh.pa.us>
 | |
| From: Tom Lane <tgl@sss.pgh.pa.us>
 | |
| Content-Length:   976
 | |
| 
 | |
| Stephen Frost <sfrost@snowman.net> writes:
 | |
| > The problem is that you might want to grant 'truncate' to people who
 | |
| > *aren't* particularly trusted.  For truncate, at least I have a
 | |
| > real-world use-case for it.
 | |
| 
 | |
| I don't find this use-case particularly convincing.  If the users are
 | |
| allowed to delete all data in a given table, then that table must be
 | |
| dedicated to them anyway; so it's not that easy to see why you can't
 | |
| risk giving them ownership rights on it.  The worst they can do is
 | |
| screw up their own data, no?
 | |
| 
 | |
| In any case, I don't see what's so wrong with the model of using
 | |
| SECURITY DEFINER interface functions when you want a security
 | |
| restriction that's finer-grain than the system provides.  I really
 | |
| *don't* want to see us trying to, say, categorize every variety of
 | |
| ALTER TABLE as a separately grantable privilege.  I could live with
 | |
| something like a catchall "ADMIN" privilege ... except it's not
 | |
| clear how that would differ from ownership.
 | |
| 
 | |
| 			regards, tom lane
 | |
| 
 | |
| From pgsql-hackers-owner+M78221=pgman=candle.pha.pa.us@postgresql.org Tue Jan  3 13:30:34 2006
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Date: Tue, 3 Jan 2006 13:30:56 -0500
 | |
| From: Stephen Frost <sfrost@snowman.net>
 | |
| To: Tom Lane <tgl@sss.pgh.pa.us>
 | |
| cc: Jim C. Nasby <jnasby@pervasive.com>, 
 | |
| 	   Bruce Momjian <pgman@candle.pha.pa.us>, 
 | |
| 	   Andrew Dunstan <andrew@dunslane.net>, kleptog@svana.org, 
 | |
| 	  simon@2ndquadrant.com, gsstark@mit.edu, pg@rbt.ca, zhouqq@cs.toronto.edu, 
 | |
| 	  pgsql-hackers@postgresql.org
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| Message-ID: <20060103183056.GR6026@ns.snowman.net>
 | |
| Mail-Followup-To: Tom Lane <tgl@sss.pgh.pa.us>,
 | |
| 	"Jim C. Nasby" <jnasby@pervasive.com>,
 | |
| 	Bruce Momjian <pgman@candle.pha.pa.us>,
 | |
| 	Andrew Dunstan <andrew@dunslane.net>, kleptog@svana.org,
 | |
| 	simon@2ndquadrant.com, gsstark@mit.edu, pg@rbt.ca,
 | |
| 	zhouqq@cs.toronto.edu, pgsql-hackers@postgresql.org
 | |
| References: <200512291605.jBTG5gi00396@candle.pha.pa.us>  <7966.1135873468@sss.pgh.pa.us>  <20060103154521.GC82560@pervasive.com>  <20060103162137.GO6026@ns.snowman.net>  <16856.1136305742@sss.pgh.pa.us>  <20060103165359.GP6026@ns.snowman.net>  <17841.1136309852@sss.pgh.pa.us>
 | |
| Content-Disposition: inline
 | |
| In-Reply-To: <17841.1136309852@sss.pgh.pa.us>
 | |
| X-Editor: Vim http://www.vim.org/
 | |
| X-Info: http://www.snowman.net
 | |
| X-Operating-System: Linux/2.4.24ns.3.0 (i686)
 | |
| X-Uptime: 12:39:16 up 206 days,  9:50, 11 users,  load average: 0.02, 0.05, 0.05
 | |
| User-Agent: Mutt/1.5.9i
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.105 required=5 tests=[AWL=0.105]
 | |
| X-Spam-Score: 0.105
 | |
| X-Spam-Level: 
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Length:  2666
 | |
| 
 | |
| -- Start of PGP signed section.
 | |
| * Tom Lane (tgl@sss.pgh.pa.us) wrote:
 | |
| > I don't find this use-case particularly convincing.  If the users are
 | |
| > allowed to delete all data in a given table, then that table must be
 | |
| > dedicated to them anyway; so it's not that easy to see why you can't
 | |
| > risk giving them ownership rights on it.  The worst they can do is
 | |
| > screw up their own data, no?
 | |
| 
 | |
| Being able to delete all data in a given table in no way implies
 | |
| ownership rights.  The tables are part of a specification which the
 | |
| users are being asked to respond to.  Being able to change the table
 | |
| types or remove the constraints put on the tables would allow the 
 | |
| users to upload garbage which would then affect downstream processing.
 | |
| 
 | |
| We can't guarentee this won't happen anyway but we try to confine the
 | |
| things they can mess up to a reasonable set which we can check for (and
 | |
| do, through a rather involved error checking system).  There are *alot*
 | |
| of things built on top of the table structures and having them change
 | |
| would basically break the whole system (without the appropriate changes
 | |
| being made to the other parts of the system).
 | |
| 
 | |
| > In any case, I don't see what's so wrong with the model of using
 | |
| > SECURITY DEFINER interface functions when you want a security
 | |
| > restriction that's finer-grain than the system provides.  I really
 | |
| > *don't* want to see us trying to, say, categorize every variety of
 | |
| > ALTER TABLE as a separately grantable privilege.  I could live with
 | |
| > something like a catchall "ADMIN" privilege ... except it's not
 | |
| > clear how that would differ from ownership.
 | |
| 
 | |
| I don't think anyone's asked for 'ALTER TABLE' privileges to be
 | |
| seperately grantable.  It seems to me that the privileges which *need*
 | |
| to be grantable are ones associated with DML statements.  I would 
 | |
| classify TRUNCATE, VACUUM and ANALYZE as DML statements (along with 
 | |
| select, insert, update, and delete).  They're PostgreSQL-specific DML 
 | |
| statements but they still fall into that category.  I don't think 
 | |
| it's a coincidence that the SQL-defined DML statements are all, 
 | |
| individually, grantable.
 | |
| 
 | |
| That doesn't mean I think we should get rid of RULE, REFERENCES or
 | |
| TRIGGER, though honestly I've very rarely needed to grant any of them 
 | |
| (I don't think I've ever granted RULE or TRIGGER...).  References is
 | |
| DDL-oriented, but for *other* tables; RULE and TRIGGER are DDL and I
 | |
| can't really justify why someone other than the owner would need them
 | |
| but I'm guessing someone's using them.  I don't think their existance
 | |
| should imply that if we ever change the grants again we have to include
 | |
| all types of 'ALTER TABLE', etc, though.
 | |
| 
 | |
| 	Thanks,
 | |
| 
 | |
| 		Stephen
 | |
| -- End of PGP section, PGP failed!
 | |
| 
 | |
| From sfrost@snowman.net Tue Jan  3 13:30:13 2006
 | |
| Date: Tue, 3 Jan 2006 13:30:56 -0500
 | |
| From: Stephen Frost <sfrost@snowman.net>
 | |
| To: Tom Lane <tgl@sss.pgh.pa.us>
 | |
| cc: Jim C. Nasby <jnasby@pervasive.com>, 
 | |
| 	   Bruce Momjian <pgman@candle.pha.pa.us>, 
 | |
| 	   Andrew Dunstan <andrew@dunslane.net>, kleptog@svana.org, 
 | |
| 	  simon@2ndquadrant.com, gsstark@mit.edu, pg@rbt.ca, zhouqq@cs.toronto.edu, 
 | |
| 	  pgsql-hackers@postgresql.org
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| Message-ID: <20060103183056.GR6026@ns.snowman.net>
 | |
| Mail-Followup-To: Tom Lane <tgl@sss.pgh.pa.us>,
 | |
| 	"Jim C. Nasby" <jnasby@pervasive.com>,
 | |
| 	Bruce Momjian <pgman@candle.pha.pa.us>,
 | |
| 	Andrew Dunstan <andrew@dunslane.net>, kleptog@svana.org,
 | |
| 	simon@2ndquadrant.com, gsstark@mit.edu, pg@rbt.ca,
 | |
| 	zhouqq@cs.toronto.edu, pgsql-hackers@postgresql.org
 | |
| References: <200512291605.jBTG5gi00396@candle.pha.pa.us>  <7966.1135873468@sss.pgh.pa.us>  <20060103154521.GC82560@pervasive.com>  <20060103162137.GO6026@ns.snowman.net>  <16856.1136305742@sss.pgh.pa.us>  <20060103165359.GP6026@ns.snowman.net>  <17841.1136309852@sss.pgh.pa.us>
 | |
| Content-Disposition: inline
 | |
| In-Reply-To: <17841.1136309852@sss.pgh.pa.us>
 | |
| X-Editor: Vim http://www.vim.org/
 | |
| X-Info: http://www.snowman.net
 | |
| X-Operating-System: Linux/2.4.24ns.3.0 (i686)
 | |
| X-Uptime: 12:39:16 up 206 days,  9:50, 11 users,  load average: 0.02, 0.05, 0.05
 | |
| User-Agent: Mutt/1.5.9i
 | |
| Content-Length:  2666
 | |
| 
 | |
| -- Start of PGP signed section.
 | |
| * Tom Lane (tgl@sss.pgh.pa.us) wrote:
 | |
| > I don't find this use-case particularly convincing.  If the users are
 | |
| > allowed to delete all data in a given table, then that table must be
 | |
| > dedicated to them anyway; so it's not that easy to see why you can't
 | |
| > risk giving them ownership rights on it.  The worst they can do is
 | |
| > screw up their own data, no?
 | |
| 
 | |
| Being able to delete all data in a given table in no way implies
 | |
| ownership rights.  The tables are part of a specification which the
 | |
| users are being asked to respond to.  Being able to change the table
 | |
| types or remove the constraints put on the tables would allow the 
 | |
| users to upload garbage which would then affect downstream processing.
 | |
| 
 | |
| We can't guarentee this won't happen anyway but we try to confine the
 | |
| things they can mess up to a reasonable set which we can check for (and
 | |
| do, through a rather involved error checking system).  There are *alot*
 | |
| of things built on top of the table structures and having them change
 | |
| would basically break the whole system (without the appropriate changes
 | |
| being made to the other parts of the system).
 | |
| 
 | |
| > In any case, I don't see what's so wrong with the model of using
 | |
| > SECURITY DEFINER interface functions when you want a security
 | |
| > restriction that's finer-grain than the system provides.  I really
 | |
| > *don't* want to see us trying to, say, categorize every variety of
 | |
| > ALTER TABLE as a separately grantable privilege.  I could live with
 | |
| > something like a catchall "ADMIN" privilege ... except it's not
 | |
| > clear how that would differ from ownership.
 | |
| 
 | |
| I don't think anyone's asked for 'ALTER TABLE' privileges to be
 | |
| seperately grantable.  It seems to me that the privileges which *need*
 | |
| to be grantable are ones associated with DML statements.  I would 
 | |
| classify TRUNCATE, VACUUM and ANALYZE as DML statements (along with 
 | |
| select, insert, update, and delete).  They're PostgreSQL-specific DML 
 | |
| statements but they still fall into that category.  I don't think 
 | |
| it's a coincidence that the SQL-defined DML statements are all, 
 | |
| individually, grantable.
 | |
| 
 | |
| That doesn't mean I think we should get rid of RULE, REFERENCES or
 | |
| TRIGGER, though honestly I've very rarely needed to grant any of them 
 | |
| (I don't think I've ever granted RULE or TRIGGER...).  References is
 | |
| DDL-oriented, but for *other* tables; RULE and TRIGGER are DDL and I
 | |
| can't really justify why someone other than the owner would need them
 | |
| but I'm guessing someone's using them.  I don't think their existance
 | |
| should imply that if we ever change the grants again we have to include
 | |
| all types of 'ALTER TABLE', etc, though.
 | |
| 
 | |
| 	Thanks,
 | |
| 
 | |
| 		Stephen
 | |
| -- End of PGP section, PGP failed!
 | |
| 
 | |
| From pgsql-hackers-owner+M78233=pgman=candle.pha.pa.us@postgresql.org Tue Jan  3 17:39:06 2006
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| From: Bruce Momjian <pgman@candle.pha.pa.us>
 | |
| Message-ID: <200601032238.k03McP804163@candle.pha.pa.us>
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| In-Reply-To: <20060103212750.GT82560@pervasive.com>
 | |
| To: Jim C. Nasby <jnasby@pervasive.com>
 | |
| Date: Tue, 3 Jan 2006 17:38:25 -0500 (EST)
 | |
| cc: Tom Lane <tgl@sss.pgh.pa.us>, Andrew Dunstan <andrew@dunslane.net>, 
 | |
| 	  simon@2ndquadrant.com, kleptog@svana.org, gsstark@mit.edu, pg@rbt.ca, 
 | |
| 	  zhouqq@cs.toronto.edu, pgsql-hackers@postgresql.org
 | |
| X-Mailer: ELM [version 2.4ME+ PL121 (25)]
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.121 required=5 tests=[AWL=0.121]
 | |
| X-Spam-Score: 0.121
 | |
| X-Spam-Level: 
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Length:  1714
 | |
| 
 | |
| Jim C. Nasby wrote:
 | |
| > > We would be creating a new lock type for this.
 | |
| > 
 | |
| > Sorry if I've just missed this in the thread, but what would  the new
 | |
| > lock type do? My impression is that as it stands you can either do:
 | |
| > 
 | |
| > BEGIN;
 | |
| > ALTER TABLE EXCLUSIVE;
 | |
| > ...
 | |
| > ALTER TABLE SHARE; --fsync
 | |
| > COMMIT;
 | |
| > 
 | |
| > Which would block all other access to the table as soon as the first
 | |
| > ALTER TABLE happens. Or you can:
 | |
| > 
 | |
| > ALTER TABLE EXCLUSIVE;
 | |
| > ...
 | |
| > ALTER TABLE SHARE;
 | |
| > 
 | |
| > Which means that between the two ALTER TABLES every backend that does
 | |
| > DML on that table will not have that DML logged, but because there's no
 | |
| > exclusive lock that DML would be allowed to occur.
 | |
| 
 | |
| Right, the DML will be single-threaded and fsync of all dirty pages will
 | |
| happen before commit of each transaction.
 | |
| 
 | |
| > BTW, there might be some usecase for the second scenario, in which case
 | |
| > it would probably be better to tell the user to aquire a table-lock on
 | |
| > their own rather than do it automatically as part of the update...
 | |
| 
 | |
| > > Basically meaning your idea of update while EXCLUSIVE/PRESERVE/STABLE is
 | |
| > > happening is never going to be implemented because it is just too hard
 | |
| > > to do, and too prone to error.
 | |
| > 
 | |
| > What I figured. Never hurts to ask though. :)
 | |
| 
 | |
| Actually, it does hurt because it generates discussion volume for no
 | |
| purpose.
 | |
| 
 | |
| -- 
 | |
|   Bruce Momjian                        |  http://candle.pha.pa.us
 | |
|   pgman@candle.pha.pa.us               |  (610) 359-1001
 | |
|   +  If your life is a hard drive,     |  13 Roberts Road
 | |
|   +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 5: don't forget to increase your free space map settings
 | |
| 
 | |
| From pgsql-hackers-owner+M78234=pgman=candle.pha.pa.us@postgresql.org Tue Jan  3 17:54:16 2006
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| From: Simon Riggs <simon@2ndquadrant.com>
 | |
| To: Bruce Momjian <pgman@candle.pha.pa.us>
 | |
| cc: Andrew Dunstan <andrew@dunslane.net>, Tom Lane <tgl@sss.pgh.pa.us>, 
 | |
| 	   Martijn van Oosterhout <kleptog@svana.org>, Greg Stark <gsstark@mit.edu>, 
 | |
| 	   Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>, 
 | |
| 	  pgsql-hackers@postgresql.org
 | |
| In-Reply-To: <200512302114.jBULEno02301@candle.pha.pa.us>
 | |
| References: <200512302114.jBULEno02301@candle.pha.pa.us>
 | |
| Date: Tue, 03 Jan 2006 22:53:53 +0000
 | |
| Message-ID: <1136328833.5052.223.camel@localhost.localdomain>
 | |
| X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4) 
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.04 required=5 tests=[AWL=0.040]
 | |
| X-Spam-Score: 0.04
 | |
| X-Spam-Level: 
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Length:  5373
 | |
| 
 | |
| On Fri, 2005-12-30 at 16:14 -0500, Bruce Momjian wrote:
 | |
| > Simon Riggs wrote:
 | |
| > > The implications of EXCLUSIVE are:
 | |
| > > - there will be a check on each and every I, U, D to check the state of
 | |
| > > the relation
 | |
| > > - *every* operation that attempts a write lock will attempt to acquire
 | |
| > > an EXCLUSIVE full table lock instead
 | |
| > > - following successful completion of *each* DML statement, the relation
 | |
| > > will be heap_sync'd involving a full scan of the buffer cache
 | |
| > 
 | |
| > Yes, I think that is it.  What we can do is implement EXCLUSIVE to
 | |
| > affect only COPY at this point, and document that, and later add other
 | |
| > commands.
 | |
| > 
 | |
| > > Can I clarify the wording of the syntax? Is EXCLUSIVE the right word?
 | |
| > > How about FASTLOAD or BULKLOAD? Those words seem less likely to be
 | |
| > > misused in the future - i.e. we are invoking a special mode, rather than
 | |
| > > invoking a special "go faster" option.
 | |
| > 
 | |
| > The problem with the FASTLOAD/BULKLOAD words is that EXCLUSIVE mode is
 | |
| > probably not the best for loading.  I would think TRUNCATE would be a
 | |
| > better option.
 | |
| > 
 | |
| > In fact, in loading a table, I think both EXCLUSIVE and TRUNCATE would be
 | |
| > the same, mostly.  You would create the table, set its RELIABILITY to
 | |
| > TRUNCATE, COPY into the table, then set the RELIABILITY to SHARE or
 | |
| > DEFAULT.  The second ALTER has to sync all the dirty data blocks, which
 | |
| > the same thing EXCLUSIVE does at the conclusion of COPY.
 | |
| > 
 | |
| > So, we need a name for EXCLUSIVE mode that suggests how it is different
 | |
| > from TRUNCATE, and in this case, the difference is that EXCLUSIVE
 | |
| > preserves the previous contents of the table on recovery, while TRUNCATE
 | |
| > does not.  Do you want to call the mode PRESERVE, or EXCLUSIVE WRITER?
 | |
| > Anyway, the keywords are easy to modify, even after the patch is
 | |
| > submitted.  FYI, I usually go through keywords.c looking for a keyword
 | |
| > we already use.
 | |
| 
 | |
| I'm very happy for suggestions on what these new modes are called.
 | |
| 
 | |
| > > > So, to summarize, I think we should add DROP/TRUNCATE, and use that by
 | |
| > > > default (or optionally off?) in pg_dump, and, assuming we want EXCLUSIVE
 | |
| > > > for more than just COPY, we need to add ALTER TABLE EXCLUSIVE.
 | |
| > > 
 | |
| > > Would you mind stating again what you mean, just so I can understand
 | |
| > > this? Your summary isn't enough.
 | |
| > 
 | |
| > New ALTER TABLE mode, perhaps call it PERSISTENCE:
 | |
| > 
 | |
| > 	ALTER TABLE tab PERSISTENCE DROP ON RECOVERY
 | |
| > 	ALTER TABLE tab PERSISTENCE TRUNCATE ON RECOVERY
 | |
| > 
 | |
| > These would drop or truncate all tables with this flag on a non-clean
 | |
| > start of the postmaster, and write something in the server logs. 
 | |
| > However, I don't know that we have the code in place to DROP/TRUNCATE in
 | |
| > recovery mode, and it would affect all databases, so it could be quite
 | |
| > complex to implement.  In this mode, no WAL logs would be written for
 | |
| > table modifications, though DDL commands would have to be logged.
 | |
| 
 | |
| Right now, this will be a TODO item... it looks like it will take some
 | |
| thought to implement correctly.
 | |
| 
 | |
| > 	ALTER TABLE tab PERSISTENCE PRESERVE (or STABLE?)
 | |
| > 
 | |
| > Table contents are preserved across recoveries, but data modifications
 | |
| > can happen only one at a time.  I don't think we have a lock mode that
 | |
| > does this, so I am worried a new lock mode will have to be created.  A
 | |
| > simplified solution at this stage would be to take an exclusive lock on
 | |
| > the table, but really we just need a single-writer table lock, which I
 | |
| > don't think we have. initially this can implemented to only affect COPY
 | |
| > but later can be done for other commands. 
 | |
| 
 | |
| ExclusiveLock locks out everything apart from readers, no new lock mode
 | |
| AFAICS. Implementing that is little additional work for COPY.
 | |
| 
 | |
| Tom had a concern about setting this for I, U, D commands via the
 | |
| executor. Not sure what the details of that are, as yet.
 | |
| 
 | |
| We can use either of the unlogged modes for pg_dump, so I'd suggest its
 | |
| this one. Everybody happy with this being the new default in pg_dump, or
 | |
| should it be an option?
 | |
| 
 | |
| > 	ALTER TABLE tab PERSISTENCE DEFAULT
 | |
| > 
 | |
| > This would be our current default mode, which is full concurrency and
 | |
| > persistence.
 | |
| 
 | |
| I'm thinking whether the ALTER TABLE statement might be better with two
 | |
| bool flags rather than a 3-state char.
 | |
| 
 | |
| flag 1: ENABLE LOGGING | DISABLE LOGGING
 | |
| 
 | |
| flag 2: FULL RECOVERY | TRUNCATE ON RECOVERY
 | |
| 
 | |
| Giving 3 possible sets of options:
 | |
| 
 | |
| -- the default
 | |
| ALTER TABLE mytable ENABLE LOGGING FULL RECOVERY; (default)
 | |
| 
 | |
| -- EXCLUSIVE mode
 | |
| ALTER TABLE mytable DISABLE LOGGING FULL RECOVERY;
 | |
| ...which would be used like this
 | |
| 	ALTER TABLE mytable DISABLE LOGGING;
 | |
| 	COPY or other bulk data manipulation SQL
 | |
| 	ALTER TABLE mytable ENABLE LOGGING;
 | |
| ...since FULL RECOVERY is the default.
 | |
| 
 | |
| -- multiuser temp table mode
 | |
| ALTER TABLE mytable DISABLE LOGGING TRUNCATE ON RECOVERY;
 | |
| ...which would usually be left on all the time
 | |
| 
 | |
| which only uses one new keyword LOGGING and yet all the modes are fairly
 | |
| explicit as to what they do.
 | |
| 
 | |
| An alternative might be the slightly more verbose:
 | |
| 	ALTER TABLE mytable DISABLE LOGGING FORCE EXCLUSIVE TABLE LOCK;
 | |
| which would be turned off by
 | |
| 	ALTER TABLE mytable ENABLE LOGGING;
 | |
| 
 | |
| Comments?
 | |
| 
 | |
| Best Regards, Simon Riggs
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 1: if posting/reading through Usenet, please send an appropriate
 | |
|        subscribe-nomail command to majordomo@postgresql.org so that your
 | |
|        message can get through to the mailing list cleanly
 | |
| 
 | |
| From simon@2ndquadrant.com Tue Jan  3 18:10:32 2006
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| From: Simon Riggs <simon@2ndquadrant.com>
 | |
| To: Jim C. Nasby <jnasby@pervasive.com>, 
 | |
| 	   Bruce Momjian <pgman@candle.pha.pa.us>
 | |
| cc: Tom Lane <tgl@sss.pgh.pa.us>, Andrew Dunstan <andrew@dunslane.net>, 
 | |
| 	  kleptog@svana.org, gsstark@mit.edu, pg@rbt.ca, zhouqq@cs.toronto.edu, 
 | |
| 	  pgsql-hackers@postgresql.org
 | |
| In-Reply-To: <200601032120.k03LKl609990@candle.pha.pa.us>
 | |
| References: <200601032120.k03LKl609990@candle.pha.pa.us>
 | |
| Date: Tue, 03 Jan 2006 23:10:16 +0000
 | |
| Message-ID: <1136329816.5052.239.camel@localhost.localdomain>
 | |
| X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4) 
 | |
| Content-Length:  2118
 | |
| 
 | |
| On Tue, 2006-01-03 at 16:20 -0500, Bruce Momjian wrote:
 | |
| > Jim C. Nasby wrote:
 | |
| 
 | |
| > > Idealistically, if EXCLUSIVE/PRESERVE/STABLE does it's thing by only
 | |
| > > appending new pages, it would be nice if other backends could continue
 | |
| > > performing updates at the same time, assuming there's free space
 | |
| > > available elsewhere within the table (and that you'd be able to recover
 | |
| > > those logged changes regardless of the non-logged operations). But
 | |
| > > that's a pretty lofty goal...
 | |
| > 
 | |
| > "Idealistically", yep.  It would be great if we could put a helmet on
 | |
| > and the computer would read your mind.  :-)
 | |
| > 
 | |
| > Basically meaning your idea of update while EXCLUSIVE/PRESERVE/STABLE is
 | |
| > happening is never going to be implemented because it is just too hard
 | |
| > to do, and too prone to error.
 | |
| 
 | |
| The reason for locking the whole table was to ensure that we do not have
 | |
| a mixture of logged and non-logged writers writing to the same data
 | |
| blocks, since that could damage blocks unrecoverably in the event of a
 | |
| crash. (Though perhaps only if full_block_writes is on)
 | |
| 
 | |
| The ALTER TABLE .. EXCLUSIVE/(insert name) mode would mean that *any*
 | |
| backend who took a write lock on the table, would lock out the whole
 | |
| table. So this new mode is not restricted to the job/user who ran the
 | |
| ALTER TABLE command. (I would note that that is how Oracle and Teradata
 | |
| do this for pre-load utility table locking, but why should we follow
 | |
| them on that?)
 | |
| 
 | |
| Currently, when we add a new row when the FSM is empty, we check the
 | |
| last block of the table. That would cause multiple writers to access the
 | |
| same blocks and so we would be in danger. The only way to avoid that
 | |
| would be for logged writers (who would use the FSM if it were not empty)
 | |
| to notify back to the FSM that they have just added a block - and remove
 | |
| the behaviour to look for the last block.
 | |
| 
 | |
| Anyway, one step at a time. *Maybe* we can do that in the future, but
 | |
| right now I'd like to add the basic fast write/load functionality.
 | |
| 
 | |
| Also, I think I will do the docs first this time, just so everyone can
 | |
| read what we're getting ahead of time, to ensure we all agree.
 | |
| 
 | |
| Best Regards, Simon Riggs
 | |
| 
 | |
| From pgsql-hackers-owner+M78236=pgman=candle.pha.pa.us@postgresql.org Tue Jan  3 18:24:20 2006
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| From: Simon Riggs <simon@2ndquadrant.com>
 | |
| To: Bruce Momjian <pgman@candle.pha.pa.us>
 | |
| cc: Jim C. Nasby <jnasby@pervasive.com>, Tom Lane <tgl@sss.pgh.pa.us>, 
 | |
| 	   Andrew Dunstan <andrew@dunslane.net>, kleptog@svana.org, gsstark@mit.edu, 
 | |
| 	  pg@rbt.ca, zhouqq@cs.toronto.edu, pgsql-hackers@postgresql.org
 | |
| In-Reply-To: <200601032238.k03McP804163@candle.pha.pa.us>
 | |
| References: <200601032238.k03McP804163@candle.pha.pa.us>
 | |
| Date: Tue, 03 Jan 2006 23:23:54 +0000
 | |
| Message-ID: <1136330634.5052.247.camel@localhost.localdomain>
 | |
| X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4) 
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.043 required=5 tests=[AWL=0.043]
 | |
| X-Spam-Score: 0.043
 | |
| X-Spam-Level: 
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Length:   725
 | |
| 
 | |
| On Tue, 2006-01-03 at 17:38 -0500, Bruce Momjian wrote:
 | |
| 
 | |
| > Right, the DML will be single-threaded and fsync of all dirty pages will
 | |
| > happen before commit of each transaction.
 | |
| 
 | |
| heap_sync() would occur at end of statement, as it does with CTAS. We
 | |
| could delay until EOT but I'm not sure I see why; in most cases they'd
 | |
| be the same point anyway.
 | |
| 
 | |
| I'd been toying with the idea of making the freshly added blocks live
 | |
| only in temp_buffers to avoid the shared_buffers overhead, but that was
 | |
| starting to sounds too wierd for my liking.
 | |
| 
 | |
| Best Regards, Simon Riggs
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 4: Have you searched our list archives?
 | |
| 
 | |
|                http://archives.postgresql.org
 | |
| 
 | |
| From simon@2ndquadrant.com Tue Jan  3 18:58:13 2006
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| From: Simon Riggs <simon@2ndquadrant.com>
 | |
| To: Michael Paesold <mpaesold@gmx.at>
 | |
| cc: Bruce Momjian <pgman@candle.pha.pa.us>, andrew@dunslane.net, 
 | |
| 	  tgl@sss.pgh.pa.us, kleptog@svana.org, gsstark@mit.edu, pg@rbt.ca, 
 | |
| 	  zhouqq@cs.toronto.edu, pgsql-hackers@postgresql.org
 | |
| In-Reply-To: <14969.1136030384@www6.gmx.net>
 | |
| References: <200512302114.jBULEno02301@candle.pha.pa.us>
 | |
| 	  <14969.1136030384@www6.gmx.net>
 | |
| Date: Tue, 03 Jan 2006 23:58:09 +0000
 | |
| Message-ID: <1136332689.5052.263.camel@localhost.localdomain>
 | |
| X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4) 
 | |
| Content-Length:  1493
 | |
| 
 | |
| On Sat, 2005-12-31 at 12:59 +0100, Michael Paesold wrote:
 | |
| > Bruce Momjian wrote:
 | |
| > 
 | |
| > > > The --single-transaction mode would apply even if the dump was created
 | |
| > > > using an earlier version of pg_dump. pg_dump has *not* been altered at
 | |
| > > > all. (And I would again add that the idea was not my own)
 | |
| > > 
 | |
| > > I assume you mean this:
 | |
| > > 
 | |
| > > 	http://archives.postgresql.org/pgsql-patches/2005-12/msg00257.php
 | |
| > > 
 | |
| > > I guess with the ALTER commands I don't see much value in the
 | |
| > > --single-transaction flag.  I am sure others suggested it, but would
 | |
| > > they suggest it now given our current direction.
 | |
| > 
 | |
| > I just want to add that --single-transaction has a value of it's own. There
 | |
| > were times when I wanted to restore parts of a dump all-or-nothing. 
 | |
| > 
 | |
| > This is possible with PostgreSQL, unlike many other DBM systems, because
 | |
| > people like Tom Lane have invested in ensuring that all DDL is working
 | |
| > without implicitly committing an enclosing transaction.
 | |
| > 
 | |
| > Using pg_restore directly into a database, it is not possible to get a
 | |
| > single transaction right now. One has to restore to a file and manually
 | |
| > added BEGIN/COMMIT. Just for that I think --single-transaction is a great
 | |
| > addition and a missing feature.
 | |
| > 
 | |
| > I think more people have a use-case for that.
 | |
| 
 | |
| I did originally separate the --single-transaction patch for this
 | |
| reason. I think its a valid patch on its own and its wrapped and ready
 | |
| to go, with some deletions from the doc patch.
 | |
| 
 | |
| Best Regards, Simon Riggs
 | |
| 
 | |
| From pgsql-hackers-owner+M78239=pgman=candle.pha.pa.us@postgresql.org Tue Jan  3 19:12:18 2006
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| From: Simon Riggs <simon@2ndquadrant.com>
 | |
| To: Bruce Momjian <pgman@candle.pha.pa.us>
 | |
| cc: Tom Lane <tgl@sss.pgh.pa.us>, Martijn van Oosterhout <kleptog@svana.org>, 
 | |
| 	   Greg Stark <gsstark@mit.edu>, Rod Taylor <pg@rbt.ca>, 
 | |
| 	   Qingqing Zhou <zhouqq@cs.toronto.edu>, pgsql-hackers@postgresql.org
 | |
| In-Reply-To: <200512291637.jBTGbdC03848@candle.pha.pa.us>
 | |
| References: <200512291637.jBTGbdC03848@candle.pha.pa.us>
 | |
| Date: Wed, 04 Jan 2006 00:11:55 +0000
 | |
| Message-ID: <1136333515.5052.273.camel@localhost.localdomain>
 | |
| X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4) 
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.045 required=5 tests=[AWL=0.045]
 | |
| X-Spam-Score: 0.045
 | |
| X-Spam-Level: 
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Length:  1200
 | |
| 
 | |
| On Thu, 2005-12-29 at 11:37 -0500, Bruce Momjian wrote:
 | |
| > Having COPY behave differently because it is
 | |
| > in a transaction is fine as long as it is user-invisible, but once you
 | |
| > require users to do that to get the speedup, it isn't user-invisible
 | |
| > anymore.
 | |
| 
 | |
| Since we're agreed on adding ALTER TABLE rather than COPY LOCK, we have
 | |
| our explicit mechanism for speedup.
 | |
| 
 | |
| However, it costs a single line of code and very very little execution
 | |
| time to add in the optimization to COPY to make it bypass WAL when
 | |
| executed in the same transaction that created the table. Everything else
 | |
| is already there.
 | |
| 
 | |
| As part of the use_wal test:
 | |
| + 	if (resultRelInfo->ri_NumIndices == 0 && 
 | |
| +         !XLogArchivingActive()            &&
 | |
| >>         (cstate->rel->rd_createSubid != InvalidSubTransactionId ))
 | |
| +             use_wal = false;
 | |
| 
 | |
| the value is already retrieved from cache...
 | |
| 
 | |
| Can anyone see a reason *not* to put that change in also? We just don't
 | |
| advertise it as the "suggested" route to gaining performance, nor would
 | |
| we rely on it for pg_dump/restore performance. 
 | |
| 
 | |
| Best Regards, Simon Riggs
 | |
| 
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 2: Don't 'kill -9' the postmaster
 | |
| 
 | |
| From pgsql-hackers-owner+M78303=pgman=candle.pha.pa.us@postgresql.org Thu Jan  5 12:23:39 2006
 | |
| X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
 | |
| X-Greylist: from auto-whitelisted by SQLgrey-
 | |
| From: Bruce Momjian <pgman@candle.pha.pa.us>
 | |
| Message-ID: <200601051722.k05HMSM02052@candle.pha.pa.us>
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| In-Reply-To: <1136328833.5052.223.camel@localhost.localdomain>
 | |
| To: Simon Riggs <simon@2ndquadrant.com>
 | |
| Date: Thu, 5 Jan 2006 12:22:28 -0500 (EST)
 | |
| cc: Andrew Dunstan <andrew@dunslane.net>, Tom Lane <tgl@sss.pgh.pa.us>, 
 | |
| 	   Martijn van Oosterhout <kleptog@svana.org>, Greg Stark <gsstark@mit.edu>, 
 | |
| 	   Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>, 
 | |
| 	  pgsql-hackers@postgresql.org
 | |
| X-Mailer: ELM [version 2.4ME+ PL121 (25)]
 | |
| X-Virus-Scanned: by amavisd-new at hub.org
 | |
| X-Spam-Status: No, score=0.12 required=5 tests=[AWL=0.120]
 | |
| X-Spam-Score: 0.12
 | |
| X-Spam-Level: 
 | |
| X-Mailing-List: pgsql-hackers
 | |
| List-Archive: <http://archives.postgresql.org/pgsql-hackers>
 | |
| List-Help: <mailto:majordomo@postgresql.org?body=help>
 | |
| List-Id: <pgsql-hackers.postgresql.org>
 | |
| List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
 | |
| List-Post: <mailto:pgsql-hackers@postgresql.org>
 | |
| List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
 | |
| List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
 | |
| Precedence: bulk
 | |
| Sender: pgsql-hackers-owner@postgresql.org
 | |
| Content-Length:  6020
 | |
| 
 | |
| Simon Riggs wrote:
 | |
| > > So, we need a name for EXCLUSIVE mode that suggests how it is different
 | |
| > > from TRUNCATE, and in this case, the difference is that EXCLUSIVE
 | |
| > > preserves the previous contents of the table on recovery, while TRUNCATE
 | |
| > > does not.  Do you want to call the mode PRESERVE, or EXCLUSIVE WRITER?
 | |
| > > Anyway, the keywords are easy to modify, even after the patch is
 | |
| > > submitted.  FYI, I usually go through keywords.c looking for a keyword
 | |
| > > we already use.
 | |
| > 
 | |
| > I'm very happy for suggestions on what these new modes are called.
 | |
| > 
 | |
| > > > > So, to summarize, I think we should add DROP/TRUNCATE, and use that by
 | |
| > > > > default (or optionally off?) in pg_dump, and, assuming we want EXCLUSIVE
 | |
| > > > > for more than just COPY, we need to add ALTER TABLE EXCLUSIVE.
 | |
| > > > 
 | |
| > > > Would you mind stating again what you mean, just so I can understand
 | |
| > > > this? Your summary isn't enough.
 | |
| > > 
 | |
| > > New ALTER TABLE mode, perhaps call it PERSISTENCE:
 | |
| > > 
 | |
| > > 	ALTER TABLE tab PERSISTENCE DROP ON RECOVERY
 | |
| > > 	ALTER TABLE tab PERSISTENCE TRUNCATE ON RECOVERY
 | |
| > > 
 | |
| > > These would drop or truncate all tables with this flag on a non-clean
 | |
| > > start of the postmaster, and write something in the server logs. 
 | |
| > > However, I don't know that we have the code in place to DROP/TRUNCATE in
 | |
| > > recovery mode, and it would affect all databases, so it could be quite
 | |
| > > complex to implement.  In this mode, no WAL logs would be written for
 | |
| > > table modifications, though DDL commands would have to be logged.
 | |
| > 
 | |
| > Right now, this will be a TODO item... it looks like it will take some
 | |
| > thought to implement correctly.
 | |
| 
 | |
| OK, I know my suggestions have made it more complicated.
 | |
| 
 | |
| TODO added:
 | |
| 
 | |
| * Allow control over which tables are WAL-logged
 | |
| 
 | |
|   Allow tables to bypass WAL writes and just fsync() dirty pages on
 | |
|   commit.  To do this, only a single writer can modify the table, and
 | |
|   writes must happen only on new pages.  Readers can continue accessing
 | |
|   the table.  This would affect COPY, and perhaps INSERT/UPDATE too.
 | |
|   Another option is to avoid transaction logging entirely and truncate
 | |
|   or drop the table on crash recovery.  These should be implemented
 | |
|   using ALTER TABLE, e.g. ALTER TABLE PERSISTENCE [ DROP | TRUNCATE |
 | |
|   STABLE | DEFAULT ].  Tables using non-default logging should not use
 | |
|   referential integrity with default-logging tables, and tables using
 | |
|   stable logging probably can not have indexes.  [walcontrol]
 | |
| 
 | |
| 
 | |
| > > 	ALTER TABLE tab PERSISTENCE PRESERVE (or STABLE?)
 | |
| > > 
 | |
| > > Table contents are preserved across recoveries, but data modifications
 | |
| > > can happen only one at a time.  I don't think we have a lock mode that
 | |
| > > does this, so I am worried a new lock mode will have to be created.  A
 | |
| > > simplified solution at this stage would be to take an exclusive lock on
 | |
| > > the table, but really we just need a single-writer table lock, which I
 | |
| > > don't think we have. initially this can implemented to only affect COPY
 | |
| > > but later can be done for other commands. 
 | |
| > 
 | |
| > ExclusiveLock locks out everything apart from readers, no new lock mode
 | |
| > AFAICS. Implementing that is little additional work for COPY.
 | |
| 
 | |
| Nice.
 | |
| 
 | |
| > Tom had a concern about setting this for I, U, D commands via the
 | |
| > executor. Not sure what the details of that are, as yet.
 | |
| 
 | |
| That is much more complicated than the COPY-only idea, for sure.  I am
 | |
| thinking we could add the ALTER syntax and just do COPY at this stage,
 | |
| meaning that I/U/D still do full logging until we get to improving them.
 | |
| The big benefit is that the user API doesn't need to change when we
 | |
| improve the code.  In fact I think we could do the TRUNCATE/DROP easily
 | |
| for I/U/D, but the STABLE option would require work and we don't need to
 | |
| implement it in the first patch.
 | |
| 
 | |
| > We can use either of the unlogged modes for pg_dump, so I'd suggest its
 | |
| > this one. Everybody happy with this being the new default in pg_dump, or
 | |
| > should it be an option?
 | |
| > 
 | |
| > > 	ALTER TABLE tab PERSISTENCE DEFAULT
 | |
| > > 
 | |
| > > This would be our current default mode, which is full concurrency and
 | |
| > > persistence.
 | |
| > 
 | |
| > I'm thinking whether the ALTER TABLE statement might be better with two
 | |
| > bool flags rather than a 3-state char.
 | |
| > 
 | |
| > flag 1: ENABLE LOGGING | DISABLE LOGGING
 | |
| > 
 | |
| > flag 2: FULL RECOVERY | TRUNCATE ON RECOVERY
 | |
| > 
 | |
| > Giving 3 possible sets of options:
 | |
| > 
 | |
| > -- the default
 | |
| > ALTER TABLE mytable ENABLE LOGGING FULL RECOVERY; (default)
 | |
| > 
 | |
| > -- EXCLUSIVE mode
 | |
| > ALTER TABLE mytable DISABLE LOGGING FULL RECOVERY;
 | |
| > ...which would be used like this
 | |
| > 	ALTER TABLE mytable DISABLE LOGGING;
 | |
| > 	COPY or other bulk data manipulation SQL
 | |
| > 	ALTER TABLE mytable ENABLE LOGGING;
 | |
| > ...since FULL RECOVERY is the default.
 | |
| > 
 | |
| > -- multiuser temp table mode
 | |
| > ALTER TABLE mytable DISABLE LOGGING TRUNCATE ON RECOVERY;
 | |
| > ...which would usually be left on all the time
 | |
| > 
 | |
| > which only uses one new keyword LOGGING and yet all the modes are fairly
 | |
| > explicit as to what they do.
 | |
| > 
 | |
| > An alternative might be the slightly more verbose:
 | |
| > 	ALTER TABLE mytable DISABLE LOGGING FORCE EXCLUSIVE TABLE LOCK;
 | |
| > which would be turned off by
 | |
| > 	ALTER TABLE mytable ENABLE LOGGING;
 | |
| > 
 | |
| > Comments?
 | |
| 
 | |
| I had the same idea originally, but avoided it because the logging
 | |
| really does affect what other options you can use.  For example, if you
 | |
| want truncate on recovery, you certainly do not want logging, so it
 | |
| seems the options are not really independent.  In fact if someone asks
 | |
| for truncate on recovery, do we automatically turn off logging for them,
 | |
| or throw an error, or a warning.  It just seemed too error-prone and
 | |
| confusing, though perhaps more logical.  Of course, if others like the
 | |
| above, we can do it.
 | |
| 
 | |
| -- 
 | |
|   Bruce Momjian                        |  http://candle.pha.pa.us
 | |
|   pgman@candle.pha.pa.us               |  (610) 359-1001
 | |
|   +  If your life is a hard drive,     |  13 Roberts Road
 | |
|   +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
 | |
| 
 | |
| ---------------------------(end of broadcast)---------------------------
 | |
| TIP 2: Don't 'kill -9' the postmaster
 | |
| 
 | |
| From simon@2ndquadrant.com Thu Jan  5 16:56:25 2006
 | |
| Return-path: <simon@2ndquadrant.com>
 | |
| Received: from smtp.nildram.co.uk (smtp.nildram.co.uk [195.112.4.54])
 | |
| 	by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k05LuPb02246
 | |
| 	for <pgman@candle.pha.pa.us>; Thu, 5 Jan 2006 16:56:25 -0500 (EST)
 | |
| Received: from [192.168.0.3] (unknown [84.12.184.6])
 | |
| 	by smtp.nildram.co.uk (Postfix) with ESMTP
 | |
| 	id A9F0F268C4E; Thu,  5 Jan 2006 21:56:18 +0000 (GMT)
 | |
| Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
 | |
| From: Simon Riggs <simon@2ndquadrant.com>
 | |
| To: Bruce Momjian <pgman@candle.pha.pa.us>
 | |
| cc: Tom Lane <tgl@sss.pgh.pa.us>, Martijn van Oosterhout <kleptog@svana.org>,
 | |
|    Greg Stark <gsstark@mit.edu>, Rod Taylor <pg@rbt.ca>,
 | |
|    Qingqing Zhou <zhouqq@cs.toronto.edu>, pgsql-hackers@postgresql.org
 | |
| In-Reply-To: <200601051727.k05HR5p02803@candle.pha.pa.us>
 | |
| References: <200601051727.k05HR5p02803@candle.pha.pa.us>
 | |
| Content-Type: text/plain
 | |
| Date: Thu, 05 Jan 2006 21:56:21 +0000
 | |
| Message-ID: <1136498181.21025.285.camel@localhost.localdomain>
 | |
| MIME-Version: 1.0
 | |
| X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4) 
 | |
| Content-Transfer-Encoding: 7bit
 | |
| Status: OR
 | |
| 
 | |
| On Thu, 2006-01-05 at 12:27 -0500, Bruce Momjian wrote:
 | |
| 
 | |
| > Seems like a nice optimization.
 | |
| 
 | |
| Negative thoughts: Toast tables have a toast index on them, yes? We have
 | |
| agreed that we cannot use the optimization if we have indexes on the
 | |
| main table. It follows that we cannot use the optimization if we have
 | |
| *any* toasted data, since that would require a pointer between two
 | |
| blocks, which would not be correctly recovered following a crash. If we
 | |
| log the toast table then there could be a mismatch between heap and
 | |
| toast table; if we don't log the toast table there could be a mismatch
 | |
| between toast table and toast index.
 | |
| 
 | |
| We can test to see if the toast table is empty when we do ALTER TABLE,
 | |
| but loading operations may try to create toasted data rows.
 | |
| 
 | |
| Presumably that means we must either:
 | |
| i) abort a COPY if we get a toastable value
 | |
| ii) if we get a toastable value, insert the row into a new block, which
 | |
| we do logging of, then also log the toast insert and the toast index
 | |
| insert - i.e. some blocks we log, others not
 | |
| 
 | |
| This is still useful for many applications, IMHO, but the list of
 | |
| restrictions seems to be growing. Worse, we wouldn't know that the toast
 | |
| tables were empty until after we did the COPY TO for a pg_dump, so we
 | |
| wouldn't be able to retrospectively add an ALTER TABLE command ahead of
 | |
| the COPY. 
 | |
| 
 | |
| Thoughts? Hopefully there are some flaws in my thinking here,
 | |
| 
 | |
| Best Regards, Simon Riggs
 | |
| 
 | |
| 
 | |
| 
 |