mirror of
				https://github.com/postgres/postgres.git
				synced 2025-10-25 13:17:41 +03:00 
			
		
		
		
	Remove unused TODO.detail functions.
This commit is contained in:
		
							
								
								
									
										10
									
								
								doc/TODO
									
									
									
									
									
								
							
							
						
						
									
										10
									
								
								doc/TODO
									
									
									
									
									
								
							| @@ -56,7 +56,7 @@ ENHANCEMENTS | ||||
|  | ||||
| URGENT | ||||
|  | ||||
| * -Add OUTER joins, left and right[outer] (Tom, Thomas) | ||||
| * -Add OUTER joins, left and right (Tom, Thomas) | ||||
| * -Allow long tuples by chaining or auto-storing outside db (TOAST) (Jan) | ||||
| * -Fix memory leak for expressions (Tom)  | ||||
| * Add replication of distributed databases [replication] | ||||
| @@ -95,7 +95,7 @@ TYPES | ||||
| 	o -Allow large object vacuuming | ||||
| 	o -Tables that start with xinv confused to be large objects | ||||
| * Add IPv6 capability to INET/CIDR types | ||||
| * -Fix improper masking of some inet/cidr types [cidr] | ||||
| * -Fix improper masking of some inet/cidr types  | ||||
| * Add conversion function from text to inet | ||||
| * Make a separate SERIAL type? | ||||
| * Store binary-compatible type information in the system | ||||
| @@ -224,7 +224,7 @@ EXOTIC FEATURES | ||||
| * Add the concept of dataspaces/tablespaces [tablespaces] | ||||
| * Allow queries across multiple databases | ||||
| * Allow nested transactions (Vadim) | ||||
| * Allow [INSERT/UPDATE] ... RETURNING new.col or old.col (Philip) | ||||
| * Allow INSERT/UPDATE ... RETURNING new.col or old.col (Philip) | ||||
| * SQL*Net listener that makes PostgreSQL appear as an Oracle database  | ||||
|   to clients | ||||
| * Incremental backups | ||||
| @@ -242,13 +242,13 @@ MISCELLANEOUS | ||||
| * Allow cursors to be DECLAREd/OPENed/CLOSEed outside transactions | ||||
| * Allow DELETE WHERE CURRENT OF cursor | ||||
| * -Transaction log, so re-do log can be on a separate disk by | ||||
|   with after-row images (Vadim) [logging] | ||||
|   with after-row images (Vadim) | ||||
| * Populate backend status area and write program to dump status data | ||||
| * Make oid use unsigned int more reliably, pg_atoi() | ||||
| * Put sort files in their own directory | ||||
| * Allow autocommit so always in a transaction block | ||||
| * Show location of syntax error in query [yacc] | ||||
| * -Redesign the function call interface to handle NULLs better[function] (Tom) | ||||
| * -Redesign the function call interface to handle NULLs better (Tom) | ||||
| * Missing optimizer selectivities for date, r-tree, etc. [optimizer] | ||||
| * Overhaul bufmgr/lockmgr/transaction manager | ||||
| * -redesign UNION structures to have separarate target lists | ||||
|   | ||||
| @@ -1,519 +0,0 @@ | ||||
| From owner-pgsql-hackers@hub.org Wed Sep 22 20:31:02 1999 | ||||
| Received: from renoir.op.net (root@renoir.op.net [209.152.193.4]) | ||||
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA15611 | ||||
| 	for <maillist@candle.pha.pa.us>; Wed, 22 Sep 1999 20:31:01 -0400 (EDT) | ||||
| Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$ Revision: 1.18 $) with ESMTP id UAA02926 for <maillist@candle.pha.pa.us>; Wed, 22 Sep 1999 20:21:24 -0400 (EDT) | ||||
| Received: from hub.org (hub.org [216.126.84.1]) | ||||
| 	by hub.org (8.9.3/8.9.3) with ESMTP id UAA75413; | ||||
| 	Wed, 22 Sep 1999 20:09:35 -0400 (EDT) | ||||
| 	(envelope-from owner-pgsql-hackers@hub.org) | ||||
| Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Wed, 22 Sep 1999 20:08:50 +0000 (EDT) | ||||
| Received: (from majordom@localhost) | ||||
| 	by hub.org (8.9.3/8.9.3) id UAA75058 | ||||
| 	for pgsql-hackers-outgoing; Wed, 22 Sep 1999 20:06:58 -0400 (EDT) | ||||
| 	(envelope-from owner-pgsql-hackers@postgreSQL.org) | ||||
| Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) | ||||
| 	by hub.org (8.9.3/8.9.3) with ESMTP id UAA74982 | ||||
| 	for <pgsql-hackers@postgreSQL.org>; Wed, 22 Sep 1999 20:06:25 -0400 (EDT) | ||||
| 	(envelope-from tgl@sss.pgh.pa.us) | ||||
| Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1]) | ||||
| 	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id UAA06411 | ||||
| 	for <pgsql-hackers@postgreSQL.org>; Wed, 22 Sep 1999 20:05:40 -0400 (EDT) | ||||
| To: pgsql-hackers@postgreSQL.org | ||||
| Subject: [HACKERS] Progress report: buffer refcount bugs and SQL functions | ||||
| Date: Wed, 22 Sep 1999 20:05:39 -0400 | ||||
| Message-ID: <6408.938045139@sss.pgh.pa.us> | ||||
| From: Tom Lane <tgl@sss.pgh.pa.us> | ||||
| Sender: owner-pgsql-hackers@postgreSQL.org | ||||
| Precedence: bulk | ||||
| Status: RO | ||||
|  | ||||
| I have been finding a lot of interesting stuff while looking into | ||||
| the buffer reference count/leakage issue. | ||||
|  | ||||
| It turns out that there were two specific things that were camouflaging | ||||
| the existence of bugs in this area: | ||||
|  | ||||
| 1. The BufferLeakCheck routine that's run at transaction commit was | ||||
| only looking for nonzero PrivateRefCount to indicate a missing unpin. | ||||
| It failed to notice nonzero LastRefCount --- which meant that an | ||||
| error in refcount save/restore usage could leave a buffer pinned, | ||||
| and BufferLeakCheck wouldn't notice. | ||||
|  | ||||
| 2. The BufferIsValid macro, which you'd think just checks whether | ||||
| it's handed a valid buffer identifier or not, actually did more: | ||||
| it only returned true if the buffer ID was valid *and* the buffer | ||||
| had positive PrivateRefCount.  That meant that the common pattern | ||||
| 	if (BufferIsValid(buf)) | ||||
| 		ReleaseBuffer(buf); | ||||
| wouldn't complain if it were handed a valid but already unpinned buffer. | ||||
| And that behavior masks bugs that result in buffers being unpinned too | ||||
| early.  For example, consider a sequence like | ||||
|  | ||||
| 1. LockBuffer (buffer now has refcount 1).  Store reference to | ||||
|    a tuple on that buffer page in a tuple table slot. | ||||
| 2. Copy buffer reference to a second tuple-table slot, but forget to | ||||
|    increment buffer's refcount. | ||||
| 3. Release second tuple table slot.  Buffer refcount drops to 0, | ||||
|    so it's unpinned. | ||||
| 4. Release original tuple slot.  Because of BufferIsValid behavior, | ||||
|    no assert happens here; in fact nothing at all happens. | ||||
|  | ||||
| This is, of course, buggy code: during the interval from 3 to 4 you | ||||
| still have an apparently valid tuple reference in the original slot, | ||||
| which someone might try to use; but the buffer it points to is unpinned | ||||
| and could be replaced at any time by another backend. | ||||
|  | ||||
| In short, we had errors that would mask both missing-pin bugs and | ||||
| missing-unpin bugs.  And naturally there were a few such bugs lurking | ||||
| behind them... | ||||
|  | ||||
| 3. The buffer refcount save/restore stuff, which I had suspected | ||||
| was useless, is not only useless but also buggy.  The reason it's | ||||
| buggy is that it only works if used in a nested fashion.  You could | ||||
| save state A, pin some buffers, save state B, pin some more | ||||
| buffers, restore state B (thereby unpinning what you pinned since | ||||
| the save), and finally restore state A (unpinning the earlier stuff). | ||||
| What you could not do is save state A, pin, save B, pin more, then | ||||
| restore state A --- that might unpin some of A's buffers, or some | ||||
| of B's buffers, or some unforeseen combination thereof.  If you | ||||
| restore A and then restore B, you do not necessarily return to a zero- | ||||
| pins state, either.  And it turns out the actual usage pattern was a | ||||
| nearly random sequence of saves and restores, compounded by a failure to | ||||
| do all of the restores reliably (which was masked by the oversight in | ||||
| BufferLeakCheck). | ||||
|  | ||||
|  | ||||
| What I have done so far is to rip out the buffer refcount save/restore | ||||
| support (including LastRefCount), change BufferIsValid to a simple | ||||
| validity check (so that you get an assert if you unpin something that | ||||
| was pinned), change ExecStoreTuple so that it increments the refcount | ||||
| when it is handed a buffer reference (for symmetry with ExecClearTuple's | ||||
| decrement of the refcount), and fix about a dozen bugs exposed by these | ||||
| changes. | ||||
|  | ||||
| I am still getting Buffer Leak notices in the "misc" regression test, | ||||
| specifically in the queries that invoke more than one SQL function. | ||||
| What I find there is that SQL functions are not always run to | ||||
| completion.  Apparently, when a function can return multiple tuples, | ||||
| it won't necessarily be asked to produce them all.  And when it isn't, | ||||
| postquel_end() isn't invoked for the function's current query, so its | ||||
| tuple table isn't cleared, so we have dangling refcounts if any of the | ||||
| tuples involved are in disk buffers. | ||||
|  | ||||
| It may be that the save/restore code was a misguided attempt to fix | ||||
| this problem.  I can't tell.  But I think what we really need to do is | ||||
| find some way of ensuring that Postquel function execution contexts | ||||
| always get shut down by the end of the query, so that they don't leak | ||||
| resources. | ||||
|  | ||||
| I suppose a straightforward approach would be to keep a list of open | ||||
| function contexts somewhere (attached to the outer execution context, | ||||
| perhaps), and clean them up at outer-plan shutdown. | ||||
|  | ||||
| What I am wondering, though, is whether this addition is actually | ||||
| necessary, or is it a bug that the functions aren't run to completion | ||||
| in the first place?  I don't really understand the semantics of this | ||||
| "nested dot notation".  I suppose it is a Berkeleyism; I can't find | ||||
| anything about it in the SQL92 document.  The test cases shown in the | ||||
| misc regress test seem peculiar, not to say wrong.  For example: | ||||
|  | ||||
| regression=> SELECT p.hobbies.equipment.name, p.hobbies.name, p.name FROM person p; | ||||
| name         |name       |name | ||||
| -------------+-----------+----- | ||||
| advil        |posthacking|mike | ||||
| peet's coffee|basketball |joe | ||||
| hightops     |basketball |sally | ||||
| (3 rows) | ||||
|  | ||||
| which doesn't appear to agree with the contents of the underlying | ||||
| relations: | ||||
|  | ||||
| regression=> SELECT * FROM hobbies_r; | ||||
| name       |person | ||||
| -----------+------ | ||||
| posthacking|mike | ||||
| posthacking|jeff | ||||
| basketball |joe | ||||
| basketball |sally | ||||
| skywalking | | ||||
| (5 rows) | ||||
|  | ||||
| regression=> SELECT * FROM equipment_r; | ||||
| name         |hobby | ||||
| -------------+----------- | ||||
| advil        |posthacking | ||||
| peet's coffee|posthacking | ||||
| hightops     |basketball | ||||
| guts         |skywalking | ||||
| (4 rows) | ||||
|  | ||||
| I'd have expected an output along the lines of | ||||
|  | ||||
| advil        |posthacking|mike | ||||
| peet's coffee|posthacking|mike | ||||
| hightops     |basketball |joe | ||||
| hightops     |basketball |sally | ||||
|  | ||||
| Is the regression test's expected output wrong, or am I misunderstanding | ||||
| what this query is supposed to do?  Is there any documentation anywhere | ||||
| about how SQL functions returning multiple tuples are supposed to | ||||
| behave? | ||||
|  | ||||
| 			regards, tom lane | ||||
|  | ||||
| ************ | ||||
|  | ||||
|  | ||||
| From owner-pgsql-hackers@hub.org Thu Sep 23 11:03:19 1999 | ||||
| Received: from hub.org (hub.org [216.126.84.1]) | ||||
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA16211 | ||||
| 	for <maillist@candle.pha.pa.us>; Thu, 23 Sep 1999 11:03:17 -0400 (EDT) | ||||
| Received: from hub.org (hub.org [216.126.84.1]) | ||||
| 	by hub.org (8.9.3/8.9.3) with ESMTP id KAA58151; | ||||
| 	Thu, 23 Sep 1999 10:53:46 -0400 (EDT) | ||||
| 	(envelope-from owner-pgsql-hackers@hub.org) | ||||
| Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Thu, 23 Sep 1999 10:53:05 +0000 (EDT) | ||||
| Received: (from majordom@localhost) | ||||
| 	by hub.org (8.9.3/8.9.3) id KAA57948 | ||||
| 	for pgsql-hackers-outgoing; Thu, 23 Sep 1999 10:52:23 -0400 (EDT) | ||||
| 	(envelope-from owner-pgsql-hackers@postgreSQL.org) | ||||
| Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) | ||||
| 	by hub.org (8.9.3/8.9.3) with ESMTP id KAA57841 | ||||
| 	for <hackers@postgreSQL.org>; Thu, 23 Sep 1999 10:51:50 -0400 (EDT) | ||||
| 	(envelope-from tgl@sss.pgh.pa.us) | ||||
| Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1]) | ||||
| 	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id KAA14211; | ||||
| 	Thu, 23 Sep 1999 10:51:10 -0400 (EDT) | ||||
| To: Andreas Zeugswetter <andreas.zeugswetter@telecom.at> | ||||
| cc: hackers@postgreSQL.org | ||||
| Subject: Re: [HACKERS] Progress report: buffer refcount bugs and SQL functions  | ||||
| In-reply-to: Your message of Thu, 23 Sep 1999 10:07:24 +0200  | ||||
|              <37E9DFBC.5C0978F@telecom.at>  | ||||
| Date: Thu, 23 Sep 1999 10:51:10 -0400 | ||||
| Message-ID: <14209.938098270@sss.pgh.pa.us> | ||||
| From: Tom Lane <tgl@sss.pgh.pa.us> | ||||
| Sender: owner-pgsql-hackers@postgreSQL.org | ||||
| Precedence: bulk | ||||
| Status: RO | ||||
|  | ||||
| Andreas Zeugswetter <andreas.zeugswetter@telecom.at> writes: | ||||
| > That is what I use it for. I have never used it with a  | ||||
| > returns setof function, but reading the comments in the regression test, | ||||
| > -- mike needs advil and peet's coffee, | ||||
| > -- joe and sally need hightops, and | ||||
| > -- everyone else is fine. | ||||
| > it looks like the results you expected are correct, and currently the  | ||||
| > wrong result is given. | ||||
|  | ||||
| Yes, I have concluded the same (and partially fixed it, per my previous | ||||
| message). | ||||
|  | ||||
| > Those that don't have a hobbie should return name|NULL|NULL. A hobbie | ||||
| > that does'nt need equipment name|hobbie|NULL. | ||||
|  | ||||
| That's a good point.  Currently (both with and without my uncommitted | ||||
| fix) you get *no* rows out from ExecTargetList if there are any Iters | ||||
| that return empty result sets.  It might be more reasonable to treat an | ||||
| empty result set as if it were NULL, which would give the behavior you | ||||
| suggest. | ||||
|  | ||||
| This would be an easy change to my current patch, and I'm prepared to | ||||
| make it before committing what I have, if people agree that that's a | ||||
| more reasonable definition.  Comments? | ||||
|  | ||||
| 			regards, tom lane | ||||
|  | ||||
| ************ | ||||
|  | ||||
|  | ||||
| From owner-pgsql-hackers@hub.org Thu Sep 23 04:31:15 1999 | ||||
| Received: from renoir.op.net (root@renoir.op.net [209.152.193.4]) | ||||
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id EAA11344 | ||||
| 	for <maillist@candle.pha.pa.us>; Thu, 23 Sep 1999 04:31:15 -0400 (EDT) | ||||
| Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$ Revision: 1.18 $) with ESMTP id EAA05350 for <maillist@candle.pha.pa.us>; Thu, 23 Sep 1999 04:24:29 -0400 (EDT) | ||||
| Received: from hub.org (hub.org [216.126.84.1]) | ||||
| 	by hub.org (8.9.3/8.9.3) with ESMTP id EAA85679; | ||||
| 	Thu, 23 Sep 1999 04:16:26 -0400 (EDT) | ||||
| 	(envelope-from owner-pgsql-hackers@hub.org) | ||||
| Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Thu, 23 Sep 1999 04:09:52 +0000 (EDT) | ||||
| Received: (from majordom@localhost) | ||||
| 	by hub.org (8.9.3/8.9.3) id EAA84708 | ||||
| 	for pgsql-hackers-outgoing; Thu, 23 Sep 1999 04:08:57 -0400 (EDT) | ||||
| 	(envelope-from owner-pgsql-hackers@postgreSQL.org) | ||||
| Received: from gandalf.telecom.at (gandalf.telecom.at [194.118.26.84]) | ||||
| 	by hub.org (8.9.3/8.9.3) with ESMTP id EAA84632 | ||||
| 	for <hackers@postgresql.org>; Thu, 23 Sep 1999 04:08:03 -0400 (EDT) | ||||
| 	(envelope-from andreas.zeugswetter@telecom.at) | ||||
| Received: from telecom.at (w0188000580.f000.d0188.sd.spardat.at [172.18.65.249]) | ||||
| 	by gandalf.telecom.at (xxx/xxx) with ESMTP id KAA195294 | ||||
| 	for <hackers@postgresql.org>; Thu, 23 Sep 1999 10:07:27 +0200 | ||||
| Message-ID: <37E9DFBC.5C0978F@telecom.at> | ||||
| Date: Thu, 23 Sep 1999 10:07:24 +0200 | ||||
| From: Andreas Zeugswetter <andreas.zeugswetter@telecom.at> | ||||
| X-Mailer: Mozilla 4.61 [en] (Win95; I) | ||||
| X-Accept-Language: en | ||||
| MIME-Version: 1.0 | ||||
| To: hackers@postgreSQL.org | ||||
| Subject: Re: [HACKERS] Progress report: buffer refcount bugs and SQL functions | ||||
| Content-Type: text/plain; charset=us-ascii | ||||
| Content-Transfer-Encoding: 7bit | ||||
| Sender: owner-pgsql-hackers@postgreSQL.org | ||||
| Precedence: bulk | ||||
| Status: RO | ||||
|  | ||||
| > Is the regression test's expected output wrong, or am I  | ||||
| > misunderstanding | ||||
| > what this query is supposed to do?  Is there any  | ||||
| > documentation anywhere | ||||
| > about how SQL functions returning multiple tuples are supposed to | ||||
| > behave? | ||||
|  | ||||
| They are supposed to behave somewhat like a view. | ||||
| Not all rows are necessarily fetched. | ||||
| If used in a context that needs a single row answer, | ||||
| and the answer has multiple rows it is supposed to  | ||||
| runtime elog. Like in: | ||||
|  | ||||
| select * from tbl where col=funcreturningmultipleresults(); | ||||
| -- this must elog | ||||
|  | ||||
| while this is ok: | ||||
| select * from tbl where col in (select funcreturningmultipleresults()); | ||||
|  | ||||
| But the caller could only fetch the first row if he wanted. | ||||
|  | ||||
| The nested notation is supposed to call the function passing it the tuple | ||||
| as the first argument. This is what can be used to "fake" a column | ||||
| onto a table (computed column).  | ||||
| That is what I use it for. I have never used it with a  | ||||
| returns setof function, but reading the comments in the regression test, | ||||
| -- mike needs advil and peet's coffee, | ||||
| -- joe and sally need hightops, and | ||||
| -- everyone else is fine. | ||||
| it looks like the results you expected are correct, and currently the  | ||||
| wrong result is given. | ||||
|  | ||||
| But I think this query could also elog whithout removing substantial | ||||
| functionality.  | ||||
|  | ||||
| SELECT p.name, p.hobbies.name, p.hobbies.equipment.name FROM person p; | ||||
|  | ||||
| Actually for me it would be intuitive, that this query return one row per  | ||||
| person, but elog on those that have more than one hobbie or a hobbie that  | ||||
| needs more than one equipment. Those that don't have a hobbie should  | ||||
| return name|NULL|NULL. A hobbie that does'nt need equipment name|hobbie|NULL. | ||||
|  | ||||
| Andreas | ||||
|  | ||||
| ************ | ||||
|  | ||||
|  | ||||
| From owner-pgsql-hackers@hub.org Wed Sep 22 22:01:07 1999 | ||||
| Received: from renoir.op.net (root@renoir.op.net [209.152.193.4]) | ||||
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA16360 | ||||
| 	for <maillist@candle.pha.pa.us>; Wed, 22 Sep 1999 22:01:05 -0400 (EDT) | ||||
| Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$ Revision: 1.18 $) with ESMTP id VAA08386 for <maillist@candle.pha.pa.us>; Wed, 22 Sep 1999 21:37:24 -0400 (EDT) | ||||
| Received: from hub.org (hub.org [216.126.84.1]) | ||||
| 	by hub.org (8.9.3/8.9.3) with ESMTP id VAA88083; | ||||
| 	Wed, 22 Sep 1999 21:28:11 -0400 (EDT) | ||||
| 	(envelope-from owner-pgsql-hackers@hub.org) | ||||
| Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Wed, 22 Sep 1999 21:27:48 +0000 (EDT) | ||||
| Received: (from majordom@localhost) | ||||
| 	by hub.org (8.9.3/8.9.3) id VAA87938 | ||||
| 	for pgsql-hackers-outgoing; Wed, 22 Sep 1999 21:26:52 -0400 (EDT) | ||||
| 	(envelope-from owner-pgsql-hackers@postgreSQL.org) | ||||
| Received: from orion.SAPserv.Hamburg.dsh.de (Tpolaris2.sapham.debis.de [53.2.131.8]) | ||||
| 	by hub.org (8.9.3/8.9.3) with SMTP id VAA87909 | ||||
| 	for <pgsql-hackers@postgresql.org>; Wed, 22 Sep 1999 21:26:36 -0400 (EDT) | ||||
| 	(envelope-from wieck@debis.com) | ||||
| Received: by orion.SAPserv.Hamburg.dsh.de  | ||||
| 	for pgsql-hackers@postgresql.org  | ||||
| 	id m11TxXw-0003kLC; Thu, 23 Sep 99 03:19 MET DST | ||||
| Message-Id: <m11TxXw-0003kLC@orion.SAPserv.Hamburg.dsh.de> | ||||
| From: wieck@debis.com (Jan Wieck) | ||||
| Subject: Re: [HACKERS] Progress report: buffer refcount bugs and SQL functions | ||||
| To: tgl@sss.pgh.pa.us (Tom Lane) | ||||
| Date: Thu, 23 Sep 1999 03:19:39 +0200 (MET DST) | ||||
| Cc: pgsql-hackers@postgreSQL.org | ||||
| Reply-To: wieck@debis.com (Jan Wieck) | ||||
| In-Reply-To: <6408.938045139@sss.pgh.pa.us> from "Tom Lane" at Sep 22, 99 08:05:39 pm | ||||
| X-Mailer: ELM [version 2.4 PL25] | ||||
| Content-Type: text | ||||
| Sender: owner-pgsql-hackers@postgreSQL.org | ||||
| Precedence: bulk | ||||
| Status: RO | ||||
|  | ||||
| Tom Lane wrote: | ||||
|  | ||||
| > [...] | ||||
| > | ||||
| > What I am wondering, though, is whether this addition is actually | ||||
| > necessary, or is it a bug that the functions aren't run to completion | ||||
| > in the first place?  I don't really understand the semantics of this | ||||
| > "nested dot notation".  I suppose it is a Berkeleyism; I can't find | ||||
| > anything about it in the SQL92 document.  The test cases shown in the | ||||
| > misc regress test seem peculiar, not to say wrong.  For example: | ||||
| > | ||||
| > [...] | ||||
| > | ||||
| > Is the regression test's expected output wrong, or am I misunderstanding | ||||
| > what this query is supposed to do?  Is there any documentation anywhere | ||||
| > about how SQL functions returning multiple tuples are supposed to | ||||
| > behave? | ||||
|  | ||||
|     I've  said some time (maybe too long) ago, that SQL functions | ||||
|     returning tuple sets are broken in general. This  nested  dot | ||||
|     notation  (which  I  think  is  an artefact from the postquel | ||||
|     querylanguage) is implemented via set functions. | ||||
|  | ||||
|     Set functions have total different semantics from  all  other | ||||
|     functions.   First  they  don't  really return a tuple set as | ||||
|     someone might think  -  all  that  screwed  up  code  instead | ||||
|     simulates  that  they  return  something you could consider a | ||||
|     scan of the last SQL statement in  the  function.   Then,  on | ||||
|     each  subsequent call inside of the same command, they return | ||||
|     a "tupletable slot" containing the next found  tuple  (that's | ||||
|     why their Func node is mangled up after the first call). | ||||
|  | ||||
|     Second  they  have  a  targetlist what I think was originally | ||||
|     intended to extract attributes out  of  the  tuples  returned | ||||
|     when  the above scan is asked to get the next tuple. But as I | ||||
|     read the code it invokes the function again  and  this  might | ||||
|     cause the resource leakage you see. | ||||
|  | ||||
|     Third,   all  this  seems  to  never  have  been  implemented | ||||
|     (thought?) to the end. A targetlist  doesn't  make  sense  at | ||||
|     this place because it could at max contain a single attribute | ||||
|     - so a single attno would have the same  power.  And  if  set | ||||
|     functions  could appear in the rangetable (FROM clause), than | ||||
|     they would be treated as that and regular Var  nodes  in  the | ||||
|     query would do it. | ||||
|  | ||||
|     I  think  you  shouldn't really care for that regression test | ||||
|     and maybe we should disable set  functions  until  we  really | ||||
|     implement stored procedures returning sets in the rangetable. | ||||
|  | ||||
|     Set  functions  where  planned  by  Stonebraker's   team   as | ||||
|     something  that  today is called stored procedures. But AFAIK | ||||
|     they never reached the useful state because even in  Postgres | ||||
|     4.2  you haven't been able to get more than one attribute out | ||||
|     of a  set  function.   It  was  a  feature  of  the  postquel | ||||
|     querylanguage  that  you  could  get one attribute from a set | ||||
|     function via | ||||
|  | ||||
|         RETRIEVE (attributename(setfuncname())) | ||||
|  | ||||
|     While working on the constraint  triggers  I've  came  across | ||||
|     another  regression test (triggers :-) that's errorneous too. | ||||
|     The funny_dup17 trigger proc executes an INSERT into the same | ||||
|     relation  where it get fired for by a previous INSERT. And it | ||||
|     stops this recursion only if it reaches a  nesting  level  of | ||||
|     17,  which  could  only  occur  if  it  is  fired  DURING the | ||||
|     execution of it's own SPI_exec(). After  Vadim  quouted  some | ||||
|     SQL92  definitions  about when constraint checks and triggers | ||||
|     are to be executed, I decided to fire regular triggers at the | ||||
|     end  of  a  query  too.  Thus, there is absolutely no nesting | ||||
|     possible for AFTER triggers resulting in an endless loop. | ||||
|  | ||||
|  | ||||
| Jan | ||||
|  | ||||
| -- | ||||
|  | ||||
| #======================================================================# | ||||
| # It's easier to get forgiveness for being wrong than for being right. # | ||||
| # Let's break this rule - forgive me.                                  # | ||||
| #========================================= wieck@debis.com (Jan Wieck) # | ||||
|  | ||||
|  | ||||
|  | ||||
| ************ | ||||
|  | ||||
|  | ||||
| From owner-pgsql-hackers@hub.org Thu Sep 23 11:01:06 1999 | ||||
| Received: from renoir.op.net (root@renoir.op.net [209.152.193.4]) | ||||
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA16162 | ||||
| 	for <maillist@candle.pha.pa.us>; Thu, 23 Sep 1999 11:01:04 -0400 (EDT) | ||||
| Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$ Revision: 1.18 $) with ESMTP id KAA28544 for <maillist@candle.pha.pa.us>; Thu, 23 Sep 1999 10:45:54 -0400 (EDT) | ||||
| Received: from hub.org (hub.org [216.126.84.1]) | ||||
| 	by hub.org (8.9.3/8.9.3) with ESMTP id KAA52943; | ||||
| 	Thu, 23 Sep 1999 10:20:51 -0400 (EDT) | ||||
| 	(envelope-from owner-pgsql-hackers@hub.org) | ||||
| Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Thu, 23 Sep 1999 10:19:58 +0000 (EDT) | ||||
| Received: (from majordom@localhost) | ||||
| 	by hub.org (8.9.3/8.9.3) id KAA52472 | ||||
| 	for pgsql-hackers-outgoing; Thu, 23 Sep 1999 10:19:03 -0400 (EDT) | ||||
| 	(envelope-from owner-pgsql-hackers@postgreSQL.org) | ||||
| Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [209.114.166.2]) | ||||
| 	by hub.org (8.9.3/8.9.3) with ESMTP id KAA52431 | ||||
| 	for <pgsql-hackers@postgresql.org>; Thu, 23 Sep 1999 10:18:47 -0400 (EDT) | ||||
| 	(envelope-from tgl@sss.pgh.pa.us) | ||||
| Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1]) | ||||
| 	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id KAA13253; | ||||
| 	Thu, 23 Sep 1999 10:18:02 -0400 (EDT) | ||||
| To: wieck@debis.com (Jan Wieck) | ||||
| cc: pgsql-hackers@postgreSQL.org | ||||
| Subject: Re: [HACKERS] Progress report: buffer refcount bugs and SQL functions  | ||||
| In-reply-to: Your message of Thu, 23 Sep 1999 03:19:39 +0200 (MET DST)  | ||||
|              <m11TxXw-0003kLC@orion.SAPserv.Hamburg.dsh.de>  | ||||
| Date: Thu, 23 Sep 1999 10:18:01 -0400 | ||||
| Message-ID: <13251.938096281@sss.pgh.pa.us> | ||||
| From: Tom Lane <tgl@sss.pgh.pa.us> | ||||
| Sender: owner-pgsql-hackers@postgreSQL.org | ||||
| Precedence: bulk | ||||
| Status: RO | ||||
|  | ||||
| wieck@debis.com (Jan Wieck) writes: | ||||
| > Tom Lane wrote: | ||||
| >> What I am wondering, though, is whether this addition is actually | ||||
| >> necessary, or is it a bug that the functions aren't run to completion | ||||
| >> in the first place? | ||||
|  | ||||
| >     I've  said some time (maybe too long) ago, that SQL functions | ||||
| >     returning tuple sets are broken in general. | ||||
|  | ||||
| Indeed they are.  Try this on for size (using the regression database): | ||||
|  | ||||
| 	SELECT p.name, p.hobbies.equipment.name FROM person p; | ||||
| 	SELECT p.hobbies.equipment.name, p.name FROM person p; | ||||
|  | ||||
| You get different result sets!? | ||||
|  | ||||
| The problem in this example is that ExecTargetList returns the isDone | ||||
| flag from the last targetlist entry, regardless of whether there are | ||||
| incomplete iterations in previous entries.  More generally, the buffer | ||||
| leak problem that I started with only occurs if some Iter nodes are not | ||||
| run to completion --- but execQual.c has no mechanism to make sure that | ||||
| they have all reached completion simultaneously. | ||||
|  | ||||
| What we really need to make functions-returning-sets work properly is | ||||
| an implementation somewhat like aggregate functions.  We need to make | ||||
| a list of all the Iter nodes present in a targetlist and cycle through | ||||
| the values returned by each in a methodical fashion (run the rightmost | ||||
| through its full cycle, then advance the next-to-rightmost one value, | ||||
| run the rightmost through its cycle again, etc etc).  Also there needs | ||||
| to be an understanding of the hierarchy when an Iter appears in the | ||||
| arguments of another Iter's function.  (You cycle the upper one for | ||||
| *each* set of arguments created by cycling its sub-Iters.) | ||||
|  | ||||
| I am not particularly interested in working on this feature right now, | ||||
| since AFAIK it's a Berkeleyism not found in SQL92.  What I've done | ||||
| is to hack ExecTargetList so that it behaves semi-sanely when there's | ||||
| more than one Iter at the top level of the target list --- it still | ||||
| doesn't really give the right answer, but at least it will keep | ||||
| generating tuples until all the Iters are done at the same time. | ||||
| It happens that that's enough to give correct answers for the examples | ||||
| shown in the misc regress test.  Even when it fails to generate all | ||||
| the possible combinations, there will be no buffer leaks. | ||||
|  | ||||
| So, I'm going to declare victory and go home ;-).  We ought to add a | ||||
| TODO item along the lines of | ||||
|  * Functions returning sets don't really work right | ||||
| in hopes that someone will feel like tackling this someday. | ||||
|  | ||||
| 			regards, tom lane | ||||
|  | ||||
| ************ | ||||
|  | ||||
|  | ||||
| @@ -1,285 +0,0 @@ | ||||
| From owner-pgsql-hackers@hub.org Fri Nov 13 13:24:37 1998 | ||||
| Received: from hub.org (majordom@hub.org [209.47.148.200]) | ||||
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA13457 | ||||
| 	for <maillist@candle.pha.pa.us>; Fri, 13 Nov 1998 13:24:35 -0500 (EST) | ||||
| Received: from localhost (majordom@localhost) | ||||
| 	by hub.org (8.9.1/8.9.1) with SMTP id NAA02464; | ||||
| 	Fri, 13 Nov 1998 13:22:52 -0500 (EST) | ||||
| 	(envelope-from owner-pgsql-hackers@hub.org) | ||||
| Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Fri, 13 Nov 1998 13:21:14 +0000 (EST) | ||||
| Received: (from majordom@localhost) | ||||
| 	by hub.org (8.9.1/8.9.1) id NAA02331 | ||||
| 	for pgsql-hackers-outgoing; Fri, 13 Nov 1998 13:21:12 -0500 (EST) | ||||
| 	(envelope-from owner-pgsql-hackers@postgreSQL.org) | ||||
| Received: from orion.SAPserv.Hamburg.dsh.de (Tpolaris2.sapham.debis.de [53.2.131.8]) | ||||
| 	by hub.org (8.9.1/8.9.1) with SMTP id NAA02316 | ||||
| 	for <pgsql-hackers@postgreSQL.org>; Fri, 13 Nov 1998 13:21:06 -0500 (EST) | ||||
| 	(envelope-from wieck@sapserv.debis.de) | ||||
| Received: by orion.SAPserv.Hamburg.dsh.de  | ||||
| 	for pgsql-hackers@postgreSQL.org  | ||||
| 	id m0zeOEf-000EBPC; Fri, 13 Nov 98 19:46 MET | ||||
| Message-Id: <m0zeOEf-000EBPC@orion.SAPserv.Hamburg.dsh.de> | ||||
| From: jwieck@debis.com (Jan Wieck) | ||||
| Subject: [HACKERS] shmem limits and redolog | ||||
| To: pgsql-hackers@postgreSQL.org (PostgreSQL HACKERS) | ||||
| Date: Fri, 13 Nov 1998 19:46:20 +0100 (MET) | ||||
| Reply-To: jwieck@debis.com (Jan Wieck) | ||||
| X-Mailer: ELM [version 2.4 PL25] | ||||
| Content-Type: text | ||||
| Sender: owner-pgsql-hackers@postgreSQL.org | ||||
| Precedence: bulk | ||||
| Status: ROr | ||||
|  | ||||
| Hi, | ||||
|  | ||||
|     I'm  currently  hacking  around on a solution for logging all | ||||
|     database operations at query level that can recover a crashed | ||||
|     database  from  the last successful backup by redoing all the | ||||
|     commands. | ||||
|  | ||||
|     Well, I wanted it to be as flexible as can. So I  decided  to | ||||
|     make  it  per  database  configurable.  One  could  say which | ||||
|     databases are logged and if a database is, if  it  is  logged | ||||
|     sync  or async (in sync mode, every COMMIT forces an fsync of | ||||
|     the actual logfile and controlfiles). | ||||
|  | ||||
|     To make async mode as fast as can, I'm using a shared  memory | ||||
|     of  32K per database (not per backend) that is used as a wrap | ||||
|     around  buffer  from  the  backends  to  place  their   query | ||||
|     information.  So  the  log writer can fall a little behind if | ||||
|     there are many backends doing  different  things  that  don't | ||||
|     lock each other. | ||||
|  | ||||
|     Now  I'm  a  little  in  doubt about the shared memory limits | ||||
|     reported.  Was it a good decision to use shared memory? Am  I | ||||
|     better off using socket's? | ||||
|  | ||||
|     The  bad  thing  in  what  I  have  up  to now (it's far from | ||||
|     complete) is, that even if a database isn't currently logged, | ||||
|     a redolog writer is started and creates the 32K shmem segment | ||||
|     (plus a semaphore set with 5 semaphores). This is  because  I | ||||
|     plan to create commands like | ||||
|  | ||||
|         ALTER DATABASE LOG MODE=ASYNC LOGDIR='/somewhere/dbname'; | ||||
|  | ||||
|     and the like that can be used at runtime (while more than one | ||||
|     backend is connected to the database) to turn logging on/off, | ||||
|     switch  to/from  backup  mode (all other activity is stopped) | ||||
|     etc. | ||||
|  | ||||
|     So every 32 databases will require another megabyte of shared | ||||
|     memory.  The  logging  master  controls  which databases have | ||||
|     activity  and  kills  redolog  writers  after  some  time  of | ||||
|     inactivity,  and  the shmem is freed then. But it can hurt if | ||||
|     someone really has many many databases that are all  used  at | ||||
|     the same time. | ||||
|  | ||||
|     What do the others say? | ||||
|  | ||||
|  | ||||
| Jan | ||||
|  | ||||
| -- | ||||
|  | ||||
| #======================================================================# | ||||
| # It's easier to get forgiveness for being wrong than for being right. # | ||||
| # Let's break this rule - forgive me.                                  # | ||||
| #======================================== jwieck@debis.com (Jan Wieck) # | ||||
|  | ||||
|  | ||||
|  | ||||
|  | ||||
| From owner-pgsql-hackers@hub.org Wed Dec 16 15:46:41 1998 | ||||
| Received: from renoir.op.net (root@renoir.op.net [209.152.193.4]) | ||||
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id PAA00521 | ||||
| 	for <maillist@candle.pha.pa.us>; Wed, 16 Dec 1998 15:46:40 -0500 (EST) | ||||
| Received: from hub.org (majordom@hub.org [209.47.145.100]) by renoir.op.net (o1/$ Revision: 1.18 $) with ESMTP id PAA08772 for <maillist@candle.pha.pa.us>; Wed, 16 Dec 1998 15:10:01 -0500 (EST) | ||||
| Received: from localhost (majordom@localhost) | ||||
| 	by hub.org (8.9.1/8.9.1) with SMTP id PAA01254; | ||||
| 	Wed, 16 Dec 1998 15:06:56 -0500 (EST) | ||||
| 	(envelope-from owner-pgsql-hackers@hub.org) | ||||
| Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Wed, 16 Dec 1998 14:58:11 +0000 (EST) | ||||
| Received: (from majordom@localhost) | ||||
| 	by hub.org (8.9.1/8.9.1) id OAA00660 | ||||
| 	for pgsql-hackers-outgoing; Wed, 16 Dec 1998 14:58:10 -0500 (EST) | ||||
| 	(envelope-from owner-pgsql-hackers@postgreSQL.org) | ||||
| Received: from orion.SAPserv.Hamburg.dsh.de (Tpolaris2.sapham.debis.de [53.2.131.8]) | ||||
| 	by hub.org (8.9.1/8.9.1) with SMTP id OAA00643 | ||||
| 	for <pgsql-hackers@postgreSQL.org>; Wed, 16 Dec 1998 14:58:05 -0500 (EST) | ||||
| 	(envelope-from wieck@sapserv.debis.de) | ||||
| Received: by orion.SAPserv.Hamburg.dsh.de  | ||||
| 	for pgsql-hackers@postgreSQL.org  | ||||
| 	id m0zqNDo-000EBTC; Wed, 16 Dec 98 21:07 MET | ||||
| Message-Id: <m0zqNDo-000EBTC@orion.SAPserv.Hamburg.dsh.de> | ||||
| From: jwieck@debis.com (Jan Wieck) | ||||
| Subject: Re: [HACKERS] redolog - for discussion | ||||
| To: vadim@krs.ru (Vadim Mikheev) | ||||
| Date: Wed, 16 Dec 1998 21:07:00 +0100 (MET) | ||||
| Cc: jwieck@debis.com, pgsql-hackers@postgreSQL.org | ||||
| Reply-To: jwieck@debis.com (Jan Wieck) | ||||
| In-Reply-To: <3677B71D.C67462B3@krs.ru> from "Vadim Mikheev" at Dec 16, 98 08:35:25 pm | ||||
| X-Mailer: ELM [version 2.4 PL25] | ||||
| Content-Type: text | ||||
| Sender: owner-pgsql-hackers@postgreSQL.org | ||||
| Precedence: bulk | ||||
| Status: RO | ||||
|  | ||||
| Vadim wrote: | ||||
|  | ||||
| > | ||||
| > Jan Wieck wrote: | ||||
| > > | ||||
| > >     RECOVER DATABASE {ALL | UNTIL 'datetime' | RESET}; | ||||
| > > | ||||
| > ... | ||||
| > > | ||||
| > >         For  the  others, the backend starts the recovery program | ||||
| > >         which  reads  the  redolog  files,  establishes  database | ||||
| > >         connections  as  required  and reruns all the commands in | ||||
| >                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||||
| > >         them. If a required logfile isn't  found,  it  tells  the | ||||
| >           ^^^^^ | ||||
| > | ||||
| > I foresee problems with using _commands_ logging for | ||||
| > recovery/replication -:(( | ||||
| > | ||||
| > Let's consider two concurrent updates in READ COMMITTED mode: | ||||
| > | ||||
| > update test set x = 2 where y = 1; | ||||
| > | ||||
| >    and | ||||
| > | ||||
| > update test set x = 3 where y = 1; | ||||
| > | ||||
| > The result of both committed transaction will be x = 2 | ||||
| > if the 1st transaction updated row _after_ 2nd transaction | ||||
| > and x = 3 if the 2nd transaction gets row after 1st one. | ||||
| > Order of updates is not defined by order in which commands | ||||
| > begun and so order in which commands should be rerun | ||||
| > will be unknown... | ||||
|  | ||||
|     Yepp,  the order in which commands begun is absolutely not of | ||||
|     interest. Locking could already delay the  execution  of  one | ||||
|     command  until  another  one  started  later has finished and | ||||
|     released the lock.  It's a classic race condition. | ||||
|  | ||||
|     Thus, my plan was to log the queries just before the call  to | ||||
|     CommitTransactionCommand()  in  tcop. This has the advantage, | ||||
|     that queries which bail out with errors don't  get  into  the | ||||
|     log  at  all  and  must not get rerun. And I can set a static | ||||
|     flag to false before starting the command, which  is  set  to | ||||
|     true  in  the buffer manager when a buffer is written (marked | ||||
|     dirty), so filtering out queries that do no updates at all is | ||||
|     easy. | ||||
|  | ||||
|     Unfortunately  query  level  logging get's hit by the current | ||||
|     implementation of sequence numbers. If  a  query  that  get's | ||||
|     aborted  somewhere  in the middle (maybe by a trigger) called | ||||
|     nextval() for rows processed  earlier,  the  sequence  number | ||||
|     isn't  advanced  at  recovery  time,  because  the  query  is | ||||
|     suppressed at all.   And  sequences  aren't  locked,  so  for | ||||
|     concurrently  running  queries  getting numbers from the same | ||||
|     sequence,  the  results   aren't   reproduceable.   If   some | ||||
|     application  selects  a  value  resulting from a sequence and | ||||
|     uses that later in another query, how could the redolog  know | ||||
|     that  this has changed? It's a Const in the query logged, and | ||||
|     all that corrupts the whole thing. | ||||
|  | ||||
|     All that is painful and I don't see another solution yet than | ||||
|     to  hook  into  nextval(),  log  out the numbers generated in | ||||
|     normal operation and getting back the same  numbers  in  redo | ||||
|     mode. | ||||
|  | ||||
|     The whole thing gets more and more complicated :-( | ||||
|  | ||||
|  | ||||
| Jan | ||||
|  | ||||
| -- | ||||
|  | ||||
| #======================================================================# | ||||
| # It's easier to get forgiveness for being wrong than for being right. # | ||||
| # Let's break this rule - forgive me.                                  # | ||||
| #======================================== jwieck@debis.com (Jan Wieck) # | ||||
|  | ||||
|  | ||||
|  | ||||
|  | ||||
| From owner-pgsql-hackers@hub.org Wed Jun 16 09:29:31 1999 | ||||
| Received: from hub.org (hub.org [209.167.229.1]) | ||||
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id JAA22504 | ||||
| 	for <maillist@candle.pha.pa.us>; Wed, 16 Jun 1999 09:29:29 -0400 (EDT) | ||||
| Received: from hub.org (hub.org [209.167.229.1]) | ||||
| 	by hub.org (8.9.3/8.9.3) with ESMTP id JAA02132; | ||||
| 	Wed, 16 Jun 1999 09:18:20 -0400 (EDT) | ||||
| 	(envelope-from owner-pgsql-hackers@hub.org) | ||||
| Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Wed, 16 Jun 1999 09:14:07 +0000 (EDT) | ||||
| Received: (from majordom@localhost) | ||||
| 	by hub.org (8.9.3/8.9.3) id JAA01318 | ||||
| 	for pgsql-hackers-outgoing; Wed, 16 Jun 1999 09:14:06 -0400 (EDT) | ||||
| 	(envelope-from owner-pgsql-hackers@postgreSQL.org) | ||||
| X-Authentication-Warning: hub.org: majordom set sender to owner-pgsql-hackers@postgreSQL.org using -f | ||||
| Received: from sunpine.krs.ru (SunPine.krs.ru [195.161.16.37]) | ||||
| 	by hub.org (8.9.3/8.9.3) with ESMTP id JAA01278 | ||||
| 	for <hackers@postgreSQL.org>; Wed, 16 Jun 1999 09:13:48 -0400 (EDT) | ||||
| 	(envelope-from vadim@krs.ru) | ||||
| Received: from krs.ru (dune.krs.ru [195.161.16.38]) | ||||
| 	by sunpine.krs.ru (8.8.8/8.8.8) with ESMTP id VAA06276 | ||||
| 	for <hackers@postgreSQL.org>; Wed, 16 Jun 1999 21:12:49 +0800 (KRSS) | ||||
| Message-ID: <3767A2CF.E6E4A5F9@krs.ru> | ||||
| Date: Wed, 16 Jun 1999 21:12:47 +0800 | ||||
| From: Vadim Mikheev <vadim@krs.ru> | ||||
| Organization: OJSC Rostelecom (Krasnoyarsk) | ||||
| X-Mailer: Mozilla 4.5 [en] (X11; I; FreeBSD 3.0-RELEASE i386) | ||||
| X-Accept-Language: ru, en | ||||
| MIME-Version: 1.0 | ||||
| To: PostgreSQL Developers List <hackers@postgreSQL.org> | ||||
| Subject: [HACKERS] Savepoints... | ||||
| Content-Type: text/plain; charset=us-ascii | ||||
| Content-Transfer-Encoding: 7bit | ||||
| Sender: owner-pgsql-hackers@postgreSQL.org | ||||
| Precedence: bulk | ||||
| Status: ROr | ||||
|  | ||||
| To have them I need to add tuple id (6 bytes) to heap tuple | ||||
| header. Are there objections? Though it's not good to increase  | ||||
| tuple header size, subj is, imho, very nice feature... | ||||
|  | ||||
| Implementation is , hm, "easy": | ||||
|  | ||||
| - heap_insert/heap_delete/heap_replace/heap_mark4update will | ||||
|   remember updated tid (and current command id) in relation cache | ||||
|   and store previously updated tid (remembered in relation cache) | ||||
|   in additional heap header tid; | ||||
| - lmgr will remember command id when lock was acquired; | ||||
| - for a savepoint we will just store command id when | ||||
|   the savepoint was setted; | ||||
| - when going to sleep due to concurrent the-same-row update, | ||||
|   backend will store MyProc and tuple id in shmem hash table. | ||||
|  | ||||
| When rolling back to a savepoint, backend will: | ||||
|  | ||||
| - release locks acquired after savepoint; | ||||
| - for a relation updated after savepoint, get last updated tid  | ||||
|   from relation cache, walk through relation, set  | ||||
|   HEAP_XMIN_INVALID/HEAP_XMAX_INVALID in all tuples updated  | ||||
|   after savepoint and wake up concurrent writers blocked | ||||
|   on these tuples (using shmem hash table mentioned above). | ||||
|  | ||||
| The last feature (waking up of concurrent writers) is most hard | ||||
| part to implement. AFAIK, Oracle 7.3 was not able to do it. | ||||
| Can someone comment is this feature implemented in Oracle 8.X, | ||||
| other DBMSes? | ||||
|  | ||||
| Now about implicit savepoints. Backend will place them before | ||||
| user statements execution. In the case of failure, transaction | ||||
| state will be rolled back to the one before execution of query. | ||||
| As side-effect, this means that we'll get rid of complaints | ||||
| about entire transaction abort in the case of mistyping | ||||
| causing abort due to parser errors... | ||||
|  | ||||
| Comments? | ||||
|  | ||||
| Vadim | ||||
|  | ||||
|  | ||||
| @@ -1,392 +0,0 @@ | ||||
| From lockhart@alumni.caltech.edu Thu Jan  7 13:31:08 1999 | ||||
| Received: from renoir.op.net (root@renoir.op.net [209.152.193.4]) | ||||
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA07771 | ||||
| 	for <maillist@candle.pha.pa.us>; Thu, 7 Jan 1999 13:31:06 -0500 (EST) | ||||
| Received: from golem.jpl.nasa.gov (IDENT:root@hectic-2.jpl.nasa.gov [128.149.68.204]) by renoir.op.net (o1/$ Revision: 1.18 $) with ESMTP id NAA14597 for <maillist@candle.pha.pa.us>; Thu, 7 Jan 1999 13:27:37 -0500 (EST) | ||||
| Received: from alumni.caltech.edu (localhost [127.0.0.1]) | ||||
| 	by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id SAA13416; | ||||
| 	Thu, 7 Jan 1999 18:26:56 GMT | ||||
| Sender: tgl@mythos.jpl.nasa.gov | ||||
| Message-ID: <3694FC70.FAD67BC3@alumni.caltech.edu> | ||||
| Date: Thu, 07 Jan 1999 18:26:56 +0000 | ||||
| From: "Thomas G. Lockhart" <lockhart@alumni.caltech.edu> | ||||
| Organization: Caltech/JPL | ||||
| X-Mailer: Mozilla 4.07 [en] (X11; I; Linux 2.0.30 i686) | ||||
| MIME-Version: 1.0 | ||||
| To: Bruce Momjian <maillist@candle.pha.pa.us> | ||||
| CC: Postgres Hackers List <hackers@postgresql.org> | ||||
| Subject: Outer Joins (and need CASE help) | ||||
| References: <199901071747.MAA07054@candle.pha.pa.us> | ||||
| Content-Type: text/plain; charset=us-ascii | ||||
| Content-Transfer-Encoding: 7bit | ||||
| Status: RO | ||||
|  | ||||
| > Thomas, do you need help on outer joins? | ||||
|  | ||||
| Yes. I'm going slowly partly because I get distracted with other | ||||
| Postgres stuff like docs, and partly because I don't understand all of | ||||
| the pieces I'm working with. | ||||
|  | ||||
| I've identified the place in the MergeJoin code where the null filling | ||||
| for outer joins needs to happen, and have the "merge walk" code done. | ||||
| But I don't have the supporting code which actually would know how to | ||||
| null-fill a result tuple from the left or right. I thought you might be | ||||
| interested in that? | ||||
|  | ||||
| I've done some work in the parser, and can now do things like: | ||||
|  | ||||
| postgres=> select * from t1 join t2 using (i); | ||||
| NOTICE:  JOIN not yet implemented | ||||
| i|j|i|k | ||||
| -+-+-+- | ||||
| 1|2|1|3 | ||||
| (1 row) | ||||
|  | ||||
| But this is just an inner join, and the result isn't quite right since | ||||
| the second "i" column should probably be omitted. At the moment I | ||||
| transform it from the syntax above into existing parse nodes, and | ||||
| everything from there on works. | ||||
|  | ||||
| I don't yet pass an explicit join node into the planner/optimizer, and | ||||
| that will be the hardest part I assume. Perhaps we can work on that | ||||
| together. | ||||
|  | ||||
| So, what I'll try to do (soon, in the next few days?) is put in | ||||
|  | ||||
|   #ifdef ENABLE_OUTER_JOINS | ||||
|  | ||||
| conditional code into the parser area (already there for the executor) | ||||
| and commit everything to the development tree. Does that sound OK? | ||||
|  | ||||
| Oh, and if anyone is looking for something to do, I've got a couple of | ||||
| CASE statements in the case.sql regression test which are commented out | ||||
| because they crash the backend. They involve references to multiple | ||||
| tables within a single result column, and in other contexts that | ||||
| construct works. It would be great if someone had time to track it | ||||
| down... | ||||
|  | ||||
|                      - Tom | ||||
|  | ||||
| From lockhart@alumni.caltech.edu Mon Feb 22 02:01:13 1999 | ||||
| Received: from renoir.op.net (root@renoir.op.net [209.152.193.4]) | ||||
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA22073 | ||||
| 	for <maillist@candle.pha.pa.us>; Mon, 22 Feb 1999 02:01:12 -0500 (EST) | ||||
| Received: from golem.jpl.nasa.gov (IDENT:root@hectic-2.jpl.nasa.gov [128.149.68.204]) by renoir.op.net (o1/$ Revision: 1.18 $) with ESMTP id BAA26054 for <maillist@candle.pha.pa.us>; Mon, 22 Feb 1999 01:57:00 -0500 (EST) | ||||
| Received: from alumni.caltech.edu (localhost [127.0.0.1]) | ||||
| 	by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id GAA04715; | ||||
| 	Mon, 22 Feb 1999 06:56:36 GMT | ||||
| Sender: tgl@mythos.jpl.nasa.gov | ||||
| Message-ID: <36D0FFA4.32ADB75C@alumni.caltech.edu> | ||||
| Date: Mon, 22 Feb 1999 06:56:36 +0000 | ||||
| From: "Thomas G. Lockhart" <lockhart@alumni.caltech.edu> | ||||
| Organization: Caltech/JPL | ||||
| X-Mailer: Mozilla 4.07 [en] (X11; I; Linux 2.0.36 i686) | ||||
| MIME-Version: 1.0 | ||||
| To: Bruce Momjian <maillist@candle.pha.pa.us> | ||||
| CC: hackers@postgreSQL.org | ||||
| Subject: Re: start on outer join | ||||
| References: <199902220304.WAA10066@candle.pha.pa.us> | ||||
| Content-Type: text/plain; charset=us-ascii | ||||
| Content-Transfer-Encoding: 7bit | ||||
| Status: ROr | ||||
|  | ||||
| Bruce Momjian wrote: | ||||
| >  | ||||
| > > Will apply ... some other changes laying a bit of | ||||
| > > groundwork for outer joins so you can start on the planner/optimizer | ||||
| > > parts :) | ||||
| > Those will be a synch now that I understand the optimizer.  In fact, I | ||||
| > think it all will happen in the executor. | ||||
|  | ||||
| I've modified executor/nodeMergeJoin.c to walk a left/right/both outer | ||||
| join, but didn't fill in the part which actually creates the result | ||||
| tuple (which will be the current left- or right-side tuple plus nulls | ||||
| for filler). I hope this is up your alley :) | ||||
|  | ||||
| So far, I'm not certain what to pass to the planner. The syntax leads me | ||||
| to pass a select structure from gram.y with a "JoinExpr" structure in | ||||
| the "fromClause" list. I need to expand that with a combination of | ||||
| column names and qualifications, but at the time I see the JoinExpr I | ||||
| don't have access to the top query structure itself. So I may just keep | ||||
| a modestly transformed JoinExpr to expand later or to pass to the | ||||
| planner. | ||||
|  | ||||
| btw, the EXCEPT/INTERSECT stuff from Stefan has some ugliness in gram.y | ||||
| which needs to be fixed (the shift/reduce conflict is not acceptable for | ||||
| our release version) and some of that code clearly needs to move to | ||||
| analyze.c or some other module. | ||||
|  | ||||
|                      - Tom | ||||
|  | ||||
| From maillist Wed Feb 24 05:27:08 1999 | ||||
| Received: (from maillist@localhost) | ||||
| 	by candle.pha.pa.us (8.9.0/8.9.0) id FAA09648; | ||||
| 	Wed, 24 Feb 1999 05:27:08 -0500 (EST) | ||||
| From: Bruce Momjian <maillist> | ||||
| Message-Id: <199902241027.FAA09648@candle.pha.pa.us> | ||||
| Subject: Re: [HACKERS] OUTER joins | ||||
| In-Reply-To: <199902240953.EAA08561@candle.pha.pa.us> from Bruce Momjian at "Feb 24, 1999  4:53:21 am" | ||||
| To: maillist@candle.pha.pa.us (Bruce Momjian) | ||||
| Date: Wed, 24 Feb 1999 05:27:07 -0500 (EST) | ||||
| Cc: lockhart@alumni.caltech.edu, hackers@postgreSQL.org | ||||
| X-Mailer: ELM [version 2.4ME+ PL47 (25)] | ||||
| MIME-Version: 1.0 | ||||
| Content-Type: text/plain; charset=US-ASCII | ||||
| Content-Transfer-Encoding: 7bit | ||||
| Status: RO | ||||
|  | ||||
| >  | ||||
| > How do you propose doing outer joins in non-mergejoin situations? | ||||
| > Mergejoins can only be used currently in equal joins. | ||||
|  | ||||
| Is your solution going to be to make sure the OUTER table is always a | ||||
| MergeJoin, or on the outside of a join loop?  That could work. | ||||
|  | ||||
| That could get tricky if the table is joined to _two_ other tables.  | ||||
| With the cleaned-up optimizer, we can disable non-merge joins in certain | ||||
| circumstances, and prevent OUTER tables from being inner in the others.  | ||||
| Is that the plan? | ||||
|  | ||||
| --  | ||||
|   Bruce Momjian                        |  http://www.op.net/~candle | ||||
|   maillist@candle.pha.pa.us            |  (610) 853-3000 | ||||
|   +  If your life is a hard drive,     |  830 Blythe Avenue | ||||
|   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026 | ||||
|  | ||||
| From lockhart@alumni.caltech.edu Mon Mar  1 13:01:08 1999 | ||||
| Received: from renoir.op.net (root@renoir.op.net [209.152.193.4]) | ||||
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA21672 | ||||
| 	for <maillist@candle.pha.pa.us>; Mon, 1 Mar 1999 13:01:06 -0500 (EST) | ||||
| Received: from golem.jpl.nasa.gov (IDENT:root@hectic-2.jpl.nasa.gov [128.149.68.204]) by renoir.op.net (o1/$ Revision: 1.18 $) with ESMTP id MAA12756 for <maillist@candle.pha.pa.us>; Mon, 1 Mar 1999 12:14:16 -0500 (EST) | ||||
| Received: from alumni.caltech.edu (localhost [127.0.0.1]) | ||||
| 	by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id RAA09406; | ||||
| 	Mon, 1 Mar 1999 17:10:49 GMT | ||||
| Sender: tgl@mythos.jpl.nasa.gov | ||||
| Message-ID: <36DACA19.E6DBE7D8@alumni.caltech.edu> | ||||
| Date: Mon, 01 Mar 1999 17:10:49 +0000 | ||||
| From: "Thomas G. Lockhart" <lockhart@alumni.caltech.edu> | ||||
| Organization: Caltech/JPL | ||||
| X-Mailer: Mozilla 4.07 [en] (X11; I; Linux 2.0.36 i686) | ||||
| MIME-Version: 1.0 | ||||
| To: Bruce Momjian <maillist@candle.pha.pa.us> | ||||
| CC: PostgreSQL-development <hackers@postgreSQL.org> | ||||
| Subject: Re: OUTER joins | ||||
| References: <199902240953.EAA08561@candle.pha.pa.us> | ||||
| Content-Type: text/plain; charset=us-ascii | ||||
| Content-Transfer-Encoding: 7bit | ||||
| Status: ROr | ||||
|  | ||||
| (back from a short vacation...) | ||||
|  | ||||
| > How do you propose doing outer joins in non-mergejoin situations? | ||||
| > Mergejoins can only be used currently in equal joins. | ||||
|  | ||||
| Hadn't thought about it, other than figuring that implementing the | ||||
| equi-join first was a good start. There is a class of outer join syntax | ||||
| (the USING clause) which is implicitly an equi-join... | ||||
|  | ||||
|                         - Tom | ||||
|  | ||||
| From lockhart@alumni.caltech.edu Mon Mar  8 21:55:02 1999 | ||||
| Received: from renoir.op.net (root@renoir.op.net [209.152.193.4]) | ||||
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA15978 | ||||
| 	for <maillist@candle.pha.pa.us>; Mon, 8 Mar 1999 21:54:57 -0500 (EST) | ||||
| Received: from golem.jpl.nasa.gov (IDENT:root@hectic-1.jpl.nasa.gov [128.149.68.203]) by renoir.op.net (o1/$ Revision: 1.18 $) with ESMTP id VAA15837 for <maillist@candle.pha.pa.us>; Mon, 8 Mar 1999 21:48:33 -0500 (EST) | ||||
| Received: from alumni.caltech.edu (localhost [127.0.0.1]) | ||||
| 	by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id CAA06996; | ||||
| 	Tue, 9 Mar 1999 02:46:40 GMT | ||||
| Sender: tgl@mythos.jpl.nasa.gov | ||||
| Message-ID: <36E48B90.F3E902B7@alumni.caltech.edu> | ||||
| Date: Tue, 09 Mar 1999 02:46:40 +0000 | ||||
| From: "Thomas G. Lockhart" <lockhart@alumni.caltech.edu> | ||||
| Organization: Caltech/JPL | ||||
| X-Mailer: Mozilla 4.07 [en] (X11; I; Linux 2.0.36 i686) | ||||
| MIME-Version: 1.0 | ||||
| To: Bruce Momjian <maillist@candle.pha.pa.us> | ||||
| CC: hackers@postgreSQL.org | ||||
| Subject: Re: OUTER joins | ||||
| References: <199903070325.WAA10357@candle.pha.pa.us> | ||||
| Content-Type: text/plain; charset=us-ascii | ||||
| Content-Transfer-Encoding: 7bit | ||||
| Status: ROr | ||||
|  | ||||
| > > Hadn't thought about it, other than figuring that implementing the | ||||
| > > equi-join first was a good start. There is a class of outer join  | ||||
| > > syntax (the USING clause) which is implicitly an equi-join... | ||||
| > Not that easy.  You don't automatically get a mergejoin from an | ||||
| > equijoin.  I will have to force outer's to be either mergejoins, or | ||||
| > inners of non-merge joins.  Can you add code to non-merge joins in the | ||||
| > executor to throw out a null row if it does not find an inner match  | ||||
| > for the outer row, and I will handle the optimizer so it doesn't throw  | ||||
| > a non-conforming plan to the executor. | ||||
|  | ||||
| So far I don't have enough info in the parser to get the | ||||
| planner/optimizer going. Should we work from the front to the back, or | ||||
| should I go ahead and look at the non-merge joins? It's painfully | ||||
| obvious that I don't know anything about the middle parts of this to | ||||
| proceed without lots more research. | ||||
|  | ||||
|                         - Tom | ||||
|  | ||||
| From lockhart@alumni.caltech.edu Tue Mar  9 22:47:57 1999 | ||||
| Received: from golem.jpl.nasa.gov (IDENT:root@hectic-1.jpl.nasa.gov [128.149.68.203]) | ||||
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07869 | ||||
| 	for <maillist@candle.pha.pa.us>; Tue, 9 Mar 1999 22:47:54 -0500 (EST) | ||||
| Received: from alumni.caltech.edu (localhost [127.0.0.1]) | ||||
| 	by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id DAA14761; | ||||
| 	Wed, 10 Mar 1999 03:46:43 GMT | ||||
| Sender: tgl@mythos.jpl.nasa.gov | ||||
| Message-ID: <36E5EB23.F5CD959B@alumni.caltech.edu> | ||||
| Date: Wed, 10 Mar 1999 03:46:43 +0000 | ||||
| From: "Thomas G. Lockhart" <lockhart@alumni.caltech.edu> | ||||
| Organization: Caltech/JPL | ||||
| X-Mailer: Mozilla 4.07 [en] (X11; I; Linux 2.0.36 i686) | ||||
| MIME-Version: 1.0 | ||||
| To: Bruce Momjian <maillist@candle.pha.pa.us>, tgl@mythos.jpl.nasa.gov | ||||
| Subject: Re: SQL outer | ||||
| References: <199903100112.UAA05772@candle.pha.pa.us> | ||||
| Content-Type: text/plain; charset=us-ascii | ||||
| Content-Transfer-Encoding: 7bit | ||||
| Status: RO | ||||
|  | ||||
| >         select  * | ||||
| >         from    outer tab1, tab2, tab3 | ||||
| >         where   tab1.col1 = tab2.col1 and | ||||
| >                 tab1.col1 = tab3.col1 | ||||
|  | ||||
| select * | ||||
| from t1 left join t2 using (c1) | ||||
|         join t3 on (c1 = t3.c1) | ||||
|  | ||||
| Result: | ||||
| t1.c1	t1.c2	t2.c2	t3.c1 | ||||
| 2	12	NULL	32 | ||||
|  | ||||
| t1: | ||||
| c1	c2 | ||||
| 1	11 | ||||
| 2	12 | ||||
| 3	13 | ||||
| 4	14 | ||||
|  | ||||
| t2: | ||||
| c1	c2 | ||||
| 1	21 | ||||
| 3	23 | ||||
|  | ||||
| t3: | ||||
| c1	c2 | ||||
| 2	32 | ||||
|  | ||||
| From lockhart@alumni.caltech.edu Wed Mar 10 10:48:54 1999 | ||||
| Received: from golem.jpl.nasa.gov (IDENT:root@hectic-1.jpl.nasa.gov [128.149.68.203]) | ||||
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA16741 | ||||
| 	for <maillist@candle.pha.pa.us>; Wed, 10 Mar 1999 10:48:51 -0500 (EST) | ||||
| Received: from alumni.caltech.edu (localhost [127.0.0.1]) | ||||
| 	by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id PAA17723; | ||||
| 	Wed, 10 Mar 1999 15:48:31 GMT | ||||
| Sender: tgl@mythos.jpl.nasa.gov | ||||
| Message-ID: <36E6944F.1F93B08@alumni.caltech.edu> | ||||
| Date: Wed, 10 Mar 1999 15:48:31 +0000 | ||||
| From: "Thomas G. Lockhart" <lockhart@alumni.caltech.edu> | ||||
| Organization: Caltech/JPL | ||||
| X-Mailer: Mozilla 4.07 [en] (X11; I; Linux 2.0.36 i686) | ||||
| MIME-Version: 1.0 | ||||
| To: Bruce Momjian <maillist@candle.pha.pa.us> | ||||
| CC: Thomas Lockhart <lockhart@alumni.caltech.edu> | ||||
| Subject: Re: SQL outer | ||||
| References: <199903100112.UAA05772@candle.pha.pa.us> <36E5EB23.F5CD959B@alumni.caltech.edu> | ||||
| Content-Type: text/plain; charset=us-ascii | ||||
| Content-Transfer-Encoding: 7bit | ||||
| Status: ROr | ||||
|  | ||||
| Just thinking... | ||||
|  | ||||
| If the initial RelOptInfo groupings are derived from the WHERE clause | ||||
| expressions, how about marking the "outer" property in those expressions | ||||
| in the parser? istm that is where the parser knows about two tables in | ||||
| one place, and I'm generating those expressions anyway. We could add a | ||||
| field(s) to the expression structure, or pass along a slightly different | ||||
| structure... | ||||
|  | ||||
|                          - Tom | ||||
|  | ||||
| From owner-pgsql-hackers@hub.org Wed Jul 21 02:35:13 1999 | ||||
| Received: from hub.org (hub.org [216.126.84.1]) | ||||
| 	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA13837 | ||||
| 	for <maillist@candle.pha.pa.us>; Wed, 21 Jul 1999 02:35:12 -0400 (EDT) | ||||
| Received: from hub.org (hub.org [216.126.84.1]) | ||||
| 	by hub.org (8.9.3/8.9.3) with ESMTP id CAA88539; | ||||
| 	Wed, 21 Jul 1999 02:27:41 -0400 (EDT) | ||||
| 	(envelope-from owner-pgsql-hackers@hub.org) | ||||
| Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Wed, 21 Jul 1999 02:24:08 +0000 (EDT) | ||||
| Received: (from majordom@localhost) | ||||
| 	by hub.org (8.9.3/8.9.3) id CAA87850 | ||||
| 	for pgsql-hackers-outgoing; Wed, 21 Jul 1999 02:23:13 -0400 (EDT) | ||||
| 	(envelope-from owner-pgsql-hackers@postgreSQL.org) | ||||
| Received: from localhost (IDENT:root@hectic-2.jpl.nasa.gov [128.149.68.204]) | ||||
| 	by hub.org (8.9.3/8.9.3) with ESMTP id CAA87810 | ||||
| 	for <pgsql-hackers@postgreSQL.org>; Wed, 21 Jul 1999 02:22:52 -0400 (EDT) | ||||
| 	(envelope-from lockhart@alumni.caltech.edu) | ||||
| Received: from alumni.caltech.edu (lockhart@localhost [127.0.0.1]) | ||||
| 	by localhost (8.8.7/8.8.7) with ESMTP id GAA14480; | ||||
| 	Wed, 21 Jul 1999 06:20:22 GMT | ||||
| Message-ID: <379566A6.A4CDF97F@alumni.caltech.edu> | ||||
| Date: Wed, 21 Jul 1999 06:20:22 +0000 | ||||
| From: Thomas Lockhart <lockhart@alumni.caltech.edu> | ||||
| X-Mailer: Mozilla 4.6 [en] (X11; I; Linux 2.0.36 i686) | ||||
| X-Accept-Language: en | ||||
| MIME-Version: 1.0 | ||||
| To: Tom Lane <tgl@sss.pgh.pa.us> | ||||
| CC: Bruce Momjian <maillist@candle.pha.pa.us>, pgsql-hackers@postgreSQL.org | ||||
| Subject: Re: [HACKERS] Another reason to redesign querytree representation | ||||
| References: <591.932505751@sss.pgh.pa.us> | ||||
| Content-Type: text/plain; charset=us-ascii | ||||
| Content-Transfer-Encoding: 7bit | ||||
| Sender: owner-pgsql-hackers@postgreSQL.org | ||||
| Precedence: bulk | ||||
| Status: RO | ||||
|  | ||||
| > Thomas, what do you think is needed for outer joins? | ||||
|  | ||||
| Bruce and I have talked about it some already: | ||||
|  | ||||
| For outer joins, tables must be combined in a particular order. For | ||||
| example, a left outer join requires that any entries in the left-side | ||||
| table which do not have a corresponding entry in the right-side table | ||||
| be expanded with nulls during the join. The information on the outer | ||||
| join can't be carried by the rte since the same table can appear twice | ||||
| in an outer join expression: | ||||
|  | ||||
|   select * from t1 left join t2 using (i) | ||||
|                 left join t1 on (i = t1.j); | ||||
|  | ||||
| For a query like | ||||
|  | ||||
|   select * from t1 left join t2 using (i) where t2.j = 3; | ||||
|  | ||||
| istm that the outer join must be done before the t2 qualification is | ||||
| applied, and that another ordering may produce the wrong result. | ||||
|  | ||||
| >From what I understand Bruce to say, the planner/optimizer is allowed | ||||
| to try all kinds of permutations of plans, choosing the one with the | ||||
| lowest cost. But if the info for the join is carried in a | ||||
| qualification node, then the planner/optimizer must know that it can't | ||||
| reorder the query as freely as it does now. | ||||
|  | ||||
| I was thinking of having a new qualification node to carry this info, | ||||
| and it could be transformed into a mergejoin node which has a couple | ||||
| of new fields indicating left and/or right outer join behavior. | ||||
|  | ||||
| A hashjoin method may be possible for queries which are structured as | ||||
| a left outer join; other outer joins will need to use the mergejoin | ||||
| method. Also, some poorly-qualified outer joins reduce to inner joins, | ||||
| and perhaps the optimizer can be smart enough to realize this. | ||||
|  | ||||
|                      - Thomas | ||||
|  | ||||
| --  | ||||
| Thomas Lockhart				lockhart@alumni.caltech.edu | ||||
| South Pasadena, California | ||||
|  | ||||
|  | ||||
		Reference in New Issue
	
	Block a user