mirror of
				https://github.com/postgres/postgres.git
				synced 2025-11-03 09:13:20 +03:00 
			
		
		
		
	
		
			
				
	
	
		
			1241 lines
		
	
	
		
			50 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			1241 lines
		
	
	
		
			50 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
From owner-pgsql-hackers@hub.org Thu Nov 26 08:31:13 1998
 | 
						|
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
 | 
						|
	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id IAA24423
 | 
						|
	for <maillist@candle.pha.pa.us>; Thu, 26 Nov 1998 08:31:08 -0500 (EST)
 | 
						|
Received: from hub.org (majordom@hub.org [209.47.148.200]) by renoir.op.net (o1/$ Revision: 1.18 $) with ESMTP id IAA04554 for <maillist@candle.pha.pa.us>; Thu, 26 Nov 1998 08:04:30 -0500 (EST)
 | 
						|
Received: from localhost (majordom@localhost)
 | 
						|
	by hub.org (8.9.1/8.9.1) with SMTP id HAA03761;
 | 
						|
	Thu, 26 Nov 1998 07:56:37 -0500 (EST)
 | 
						|
	(envelope-from owner-pgsql-hackers@hub.org)
 | 
						|
Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Thu, 26 Nov 1998 07:55:28 +0000 (EST)
 | 
						|
Received: (from majordom@localhost)
 | 
						|
	by hub.org (8.9.1/8.9.1) id HAA03689
 | 
						|
	for pgsql-hackers-outgoing; Thu, 26 Nov 1998 07:55:26 -0500 (EST)
 | 
						|
	(envelope-from owner-pgsql-hackers@postgreSQL.org)
 | 
						|
Received: from orion.SAPserv.Hamburg.dsh.de (Tpolaris2.sapham.debis.de [53.2.131.8])
 | 
						|
	by hub.org (8.9.1/8.9.1) with SMTP id HAA03674
 | 
						|
	for <pgsql-hackers@postgreSQL.org>; Thu, 26 Nov 1998 07:55:19 -0500 (EST)
 | 
						|
	(envelope-from wieck@sapserv.debis.de)
 | 
						|
Received: by orion.SAPserv.Hamburg.dsh.de 
 | 
						|
	for pgsql-hackers@postgreSQL.org 
 | 
						|
	id m0zj13G-000EBfC; Thu, 26 Nov 98 14:01 MET
 | 
						|
Message-Id: <m0zj13G-000EBfC@orion.SAPserv.Hamburg.dsh.de>
 | 
						|
From: jwieck@debis.com (Jan Wieck)
 | 
						|
Subject: Re: [HACKERS] Re: memory leak with Abort Transaction
 | 
						|
To: takehi-s@ascii.co.jp (SHIOZAKI Takehiko)
 | 
						|
Date: Thu, 26 Nov 1998 14:01:42 +0100 (MET)
 | 
						|
Cc: pgsql-hackers@postgreSQL.org
 | 
						|
Reply-To: jwieck@debis.com (Jan Wieck)
 | 
						|
In-Reply-To: <199811261240.VAA27516@libpc01.pb.ascii.co.jp> from "SHIOZAKI Takehiko" at Nov 26, 98 09:40:19 pm
 | 
						|
X-Mailer: ELM [version 2.4 PL25]
 | 
						|
Content-Type: text
 | 
						|
Sender: owner-pgsql-hackers@postgreSQL.org
 | 
						|
Precedence: bulk
 | 
						|
Status: RO
 | 
						|
 | 
						|
SHIOZAKI Takehiko wrote:
 | 
						|
 | 
						|
>
 | 
						|
> Hello!
 | 
						|
>
 | 
						|
> Releasing 6.4.1 is a good news.
 | 
						|
> But would you confirm the following "memory leak" problem?
 | 
						|
> It is reproducable on 6.4 (FreeBSD 2.2.7-RELEASE).
 | 
						|
 | 
						|
    It's  an far too old problem. And as far as I remember, there
 | 
						|
    are different locations in the code causing it.
 | 
						|
 | 
						|
    One place I remember well.  It's  in  the  tcop  mainloop  in
 | 
						|
    PostgresMain().  The querytree list is malloc()'ed (there and
 | 
						|
    in the parser) and free()'d after the query  is  processed  -
 | 
						|
    except  the  processing of the queries bails out with elog().
 | 
						|
    In that case it  never  runs  over  the  free()  because  the
 | 
						|
    longjmp() kick's it back to the beginning of the loop.
 | 
						|
 | 
						|
 | 
						|
Jan
 | 
						|
 | 
						|
--
 | 
						|
 | 
						|
#======================================================================#
 | 
						|
# It's easier to get forgiveness for being wrong than for being right. #
 | 
						|
# Let's break this rule - forgive me.                                  #
 | 
						|
#======================================== jwieck@debis.com (Jan Wieck) #
 | 
						|
 | 
						|
 | 
						|
 | 
						|
 | 
						|
From owner-pgsql-hackers@hub.org Fri Mar 19 16:01:29 1999
 | 
						|
Received: from hub.org (majordom@hub.org [209.47.145.100])
 | 
						|
	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id QAA05828
 | 
						|
	for <maillist@candle.pha.pa.us>; Fri, 19 Mar 1999 16:01:22 -0500 (EST)
 | 
						|
Received: from localhost (majordom@localhost)
 | 
						|
	by hub.org (8.9.2/8.9.1) with SMTP id PAA15701;
 | 
						|
	Fri, 19 Mar 1999 15:59:51 -0500 (EST)
 | 
						|
	(envelope-from owner-pgsql-hackers@hub.org)
 | 
						|
Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Fri, 19 Mar 1999 15:59:08 +0000 (EST)
 | 
						|
Received: (from majordom@localhost)
 | 
						|
	by hub.org (8.9.2/8.9.1) id PAA15551
 | 
						|
	for pgsql-hackers-outgoing; Fri, 19 Mar 1999 15:59:05 -0500 (EST)
 | 
						|
	(envelope-from owner-pgsql-hackers@postgreSQL.org)
 | 
						|
Received: from andrew.cmu.edu (ANDREW.CMU.EDU [128.2.10.101])
 | 
						|
	by hub.org (8.9.2/8.9.1) with ESMTP id PAA15524
 | 
						|
	for <pgsql-hackers@postgresql.org>; Fri, 19 Mar 1999 15:58:53 -0500 (EST)
 | 
						|
	(envelope-from er1p+@andrew.cmu.edu)
 | 
						|
Received: (from postman@localhost) by andrew.cmu.edu (8.8.5/8.8.2) id PAA29323 for pgsql-hackers@postgresql.org; Fri, 19 Mar 1999 15:58:50 -0500 (EST)
 | 
						|
Received: via switchmail; Fri, 19 Mar 1999 15:58:50 -0500 (EST)
 | 
						|
Received: from cloudy.me.cmu.edu via qmail
 | 
						|
          ID </afs/andrew.cmu.edu/service/mailqs/q005/QF.cqwfdxK00gNtE0TVBp>;
 | 
						|
          Fri, 19 Mar 1999 15:58:37 -0500 (EST)
 | 
						|
Received: from cloudy.me.cmu.edu via qmail
 | 
						|
          ID </afs/andrew.cmu.edu/usr2/er1p/.Outgoing/QF.MqwfdrO00gNtEmTPh2>;
 | 
						|
          Fri, 19 Mar 1999 15:58:31 -0500 (EST)
 | 
						|
Received: from mms.4.60.Jun.27.1996.03.05.56.sun4.41.EzMail.2.0.CUILIB.3.45.SNAP.NOT.LINKED.cloudy.me.cmu.edu.sun4m.412
 | 
						|
          via MS.5.6.cloudy.me.cmu.edu.sun4_41;
 | 
						|
          Fri, 19 Mar 1999 15:58:29 -0500 (EST)
 | 
						|
Message-ID: <wqwfdpu00gNtImTPUm@andrew.cmu.edu>
 | 
						|
Date: Fri, 19 Mar 1999 15:58:29 -0500 (EST)
 | 
						|
From: Erik Riedel <riedel+@CMU.EDU>
 | 
						|
To: pgsql-hackers@postgreSQL.org
 | 
						|
Subject: [HACKERS] aggregation memory leak and fix
 | 
						|
Sender: owner-pgsql-hackers@postgreSQL.org
 | 
						|
Precedence: bulk
 | 
						|
Status: ROr
 | 
						|
 | 
						|
 | 
						|
Platform:  Alpha, Digital UNIX 4.0D
 | 
						|
Software:  PostgreSQL 6.4.2 and 6.5 snaphot (11 March 1999)
 | 
						|
 | 
						|
I have a table as follows:
 | 
						|
 | 
						|
Table    = lineitem
 | 
						|
+------------------------+----------------------------------+-------+
 | 
						|
|              Field     |              Type                | Length|
 | 
						|
+------------------------+----------------------------------+-------+
 | 
						|
| l_orderkey             | int4 not null                    |     4 |
 | 
						|
| l_partkey              | int4 not null                    |     4 |
 | 
						|
| l_suppkey              | int4 not null                    |     4 |
 | 
						|
| l_linenumber           | int4 not null                    |     4 |
 | 
						|
| l_quantity             | float4 not null                  |     4 |
 | 
						|
| l_extendedprice        | float4 not null                  |     4 |
 | 
						|
| l_discount             | float4 not null                  |     4 |
 | 
						|
| l_tax                  | float4 not null                  |     4 |
 | 
						|
| l_returnflag           | char() not null                  |     1 |
 | 
						|
| l_linestatus           | char() not null                  |     1 |
 | 
						|
| l_shipdate             | date                             |     4 |
 | 
						|
| l_commitdate           | date                             |     4 |
 | 
						|
| l_receiptdate          | date                             |     4 |
 | 
						|
| l_shipinstruct         | char() not null                  |    25 |
 | 
						|
| l_shipmode             | char() not null                  |    10 |
 | 
						|
| l_comment              | char() not null                  |    44 |
 | 
						|
+------------------------+----------------------------------+-------+
 | 
						|
Index:    lineitem_index_
 | 
						|
 | 
						|
that ends up having on the order of 500,000 rows (about 100 MB on disk).  
 | 
						|
 | 
						|
I then run an aggregation query as:
 | 
						|
 | 
						|
--
 | 
						|
-- Query 1
 | 
						|
--
 | 
						|
select l_returnflag, l_linestatus, sum(l_quantity) as sum_qty, 
 | 
						|
sum(l_extendedprice) as sum_base_price, 
 | 
						|
sum(l_extendedprice*(1-l_discount)) as sum_disc_price, 
 | 
						|
sum(l_extendedprice*(1-l_discount)*(1+l_tax)) as sum_charge, 
 | 
						|
avg(l_quantity) as avg_qty, avg(l_extendedprice) as avg_price, 
 | 
						|
avg(l_discount) as avg_disc, count(*) as count_order 
 | 
						|
from lineitem 
 | 
						|
where l_shipdate <= ('1998-12-01'::datetime - interval '90 day')::date 
 | 
						|
group by l_returnflag, l_linestatus 
 | 
						|
order by l_returnflag, l_linestatus;
 | 
						|
 | 
						|
 | 
						|
when I run this against 6.4.2, the postgres process grows to upwards of
 | 
						|
1 GB of memory (at which point something overflows and it dumps core) -
 | 
						|
I watch it grow through 200 MB, 400 MB, 800 MB, dies somewhere near 1 GB
 | 
						|
of allocated memory).
 | 
						|
 | 
						|
If I take out a few of the "sum" expressions it gets better, removing
 | 
						|
sum_disk_price and sum_charge causes it to be only 600 MB and the query
 | 
						|
actually (eventually) completes.  Takes about 10 minutes on my 500 MHz
 | 
						|
machine with 256 MB core and 4 GB of swap.
 | 
						|
 | 
						|
The problem seems to be the memory allocation mechanism.  Looking at a
 | 
						|
call trace, it is doing some kind of "sub query" plan for each row in
 | 
						|
the database.  That means it does ExecEval and postquel_function and
 | 
						|
postquel_execute and all their friends for each row in the database. 
 | 
						|
Allocating a couple hundred bytes for each one.
 | 
						|
 | 
						|
The problem is that none of these allocations are freed - they seem to
 | 
						|
depend on the AllocSet to free them at the end of the transaction.  This
 | 
						|
means it isn't a "true" leak, because the bytes are all freed at the
 | 
						|
(very) end of the transaction, but it does mean that the process grows
 | 
						|
to unreasonable size in the meantime.  There is no need for this,
 | 
						|
because the individual expression results are aggregated as it goes
 | 
						|
along, so the intermediate nodes can be freed.
 | 
						|
 | 
						|
I spent half a day last week chasing down the offending palloc() calls
 | 
						|
and execution stacks sufficiently that I think I found the right places
 | 
						|
to put pfree() calls.
 | 
						|
 | 
						|
As a result, I have changes in the files:
 | 
						|
 | 
						|
src/backend/executor/execUtils.c
 | 
						|
src/backend/executor/nodeResult.c 
 | 
						|
src/backend/executor/nodeAgg.c 
 | 
						|
src/backend/executor/execMain.c 
 | 
						|
 | 
						|
patches to these files are attached at the end of this message.  These
 | 
						|
files are based on the 6.5.0 snapshot downloaded from ftp.postgreql.org
 | 
						|
on 11 March 1999.
 | 
						|
 | 
						|
Apologies for sending patches to a non-released version.  If anyone has
 | 
						|
problems applying the patches, I can send the full files (I wanted to
 | 
						|
avoid sending a 100K shell archive to the list).  If anyone cares about
 | 
						|
reproducing my exact problem with the above table, I can provide the 100
 | 
						|
MB pg_dump file for download as well.
 | 
						|
 | 
						|
Secondary Issue:  the reason I did not use the 6.4.2 code to make my
 | 
						|
changes is because the AllocSet calls in that one were particularly
 | 
						|
egregious - they only had the skeleton of the allocsets code that exists
 | 
						|
in the 6.5 snapshots, so they were calling malloc() for all of the 8 and
 | 
						|
16 byte allocations that the above query causes.
 | 
						|
 | 
						|
Using the fixed code reduces the maximum memory requirement on the above
 | 
						|
query to about 210 MB, and reduces the runtime to (an acceptable) 1.5
 | 
						|
minutes - a factor of more than 6x improvement on my 256 MB machine.
 | 
						|
 | 
						|
Now the biggest part of the execution time is in the sort before the
 | 
						|
aggregation (which isn't strictly needed, but that is an optimization
 | 
						|
for another day).
 | 
						|
 | 
						|
Open Issue: there is still a small "leak" that I couldn't eliminate, I
 | 
						|
think I chased it down to the constvalue allocated in
 | 
						|
execQual::ExecTargetList(), but I couldn't figure out where to properly
 | 
						|
free it.  8 bytes leaked was much better than 750 bytes, so I stopped
 | 
						|
banging my head on that particular item.
 | 
						|
 | 
						|
Secondary Open Issue: what I did have to do to get down to 210 MB of
 | 
						|
core was reduce the minimum allocation size in AllocSet to 8 bytes from
 | 
						|
16 bytes.  That reduces the 8 byte leak above to a true 8 byte, rather
 | 
						|
than a 16 byte leak.  Otherwise, I think the size was 280 MB (still a
 | 
						|
big improvement on 1000+ MB).  I only changed this in my code and I am
 | 
						|
not including a changed mcxt.c for that.
 | 
						|
 | 
						|
I hope my changes are understandable/reasonable.  Enjoy.
 | 
						|
 | 
						|
Erik Riedel
 | 
						|
Carnegie Mellon University
 | 
						|
www.cs.cmu.edu/~riedel
 | 
						|
 | 
						|
--------------[aggregation_memory_patch.sh]-----------------------
 | 
						|
 | 
						|
#! /bin/sh
 | 
						|
# This is a shell archive, meaning:
 | 
						|
# 1. Remove everything above the #! /bin/sh line.
 | 
						|
# 2. Save the resulting text in a file.
 | 
						|
# 3. Execute the file with /bin/sh (not csh) to create:
 | 
						|
#	execMain.c.diff
 | 
						|
#	execUtils.c.diff
 | 
						|
#	nodeAgg.c.diff
 | 
						|
#	nodeResult.c.diff
 | 
						|
# This archive created: Fri Mar 19 15:47:17 1999
 | 
						|
export PATH; PATH=/bin:/usr/bin:$PATH
 | 
						|
if test -f 'execMain.c.diff'
 | 
						|
then
 | 
						|
	echo shar: "will not over-write existing file 'execMain.c.diff'"
 | 
						|
else
 | 
						|
cat << \SHAR_EOF > 'execMain.c.diff'
 | 
						|
583c
 | 
						|
	
 | 
						|
.
 | 
						|
398a
 | 
						|
 | 
						|
.
 | 
						|
396a
 | 
						|
	/* XXX - clean up some more from ExecutorStart() - er1p */
 | 
						|
	if (NULL == estate->es_snapshot) {
 | 
						|
	  /* nothing to free */
 | 
						|
	} else {
 | 
						|
	  if (estate->es_snapshot->xcnt > 0) { 
 | 
						|
	    pfree(estate->es_snapshot->xip);
 | 
						|
	  }
 | 
						|
	  pfree(estate->es_snapshot);
 | 
						|
	}
 | 
						|
 | 
						|
	if (NULL == estate->es_param_exec_vals) {
 | 
						|
	  /* nothing to free */
 | 
						|
	} else {
 | 
						|
	  pfree(estate->es_param_exec_vals);
 | 
						|
	  estate->es_param_exec_vals = NULL;
 | 
						|
	}
 | 
						|
 | 
						|
.
 | 
						|
SHAR_EOF
 | 
						|
fi
 | 
						|
if test -f 'execUtils.c.diff'
 | 
						|
then
 | 
						|
	echo shar: "will not over-write existing file 'execUtils.c.diff'"
 | 
						|
else
 | 
						|
cat << \SHAR_EOF > 'execUtils.c.diff'
 | 
						|
368a
 | 
						|
}
 | 
						|
 | 
						|
/* ----------------
 | 
						|
 *		ExecFreeExprContext
 | 
						|
 * ----------------
 | 
						|
 */
 | 
						|
void
 | 
						|
ExecFreeExprContext(CommonState *commonstate)
 | 
						|
{
 | 
						|
	ExprContext *econtext;
 | 
						|
 | 
						|
	/* ----------------
 | 
						|
	 *	get expression context.  if NULL then this node has
 | 
						|
	 *	none so we just return.
 | 
						|
	 * ----------------
 | 
						|
	 */
 | 
						|
	econtext = commonstate->cs_ExprContext;
 | 
						|
	if (econtext == NULL)
 | 
						|
		return;
 | 
						|
 | 
						|
	/* ----------------
 | 
						|
	 *	clean up memory used.
 | 
						|
	 * ----------------
 | 
						|
	 */
 | 
						|
	pfree(econtext);
 | 
						|
	commonstate->cs_ExprContext = NULL;
 | 
						|
}
 | 
						|
 | 
						|
/* ----------------
 | 
						|
 *		ExecFreeTypeInfo
 | 
						|
 * ----------------
 | 
						|
 */
 | 
						|
void
 | 
						|
ExecFreeTypeInfo(CommonState *commonstate)
 | 
						|
{
 | 
						|
	TupleDesc tupDesc;
 | 
						|
 | 
						|
	tupDesc = commonstate->cs_ResultTupleSlot->ttc_tupleDescriptor;
 | 
						|
	if (tupDesc == NULL)
 | 
						|
		return;
 | 
						|
 | 
						|
	/* ----------------
 | 
						|
	 *	clean up memory used.
 | 
						|
	 * ----------------
 | 
						|
	 */
 | 
						|
	FreeTupleDesc(tupDesc);
 | 
						|
	commonstate->cs_ResultTupleSlot->ttc_tupleDescriptor = NULL;
 | 
						|
.
 | 
						|
274a
 | 
						|
 | 
						|
.
 | 
						|
SHAR_EOF
 | 
						|
fi
 | 
						|
if test -f 'nodeAgg.c.diff'
 | 
						|
then
 | 
						|
	echo shar: "will not over-write existing file 'nodeAgg.c.diff'"
 | 
						|
else
 | 
						|
cat << \SHAR_EOF > 'nodeAgg.c.diff'
 | 
						|
376a
 | 
						|
						pfree(oldVal); /* XXX - new, let's free the old datum - er1p */
 | 
						|
.
 | 
						|
374a
 | 
						|
						oldVal = value1[aggno]; /* XXX - save so we can free later - er1p */
 | 
						|
.
 | 
						|
112a
 | 
						|
	Datum  oldVal = (Datum) NULL;  /* XXX - so that we can save and free on
 | 
						|
each iteration - er1p */
 | 
						|
.
 | 
						|
SHAR_EOF
 | 
						|
fi
 | 
						|
if test -f 'nodeResult.c.diff'
 | 
						|
then
 | 
						|
	echo shar: "will not over-write existing file 'nodeResult.c.diff'"
 | 
						|
else
 | 
						|
cat << \SHAR_EOF > 'nodeResult.c.diff'
 | 
						|
278a
 | 
						|
	pfree(resstate); node->resstate = NULL; /* XXX - new for us - er1p */
 | 
						|
.
 | 
						|
265a
 | 
						|
	ExecFreeExprContext(&resstate->cstate); /* XXX - new for us - er1p */
 | 
						|
	ExecFreeTypeInfo(&resstate->cstate); /* XXX - new for us - er1p */
 | 
						|
.
 | 
						|
SHAR_EOF
 | 
						|
fi
 | 
						|
exit 0
 | 
						|
#	End of shell archive
 | 
						|
 | 
						|
 | 
						|
 | 
						|
From er1p+@andrew.cmu.edu Fri Mar 19 19:43:27 1999
 | 
						|
Received: from po8.andrew.cmu.edu (PO8.ANDREW.CMU.EDU [128.2.10.108])
 | 
						|
	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA09183
 | 
						|
	for <maillist@candle.pha.pa.us>; Fri, 19 Mar 1999 19:43:26 -0500 (EST)
 | 
						|
Received: (from postman@localhost) by po8.andrew.cmu.edu (8.8.5/8.8.2) id TAA11773; Fri, 19 Mar 1999 19:43:18 -0500 (EST)
 | 
						|
Received: via switchmail; Fri, 19 Mar 1999 19:43:18 -0500 (EST)
 | 
						|
Received: from cloudy.me.cmu.edu via qmail
 | 
						|
          ID </afs/andrew.cmu.edu/service/mailqs/q007/QF.oqwiwLK00gNtQmTLgB>;
 | 
						|
          Fri, 19 Mar 1999 19:43:05 -0500 (EST)
 | 
						|
Received: from mms.4.60.Jun.27.1996.03.05.56.sun4.41.EzMail.2.0.CUILIB.3.45.SNAP.NOT.LINKED.cloudy.me.cmu.edu.sun4m.412
 | 
						|
          via MS.5.6.cloudy.me.cmu.edu.sun4_41;
 | 
						|
          Fri, 19 Mar 1999 19:43:02 -0500 (EST)
 | 
						|
Message-ID: <YqwiwKW00gNt4mTKsv@andrew.cmu.edu>
 | 
						|
Date: Fri, 19 Mar 1999 19:43:02 -0500 (EST)
 | 
						|
From: Erik Riedel <riedel+@CMU.EDU>
 | 
						|
To: Bruce Momjian <maillist@candle.pha.pa.us>
 | 
						|
Subject: Re: [HACKERS] aggregation memory leak and fix
 | 
						|
Cc: pgsql-hackers@postgreSQL.org
 | 
						|
In-Reply-To: <199903192223.RAA06691@candle.pha.pa.us>
 | 
						|
References: <199903192223.RAA06691@candle.pha.pa.us>
 | 
						|
Status: ROr
 | 
						|
 | 
						|
 | 
						|
> No apologies necessary.  Glad to have someone digging into that area of
 | 
						|
> the code.  We will gladly apply your patches to 6.5.  However, I request
 | 
						|
> that you send context diffs(diff -c).  Normal diffs are just too
 | 
						|
> error-prone in application.   Send them, and I will apply them right
 | 
						|
> away.
 | 
						|
>  
 | 
						|
Context diffs attached.  This was due to my ignorance of diff.  When I
 | 
						|
made the other files, I though "hmm, these could be difficult to apply
 | 
						|
if the code has changed a bit, wouldn't it be good if they included a
 | 
						|
few lines before and after the fix".  Now I know "-c".
 | 
						|
 | 
						|
> Not sure why that is there?  Perhaps for GROUP BY processing?
 | 
						|
>  
 | 
						|
Right, it is a result of the Group processing requiring sorted input. 
 | 
						|
Just that it doesn't "require" sorted input, it "could" be a little more
 | 
						|
flexible and the sort wouldn't be necessary.  Essentially this would be
 | 
						|
a single "AggSort" node that did the aggregation while sorting (probably
 | 
						|
with replacement selection rather than quicksort).  This definitely
 | 
						|
would require some code/smarts that isn't there today.
 | 
						|
 | 
						|
> > think I chased it down to the constvalue allocated in
 | 
						|
> > execQual::ExecTargetList(), but I couldn't figure out where to properly
 | 
						|
> > free it.  8 bytes leaked was much better than 750 bytes, so I stopped
 | 
						|
> > banging my head on that particular item.
 | 
						|
>  
 | 
						|
> Can you give me the exact line?  Is it the palloc(1)?
 | 
						|
>  
 | 
						|
No, the 8 bytes seem to come from the ExecEvalExpr() call near line
 | 
						|
1530.  Problem was when I tried to free these, I got "not in AllocSet"
 | 
						|
errors, so something more complicated was going on.
 | 
						|
 | 
						|
Thanks.
 | 
						|
 | 
						|
Erik
 | 
						|
 | 
						|
-----------[aggregation_memory_patch.sh]----------------------
 | 
						|
 | 
						|
#! /bin/sh
 | 
						|
# This is a shell archive, meaning:
 | 
						|
# 1. Remove everything above the #! /bin/sh line.
 | 
						|
# 2. Save the resulting text in a file.
 | 
						|
# 3. Execute the file with /bin/sh (not csh) to create:
 | 
						|
#	execMain.c.diff
 | 
						|
#	execUtils.c.diff
 | 
						|
#	nodeAgg.c.diff
 | 
						|
#	nodeResult.c.diff
 | 
						|
# This archive created: Fri Mar 19 19:35:42 1999
 | 
						|
export PATH; PATH=/bin:/usr/bin:$PATH
 | 
						|
if test -f 'execMain.c.diff'
 | 
						|
then
 | 
						|
	echo shar: "will not over-write existing file 'execMain.c.diff'"
 | 
						|
else
 | 
						|
cat << \SHAR_EOF > 'execMain.c.diff'
 | 
						|
***
 | 
						|
/afs/ece.cmu.edu/project/lcs/lcs-004/er1p/postgres/611/src/backend/executor/
 | 
						|
execMain.c	Thu Mar 11 23:59:11 1999
 | 
						|
---
 | 
						|
/afs/ece.cmu.edu/project/lcs/lcs-004/er1p/postgres/612/src/backend/executor/
 | 
						|
execMain.c	Fri Mar 19 15:03:28 1999
 | 
						|
***************
 | 
						|
*** 394,401 ****
 | 
						|
--- 394,419 ----
 | 
						|
  
 | 
						|
  	EndPlan(queryDesc->plantree, estate);
 | 
						|
  
 | 
						|
+ 	/* XXX - clean up some more from ExecutorStart() - er1p */
 | 
						|
+ 	if (NULL == estate->es_snapshot) {
 | 
						|
+ 	  /* nothing to free */
 | 
						|
+ 	} else {
 | 
						|
+ 	  if (estate->es_snapshot->xcnt > 0) { 
 | 
						|
+ 	    pfree(estate->es_snapshot->xip);
 | 
						|
+ 	  }
 | 
						|
+ 	  pfree(estate->es_snapshot);
 | 
						|
+ 	}
 | 
						|
+ 
 | 
						|
+ 	if (NULL == estate->es_param_exec_vals) {
 | 
						|
+ 	  /* nothing to free */
 | 
						|
+ 	} else {
 | 
						|
+ 	  pfree(estate->es_param_exec_vals);
 | 
						|
+ 	  estate->es_param_exec_vals = NULL;
 | 
						|
+ 	}
 | 
						|
+ 
 | 
						|
  	/* restore saved refcounts. */
 | 
						|
  	BufferRefCountRestore(estate->es_refcount);
 | 
						|
+ 
 | 
						|
  }
 | 
						|
  
 | 
						|
  void
 | 
						|
***************
 | 
						|
*** 580,586 ****
 | 
						|
  	/*
 | 
						|
  	 *	initialize result relation stuff
 | 
						|
  	 */
 | 
						|
! 
 | 
						|
  	if (resultRelation != 0 && operation != CMD_SELECT)
 | 
						|
  	{
 | 
						|
  		/*
 | 
						|
--- 598,604 ----
 | 
						|
  	/*
 | 
						|
  	 *	initialize result relation stuff
 | 
						|
  	 */
 | 
						|
! 	
 | 
						|
  	if (resultRelation != 0 && operation != CMD_SELECT)
 | 
						|
  	{
 | 
						|
  		/*
 | 
						|
SHAR_EOF
 | 
						|
fi
 | 
						|
if test -f 'execUtils.c.diff'
 | 
						|
then
 | 
						|
	echo shar: "will not over-write existing file 'execUtils.c.diff'"
 | 
						|
else
 | 
						|
cat << \SHAR_EOF > 'execUtils.c.diff'
 | 
						|
***
 | 
						|
/afs/ece.cmu.edu/project/lcs/lcs-004/er1p/postgres/611/src/backend/executor/
 | 
						|
execUtils.c	Thu Mar 11 23:59:11 1999
 | 
						|
---
 | 
						|
/afs/ece.cmu.edu/project/lcs/lcs-004/er1p/postgres/612/src/backend/executor/
 | 
						|
execUtils.c	Fri Mar 19 14:55:59 1999
 | 
						|
***************
 | 
						|
*** 272,277 ****
 | 
						|
--- 272,278 ----
 | 
						|
  #endif
 | 
						|
  		i++;
 | 
						|
  	}
 | 
						|
+ 
 | 
						|
  	if (len > 0)
 | 
						|
  	{
 | 
						|
  		ExecAssignResultType(commonstate,
 | 
						|
***************
 | 
						|
*** 366,371 ****
 | 
						|
--- 367,419 ----
 | 
						|
  
 | 
						|
  	pfree(projInfo);
 | 
						|
  	commonstate->cs_ProjInfo = NULL;
 | 
						|
+ }
 | 
						|
+ 
 | 
						|
+ /* ----------------
 | 
						|
+  *		ExecFreeExprContext
 | 
						|
+  * ----------------
 | 
						|
+  */
 | 
						|
+ void
 | 
						|
+ ExecFreeExprContext(CommonState *commonstate)
 | 
						|
+ {
 | 
						|
+ 	ExprContext *econtext;
 | 
						|
+ 
 | 
						|
+ 	/* ----------------
 | 
						|
+ 	 *	get expression context.  if NULL then this node has
 | 
						|
+ 	 *	none so we just return.
 | 
						|
+ 	 * ----------------
 | 
						|
+ 	 */
 | 
						|
+ 	econtext = commonstate->cs_ExprContext;
 | 
						|
+ 	if (econtext == NULL)
 | 
						|
+ 		return;
 | 
						|
+ 
 | 
						|
+ 	/* ----------------
 | 
						|
+ 	 *	clean up memory used.
 | 
						|
+ 	 * ----------------
 | 
						|
+ 	 */
 | 
						|
+ 	pfree(econtext);
 | 
						|
+ 	commonstate->cs_ExprContext = NULL;
 | 
						|
+ }
 | 
						|
+ 
 | 
						|
+ /* ----------------
 | 
						|
+  *		ExecFreeTypeInfo
 | 
						|
+  * ----------------
 | 
						|
+  */
 | 
						|
+ void
 | 
						|
+ ExecFreeTypeInfo(CommonState *commonstate)
 | 
						|
+ {
 | 
						|
+ 	TupleDesc tupDesc;
 | 
						|
+ 
 | 
						|
+ 	tupDesc = commonstate->cs_ResultTupleSlot->ttc_tupleDescriptor;
 | 
						|
+ 	if (tupDesc == NULL)
 | 
						|
+ 		return;
 | 
						|
+ 
 | 
						|
+ 	/* ----------------
 | 
						|
+ 	 *	clean up memory used.
 | 
						|
+ 	 * ----------------
 | 
						|
+ 	 */
 | 
						|
+ 	FreeTupleDesc(tupDesc);
 | 
						|
+ 	commonstate->cs_ResultTupleSlot->ttc_tupleDescriptor = NULL;
 | 
						|
  }
 | 
						|
  
 | 
						|
  /* ----------------------------------------------------------------
 | 
						|
SHAR_EOF
 | 
						|
fi
 | 
						|
if test -f 'nodeAgg.c.diff'
 | 
						|
then
 | 
						|
	echo shar: "will not over-write existing file 'nodeAgg.c.diff'"
 | 
						|
else
 | 
						|
cat << \SHAR_EOF > 'nodeAgg.c.diff'
 | 
						|
***
 | 
						|
/afs/ece.cmu.edu/project/lcs/lcs-004/er1p/postgres/611/src/backend/executor/
 | 
						|
nodeAgg.c	Thu Mar 11 23:59:11 1999
 | 
						|
---
 | 
						|
/afs/ece.cmu.edu/project/lcs/lcs-004/er1p/postgres/612/src/backend/executor/
 | 
						|
nodeAgg.c	Fri Mar 19 15:01:21 1999
 | 
						|
***************
 | 
						|
*** 110,115 ****
 | 
						|
--- 110,116 ----
 | 
						|
  				isNull2 = FALSE;
 | 
						|
  	bool		qual_result;
 | 
						|
  
 | 
						|
+ 	Datum  oldVal = (Datum) NULL;  /* XXX - so that we can save and free
 | 
						|
on each iteration - er1p */
 | 
						|
  
 | 
						|
  	/* ---------------------
 | 
						|
  	 *	get state info from node
 | 
						|
***************
 | 
						|
*** 372,379 ****
 | 
						|
--- 373,382 ----
 | 
						|
  						 */
 | 
						|
  						args[0] = value1[aggno];
 | 
						|
  						args[1] = newVal;
 | 
						|
+ 						oldVal = value1[aggno]; /* XXX - save so we can free later - er1p */
 | 
						|
  						value1[aggno] =	(Datum) fmgr_c(&aggfns->xfn1,
 | 
						|
  										   (FmgrValues *) args, &isNull1);
 | 
						|
+ 						pfree(oldVal); /* XXX - new, let's free the old datum - er1p */
 | 
						|
  						Assert(!isNull1);
 | 
						|
  					}
 | 
						|
  				}
 | 
						|
SHAR_EOF
 | 
						|
fi
 | 
						|
if test -f 'nodeResult.c.diff'
 | 
						|
then
 | 
						|
	echo shar: "will not over-write existing file 'nodeResult.c.diff'"
 | 
						|
else
 | 
						|
cat << \SHAR_EOF > 'nodeResult.c.diff'
 | 
						|
***
 | 
						|
/afs/ece.cmu.edu/project/lcs/lcs-004/er1p/postgres/611/src/backend/executor/
 | 
						|
nodeResult.c	Thu Mar 11 23:59:12 1999
 | 
						|
---
 | 
						|
/afs/ece.cmu.edu/project/lcs/lcs-004/er1p/postgres/612/src/backend/executor/
 | 
						|
nodeResult.c	Fri Mar 19 14:57:26 1999
 | 
						|
***************
 | 
						|
*** 263,268 ****
 | 
						|
--- 263,270 ----
 | 
						|
  	 *		  is freed at end-transaction time.  -cim 6/2/91
 | 
						|
  	 * ----------------
 | 
						|
  	 */
 | 
						|
+ 	ExecFreeExprContext(&resstate->cstate); /* XXX - new for us - er1p */
 | 
						|
+ 	ExecFreeTypeInfo(&resstate->cstate); /* XXX - new for us - er1p */
 | 
						|
  	ExecFreeProjectionInfo(&resstate->cstate);
 | 
						|
  
 | 
						|
  	/* ----------------
 | 
						|
***************
 | 
						|
*** 276,281 ****
 | 
						|
--- 278,284 ----
 | 
						|
  	 * ----------------
 | 
						|
  	 */
 | 
						|
  	ExecClearTuple(resstate->cstate.cs_ResultTupleSlot);
 | 
						|
+ 	pfree(resstate); node->resstate = NULL; /* XXX - new for us - er1p */
 | 
						|
  }
 | 
						|
  
 | 
						|
  void
 | 
						|
SHAR_EOF
 | 
						|
fi
 | 
						|
exit 0
 | 
						|
#	End of shell archive
 | 
						|
 | 
						|
 | 
						|
From owner-pgsql-hackers@hub.org Fri Mar 19 21:01:15 1999
 | 
						|
Received: from hub.org (majordom@hub.org [209.47.145.100])
 | 
						|
	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA11368
 | 
						|
	for <maillist@candle.pha.pa.us>; Fri, 19 Mar 1999 21:01:13 -0500 (EST)
 | 
						|
Received: from localhost (majordom@localhost)
 | 
						|
	by hub.org (8.9.2/8.9.1) with SMTP id UAA40887;
 | 
						|
	Fri, 19 Mar 1999 20:59:47 -0500 (EST)
 | 
						|
	(envelope-from owner-pgsql-hackers@hub.org)
 | 
						|
Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Fri, 19 Mar 1999 20:58:14 +0000 (EST)
 | 
						|
Received: (from majordom@localhost)
 | 
						|
	by hub.org (8.9.2/8.9.1) id UAA40637
 | 
						|
	for pgsql-hackers-outgoing; Fri, 19 Mar 1999 20:58:12 -0500 (EST)
 | 
						|
	(envelope-from owner-pgsql-hackers@postgreSQL.org)
 | 
						|
Received: from candle.pha.pa.us (maillist@s5-03.ppp.op.net [209.152.195.67])
 | 
						|
	by hub.org (8.9.2/8.9.1) with ESMTP id UAA40620
 | 
						|
	for <pgsql-hackers@postgreSQL.org>; Fri, 19 Mar 1999 20:58:02 -0500 (EST)
 | 
						|
	(envelope-from maillist@candle.pha.pa.us)
 | 
						|
Received: (from maillist@localhost)
 | 
						|
	by candle.pha.pa.us (8.9.0/8.9.0) id UAA11263;
 | 
						|
	Fri, 19 Mar 1999 20:58:00 -0500 (EST)
 | 
						|
From: Bruce Momjian <maillist@candle.pha.pa.us>
 | 
						|
Message-Id: <199903200158.UAA11263@candle.pha.pa.us>
 | 
						|
Subject: Re: [HACKERS] aggregation memory leak and fix
 | 
						|
In-Reply-To: <YqwiwKW00gNt4mTKsv@andrew.cmu.edu> from Erik Riedel at "Mar 19, 1999  7:43: 2 pm"
 | 
						|
To: riedel+@CMU.EDU (Erik Riedel)
 | 
						|
Date: Fri, 19 Mar 1999 20:58:00 -0500 (EST)
 | 
						|
Cc: pgsql-hackers@postgreSQL.org
 | 
						|
X-Mailer: ELM [version 2.4ME+ PL47 (25)]
 | 
						|
MIME-Version: 1.0
 | 
						|
Content-Type: text/plain; charset=US-ASCII
 | 
						|
Content-Transfer-Encoding: 7bit
 | 
						|
Sender: owner-pgsql-hackers@postgreSQL.org
 | 
						|
Precedence: bulk
 | 
						|
Status: ROr
 | 
						|
 | 
						|
> 
 | 
						|
> > No apologies necessary.  Glad to have someone digging into that area of
 | 
						|
> > the code.  We will gladly apply your patches to 6.5.  However, I request
 | 
						|
> > that you send context diffs(diff -c).  Normal diffs are just too
 | 
						|
> > error-prone in application.   Send them, and I will apply them right
 | 
						|
> > away.
 | 
						|
> >  
 | 
						|
> Context diffs attached.  This was due to my ignorance of diff.  When I
 | 
						|
> made the other files, I though "hmm, these could be difficult to apply
 | 
						|
> if the code has changed a bit, wouldn't it be good if they included a
 | 
						|
> few lines before and after the fix".  Now I know "-c".
 | 
						|
 | 
						|
Applied.
 | 
						|
 | 
						|
> > Not sure why that is there?  Perhaps for GROUP BY processing?
 | 
						|
> >  
 | 
						|
> Right, it is a result of the Group processing requiring sorted input. 
 | 
						|
> Just that it doesn't "require" sorted input, it "could" be a little more
 | 
						|
> flexible and the sort wouldn't be necessary.  Essentially this would be
 | 
						|
> a single "AggSort" node that did the aggregation while sorting (probably
 | 
						|
> with replacement selection rather than quicksort).  This definitely
 | 
						|
> would require some code/smarts that isn't there today.
 | 
						|
 | 
						|
I think you will find make_groupPlan adds the sort as needed by the
 | 
						|
GROUP BY.  I assume you are suggesting to do the aggregate/GROUP on unsorted
 | 
						|
data, which is hard to do in a flexible way.
 | 
						|
 | 
						|
> > > think I chased it down to the constvalue allocated in
 | 
						|
> > > execQual::ExecTargetList(), but I couldn't figure out where to properly
 | 
						|
> > > free it.  8 bytes leaked was much better than 750 bytes, so I stopped
 | 
						|
> > > banging my head on that particular item.
 | 
						|
> >  
 | 
						|
> > Can you give me the exact line?  Is it the palloc(1)?
 | 
						|
> >  
 | 
						|
> No, the 8 bytes seem to come from the ExecEvalExpr() call near line
 | 
						|
> 1530.  Problem was when I tried to free these, I got "not in AllocSet"
 | 
						|
> errors, so something more complicated was going on.
 | 
						|
 | 
						|
Yes, if you look inside ExecEvalExpr(), you will see it tries to get a
 | 
						|
value for the expression(Datum).  It may return an int, float4, or a
 | 
						|
string.  In the last case, that is actually a pointer and not a specific
 | 
						|
value.
 | 
						|
 | 
						|
So, in some cases, the value can just be thrown away, or it may be a
 | 
						|
pointer to memory that can be freed after the call to heap_formtuple()
 | 
						|
later in the function.  The trick is to find the function call in
 | 
						|
ExecEvalExpr() that is allocating something, and conditionally free
 | 
						|
values[] after the call to heap_formtuple().  If you don't want find it,
 | 
						|
perhaps you can send me enough info so I can see it here.
 | 
						|
 | 
						|
I wonder whether it is the call to CreateTupleDescCopy() inside
 | 
						|
ExecEvalVar()?
 | 
						|
 | 
						|
Another problem I just fixed is that fjIsNull was not being pfree'ed if
 | 
						|
it was used with >64 targets, but I don't think that affects you.
 | 
						|
 | 
						|
I also assume you have run your recent patch through the the
 | 
						|
test/regression tests, so see it does not cause some other area to fail,
 | 
						|
right?
 | 
						|
 | 
						|
-- 
 | 
						|
  Bruce Momjian                        |  http://www.op.net/~candle
 | 
						|
  maillist@candle.pha.pa.us            |  (610) 853-3000
 | 
						|
  +  If your life is a hard drive,     |  830 Blythe Avenue
 | 
						|
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
 | 
						|
 | 
						|
 | 
						|
From owner-pgsql-hackers@hub.org Sat Mar 20 12:01:44 1999
 | 
						|
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
 | 
						|
	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA24855
 | 
						|
	for <maillist@candle.pha.pa.us>; Sat, 20 Mar 1999 12:01:43 -0500 (EST)
 | 
						|
Received: from hub.org (majordom@hub.org [209.47.145.100]) by renoir.op.net (o1/$ Revision: 1.18 $) with ESMTP id LAA11985 for <maillist@candle.pha.pa.us>; Sat, 20 Mar 1999 11:58:48 -0500 (EST)
 | 
						|
Received: from localhost (majordom@localhost)
 | 
						|
	by hub.org (8.9.2/8.9.1) with SMTP id LAA12367;
 | 
						|
	Sat, 20 Mar 1999 11:57:17 -0500 (EST)
 | 
						|
	(envelope-from owner-pgsql-hackers@hub.org)
 | 
						|
Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sat, 20 Mar 1999 11:55:22 +0000 (EST)
 | 
						|
Received: (from majordom@localhost)
 | 
						|
	by hub.org (8.9.2/8.9.1) id LAA12026
 | 
						|
	for pgsql-hackers-outgoing; Sat, 20 Mar 1999 11:55:17 -0500 (EST)
 | 
						|
	(envelope-from owner-pgsql-hackers@postgreSQL.org)
 | 
						|
Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6])
 | 
						|
	by hub.org (8.9.2/8.9.1) with ESMTP id LAA11871
 | 
						|
	for <pgsql-hackers@postgreSQL.org>; Sat, 20 Mar 1999 11:54:57 -0500 (EST)
 | 
						|
	(envelope-from tgl@sss.pgh.pa.us)
 | 
						|
Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
 | 
						|
	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id LAA28068;
 | 
						|
	Sat, 20 Mar 1999 11:48:58 -0500 (EST)
 | 
						|
To: Bruce Momjian <maillist@candle.pha.pa.us>
 | 
						|
cc: riedel+@CMU.EDU (Erik Riedel), pgsql-hackers@postgreSQL.org
 | 
						|
Subject: Re: [HACKERS] aggregation memory leak and fix 
 | 
						|
In-reply-to: Your message of Fri, 19 Mar 1999 21:33:33 -0500 (EST) 
 | 
						|
             <199903200233.VAA11816@candle.pha.pa.us> 
 | 
						|
Date: Sat, 20 Mar 1999 11:48:58 -0500
 | 
						|
Message-ID: <28066.921948538@sss.pgh.pa.us>
 | 
						|
From: Tom Lane <tgl@sss.pgh.pa.us>
 | 
						|
Sender: owner-pgsql-hackers@postgreSQL.org
 | 
						|
Precedence: bulk
 | 
						|
Status: ROr
 | 
						|
 | 
						|
Bruce Momjian <maillist@candle.pha.pa.us> writes:
 | 
						|
> My only quick solution would seem to be to add a new "expression" memory
 | 
						|
> context, that can be cleared after every tuple is processed, clearing
 | 
						|
> out temporary values allocated inside an expression.
 | 
						|
 | 
						|
Right, this whole problem of growing backend memory use during a large
 | 
						|
SELECT (or COPY, or probably a few other things) is one of the things
 | 
						|
that we were talking about addressing by revising the memory management
 | 
						|
structure.
 | 
						|
 | 
						|
I think what we want inside the executor is a distinction between
 | 
						|
storage that must live to the end of the statement and storage that is
 | 
						|
only needed while processing the current tuple.  The second kind of
 | 
						|
storage would go into a separate context that gets flushed every so
 | 
						|
often.  (It could be every tuple, or every dozen or hundred tuples
 | 
						|
depending on what seems the best tradeoff of cycles against memory
 | 
						|
usage.)
 | 
						|
 | 
						|
I'm not sure that just two contexts is enough, either.  For example in
 | 
						|
	SELECT field1, SUM(field2) GROUP BY field1;
 | 
						|
the working memory for the SUM aggregate could not be released after
 | 
						|
each tuple, but perhaps we don't want it to live for the whole statement
 | 
						|
either --- in that case we'd need a per-group context.  (This particular
 | 
						|
example isn't very convincing, because the same storage for the SUM
 | 
						|
*could* be recycled from group to group.  But I don't know whether it
 | 
						|
actually *is* reused or not.  If fresh storage is palloc'd for each
 | 
						|
instantiation of SUM then we have a per-group leak in this scenario.
 | 
						|
In any case, I'm not sure all aggregate functions have constant memory
 | 
						|
requirements that would let them recycle storage across groups.)
 | 
						|
 | 
						|
What we need to do is work out what the best set of memory context
 | 
						|
definitions is, and then decide on a strategy for making sure that
 | 
						|
lower-level routines allocate their return values in the right context.
 | 
						|
It'd be nice if the lower-level routines could still call palloc() and
 | 
						|
not have to worry about this explicitly --- otherwise we'll break not
 | 
						|
only a lot of our own code but perhaps a lot of user code.  (User-
 | 
						|
specific data types and SPI code all use palloc, no?)
 | 
						|
 | 
						|
I think it is too late to try to fix this for 6.5, but it ought to be a
 | 
						|
top priority for 6.6.
 | 
						|
 | 
						|
			regards, tom lane
 | 
						|
 | 
						|
 | 
						|
From tgl@sss.pgh.pa.us Sun Mar 21 16:01:46 1999
 | 
						|
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
 | 
						|
	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id QAA00139
 | 
						|
	for <maillist@candle.pha.pa.us>; Sun, 21 Mar 1999 16:01:45 -0500 (EST)
 | 
						|
Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by renoir.op.net (o1/$ Revision: 1.18 $) with ESMTP id PAA27737 for <maillist@candle.pha.pa.us>; Sun, 21 Mar 1999 15:52:38 -0500 (EST)
 | 
						|
Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
 | 
						|
	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id PAA14946;
 | 
						|
	Sun, 21 Mar 1999 15:50:20 -0500 (EST)
 | 
						|
To: Bruce Momjian <maillist@candle.pha.pa.us>
 | 
						|
cc: pgsql-hackers@postgreSQL.org
 | 
						|
Subject: Re: [HACKERS] aggregation memory leak and fix 
 | 
						|
In-reply-to: Your message of Sun, 21 Mar 1999 14:20:39 -0500 (EST) 
 | 
						|
             <199903211920.OAA28744@candle.pha.pa.us> 
 | 
						|
Date: Sun, 21 Mar 1999 15:50:20 -0500
 | 
						|
Message-ID: <14944.922049420@sss.pgh.pa.us>
 | 
						|
From: Tom Lane <tgl@sss.pgh.pa.us>
 | 
						|
Status: ROr
 | 
						|
 | 
						|
Bruce Momjian <maillist@candle.pha.pa.us> writes:
 | 
						|
>> What we need to do is work out what the best set of memory context
 | 
						|
>> definitions is, and then decide on a strategy for making sure that
 | 
						|
>> lower-level routines allocate their return values in the right context.
 | 
						|
 | 
						|
> Let's suppose that we want to free all the memory used as expression
 | 
						|
> intermediate values after each row is processed.
 | 
						|
> It is my understanding that all these are created in utils/adt/*.c
 | 
						|
> files, and that the entry point to all those functions via
 | 
						|
> fmgr()/fmgr_c().
 | 
						|
 | 
						|
That's probably the bulk of the specific calls of palloc().  Someone
 | 
						|
(Jan?) did a scan of the code a while ago looking for palloc() calls,
 | 
						|
and there aren't that many outside of the data-type-specific functions.
 | 
						|
But we'd have to look individually at all the ones that are elsewhere.
 | 
						|
 | 
						|
> So, if we go into an expression memory context before calling
 | 
						|
> fmgr/fmgr_c in the executor, and return to the normal context after the
 | 
						|
> function call, all our intermediates are trapped in the expression
 | 
						|
> memory context.
 | 
						|
 | 
						|
OK, so you're saying we leave the data-type-specific functions as is
 | 
						|
(calling palloc() to allocate their result areas), and make each call
 | 
						|
site specifically responsible for setting the context that palloc() will
 | 
						|
allocate from?  That could work, I think.  We'd need to see what side
 | 
						|
effects it'd have on other uses of palloc().
 | 
						|
 | 
						|
What we'd probably want is to use a stack discipline for the current
 | 
						|
palloc-target memory context: when you set the context, you get back the
 | 
						|
ID of the old context, and you are supposed to restore that old context
 | 
						|
before returning.
 | 
						|
 | 
						|
> At the end of each row, we just free the expression memory context.  In
 | 
						|
> almost all cases, the data is stored in tuples, and we can free it.  In
 | 
						|
> a few cases like aggregates, we have to save off the value we need to
 | 
						|
> keep before freeing the expression context.
 | 
						|
 | 
						|
Actually, nodeAgg would just have to set an appropriate context before
 | 
						|
calling fmgr to execute the aggregate's transition functions, and then
 | 
						|
it wouldn't need an extra copy step.  The results would come back in the
 | 
						|
right context already.
 | 
						|
 | 
						|
> In fact, you could even optimize the cleanup to only do free'ing if
 | 
						|
> some expression memory was allocated.  In most cases, it is not.
 | 
						|
 | 
						|
Jan's stuff should already fall through pretty quickly if there's
 | 
						|
nothing in the context, I think.  Note that what we want to do between
 | 
						|
tuples is a "context clear" of the expression context, not a "context
 | 
						|
delete" and then "context create" a new expression context.  Context
 | 
						|
clear should be a pretty quick no-op if nothing's been allocated in that
 | 
						|
context...
 | 
						|
 | 
						|
> In fact the nodeAgg.c patch that I backed out attempted to do that,
 | 
						|
> though because there wasn't code that checked if the Datum was
 | 
						|
> pg_type.typbyval, it didn't work 100%.
 | 
						|
 | 
						|
Right.  But if we approach it this way (clear the context at appropriate
 | 
						|
times) rather than thinking in terms of explicitly pfree'ing individual
 | 
						|
objects, life gets much simpler.  Also, if we insist on being able to
 | 
						|
pfree individual objects inside a context, we can't use Jan's faster
 | 
						|
allocator!  Remember, the reason it is faster and lower overhead is that
 | 
						|
it doesn't keep track of individual objects, only pools.
 | 
						|
 | 
						|
I'd like to see us head in the direction of removing most of the
 | 
						|
explicit pfree calls that exist now, and instead rely on clearing
 | 
						|
memory contexts at appropriate times in order to manage memory.
 | 
						|
The fewer places where we need pfree, the more contexts can be run
 | 
						|
with the low-overhead space allocator.  Also, the fewer explicit
 | 
						|
pfrees we need, the simpler and more reliable the code gets.
 | 
						|
 | 
						|
			regards, tom lane
 | 
						|
 | 
						|
From owner-pgsql-hackers@hub.org Sun Mar 21 16:01:49 1999
 | 
						|
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
 | 
						|
	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id QAA00149
 | 
						|
	for <maillist@candle.pha.pa.us>; Sun, 21 Mar 1999 16:01:48 -0500 (EST)
 | 
						|
Received: from hub.org (majordom@hub.org [209.47.145.100]) by renoir.op.net (o1/$ Revision: 1.18 $) with ESMTP id PAA27950 for <maillist@candle.pha.pa.us>; Sun, 21 Mar 1999 15:56:07 -0500 (EST)
 | 
						|
Received: from localhost (majordom@localhost)
 | 
						|
	by hub.org (8.9.2/8.9.1) with SMTP id PAA39413;
 | 
						|
	Sun, 21 Mar 1999 15:54:51 -0500 (EST)
 | 
						|
	(envelope-from owner-pgsql-hackers@hub.org)
 | 
						|
Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sun, 21 Mar 1999 15:54:31 +0000 (EST)
 | 
						|
Received: (from majordom@localhost)
 | 
						|
	by hub.org (8.9.2/8.9.1) id PAA39249
 | 
						|
	for pgsql-hackers-outgoing; Sun, 21 Mar 1999 15:54:27 -0500 (EST)
 | 
						|
	(envelope-from owner-pgsql-hackers@postgreSQL.org)
 | 
						|
Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6])
 | 
						|
	by hub.org (8.9.2/8.9.1) with ESMTP id PAA39235
 | 
						|
	for <pgsql-hackers@postgreSQL.org>; Sun, 21 Mar 1999 15:54:21 -0500 (EST)
 | 
						|
	(envelope-from tgl@sss.pgh.pa.us)
 | 
						|
Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
 | 
						|
	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id PAA14946;
 | 
						|
	Sun, 21 Mar 1999 15:50:20 -0500 (EST)
 | 
						|
To: Bruce Momjian <maillist@candle.pha.pa.us>
 | 
						|
cc: pgsql-hackers@postgreSQL.org
 | 
						|
Subject: Re: [HACKERS] aggregation memory leak and fix 
 | 
						|
In-reply-to: Your message of Sun, 21 Mar 1999 14:20:39 -0500 (EST) 
 | 
						|
             <199903211920.OAA28744@candle.pha.pa.us> 
 | 
						|
Date: Sun, 21 Mar 1999 15:50:20 -0500
 | 
						|
Message-ID: <14944.922049420@sss.pgh.pa.us>
 | 
						|
From: Tom Lane <tgl@sss.pgh.pa.us>
 | 
						|
Sender: owner-pgsql-hackers@postgreSQL.org
 | 
						|
Precedence: bulk
 | 
						|
Status: RO
 | 
						|
 | 
						|
Bruce Momjian <maillist@candle.pha.pa.us> writes:
 | 
						|
>> What we need to do is work out what the best set of memory context
 | 
						|
>> definitions is, and then decide on a strategy for making sure that
 | 
						|
>> lower-level routines allocate their return values in the right context.
 | 
						|
 | 
						|
> Let's suppose that we want to free all the memory used as expression
 | 
						|
> intermediate values after each row is processed.
 | 
						|
> It is my understanding that all these are created in utils/adt/*.c
 | 
						|
> files, and that the entry point to all those functions via
 | 
						|
> fmgr()/fmgr_c().
 | 
						|
 | 
						|
That's probably the bulk of the specific calls of palloc().  Someone
 | 
						|
(Jan?) did a scan of the code a while ago looking for palloc() calls,
 | 
						|
and there aren't that many outside of the data-type-specific functions.
 | 
						|
But we'd have to look individually at all the ones that are elsewhere.
 | 
						|
 | 
						|
> So, if we go into an expression memory context before calling
 | 
						|
> fmgr/fmgr_c in the executor, and return to the normal context after the
 | 
						|
> function call, all our intermediates are trapped in the expression
 | 
						|
> memory context.
 | 
						|
 | 
						|
OK, so you're saying we leave the data-type-specific functions as is
 | 
						|
(calling palloc() to allocate their result areas), and make each call
 | 
						|
site specifically responsible for setting the context that palloc() will
 | 
						|
allocate from?  That could work, I think.  We'd need to see what side
 | 
						|
effects it'd have on other uses of palloc().
 | 
						|
 | 
						|
What we'd probably want is to use a stack discipline for the current
 | 
						|
palloc-target memory context: when you set the context, you get back the
 | 
						|
ID of the old context, and you are supposed to restore that old context
 | 
						|
before returning.
 | 
						|
 | 
						|
> At the end of each row, we just free the expression memory context.  In
 | 
						|
> almost all cases, the data is stored in tuples, and we can free it.  In
 | 
						|
> a few cases like aggregates, we have to save off the value we need to
 | 
						|
> keep before freeing the expression context.
 | 
						|
 | 
						|
Actually, nodeAgg would just have to set an appropriate context before
 | 
						|
calling fmgr to execute the aggregate's transition functions, and then
 | 
						|
it wouldn't need an extra copy step.  The results would come back in the
 | 
						|
right context already.
 | 
						|
 | 
						|
> In fact, you could even optimize the cleanup to only do free'ing if
 | 
						|
> some expression memory was allocated.  In most cases, it is not.
 | 
						|
 | 
						|
Jan's stuff should already fall through pretty quickly if there's
 | 
						|
nothing in the context, I think.  Note that what we want to do between
 | 
						|
tuples is a "context clear" of the expression context, not a "context
 | 
						|
delete" and then "context create" a new expression context.  Context
 | 
						|
clear should be a pretty quick no-op if nothing's been allocated in that
 | 
						|
context...
 | 
						|
 | 
						|
> In fact the nodeAgg.c patch that I backed out attempted to do that,
 | 
						|
> though because there wasn't code that checked if the Datum was
 | 
						|
> pg_type.typbyval, it didn't work 100%.
 | 
						|
 | 
						|
Right.  But if we approach it this way (clear the context at appropriate
 | 
						|
times) rather than thinking in terms of explicitly pfree'ing individual
 | 
						|
objects, life gets much simpler.  Also, if we insist on being able to
 | 
						|
pfree individual objects inside a context, we can't use Jan's faster
 | 
						|
allocator!  Remember, the reason it is faster and lower overhead is that
 | 
						|
it doesn't keep track of individual objects, only pools.
 | 
						|
 | 
						|
I'd like to see us head in the direction of removing most of the
 | 
						|
explicit pfree calls that exist now, and instead rely on clearing
 | 
						|
memory contexts at appropriate times in order to manage memory.
 | 
						|
The fewer places where we need pfree, the more contexts can be run
 | 
						|
with the low-overhead space allocator.  Also, the fewer explicit
 | 
						|
pfrees we need, the simpler and more reliable the code gets.
 | 
						|
 | 
						|
			regards, tom lane
 | 
						|
 | 
						|
 | 
						|
From owner-pgsql-hackers@hub.org Wed Mar 24 19:10:53 1999
 | 
						|
Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
 | 
						|
	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA00906
 | 
						|
	for <maillist@candle.pha.pa.us>; Wed, 24 Mar 1999 19:10:52 -0500 (EST)
 | 
						|
Received: from hub.org (majordom@hub.org [209.47.145.100]) by renoir.op.net (o1/$ Revision: 1.18 $) with ESMTP id NAA24258 for <maillist@candle.pha.pa.us>; Wed, 24 Mar 1999 13:09:47 -0500 (EST)
 | 
						|
Received: from localhost (majordom@localhost)
 | 
						|
	by hub.org (8.9.2/8.9.1) with SMTP id NAA60743;
 | 
						|
	Wed, 24 Mar 1999 13:07:26 -0500 (EST)
 | 
						|
	(envelope-from owner-pgsql-hackers@hub.org)
 | 
						|
Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Wed, 24 Mar 1999 13:06:47 +0000 (EST)
 | 
						|
Received: (from majordom@localhost)
 | 
						|
	by hub.org (8.9.2/8.9.1) id NAA60556
 | 
						|
	for pgsql-hackers-outgoing; Wed, 24 Mar 1999 13:06:43 -0500 (EST)
 | 
						|
	(envelope-from owner-pgsql-hackers@postgreSQL.org)
 | 
						|
Received: from po7.andrew.cmu.edu (PO7.ANDREW.CMU.EDU [128.2.10.107])
 | 
						|
	by hub.org (8.9.2/8.9.1) with ESMTP id NAA60540
 | 
						|
	for <pgsql-hackers@postgreSQL.org>; Wed, 24 Mar 1999 13:06:25 -0500 (EST)
 | 
						|
	(envelope-from er1p+@andrew.cmu.edu)
 | 
						|
Received: (from postman@localhost) by po7.andrew.cmu.edu (8.8.5/8.8.2) id NAA06323; Wed, 24 Mar 1999 13:06:16 -0500 (EST)
 | 
						|
Received: via switchmail; Wed, 24 Mar 1999 13:06:16 -0500 (EST)
 | 
						|
Received: from cloudy.me.cmu.edu via qmail
 | 
						|
          ID </afs/andrew.cmu.edu/service/mailqs/q001/QF.cqyGa::00gNtI0TZtD>;
 | 
						|
          Wed, 24 Mar 1999 13:06:02 -0500 (EST)
 | 
						|
Received: from cloudy.me.cmu.edu via qmail
 | 
						|
          ID </afs/andrew.cmu.edu/usr2/er1p/.Outgoing/QF.sqyGa8G00gNtImTGBe>;
 | 
						|
          Wed, 24 Mar 1999 13:06:00 -0500 (EST)
 | 
						|
Received: from mms.4.60.Jun.27.1996.03.05.56.sun4.41.EzMail.2.0.CUILIB.3.45.SNAP.NOT.LINKED.cloudy.me.cmu.edu.sun4m.412
 | 
						|
          via MS.5.6.cloudy.me.cmu.edu.sun4_41;
 | 
						|
          Wed, 24 Mar 1999 13:05:58 -0500 (EST)
 | 
						|
Message-ID: <QqyGa6600gNtMmTG1o@andrew.cmu.edu>
 | 
						|
Date: Wed, 24 Mar 1999 13:05:58 -0500 (EST)
 | 
						|
From: Erik Riedel <riedel+@CMU.EDU>
 | 
						|
To: Bruce Momjian <maillist@candle.pha.pa.us>
 | 
						|
Subject: Re: [HACKERS] aggregation memory leak and fix
 | 
						|
Cc: pgsql-hackers@postgreSQL.org
 | 
						|
In-Reply-To: <199903240611.BAA01206@candle.pha.pa.us>
 | 
						|
References: <199903240611.BAA01206@candle.pha.pa.us>
 | 
						|
Sender: owner-pgsql-hackers@postgreSQL.org
 | 
						|
Precedence: bulk
 | 
						|
Status: RO
 | 
						|
 | 
						|
 | 
						|
> I am interested to see if it fixes the expression leak you saw.  I have
 | 
						|
> not committed this yet.  I want to look at it some more.
 | 
						|
>  
 | 
						|
I'm afraid that this doesn't seem to have any effect on my query.
 | 
						|
 | 
						|
Looking at your code, I think the problem is that most of the
 | 
						|
allocations in my query are on the top part of the if statement that
 | 
						|
you modified (i.e. the == SQLlanguageId part).  Below is a snippet of
 | 
						|
a trace from my query, with approximate line numbers for execQual.c
 | 
						|
with your patch applied:
 | 
						|
 | 
						|
(execQual) language == SQLlanguageId (execQual.c:757)
 | 
						|
(execQual) execute postquel_function (execQual.c:759)
 | 
						|
(mcxt) MemoryContextAlloc 32 bytes in ** Blank Portal **-heap
 | 
						|
(mcxt) MemoryContextAlloc 16 bytes in ** Blank Portal **-heap
 | 
						|
(mcxt) MemoryContextAlloc 528 bytes in ** Blank Portal **-heap
 | 
						|
(mcxt) MemoryContextAlloc 56 bytes in ** Blank Portal **-heap
 | 
						|
(mcxt) MemoryContextAlloc 88 bytes in ** Blank Portal **-heap
 | 
						|
(mcxt) MemoryContextAlloc 24 bytes in ** Blank Portal **-heap
 | 
						|
(mcxt) MemoryContextAlloc 8 bytes in ** Blank Portal **-heap
 | 
						|
(mcxt) MemoryContextAlloc 65 bytes in ** Blank Portal **-heap
 | 
						|
(mcxt) MemoryContextAlloc 48 bytes in ** Blank Portal **-heap
 | 
						|
(mcxt) MemoryContextAlloc 8 bytes in ** Blank Portal **-heap
 | 
						|
(execQual) else clause NOT SQLlanguageId (execQual.c:822)
 | 
						|
(execQual) install qual memory context (execQual.c:858)
 | 
						|
(execQual) exit qual context (execQual.c:862)
 | 
						|
(mcxt) MemoryContextAlloc 60 bytes in ** Blank Portal **-heap
 | 
						|
(mcxt) MemoryContextFree in ** Blank Portal **-heap freed 16 bytes
 | 
						|
(mcxt) MemoryContextFree in ** Blank Portal **-heap freed 64 bytes
 | 
						|
(mcxt) MemoryContextFree in ** Blank Portal **-heap freed 64 bytes
 | 
						|
(mcxt) MemoryContextFree in ** Blank Portal **-heap freed 528 bytes
 | 
						|
(mcxt) MemoryContextFree in ** Blank Portal **-heap freed 16 bytes
 | 
						|
(execQual) return from postquel_function (execQual.c:764)
 | 
						|
(execQual) return from ExecEvalFuncArgs (execQual.c:792)
 | 
						|
(execQual) else clause NOT SQLlanguageId (execQual.c:822)
 | 
						|
(execQual) install qual memory context (execQual.c:858)
 | 
						|
(execQual) exit qual context (execQual.c:862)
 | 
						|
(mcxt) MemoryContextAlloc 108 bytes in ** Blank Portal **-heap
 | 
						|
(mcxt) MemoryContextAlloc 108 bytes in ** Blank Portal **-heap
 | 
						|
(mcxt) MemoryContextFree in ** Blank Portal **-heap freed 128 bytes
 | 
						|
(execQual) else clause NOT SQLlanguageId (execQual.c:822)
 | 
						|
(execQual) install qual memory context (execQual.c:858)
 | 
						|
(mcxt) MemoryContextAlloc 8 bytes in <Qual manager>-heap
 | 
						|
(execQual) exit qual context (execQual.c:862)
 | 
						|
 | 
						|
<pattern repeats>
 | 
						|
 | 
						|
(execQual) language == SQLlanguageId (execQual.c:757)
 | 
						|
(execQual) execute postquel_function (execQual.c:759)
 | 
						|
(mcxt) MemoryContextAlloc 32 bytes in ** Blank Portal **-heap
 | 
						|
(mcxt) MemoryContextAlloc 16 bytes in ** Blank Portal **-heap
 | 
						|
(mcxt) MemoryContextAlloc 528 bytes in ** Blank Portal **-heap
 | 
						|
(mcxt) MemoryContextAlloc 56 bytes in ** Blank Portal **-heap
 | 
						|
(mcxt) MemoryContextAlloc 88 bytes in ** Blank Portal **-heap
 | 
						|
(mcxt) MemoryContextAlloc 24 bytes in ** Blank Portal **-heap
 | 
						|
(mcxt) MemoryContextAlloc 8 bytes in ** Blank Portal **-heap
 | 
						|
(mcxt) MemoryContextAlloc 65 bytes in ** Blank Portal **-heap
 | 
						|
(mcxt) MemoryContextAlloc 48 bytes in ** Blank Portal **-heap
 | 
						|
(mcxt) MemoryContextAlloc 8 bytes in ** Blank Portal **-heap
 | 
						|
(execQual) else clause NOT SQLlanguageId (execQual.c:822)
 | 
						|
(execQual) install qual memory context (execQual.c:858)
 | 
						|
(execQual) exit qual context (execQual.c:862)
 | 
						|
(mcxt) MemoryContextAlloc 60 bytes in ** Blank Portal **-heap
 | 
						|
(mcxt) MemoryContextFree in ** Blank Portal **-heap freed 16 bytes
 | 
						|
(mcxt) MemoryContextFree in ** Blank Portal **-heap freed 64 bytes
 | 
						|
(mcxt) MemoryContextFree in ** Blank Portal **-heap freed 64 bytes
 | 
						|
(mcxt) MemoryContextFree in ** Blank Portal **-heap freed 528 bytes
 | 
						|
(mcxt) MemoryContextFree in ** Blank Portal **-heap freed 16 bytes
 | 
						|
(execQual) return from postquel_function (execQual.c:764)
 | 
						|
(execQual) return from ExecEvalFuncArgs (execQual.c:792)
 | 
						|
(execQual) else clause NOT SQLlanguageId (execQual.c:822)
 | 
						|
(execQual) install qual memory context (execQual.c:858)
 | 
						|
(execQual) exit qual context (execQual.c:862)
 | 
						|
(mcxt) MemoryContextAlloc 108 bytes in ** Blank Portal **-heap
 | 
						|
(mcxt) MemoryContextAlloc 108 bytes in ** Blank Portal **-heap
 | 
						|
(mcxt) MemoryContextFree in ** Blank Portal **-heap freed 128 bytes
 | 
						|
(execQual) else clause NOT SQLlanguageId (execQual.c:822)
 | 
						|
(execQual) install qual memory context (execQual.c:858)
 | 
						|
(mcxt) MemoryContextAlloc 8 bytes in <Qual manager>-heap
 | 
						|
(execQual) exit qual context (execQual.c:862)
 | 
						|
 | 
						|
 | 
						|
the MemoryContext lines give the name of the portal where each
 | 
						|
allocation is happening - you see that your Qual manager only captures
 | 
						|
a very small number (one) of the allocations, the rest are in the
 | 
						|
upper part of the if statement.
 | 
						|
 | 
						|
Note that I also placed a printf next to your EndPortalAllocMode() and
 | 
						|
StartPortalAllocMode() fix in ExecQual() - I believe this is what is
 | 
						|
supposed to clear the portal and free the memory - and that printf
 | 
						|
never appears in the above trace.
 | 
						|
 | 
						|
Sorry if the trace is a little confusing, but I hope that it helps you
 | 
						|
zero in.
 | 
						|
 | 
						|
Erik
 | 
						|
 | 
						|
 | 
						|
 | 
						|
 | 
						|
 | 
						|
 | 
						|
 | 
						|
From owner-pgsql-hackers@hub.org Sat May 15 23:13:50 1999
 | 
						|
Received: from hub.org (hub.org [209.167.229.1])
 | 
						|
	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA29144
 | 
						|
	for <maillist@candle.pha.pa.us>; Sat, 15 May 1999 23:13:49 -0400 (EDT)
 | 
						|
Received: from hub.org (hub.org [209.167.229.1])
 | 
						|
	by hub.org (8.9.3/8.9.3) with ESMTP id XAA25173;
 | 
						|
	Sat, 15 May 1999 23:11:03 -0400 (EDT)
 | 
						|
	(envelope-from owner-pgsql-hackers@hub.org)
 | 
						|
Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sat, 15 May 1999 23:10:29 +0000 (EDT)
 | 
						|
Received: (from majordom@localhost)
 | 
						|
	by hub.org (8.9.3/8.9.3) id XAA25111
 | 
						|
	for pgsql-hackers-outgoing; Sat, 15 May 1999 23:10:27 -0400 (EDT)
 | 
						|
	(envelope-from owner-pgsql-hackers@postgreSQL.org)
 | 
						|
X-Authentication-Warning: hub.org: majordom set sender to owner-pgsql-hackers@postgreSQL.org using -f
 | 
						|
Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6])
 | 
						|
	by hub.org (8.9.3/8.9.3) with ESMTP id XAA25092
 | 
						|
	for <pgsql-hackers@postgreSQL.org>; Sat, 15 May 1999 23:10:22 -0400 (EDT)
 | 
						|
	(envelope-from tgl@sss.pgh.pa.us)
 | 
						|
Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
 | 
						|
	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id XAA17752
 | 
						|
	for <pgsql-hackers@postgreSQL.org>; Sat, 15 May 1999 23:09:46 -0400 (EDT)
 | 
						|
To: pgsql-hackers@postgreSQL.org
 | 
						|
Subject: [HACKERS] Memory leaks in relcache
 | 
						|
Date: Sat, 15 May 1999 23:09:46 -0400
 | 
						|
Message-ID: <17750.926824186@sss.pgh.pa.us>
 | 
						|
From: Tom Lane <tgl@sss.pgh.pa.us>
 | 
						|
Sender: owner-pgsql-hackers@postgreSQL.org
 | 
						|
Precedence: bulk
 | 
						|
Status: ROr
 | 
						|
 | 
						|
I have been looking into why a reference to a nonexistent table, eg
 | 
						|
	INSERT INTO nosuchtable VALUES(1);
 | 
						|
leaks a small amount of memory per occurrence.  What I find is a
 | 
						|
memory leak in the indexscan support.  Specifically,
 | 
						|
RelationGetIndexScan in backend/access/index/genam.c palloc's both
 | 
						|
an IndexScanDesc and some keydata storage.  The IndexScanDesc
 | 
						|
block is eventually pfree'd, at the bottom of CatalogIndexFetchTuple
 | 
						|
in backend/catalog/indexing.c.  But the keydata block is not.
 | 
						|
 | 
						|
This wouldn't matter so much if the palloc were coming from a
 | 
						|
transaction-local context.  But what we're doing is a lookup in pg_class
 | 
						|
on behalf of RelationBuildDesc in backend/utils/cache/relcache.c, and
 | 
						|
it's done a MemoryContextSwitchTo into the global CacheCxt before
 | 
						|
starting the lookup.  Therefore, the un-pfreed block represents a
 | 
						|
permanent memory leak.
 | 
						|
 | 
						|
In fact, *every* reference to a relation that is not already present in
 | 
						|
the relcache causes a similar leak.  The error case is just the one that
 | 
						|
is easiest to repeat.  The missing pfree of the keydata block is
 | 
						|
probably causing a bunch of other short-term and long-term leaks too.
 | 
						|
 | 
						|
It seems to me there are two things to fix here: indexscan ought to
 | 
						|
pfree everything it pallocs, and RelationBuildDesc ought to be warier
 | 
						|
about how much work gets done with CacheCxt as the active palloc
 | 
						|
context.  (Even if indexscan didn't leak anything ordinarily, there's
 | 
						|
still the risk of elog(ERROR) causing an abort before the indexscan code
 | 
						|
gets to clean up.)
 | 
						|
 | 
						|
Comments?  In particular, where is the cleanest place to add the pfree
 | 
						|
of the keydata block?  I don't especially like the fact that callers
 | 
						|
of index_endscan have to clean up the toplevel scan block; I think that
 | 
						|
ought to happen inside index_endscan.
 | 
						|
 | 
						|
			regards, tom lane
 | 
						|
 | 
						|
 |