1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-05 07:21:24 +03:00
Commit Graph

95 Commits

Author SHA1 Message Date
e7f89a5dec Consistently test for in-use shared memory.
postmaster startup scrutinizes any shared memory segment recorded in
postmaster.pid, exiting if that segment matches the current data
directory and has an attached process.  When the postmaster.pid file was
missing, a starting postmaster used weaker checks.  Change to use the
same checks in both scenarios.  This increases the chance of a startup
failure, in lieu of data corruption, if the DBA does "kill -9 `head -n1
postmaster.pid` && rm postmaster.pid && pg_ctl -w start".  A postmaster
will no longer recycle segments pertaining to other data directories.
That's good for production, but it's bad for integration tests that
crash a postmaster and immediately delete its data directory.  Such a
test now leaks a segment indefinitely.  No "make check-world" test does
that.  win32_shmem.c already avoided all these problems.  In 9.6 and
later, enhance PostgresNode to facilitate testing.  Back-patch to 9.4
(all supported versions).

Reviewed by Daniel Gustafsson and Kyotaro HORIGUCHI.

Discussion: https://postgr.es/m/20130911033341.GD225735@tornado.leadboat.com
2019-04-03 17:03:50 -07:00
bbd4a1b60b Provide a way to control SysV shmem attach address in EXEC_BACKEND builds.
In standard non-Windows builds, there's no particular reason to care what
address the kernel chooses to map the shared memory segment at.  However,
when building with EXEC_BACKEND, there's a risk that the chosen address
won't be available in all child processes.  Linux with ASLR enabled (which
it is by default) seems particularly at risk because it puts shmem segments
into the same area where it maps shared libraries.  We can work around
that by specifying a mapping address that's outside the range where
shared libraries could get mapped.  On x86_64 Linux, 0x7e0000000000
seems to work well.

This is only meant for testing/debugging purposes, so it doesn't seem
necessary to go as far as providing a GUC (or any user-visible
documentation, though we might change that later).  Instead, it's just
controlled by setting an environment variable PG_SHMEM_ADDR to the
desired attach address.

Back-patch to all supported branches, since the point here is to
remove intermittent buildfarm failures on EXEC_BACKEND animals.
Owners of affected animals will need to add a suitable setting of
PG_SHMEM_ADDR to their build_env configuration.

Discussion: https://postgr.es/m/7036.1492231361@sss.pgh.pa.us
2017-04-15 17:28:02 -04:00
43d17489d1 Try to find out the actual hugepage size when making a MAP_HUGETLB request.
Even if Linux's mmap() is okay with a partial-hugepage request, munmap()
is not, as reported by Chris Richards.  Therefore it behooves us to try
a bit harder to find out the actual hugepage size, instead of assuming
that we can skate by with a guess.

For the moment, just look into /proc/meminfo to find out the default
hugepage size, and use that.  Later, on kernels that support requests
for nondefault sizes, we might try to consider other alternatives.
But that smells more like a new feature than a bug fix, especially if
we want to provide any way for the DBA to control it, so leave it for
another day.

I set this up to allow easy addition of platform-specific code for
non-Linux platforms, if needed; but right now there are no reports
suggesting that we need to work harder on other platforms.

Back-patch to 9.4 where hugepage support was introduced.

Discussion: <31056.1476303954@sss.pgh.pa.us>
2016-10-13 15:07:04 -04:00
b1b56b653c Clean up handling of anonymous mmap'd shared-memory segment.
Fix detaching of the mmap'd segment to have its own on_shmem_exit callback,
rather than piggybacking on the one for detaching from the SysV segment.
That was confusing, and given the distance between the two attach calls,
it was trouble waiting to happen.

Make the detaching calls idempotent by clearing AnonymousShmem to show
we've already unmapped.  I spent quite a bit of time yesterday trying
to find a path that would allow the munmap()'s to be done twice, and
while I did not succeed, it seems silly that there's even a question.

Make the #ifdef logic less confusing by separating "do we want to use
anonymous shmem" from EXEC_BACKEND.  Even though there's no current
scenario where those conditions are different, it is not helpful for
different places in the same file to be testing EXEC_BACKEND for what
are fundamentally different reasons.

Don't do on_exit_reset() in StartBackgroundWorker().  At best that's
useless (InitPostmasterChild would have done it already) and at worst
it could zap some callback that's unrelated to shared memory.

Improve comments, and simplify the huge_pages enablement logic slightly.

Back-patch to 9.4 where hugepage support was introduced.
Arguably this should go into 9.3 as well, but the code looks
significantly different there, and I doubt it's worth the
trouble of adapting the patch given I can't show a live bug.
2016-10-13 13:59:56 -04:00
39ac293940 On Windows, ensure shared memory handle gets closed if not being used.
Postmaster child processes that aren't supposed to be attached to shared
memory were not bothering to close the shared memory mapping handle they
inherit from the postmaster process.  That's mostly harmless, since the
handle vanishes anyway when the child process exits -- but the syslogger
process, if used, doesn't get killed and restarted during recovery from a
backend crash.  That meant that Windows doesn't see the shared memory
mapping as becoming free, so it doesn't delete it and the postmaster is
unable to create a new one, resulting in failure to recover from crashes
whenever logging_collector is turned on.

Per report from Dmitry Vasilyev.  It's a bit astonishing that we'd not
figured this out long ago, since it's been broken from the very beginnings
of out native Windows support; probably some previously-unexplained trouble
reports trace to this.

A secondary problem is that on Cygwin (perhaps only in older versions?),
exec() may not detach from the shared memory segment after all, in which
case these child processes did remain attached to shared memory, posing
the risk of an unexpected shared memory clobber if they went off the rails
somehow.  That may be a long-gone bug, but we can deal with it now if it's
still live, by detaching within the infrastructure introduced here to deal
with closing the handle.

Back-patch to all supported branches.

Tom Lane and Amit Kapila
2015-10-13 11:21:33 -04:00
807b9e0dff pgindent run for 9.5 2015-05-23 21:35:49 -04:00
4baaf863ec Update copyright for 2015
Backpatch certain files through 9.0
2015-01-06 11:43:47 -05:00
65c9dc231a Assorted message improvements 2014-08-29 00:26:17 -04:00
66802246e2 Fix weird spacing in error message.
Seems to have been introduced in 1a3458b6d8.
2014-06-18 15:44:35 -04:00
0a78320057 pgindent run for 9.4
This includes removing tabs after periods in C comments, which was
applied to back branches, so this change should not effect backpatching.
2014-05-06 12:12:18 -04:00
11a65eed16 Get rid of the dynamic shared memory state file.
Instead of storing the ID of the dynamic shared memory control
segment in a file within the data directory, store it in the main
control segment.  This avoids a number of nasty corner cases,
most seriously that doing an online backup and then using it on
the same machine (e.g. to fire up a standby) would result in the
standby clobbering all of the master's dynamic shared memory
segments.

Per complaints from Heikki Linnakangas, Fujii Masao, and Tom
Lane.
2014-04-08 11:39:55 -04:00
f8ce16d0d2 Rename huge_tlb_pages to huge_pages, and improve docs.
Christian Kruse
2014-03-03 20:52:48 +02:00
571addd729 Fix unsafe references to errno within error messaging logic.
Various places were supposing that errno could be expected to hold still
within an ereport() nest or similar contexts.  This isn't true necessarily,
though in some cases it accidentally failed to fail depending on how the
compiler chanced to order the subexpressions.  This class of thinko
explains recent reports of odd failures on clang-built versions, typically
missing or inappropriate HINT fields in messages.

Problem identified by Christian Kruse, who also submitted the patch this
commit is based on.  (I fixed a few issues in his patch and found a couple
of additional places with the same disease.)

Back-patch as appropriate to all supported branches.
2014-01-29 20:04:43 -05:00
699b1f40da Fix thinko in huge_tlb_pages patch.
We calculated the rounded-up size for the allocation, but then failed to
use the rounded-up value in the mmap() call. Oops.

Also, initialize allocsize, to silence warnings seen with some compilers,
as pointed out by Jeff Janes.
2014-01-29 21:33:56 +02:00
1a3458b6d8 Allow using huge TLB pages on Linux (MAP_HUGETLB)
This patch adds an option, huge_tlb_pages, which allows requesting the
shared memory segment to be allocated using huge pages, by using the
MAP_HUGETLB flag in mmap(). This can improve performance.

The default is 'try', which means that we will attempt using huge pages,
and fall back to non-huge pages if it doesn't work. Currently, only Linux
has MAP_HUGETLB. On other platforms, the default 'try' behaves the same as
'off'.

In the passing, don't try to round the mmap() size to a multiple of
pagesize. mmap() doesn't require that, and there's no particular reason for
PostgreSQL to do that either. When using MAP_HUGETLB, however, round the
request size up to nearest 2MB boundary. This is to work around a bug in
some Linux kernel versions, but also to avoid wasting memory, because the
kernel will round the size up anyway.

Many people were involved in writing this patch, including Christian Kruse,
Richard Poole, Abhijit Menon-Sen, reviewed by Peter Geoghegan, Andres Freund
and me.
2014-01-29 14:08:30 +02:00
ac4ef637ad Allow use of "z" flag in our printf calls, and use it where appropriate.
Since C99, it's been standard for printf and friends to accept a "z" size
modifier, meaning "whatever size size_t has".  Up to now we've generally
dealt with printing size_t values by explicitly casting them to unsigned
long and using the "l" modifier; but this is really the wrong thing on
platforms where pointers are wider than longs (such as Win64).  So let's
start using "z" instead.  To ensure we can do that on all platforms, teach
src/port/snprintf.c to understand "z", and add a configure test to force
use of that implementation when the platform's version doesn't handle "z".

Having done that, modify a bunch of places that were using the
unsigned-long hack to use "z" instead.  This patch doesn't pretend to have
gotten everyplace that could benefit, but it catches many of them.  I made
an effort in particular to ensure that all uses of the same error message
text were updated together, so as not to increase the number of
translatable strings.

It's possible that this change will result in format-string warnings from
pre-C99 compilers.  We might have to reconsider if there are any popular
compilers that will warn about this; but let's start by seeing what the
buildfarm thinks.

Andres Freund, with a little additional work by me
2014-01-23 17:18:33 -05:00
7e04792a1c Update copyright for 2014
Update all files in head, and files COPYRIGHT and legal.sgml in all back
branches.
2014-01-07 16:05:30 -05:00
0ac5e5a7e1 Allow dynamic allocation of shared memory segments.
Patch by myself and Amit Kapila.  Design help from Noah Misch.  Review
by Andres Freund.
2013-10-09 21:05:02 -04:00
9d775d8894 Message style improvements 2013-08-07 22:48:40 -04:00
9af4159fce pgindent run for release 9.3
This is the first run of the Perl-based pgindent script.  Also update
pgindent instructions.
2013-05-29 16:58:43 -04:00
bd61a623ac Update copyrights for 2013
Fully update git head, and update back branches in ./COPYRIGHT and
legal.sgml files.
2013-01-01 17:15:01 -05:00
6a77bff086 Remove misleading hints about reducing the System V request size.
Since the request size will now be ~48 bytes regardless of how
shared_buffers et. al. are set, much of this advice is no longer
relevant.
2012-07-03 10:07:47 -04:00
81e8264383 Declare AnonymousShmem pointer as "void *".
The original coding had it as "PGShmemHeader *", but that doesn't offer any
notational benefit because we don't dereference it.  And it was resulting
in compiler warnings on some platforms, notably buildfarm member
castoroides, where mmap() and munmap() are evidently declared to take and
return "char *".
2012-06-30 17:19:46 -04:00
42e2ce6ae3 Fix confusion between "size" and "AnonymousShmemSize".
Noted by Andres Freund.  Also improve a couple of comments.
2012-06-29 15:12:10 -04:00
c1494b7330 Provide MAP_FAILED if sys/mman.h doesn't.
On old HPUX this has to be #defined to -1.  It might be that other values
are required on other dinosaur systems, but we'll worry about that when
and if we get reports.
2012-06-28 14:19:20 -04:00
39715af23a Fix broken mmap failure-detection code, and improve error message.
Per an observation by Thom Brown that my previous commit made an
overly large shmem allocation crash the server, on Linux.
2012-06-28 12:57:22 -04:00
b0fc0df936 Dramatically reduce System V shared memory consumption.
Except when compiling with EXEC_BACKEND, we'll now allocate only a tiny
amount of System V shared memory (as an interlock to protect the data
directory) and allocate the rest as anonymous shared memory via mmap.
This will hopefully spare most users the hassle of adjusting operating
system parameters before being able to start PostgreSQL with a
reasonable value for shared_buffers.

There are a bunch of documentation updates needed here, and we might
need to adjust some of the HINT messages related to shared memory as
well.  But it's not 100% clear how portable this is, so before we
write the documentation, let's give it a spin on the buildfarm and
see what turns red.
2012-06-28 11:05:16 -04:00
64f09ca386 Remove leftovers of BeOS port
These should have been removed when the BeOS port was removed in
44f9021223.
2012-05-14 04:50:39 +03:00
e126958c2e Update copyright notices for year 2012. 2012-01-01 18:01:58 -05:00
bb46d42859 Consistent spacing for lengthy error messages
Also, we removed the display of the current value of
max_connections/MaxBackends from some messages earlier, because it was
confusing, so do that in the remaining one as well.
2011-05-19 21:38:24 +03:00
bf50caf105 pgindent run before PG 9.1 beta 1. 2011-04-10 11:42:00 -04:00
9d56886112 Fix two missing spaces in error messages.
Josh Kupershmidt
2011-04-01 14:42:38 +03:00
67a5e727c8 Be less detailed about reporting shared memory failure by avoiding the
output of actual Postgres parameter _values_ related to shared memory,
and suggesting that these are only possible parameters to reduce.
2011-02-27 12:21:58 -05:00
52948169bc Code review for postmaster.pid contents changes.
Fix broken test for pre-existing postmaster, caused by wrong code for
appending lines to the lockfile; don't write a failed listen_address
setting into the lockfile; don't arbitrarily change the location of the
data directory in the lockfile compared to previous releases; provide more
consistent and useful definitions of the socket path and listen_address
entries; avoid assuming that pg_ctl has the same DEFAULT_PGSOCKET_DIR as
the postmaster; assorted code style improvements.
2011-01-13 19:01:28 -05:00
5d950e3b0c Stamp copyrights for year 2011. 2011-01-01 13:18:15 -05:00
30aeda4394 Include the first valid listen address in pg_ctl to improve server start
"wait" detection and add postmaster start time to help determine if the
postmaster is actually using the specified data directory.
2010-12-31 17:25:02 -05:00
9f2e211386 Remove cvs keywords from all files. 2010-09-20 22:08:53 +02:00
acac35adca Improve hint message for ENOMEM failure from shmget().
It turns out that some platforms return ENOMEM for a request that violates
SHMALL, whereas we were assuming that ENOSPC would always be used for that.
Apparently the latter is a Linuxism while ENOMEM is the BSD tradition.
Extend the ENOMEM hint to suggest that raising SHMALL might be needed.
Per gripe from A.M.

Backpatch to 9.0, but not further, because this doesn't seem important
enough to warrant creating extra translation work in the stable branches.
(If it were, we'd have figured this out years ago.)
2010-08-25 20:10:55 +00:00
239d769e7e pgindent run for 9.0, second run 2010-07-06 19:19:02 +00:00
154163238e Add code to InternalIpcMemoryCreate() to handle the case where shmget()
returns EINVAL for an existing shared memory segment.  Although it's not
terribly sensible, that behavior does meet the POSIX spec because EINVAL
is the appropriate error code when the existing segment is smaller than the
requested size, and the spec explicitly disclaims any particular ordering of
error checks.  Moreover, it does in fact happen on OS X and probably other
BSD-derived kernels.  (We were able to talk NetBSD into changing their code,
but purging that behavior from the wild completely seems unlikely to happen.)
We need to distinguish collision with a pre-existing segment from invalid size
request in order to behave sensibly, so it's worth some extra code here to get
it right.  Per report from Gavin Kistner and subsequent investigation.

Back-patch to all supported versions, since any of them could get used
with a kernel having the debatable behavior.
2010-05-01 22:46:30 +00:00
0239800893 Update copyright for the year 2010. 2010-01-02 16:58:17 +00:00
511db38ace Update copyright for 2009. 2009-01-01 17:24:05 +00:00
9098ab9e32 Update copyrights in source tree to 2008. 2008-01-01 19:46:01 +00:00
fdf5a5efb7 pgindent run for 8.3. 2007-11-15 21:14:46 +00:00
1c7fe33fdb Fix failure to restart Postgres when Linux kernel returns EIDRM for shmctl().
This is a Linux kernel bug that apparently exists in every extant kernel
version: sometimes shmctl() will fail with EIDRM when EINVAL is correct.
We were assuming that EIDRM indicates a possible conflict with pre-existing
backends, and refusing to start the postmaster when this happens.  Fortunately,
there does not seem to be any case where Linux can legitimately return EIDRM
(it doesn't track shmem segments in a way that would allow that), so we can
get away with just assuming that EIDRM means EINVAL on this platform.

Per reports from Michael Fuhr and Jon Lapham --- it's a bit surprising
we have not seen more reports, actually.
2007-07-02 20:11:55 +00:00
18d82d03b5 Native shared memory implementation for win32.
Uses same underlying tech as before, but not the sysv emulation layer.
2007-03-21 14:39:23 +00:00
28c3cd5c1c Fix typo in comment. 2007-02-06 16:20:23 +00:00
29dccf5fe0 Update CVS HEAD for 2007 copyright. Back branches are typically not
back-stamped for this.
2007-01-05 22:20:05 +00:00
ae643747b1 Fix a passel of recently-committed violations of the rule 'thou shalt
have no other gods before c.h'.  Also remove some demonstrably redundant
#include lines, mostly of <errno.h> which was added to c.h years ago.
2006-07-14 05:28:29 +00:00
f2f5b05655 Update copyright for 2006. Update scripts. 2006-03-05 15:59:11 +00:00