1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-30 11:03:19 +03:00
Commit Graph

32697 Commits

Author SHA1 Message Date
97395185b8 Make configure probe for mbstowcs_l as well as wcstombs_l.
We previously supposed that any given platform would supply both or neither
of these functions, so that one configure test would be sufficient.  It now
appears that at least on AIX this is not the case ... which is likely an
AIX bug, but nonetheless we need to cope with it.  So use separate tests.
Per bug #6758; thanks to Andrew Hastie for doing the followup testing
needed to confirm what was happening.

Backpatch to 9.1, where we began using these functions.
2012-08-31 14:18:08 -04:00
6707dd48cd Back-patch recent fixes for gistchoose and gistRelocateBuildBuffersOnSplit.
This back-ports commits c8ba697a4b and
e5db11c558, which fix one definite and one
speculative bug in gistchoose, and make the code a lot more intelligible as
well.  In 9.2 only, this also affects the largely-copied-and-pasted logic
in gistRelocateBuildBuffersOnSplit.

The impact of the bugs was that the functions might make poor decisions
as to which index tree branch to push a new entry down into, resulting in
GiST index bloat and poor performance.  The fixes rectify these decisions
for future insertions, but a REINDEX would be needed to clean up any
existing index bloat.

Alexander Korotkov, Robert Haas, Tom Lane
2012-08-30 23:47:54 -04:00
f6956eb74e Document how to prevent PostgreSQL itself from exhausting memory.
The existing documentation in Linux Memory Overcommit seemed to
assume that PostgreSQL itself could never be the problem, or at
least it didn't tell you what to do about it.

Per discussion with Craig Ringer and Kevin Grittner.
2012-08-30 14:23:05 -04:00
657face6fe Add missing period to detail message.
Per note from Peter Eisentraut.
2012-08-30 13:27:18 -04:00
ed597d0c32 Back-patch fixes for some issues in our Windows socket code into 9.1.
This is a backport of commit b85427f227.
Per discussion of bug #4958.  Some of these fixes probably need to be
back-patched further, but I'm just doing this much for now.
2012-08-27 15:00:08 -04:00
180ce0af33 Fix issues with checks for unsupported transaction states in Hot Standby.
The GUC check hooks for transaction_read_only and transaction_isolation
tried to check RecoveryInProgress(), so as to disallow setting read/write
mode or serializable isolation level (respectively) in hot standby
sessions.  However, GUC check hooks can be called in many situations where
we're not connected to shared memory at all, resulting in a crash in
RecoveryInProgress().  Among other cases, this results in EXEC_BACKEND
builds crashing during child process start if default_transaction_isolation
is serializable, as reported by Heikki Linnakangas.  Protect those calls
by silently allowing any setting when not inside a transaction; which is
okay anyway since these GUCs are always reset at start of transaction.

Also, add a check to GetSerializableTransactionSnapshot() to complain
if we are in hot standby.  We need that check despite the one in
check_XactIsoLevel() because default_transaction_isolation could be
serializable.  We don't want to complain any sooner than this in such
cases, since that would prevent running transactions at all in such a
state; but a transaction can be run, if SET TRANSACTION ISOLATION is done
before setting a snapshot.  Per report some months ago from Robert Haas.

Back-patch to 9.1, since these problems were introduced by the SSI patch.

Kevin Grittner and Tom Lane, with ideas from Heikki Linnakangas
2012-08-24 13:09:17 -04:00
ff122d3268 Fix cascading privilege revoke to notice when privileges are still held.
If we revoke a grant option from some role X, but X still holds the option
via another grant, we should not recursively revoke the privilege from
role(s) Y that X had granted it to.  This was supposedly fixed as one
aspect of commit 4b2dafcc0b, but I must not
have tested it, because in fact that code never worked: it forgot to shift
the grant-option bits back over when masking the bits being revoked.

Per bug #6728 from Daniel German.  Back-patch to all active branches,
since this has been wrong since 8.0.
2012-08-23 17:25:23 -04:00
874d97c2a8 Fix bugs in contrib/pg_trgm's LIKE pattern analysis code.
Extraction of trigrams did not process LIKE escape sequences properly,
leading to possible misidentification of trigrams near escapes, resulting
in incorrect index search results.

Fujii Masao
2012-08-20 13:25:03 -04:00
7665e82b6a Fix rescan logic in nodeCtescan.
The previous coding essentially assumed that nodes would be rescanned in
the same order they were initialized in; or at least that the "leader" of
a group of CTEscans would be rescanned before any others were required to
execute.  Unfortunately, that isn't even a little bit true.  It's possible
to devise queries in which the leader isn't rescanned until other CTEscans
on the same CTE have run to completion, or even in which the leader never
gets a rescan call at all.

The fix makes the leader specially responsible only for initial creation
and final destruction of the tuplestore; rescan resets are now a
symmetrically shared responsibility.  This means that we might reset the
tuplestore multiple times when restarting a plan subtree containing
multiple CTEscans; but resetting an already-empty tuplestore is cheap
enough that that doesn't seem like a problem.

Per report from Adam Mackler; the new regression test cases are based on
his example query.

Back-patch to 8.4 where CTE scans were introduced.
2012-08-15 19:01:29 -04:00
9e035184b0 Disallow extensions from owning the schema they are assigned to.
This situation creates a dependency loop that confuses pg_dump and probably
other things.  Moreover, since the mental model is that the extension
"contains" schemas it owns, but "is contained in" its extschema (even
though neither is strictly true), having both true at once is confusing for
people too.  So prevent the situation from being set up.

Reported and patched by Thom Brown.  Back-patch to 9.1 where extensions
were added.
2012-08-15 11:27:06 -04:00
04e96bc69d Stamp 9.1.5. REL9_1_5 2012-08-14 18:41:04 -04:00
18ee575df3 Update release notes for 9.1.5, 9.0.9, 8.4.13, 8.3.20. 2012-08-14 18:34:07 -04:00
e76e252286 Prevent access to external files/URLs via contrib/xml2's xslt_process().
libxslt offers the ability to read and write both files and URLs through
stylesheet commands, thus allowing unprivileged database users to both read
and write data with the privileges of the database server.  Disable that
through proper use of libxslt's security options.

Also, remove xslt_process()'s ability to fetch documents and stylesheets
from external files/URLs.  While this was a documented "feature", it was
long regarded as a terrible idea.  The fix for CVE-2012-3489 broke that
capability, and rather than expend effort on trying to fix it, we're just
going to summarily remove it.

While the ability to write as well as read makes this security hole
considerably worse than CVE-2012-3489, the problem is mitigated by the fact
that xslt_process() is not available unless contrib/xml2 is installed,
and the longstanding warnings about security risks from that should have
discouraged prudent DBAs from installing it in security-exposed databases.

Reported and fixed by Peter Eisentraut.

Security: CVE-2012-3488
2012-08-14 18:32:03 -04:00
0df2da9d72 Prevent access to external files/URLs via XML entity references.
xml_parse() would attempt to fetch external files or URLs as needed to
resolve DTD and entity references in an XML value, thus allowing
unprivileged database users to attempt to fetch data with the privileges
of the database server.  While the external data wouldn't get returned
directly to the user, portions of it could be exposed in error messages
if the data didn't parse as valid XML; and in any case the mere ability
to check existence of a file might be useful to an attacker.

The ideal solution to this would still allow fetching of references that
are listed in the host system's XML catalogs, so that documents can be
validated according to installed DTDs.  However, doing that with the
available libxml2 APIs appears complex and error-prone, so we're not going
to risk it in a security patch that necessarily hasn't gotten wide review.
So this patch merely shuts off all access, causing any external fetch to
silently expand to an empty string.  A future patch may improve this.

In HEAD and 9.2, also suppress warnings about undefined entities, which
would otherwise occur as a result of not loading referenced DTDs.  Previous
branches don't show such warnings anyway, due to different error handling
arrangements.

Credit to Noah Misch for first reporting the problem, and for much work
towards a solution, though this simplistic approach was not his preference.
Also thanks to Daniel Veillard for consultation.

Security: CVE-2012-3489
2012-08-14 18:32:02 -04:00
b5987c4f87 Translation updates 2012-08-14 16:34:12 -04:00
a2c04f189a Update time zone data files to tzdata release 2012e.
DST law changes in Morocco; Tokelau has relocated to the other side of
the International Date Line; and apparently Olson had Tokelau's GMT
offset wrong by an hour even before that.

There are also a large number of non-significant changes in this update.
Upstream took the opportunity to remove trailing whitespace, and the
SCCS-style version numbers on the individual files are gone too.
2012-08-14 10:54:36 -04:00
b2cc611959 Fix dependencies generated during ALTER TABLE ADD CONSTRAINT USING INDEX.
This command generated new pg_depend entries linking the index to the
constraint and the constraint to the table, which match the entries made
when a unique or primary key constraint is built de novo.  However, it did
not bother to get rid of the entries linking the index directly to the
table.  We had considered the issue when the ADD CONSTRAINT USING INDEX
patch was written, and concluded that we didn't need to get rid of the
extra entries.  But this is wrong: ALTER COLUMN TYPE wasn't expecting such
redundant dependencies to exist, as reported by Hubert Depesz Lubaczewski.
On reflection it seems rather likely to break other things as well, since
there are many bits of code that crawl pg_depend for one purpose or
another, and most of them are pretty naive about what relationships they're
expecting to find.  Fortunately it's not that hard to get rid of the extra
dependency entries, so let's do that.

Back-patch to 9.1, where ALTER TABLE ADD CONSTRAINT USING INDEX was added.
2012-08-11 12:51:36 -04:00
64d64a0530 Fix upper limit of superuser_reserved_connections, add limit for wal_senders
Should be limited to the maximum number of connections excluding
autovacuum workers, not including.

Add similar check for max_wal_senders, which should never be higher than
max_connections.
2012-08-10 14:52:16 +02:00
2bf6e8cbc0 Update isolation tests' README file.
The directions explaining about running the prepared-transactions test
were not updated in commit ae55d9fbe3.
2012-08-08 12:02:15 -04:00
efed8c0031 fsync backup_label after pg_start_backup()
Dave Kerr, backpatched by Simon Riggs
2012-08-07 16:21:49 +01:00
1c638c8074 Typo fixes for previous commit.
Noted by Thom Brown.
2012-08-06 16:12:39 -04:00
16a69120eb Warn more vigorously about the non-transactional behavior of sequences.
Craig Ringer, edited fairly heavily by me
2012-08-06 15:18:54 -04:00
3159599390 Perform conversion from Python unicode to string/bytes object via UTF-8.
We used to convert the unicode object directly to a string in the server
encoding by calling Python's PyUnicode_AsEncodedString function. In other
words, we used Python's routines to do the encoding. However, that has a
few problems. First of all, it required keeping a mapping table of Python
encoding names and PostgreSQL encodings. But the real killer was that Python
doesn't support EUC_TW and MULE_INTERNAL encodings at all.

Instead, convert the Python unicode object to UTF-8, and use PostgreSQL's
encoding conversion functions to convert from UTF-8 to server encoding. We
were already doing the same in the other direction in PLyUnicode_FromString,
so this is more consistent, too.

Note: This makes SQL_ASCII to behave more leniently. We used to map
SQL_ASCII to Python's 'ascii', which on Python means strict 7-bit ASCII
only, so you got an error if the python string contained anything but pure
ASCII. You no longer get an error; you get the UTF-8 representation of the
string instead.

Backpatch to 9.0, where these conversions were introduced.

Jan Urbański
2012-08-06 14:33:27 +03:00
c9c95202b0 Reword documentation for concurrent index rebuilds to be clearer.
Backpatch to 9.1 and 9.2.
2012-08-04 10:35:44 -04:00
805fc21c33 Fix bugs with parsing signed hh:mm and hh:mm:ss fields in interval input.
DecodeInterval() failed to honor the "range" parameter (the special SQL
syntax for indicating which fields appear in the literal string) if the
time was signed.  This seems inappropriate, so make it work like the
not-signed case.  The inconsistency was introduced in my commit
f867339c01, which as noted in its log message
was only really focused on making SQL-compliant literals work per spec.
Including a sign here is not per spec, but if we're going to allow it
then it's reasonable to expect it to work like the not-signed case.

Also, remove bogus setting of tmask, which caused subsequent processing to
think that what had been given was a timezone and not an hh:mm(:ss) field,
thus confusing checks for redundant fields.  This seems to be an aboriginal
mistake in Lockhart's commit 2cf1642461.

Add regression test cases to illustrate the changed behaviors.

Back-patch as far as 8.4, where support for spec-compliant interval
literals was added.

Range problem reported and diagnosed by Amit Kapila, tmask problem by me.
2012-08-03 17:39:50 -04:00
d06dfc1b63 Document that, for psql -c, only the result of the last command is
returned, per report from Aleksey Tsalolikhin

Backpatch to 9.2 and 9.1.
2012-08-03 14:02:22 -04:00
b1c891e90b Fix WITH attached to a nested set operation (UNION/INTERSECT/EXCEPT).
Parse analysis neglected to cover the case of a WITH clause attached to an
intermediate-level set operation; it only handled WITH at the top level
or WITH attached to a leaf-level SELECT.  Per report from Adam Mackler.

In HEAD, I rearranged the order of SelectStmt's fields to put withClause
with the other fields that can appear on non-leaf SelectStmts.  In back
branches, leave it alone to avoid a possible ABI break for third-party
code.

Back-patch to 8.4 where WITH support was added.
2012-07-31 17:56:32 -04:00
118b941614 Fix syslogger so that log_truncate_on_rotation works in the first rotation.
In the original coding of the log rotation stuff, we did not bother to make
the truncation logic work for the very first rotation after postmaster
start (or after a syslogger crash and restart).  It just always appended
in that case.  It did not seem terribly important at the time, but we've
recently had two separate complaints from people who expected it to work
unsurprisingly.  (Both users tend to restart the postmaster about as often
as a log rotation is configured to happen, which is maybe not typical use,
but still...)  Since the initial log file is opened in the postmaster,
fixing this requires passing down some more state to the syslogger child
process.

It's always been like this, so back-patch to all supported branches.
2012-07-31 14:37:04 -04:00
27394f76bf Now that the diskchecker.pl author has updated the download link on his
website, revert the separate link to the download git repository.

Backpatch from 9.0 to current.
2012-07-30 10:15:56 -04:00
022e8786da Improve reporting of error situations in find_other_exec().
This function suppressed any stderr output from the called program, which
is unnecessary in the normal case and unhelpful in error cases.  It also
gave a rather opaque message along the lines of "fgets failure: Success"
in case the called program failed to return anything on stdout.  Since
we've seen multiple reports of people not understanding what's wrong when
pg_ctl reports this, improve the message.

Back-patch to all active branches.
2012-07-27 19:31:24 -04:00
3d980e15ee Update doc mention of diskchecker.pl to add URL for script; retain URL
for description.

Patch to 9.0 and later, where script is mentioned.
2012-07-26 21:25:25 -04:00
7a7d970b36 Only allow autovacuum to be auto-canceled by a directly blocked process.
In the original coding of the autovacuum cancel feature, commit
acac68b2bc, an autovacuum process was
considered a target for cancellation if it was found to hard-block any
process examined in the deadlock search.  This patch tightens the test so
that the autovacuum must directly hard-block the current process.  This
should make the behavior more predictable in general, and in particular
it ensures that an autovacuum will not be canceled with less than
deadlock_timeout grace period.  In the old coding, it was possible for an
autovacuum to be canceled almost instantly, given unfortunate timing of two
or more other processes' lock attempts.

This also justifies the logging methodology in the recent commit
d7318d43d891bd63e82dcfc27948113ed7b1db80; without this restriction, that
patch isn't providing enough information to see the connection of the
canceling process to the autovacuum.  Like that one, patch all the way
back.
2012-07-26 14:29:37 -04:00
a1195a5c26 Log a better message when canceling autovacuum.
The old message was at DEBUG2, so typically it didn't show up in the
log at all.  As a result, in most cases where autovacuum was canceled,
the only information that was logged was the table being vacuumed,
with no indication as to what problem caused the cancel.  Crank up
the level to LOG and add some more details to assist with debugging.

Back-patch all the way, per discussion on pgsql-hackers.
2012-07-26 09:18:32 -04:00
aa7cd14406 Fix longstanding crash-safety bug with newly-created-or-reset sequences.
If a crash occurred immediately after the first nextval() call for a serial
column, WAL replay would restore the sequence to a state in which it
appeared that no nextval() had been done, thus allowing the first sequence
value to be returned again by the next nextval() call; as reported in
bug #6748 from Xiangming Mei.

More generally, the problem would occur if an ALTER SEQUENCE was executed
on a freshly created or reset sequence.  (The manifestation with serial
columns was introduced in 8.2 when we added an ALTER SEQUENCE OWNED BY step
to serial column creation.)  The cause is that sequence creation attempted
to save one WAL entry by writing out a WAL record that made it appear that
the first nextval() had already happened (viz, with is_called = true),
while marking the sequence's in-database state with log_cnt = 1 to show
that the first nextval() need not emit a WAL record.  However, ALTER
SEQUENCE would emit a new WAL entry reflecting the actual in-database state
(with is_called = false).  Then, nextval would allocate the first sequence
value and set is_called = true, but it would trust the log_cnt value and
not emit any WAL record.  A crash at this point would thus restore the
sequence to its post-ALTER state, causing the next nextval() call to return
the first sequence value again.

To fix, get rid of the idea of logging an is_called status different from
reality.  This means that the first nextval-driven WAL record will happen
at the first nextval call not the second, but the marginal cost of that is
pretty negligible.  In addition, make sure that ALTER SEQUENCE resets
log_cnt to zero in any case where it touches sequence parameters that
affect future nextval results.  This will result in some user-visible
changes in the contents of a sequence's log_cnt column, as reflected in the
patch's regression test changes; but no application should be depending on
that anyway, since it was already true that log_cnt changes rather
unpredictably depending on checkpoint timing.

In addition, make some basically-cosmetic improvements to get rid of
sequence.c's undesirable intimacy with page layout details.  It was always
really trying to WAL-log the contents of the sequence tuple, so we should
have it do that directly using a HeapTuple's t_data and t_len, rather than
backing into it with some magic assumptions about where the tuple would be
on the sequence's page.

Back-patch to all supported branches.
2012-07-25 17:40:48 -04:00
baf6090f8c Remove now unneeded results file for disabled prepared transactions case. 2012-07-20 16:29:00 -04:00
963bafde8b Remove prepared transactions from main isolation test schedule.
There is no point in running this test when prepared transactions are disabled,
which is the default. New make targets that include the test are provided. This
will save some useless waste of cycles on buildfarm machines.

Backpatch to 9.1 where these tests were introduced.
2012-07-20 15:56:57 -04:00
200ff8bf39 Fix whole-row Var evaluation to cope with resjunk columns (again).
When a whole-row Var is reading the result of a subquery, we need it to
ignore any "resjunk" columns that the subquery might have evaluated for
GROUP BY or ORDER BY purposes.  We've hacked this area before, in commit
68e40998d0, but that fix only covered
whole-row Vars of named composite types, not those of RECORD type; and it
was mighty klugy anyway, since it just assumed without checking that any
extra columns in the result must be resjunk.  A proper fix requires getting
hold of the subquery's targetlist so we can actually see which columns are
resjunk (whereupon we can use a JunkFilter to get rid of them).  So bite
the bullet and add some infrastructure to make that possible.

Per report from Andrew Dunstan and additional testing by Merlin Moncure.
Back-patch to all supported branches.  In 8.3, also back-patch commit
292176a118, which for some reason I had
not done at the time, but it's a prerequisite for this change.
2012-07-20 13:09:16 -04:00
2f961b1b5f Improve coding around the fsync request queue.
In all branches back to 8.3, this patch fixes a questionable assumption in
CompactCheckpointerRequestQueue/CompactBgwriterRequestQueue that there are
no uninitialized pad bytes in the request queue structs.  This would only
cause trouble if (a) there were such pad bytes, which could happen in 8.4
and up if the compiler makes enum ForkNumber narrower than 32 bits, but
otherwise would require not-currently-planned changes in the widths of
other typedefs; and (b) the kernel has not uniformly initialized the
contents of shared memory to zeroes.  Still, it seems a tad risky, and we
can easily remove any risk by pre-zeroing the request array for ourselves.
In addition to that, we need to establish a coding rule that struct
RelFileNode can't contain any padding bytes, since such structs are copied
into the request array verbatim.  (There are other places that are assuming
this anyway, it turns out.)

In 9.1 and up, the risk was a bit larger because we were also effectively
assuming that struct RelFileNodeBackend contained no pad bytes, and with
fields of different types in there, that would be much easier to break.
However, there is no good reason to ever transmit fsync or delete requests
for temp files to the bgwriter/checkpointer, so we can revert the request
structs to plain RelFileNode, getting rid of the padding risk and saving
some marginal number of bytes and cycles in fsync queue manipulation while
we are at it.  The savings might be more than marginal during deletion of
a temp relation, because the old code transmitted an entirely useless but
nonetheless expensive-to-process ForgetRelationFsync request to the
background process, and also had the background process perform the file
deletion even though that can safely be done immediately.

In addition, make some cleanup of nearby comments and small improvements to
the code in CompactCheckpointerRequestQueue/CompactBgwriterRequestQueue.
2012-07-17 16:57:22 -04:00
5dd19d10d2 Remove recently added PL/Perl encoding tests
These only pass cleanly on UTF8 and SQL_ASCII encodings, besides the
Japanese encoding in which they were originally written, which is clearly
not good enough.  Since the functionality they test has not ever been
tested from PL/Perl, the best answer seems to be to remove the new tests
completely.

Per buildfarm results and ensuing discussion.
2012-07-17 13:26:55 -04:00
d6270e2df9 Prevent corner-case core dump in rfree().
rfree() failed to cope with the case that pg_regcomp() had initialized the
regex_t struct but then failed to allocate any memory for re->re_guts (ie,
the first malloc call in pg_regcomp() failed).  It would try to touch the
guts struct anyway, and thus dump core.  This is a sufficiently narrow
corner case that it's not surprising it's never been seen in the field;
but still a bug is a bug, so patch all active branches.

Noted while investigating whether we need to call pg_regfree after a
failure return from pg_regcomp.  Other than this bug, it turns out we
don't, so adjust comments appropriately.
2012-07-15 13:28:09 -04:00
d066cc548d Fix walsender processes to establish a SIGALRM handler.
Walsenders must have working SIGALRM handling during InitPostgres,
but they set the handler to SIG_IGN so that nothing would happen
if a timeout was reached.  This could result in two failure modes:

* If a walsender participated in a deadlock during its authentication
transaction, and was the last to wait in the deadly embrace, the deadlock
would not get cleared automatically.  This would require somebody to be
trying to take out AccessExclusiveLock on multiple system catalogs, so
it's not very probable.

* If a client failed to respond to a walsender's authentication challenge,
the intended disconnect after AuthenticationTimeout wouldn't happen, and
the walsender would wait indefinitely for the client.

For the moment, fix in back branches only, since this is fixed in a
different way in the timeout-infrastructure patch that's awaiting
application to HEAD.  If we choose not to apply that, then we'll need
to do this in HEAD as well.
2012-07-12 14:30:04 -04:00
a9287de176 Back-patch fix for extraction of fixed prefixes from regular expressions.
Back-patch of commits 628cbb50ba and
c6aae3042b.  This has been broken since
7.3, so back-patch to all supported branches.
2012-07-10 18:00:44 -04:00
ed45a53730 Back-patch addition of pg_wchar-to-multibyte conversion functionality.
Back-patch of commits 72dd6291f2,
f6a05fd973, and
60e9c224a1.

This is needed to support fixing the regex prefix extraction bug in
back branches.
2012-07-10 16:53:27 -04:00
892a8d0544 Add forgotten PL/Perl regression test files
Due to a git hook blowing up in my face telling me I could not commit
Peter Eisentraut's patch on his name, I had to "git reset" to fix the
previous commit ... and then forgot that I needed to "git add" these
files :-(
2012-07-10 16:46:59 -04:00
fc661f78c6 plperl: Skip setting UTF8 flag when in SQL_ASCII encoding
When in SQL_ASCII encoding, strings passed around are not necessarily
UTF8-safe.  We had already fixed this in some places, but it looks like
we missed some.

I had to backpatch Peter Eisentraut's a8b92b60 to 9.1 in order for this
patch to cherry-pick more cleanly.

Patch from Alex Hunsaker, tweaked by Kyotaro HORIGUCHI and myself.

Some desultory cleanup and comment addition by me, during patch review.

Per bug report from Christoph Berg in
20120209102116.GA14429@msgid.df7cb.de
2012-07-10 15:50:58 -04:00
1fbe7d377c PL/Perl: Avoid compiler warning from clang
Use SvREFCNT_inc_simple_void() instead of SvREFCNT_inc() to avoid
warning about unused return value.
2012-07-10 15:49:48 -04:00
62b9045666 Refactor pattern_fixed_prefix() to avoid dealing in incomplete patterns.
Previously, pattern_fixed_prefix() was defined to return whatever fixed
prefix it could extract from the pattern, plus the "rest" of the pattern.
That definition was sensible for LIKE patterns, but not so much for
regexes, where reconstituting a valid pattern minus the prefix could be
quite tricky (certainly the existing code wasn't doing that correctly).
Since the only thing that callers ever did with the "rest" of the pattern
was to pass it to like_selectivity() or regex_selectivity(), let's cut out
the middle-man and just have pattern_fixed_prefix's subroutines do this
directly.  Then pattern_fixed_prefix can return a simple selectivity
number, and the question of how to cope with partial patterns is removed
from its API specification.

While at it, adjust the API spec so that callers who don't actually care
about the pattern's selectivity (which is a lot of them) can pass NULL for
the selectivity pointer to skip doing the work of computing a selectivity
estimate.

This patch is only an API refactoring that doesn't actually change any
processing, other than allowing a little bit of useless work to be skipped.
However, it's necessary infrastructure for my upcoming fix to regex prefix
extraction, because after that change there won't be any simple way to
identify the "rest" of the regex, not even to the low level of fidelity
needed by regex_selectivity.  We can cope with that if regex_fixed_prefix
and regex_selectivity communicate directly, but not if we have to work
within the old API.  Hence, back-patch to all active branches.
2012-07-09 23:23:09 -04:00
b6108fe59b Fix planner to pass correct collation to operator selectivity estimators.
We can do this without creating an API break for estimation functions
by passing the collation using the existing fmgr functionality for
passing an input collation as a hidden parameter.

The need for this was foreseen at the outset, but we didn't get around to
making it happen in 9.1 because of the decision to sort all pg_statistic
histograms according to the database's default collation.  That meant that
selectivity estimators generally need to use the default collation too,
even if they're estimating for an operator that will do something
different.  The reason it's suddenly become more interesting is that
regexp interpretation also uses a collation (for its LC_TYPE not LC_COLLATE
property), and we no longer want to use the wrong collation when examining
regexps during planning.  It's not that the selectivity estimate is likely
to change much from this; rather that we are thinking of caching compiled
regexps during planner estimation, and we won't get the intended benefit
if we cache them with a different collation than the executor will use.

Back-patch to 9.1, both because the regexp change is likely to get
back-patched and because we might as well get this right in all
collation-supporting branches, in case any third-party code wants to
rely on getting the collation.  The patch turns out to be minuscule
now that I've done it ...
2012-07-08 23:51:19 -04:00
3f35526479 Don't try to trim "../" in join_path_components().
join_path_components() tried to remove leading ".." components from its
tail argument, but it was not nearly bright enough to do so correctly
unless the head argument was (a) absolute and (b) canonicalized.
Rather than try to fix that logic, let's just get rid of it: there is no
correctness reason to remove "..", and cosmetic concerns can be taken
care of by a subsequent canonicalize_path() call.  Per bug #6715 from
Greg Davidson.

Back-patch to all supported branches.  It appears that pre-9.2, this
function is only used with absolute paths as head arguments, which is why
we'd not noticed the breakage before.  However, third-party code might be
expecting this function to work in more general cases, so it seems wise
to back-patch.

In HEAD and 9.2, also make some minor cosmetic improvements to callers.
2012-07-05 17:16:36 -04:00
b4234f8fc4 Revert part of the previous patch that avoided using PLy_elog().
That caused the plpython_unicode regression test to fail on SQL_ASCII
encoding, as evidenced by the buildfarm. The reason is that with the patch,
you don't get the detail in the error message that you got before. That
detail is actually very informative, so rather than just adjust the expected
output, let's revert that part of the patch for now to make the buildfarm
green again, and figure out some other way to avoid the recursion of
PLy_elog() that doesn't lose the detail.
2012-07-05 23:44:45 +03:00